A First Encounter with SIESTA¶
Are you using this tutorial for a school?
Ask tutors if you need to set up your local environment. Do that now before proceeding.
This first tutorial provides a first acquaintance with SIESTA by studying one simple molecule, CH4. It covers most of the basic features and concepts, without worrying too much about issues of convergence or fine-tuning.
background Background Information¶
Additional information: Slides
These slides from previous presentations might help improve your understanding of the concepts presented in this tutorial:
Presentation by Emilio Artacho
As a general background for basis sets, you can watch the 2021 talks by Emilio Artacho:
This tutorial will only cover basic, general options for basis set generation. We encourage you to jump into the dedicated tutorial for basis set optimization afterwards.
The inputs of SIESTA¶
The essential inputs of SIESTA are the FDF (Flexible Data Format) file and the pseudopotential file.
FDF file¶
It is the core input, since it provides information about the system and sets the values of several parameters regarding the quality of the calculation, as well as the pseudopotential information.
The FDF file (input.fdf) is a plain-text input file. It is
structured in name-value pairs—that is, a name characterizing the data and
its value. Units can be specified after the value. These name-value pairs
can be part of more complex structures called blocks
(%block label [...] %endblock
). The FDF file is case-insensitive,
and characters -
, _
, and .
in a data label are ignored.
The list of available descriptors is extensive; however, most of them have default values. To describe the system for a calculation, the minimum information needed is provided by the following descriptors:
General system information: SystemLabel, NumberOfSpecies, NumberOfAtoms
Structural information: LatticeConstant, LatticeVectors, AtomicCoordinatesFormat, AtomicCoordinatesAndAtomicSpecies.
As a part of the FDF file we can find the information regarding the basis set used. SIESTA offers many options for the specification of the basis set. Two of the key concepts are the cardinality of the set and range of the orbitals; both can be specified, in most cases, simply with a couple of lines.
The cardinality of the set, or its size, can be set by PAO.BasisSize. It follows the usual convention of amount of functions per valence orbital, the possible options being SZ, SZP, DZ, DZP… For example, for C, a SZ basis set has 4 functions (s+p) and a DZP basis set has 13 functions (2*s, 2*p, and a d polarization shell).
The energy shift is a physical concept determining the increase in energy brought about by the confinement of the orbital; the lower the energy shift, the larger the cut-off radii of basis orbitals. It is set using PAO.EnergyShift.
As in any DFT code, the exchange-correlation functional must be specified; the default option is using the local density approximation (LDA) for our calculations. However, it is also possible to use other functionals, such as those of GGA type. The information of the exchange-correlation functional should be provided like this:
XC.functional GGA
XC.authors PBE
Notice that both XC.Authors and
XC.Functional are needed, and they must be
consistent, check the
manual.
For XC.Functional GGA
, the most used variants are BLYP, PBE and PBEsol. For
XC.Functional VDW
, frequently used functionals are DRSLL, LMKLL and VV. See
the manual for more options on XC functionals.
Pseudopotentials¶
The other essential input of SIESTA is the pseudopotential information. SIESTA uses pseudopotentials to represent the core electron-ion interaction, as do most plane-wave codes (in contrast to so-called “all-electron” programs). In particular, the pseudopotentials are of the “norm-conserving” kind. The choice of a pseudopotential significantly impacts the quality of the calculation.
SIESTA reads pseudopotentials from different files based on the species information in %block ChemicalSpeciesLabel. By default, files are searched for in the current directory. SIESTA accepts different formats, but PSML is the one recommended. Curated databases of high-quality PSML files are available. In particular, the Pseudo-Dojo project offers PSML files for almost the entire periodic table, along with reports detailing the tests performed during their generation. The database includes variants for scalar and fully relativistic pseudopotentials using LDA, PBE, and PBEsol functionals.
Generating a new pseudopotential from scratch is beyond the scope of this tutorial, but more details can be found in Pseudopotentials. Note that ATOM generates psf files, which are also supported by SIESTA.
The pseudopotential should be tested. The method for testing depends on the type of system being simulated, so reviewing existing literature is strongly recommended.
The output¶
SIESTA writes a log of its workings to standard output
, which is
recommended to be redirected to an output file. In the standard output,
the entire thinking process of SIESTA can be followed: starting from
reading the input parameters, defining the system, and proceeding to
the self-consistent-field convergence.
System properties can be found in it; however, not all of them are written by default. More detailed information will be given in the exercises: see below.
basic Look at the input files¶
Are you under a school environment?
Enter directory 01-CH4-Basic
You will find an input file named ch4.fdf, along with the files C.psml and H.psml which contain the information about the pseudopotentials.
A peek at the fdf file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 | #General system specifications SystemName CH4 molecule SystemLabel ch4 NumberOfAtoms 5 NumberOfSpecies 2 %block ChemicalSpeciesLabel 1 6 C # Species index, atomic number, species label 2 1 H # Species index, atomic number, species label %endblock ChemicalSpeciesLabel #Atomic coordinates AtomicCoordinatesFormat Ang %block AtomicCoordinatesAndAtomicSpecies 0.000 0.000 0.000 1 1.219 -0.284 -0.377 2 -0.284 1.219 -0.377 2 -0.140 -0.140 1.219 2 -0.833 -0.833 -0.503 2 %endblock AtomicCoordinatesAndAtomicSpecies #Unit cell for the calculation LatticeConstant 15 Ang %block LatticeVectors 1.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 1.000 %endblock LatticeVectors # Basis set definition PAO.EnergyShift 250 meV PAO.SplitNorm 0.15 PAO.BasisSize SZ #Real space grid MeshCutoff 125.0 Ry # Convergence of SCF MaxSCFIterations 50 DM.MixingWeight 0.4 DM.NumberPulay 2 # Type of solution SolutionMethod diagon |
We find first the descriptors that specify the system
2 3 4 5 6 7 8 9 10 | SystemName CH4 molecule SystemLabel ch4 NumberOfAtoms 5 NumberOfSpecies 2 %block ChemicalSpeciesLabel 1 6 C # Species index, atomic number, species label 2 1 H # Species index, atomic number, species label %endblock ChemicalSpeciesLabel |
Note
You will notice several lines starting with #. These are comments on the file and will not be read by SIESTA.
The SystemName is only used for tagging purposes, while the SystemLabel will determine the root name for all output files (i.e., they will all begin by ch4.* in our case). NumberOfAtoms indicates the total number of atoms in our simulation box, and NumberOfSpecies indicates the different kinds of atoms present in the system.
Pay special attention to the %block ChemicalSpeciesLabel. In this block, you assign an index, an atomic number, and a label to each atomic species. The atomic number is, as usual, the nuclear charge of the atom. The label will allow SIESTA to recognize the files containing the information about the pseudopotential and the basis set (when provided); this means that the name of the label must coincide with the name of the pseudopotential file. The amount of species in this block must be the one specified by the NumberOfSpecies input.
Note
Options in the fdf file are case-insensitive. This means that NumberOfAtoms is the same as numberofatoms or NUMBEROFATOMS.
A few lines below, we find the coordinates for the atoms in the system
13 14 15 16 17 18 19 20 21 | AtomicCoordinatesFormat Ang %block AtomicCoordinatesAndAtomicSpecies 0.000 0.000 0.000 1 1.219 -0.284 -0.377 2 -0.284 1.219 -0.377 2 -0.140 -0.140 1.219 2 -0.833 -0.833 -0.503 2 %endblock AtomicCoordinatesAndAtomicSpecies |
In this block, each of the lines corresponds to an atom. The first three numbers are the X, Y and Z coordinates for each atom, and the last (integer) number indicates the species index in the %block ChemicalSpeciesLabel. In this case, there is a Carbon atom (species index = 1) in the position (0.0,0.0,0.0), and the rest are the four Hydrogens (species index = 2).
The AtomicCoordinatesFormat option can also be set to Bohr (which is the default value), or to Fractional if one desires to set the coordinates as a fraction of the lattice vectors (see below).
You might have wondered about the existence of this block in the input file:
24 25 26 27 28 29 30 | #Unit cell for the calculation LatticeConstant 15 Ang %block LatticeVectors 1.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 1.000 %endblock LatticeVectors |
It does not seem to make sense for a ‘molecule’ calculation. In fact, SIESTA always uses periodic boundary conditions (PBC); this means that we are doing a calculation for an infinite collection of regularly spaced molecules. If we want to simulate an isolated molecule, it is important to have enough distance between the molecule and its neighboring images. At the very minimum, there should not be overlap between the orbitals on different image molecules; however, this size should be bigger for molecules with a dipole, which will have a long-range interaction with their images in PBC.
The input option LatticeConstant is mandatory and multiplies the lattice vectors by that constant. In this case, we will end up with a cubic simulation box of 15 x 15 x 15 Angstrom.
Additional information: Periodic crystals
Imagine that instead of studying methane, we want to study bulk silicon.
Silicon is a periodic face-centered cubic structure that can be defined
by the primitive unit cell with a = 5.546790 Å
, given by the lattice
vectors:
\(a_1 = \frac{a}{2}(\hat{y} + \hat{z})\), \(a_2 = \frac{a}{2}(\hat{x} + \hat{z})\), and \(a_3 = \frac{a}{2}(\hat{x} + \hat{y})\).
The unit cell contains two atoms positioned at \((-0.125, -0.125, -0.125)\) and \((0.125, 0.125, 0.125)\) in fractional coordinates.
Answer: Can you write the structural parameters to define silicon bulk?
The descriptors that define the structure are:
NumberOfSpecies
,NumberOfAtoms
,%block ChemicalSpeciesLabel
,LatticeConstant
,%block LatticeVectors
,AtomicCoordinatesFormat
and%block AtomicCoordinatesAndAtomicSpecies
Thus, the input will have an structure like the following:
SystemLabel Si NumberOfSpecies 1 NumberOfAtoms 2 %block ChemicalSpeciesLabel 1 14 Si %endblock ChemicalSpeciesLabel LatticeConstant 5.546790 Ang %block LatticeVectors 0.00 0.50 0.50 0.50 0.00 0.50 0.50 0.50 0.00 %endblock LatticeVectors AtomicCoordinatesFormat Fractional %block AtomicCoordinatesAndAtomicSpecies -0.125 -0.125 -0.125 1 28.086 0.125 0.125 0.125 1 28.086 %endblock AtomicCoordinatesAndAtomicSpecies
In the rest of the file you will find options regarding the quality of the calculation and other performance related variables:
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | # Basis set definition PAO.EnergyShift 250 meV PAO.SplitNorm 0.15 PAO.BasisSize SZ #Real space grid MeshCutoff 125.0 Ry # Convergence of SCF MaxSCFIterations 50 DM.MixingWeight 0.4 DM.NumberPulay 2 # Type of solution SolutionMethod diagon |
The specifics of these will be explored in other tutorials; but to provide a brief overview:
The first three variables are a rough definition the quality of the basis set. The number of orbitals per atom is defined by the parameter PAO.BasisSize, and we have set it to select a minimal basis (SZ) for a quick, cheap calculation disregarding quality.
The parameter MeshCutoff controls the fineness of the real-space grid used to compute the integrals for the matrix elements of the Hamiltonian. A higher value corresponds to a finer grid.
The next three options control the performance of the self-consistent field (SCF) cycle: MaxSCFIterations, DM.MixingWeight, DM.NumberPulay.
The kind of solver used is set by SolutionMethod.
While the parameters specifying the system are mandatory, all other parameters have some default values and, in principle, it is not necessary to explicitly include them in the input file. However, it is important to note that the default values do not always guarantee a fast or a good quality calculation.
basic Running SIESTA¶
To run SIESTA via command line, you just need to do
$ siesta < ch4.fdf > ch4.out
or simply
$ siesta ch4.fdf > ch4.out
You will see that a lot of output files are generated in the simulation directory. We will begin by having a look at the main output file, ch4.out (or any name that you gave to the standard output).
basic A look at the output¶
Take a look at ch4.out once the program has finished. The file contains a human-readable account of the doings of SIESTA for this calculation:
A header with the version and compilation options, and a copy (in the first mode above) or a mention (in the second mode) of the input file.
Check it!
Siesta Version : 5.0 Architecture : ---- Compiler version: GNU-11.4.0 Compiler flags : -fallow-argument-mismatch;-O3 -march=native PP flags : ---- Libraries : ---- Parallelisations: MPI [...] ************************** Dump of input data file **************************** #General system specifications SystemName CH4 molecule SystemLabel ch4 NumberOfAtoms 5 NumberOfSpecies 2 [...] # Type of solution SolutionMethod diagon ************************** End of input data file *****************************
A summary of existing species and pseudopotential information.
Check it!
initatom: Reading input for the pseudopotentials and atomic orbitals ---------- Species number: 1 Atomic number: 6 Label: C Species number: 2 Atomic number: 1 Label: H [...] Valence configuration for ps generation: (assumed as above) For C, standard SIESTA heuristics set lmxkb to 2 (one more than the basis l, including polarization orbitals). Use PS.lmax or PS.KBprojectors blocks to override. For H, standard SIESTA heuristics set lmxkb to 1 (one more than the basis l, including polarization orbitals). Use PS.lmax or PS.KBprojectors blocks to override.
Details on the basis set generation.
Check it!
<basis_specs> =============================================================================== C Z= 6 Mass= 12.011 Charge= 0.17977+309 Lmxo=1 Lmxkb= 2 BasisType=split Semic=F L=0 Nsemic=0 Cnfigmx=2 i=1 nzeta=1 polorb=0 (2s) [...] izeta = 1 lambda = 1.000000 rc = 4.810269 energy = -0.447136 kinetic = 0.945150 potential(screened) = -1.392287 potential(ionic) = -1.930221 atom: Total number of Sankey-type orbitals: 1 atm_pop: Valence configuration (for local Pseudopot. screening): 1s( 1.00) Vna: chval, zval: 1.00000 1.00000 Vna: Cut-off radius for the neutral-atom potential: 4.810269
A summary of the values used in the calculation for the most important parameters.
Check it!
siesta: ******************** Simulation parameters **************************** siesta: siesta: The following are some of the parameters of the simulation. siesta: A complete list of the parameters used, including default values, siesta: can be found in file out.fdf siesta: redata: Spin configuration = none redata: Number of spin components = 1 [...] ts: ************************************************************** ts: Save H and S matrices = F ts: Save DM and EDM matrices = F ts: Only save the overlap matrix S = F ts: **************************************************************
Information of the geometry and matrix sparsity for a given geometry.
Check it!
outcell: Unit cell vectors (Ang): 15.000000 0.000000 0.000000 0.000000 15.000000 0.000000 0.000000 0.000000 15.000000 outcell: Cell vector modules (Ang) : 15.000000 15.000000 15.000000 outcell: Cell angles (23,13,12) (deg): 90.0000 90.0000 90.0000 outcell: Cell volume (Ang**3) : 3375.0000 <dSpData1D:S at geom step 0 <sparsity:sparsity for geom step 0 nrows_g=8 nrows=8 sparsity=1.0000 nnzs=64, refcount: 7> <dData1D:(new from dSpData1D) n=64, refcount: 1> refcount: 1> new_DM -- step: 1 [...] InitMesh: MESH = 108 x 108 x 108 = 1259712 InitMesh: Mesh cutoff (required, used) = 125.000 143.274 Ry New grid distribution [1]: sub = 2 stepf: Fermi-Dirac step function
Followed by a full breakdown of the initial, non-converged energy.
Check it!
siesta: Program's energy decomposition (eV): siesta: Ebs = -86.615317 siesta: Eions = 380.812454 [...] siesta: Etot = -214.688379 siesta: FreeEng = -214.688379
A log of the self-consistent-field (SCF) cycle.
Check it!
iscf Eharris(eV) E_KS(eV) FreeEng(eV) dDmax Ef(eV) dHmax(eV) scf: 1 -224.477055 -214.688379 -214.688379 1.130398 -6.803327 1.039357 timer: Routine,Calls,Time,% = IterSCF 1 0.314 28.15 scf: 2 -214.708050 -214.704182 -214.704182 0.022332 -6.481242 0.243330 scf: 3 -214.704223 -214.704283 -214.704283 0.002871 -6.378875 0.148294 scf: 4 -214.704298 -214.704314 -214.704314 0.001596 -6.231017 0.052566 scf: 5 -214.704324 -214.704321 -214.704321 0.000517 -6.269207 0.001236 scf: 6 -214.704321 -214.704321 -214.704321 0.000031 -6.268437 0.000383 SCF Convergence by DM+H criterion max |DM_out - DM_in| : 0.0000307918 max |H_out - H_in| (eV) : 0.0003831179 SCF cycle converged after 6 iterations
A full breakdown of the total energy decomposition and forces, and stress.
Check it!
siesta: Program's energy decomposition (eV): siesta: Ebs = -88.492716 siesta: Eions = 380.812454 [...] siesta: Total = -214.704321 siesta: Fermi = -6.268437 [...] siesta: Atomic forces (eV/Ang): siesta: 1 0.140447 0.140447 -1.039696 siesta: 2 -2.496202 0.517116 0.806177 siesta: 3 0.517116 -2.496202 0.806177 siesta: 4 0.363575 0.363575 -1.130990 siesta: 5 1.475788 1.475788 0.558249 siesta: ---------------------------------------- siesta: Tot 0.000724 0.000724 -0.000084 siesta: Stress tensor (static) (eV/Ang**3): siesta: 0.001324 -0.000018 -0.000133 siesta: -0.000018 0.001324 -0.000133 siesta: -0.000133 -0.000133 0.000671
The last part of the output file includes information on the cell pressure and the total electric dipole of the system.
Check it!
siesta: Cell volume = 3375.000000 Ang**3 siesta: Pressure (static): siesta: Solid Molecule Units siesta: -0.00001205 0.00000000 Ry/Bohr**3 siesta: -0.00110663 0.00000026 eV/Ang**3 siesta: -1.77300959 0.00042191 kBar (Free)E+ p_basis*V_orbitals = -214.210260 (Free)Eharris+ p_basis*V_orbitals = -214.210260 siesta: Electric dipole (a.u.) = -0.011736 -0.011736 0.009580 siesta: Electric dipole (Debye) = -0.029830 -0.029830 0.024350
And to finish, literature regarding the simulation done.
Check it!
cite: Please see "ch4.bib" for an exhaustive BiBTeX file. cite: Please clearly indicate Siesta version in published work: 5.1-MaX cite: This calculation has made use of the following articles cite: which are encouraged to be cited in a published work. Primary SIESTA paper DOI: www.doi.org/10.1088/0953-8984/14/11/302
As you can see, SIESTA generates a ton of output files besides the main log/out file. Each of these might interest you depending on what you intend to do with your simulation, but some of the most commonly relevant files are:
*.DM files (ch4.DM) contain a restart of the Density Matrix. Useful to restart already converged or close-to-convergence calculations without doing so from scratch.
*.XV files (ch4.XV) contain a restart of the current geometry and cell size. This is particularly useful to restart or continue molecular dynamics runs.
*.FA files (ch4.FA) contain total forces over all atoms. Each line corresponds to the cartesian force (x, y and z) over each atom (in eV/Ang).
*.EIG files (ch4.EIG) contain the eigenvalues of the Hamiltonian for all k-points involved. k-points are explored in more detail in Sampling of the BZ with k-points.
*.times and TIMES files (ch4.times and TIMES) contain timing information, which is invaluable when optimizing parameters for fast performance.
Now that you had a brief overview of how running SIESTA works, let’s play a bit with your CH4 inputs. See what happens if…:
You change the fineness of the real-space grid. Try an unreasonably low value for the the parameter MeshCutoff (around 10 Ry) and check the resulting total energy and forces (you can also find the forces in the file ch4.FA). Then experiment with a much higher one (300 Ry). More on this topic will be covered in this tutorial.
Answer: Change of MeshCutoff.
Everytime that you run SIESTA in a folder, all the output will be overwritten, thus to be able to compare the results of the diferent runs, a specific directory should be created for each of them.
The information or the total energy can be extracted in a more automatic way doing a
grep
ofTotal =
of the standard output file and to extract the total force look for the line that containssiesta: Tot
; for example if you have structured the calculation in different folders:$ grep -i "siesta: Tot" */output.dat10ry/output.dat:siesta: Tot 1.798364 1.798364 -3.227789 300ry/output.dat:siesta: Tot -0.000058 -0.000058 -0.000291$ grep -i "siesta: Total =" */output.dat10ry/output.dat:siesta: Total = -230.840670 300ry/output.dat:siesta: Total = -222.538059You can play with the size of the lattice parameters to go from ‘interacting’ molecules to effectively isolated ones. Look at the variation in the total energy as a function of the cell size, to see how the interaction between molecules decreases with increasing distance between images. For this non-polar molecule, the interaction should be very small.
Answer: Change of lattice parameter.
Everytime that you run SIESTA in a folder, all the output will be overwritten, thus to be able to compare the results of the diferent runs, a specific directory should be created for each of them.
The information or the total energy can be extracted in a more automatic way doing a
grep
ofTotal =
of the standard output file and to extract the total force look for the line that containssiesta: Tot
. For example, if you have structured the calculation in different folders:$ grep -i "Total =" */output.dat15AA/output.dat:siesta: Total = -222.538221 35AA/output.dat:siesta: Total = -222.538164 7_5AA/output.dat:siesta: Total = -222.538303
basic Simple basis-sets options¶
Now let’s change some options related to basis-sets.
Are you under a school environment?
Enter directory 02-CH4-Basis
You can find the input files here:
A peek at the fdf file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | #General system specifications SystemName CH4 molecule SystemLabel ch4 NumberOfAtoms 5 NumberOfSpecies 2 # Species index, atomic number, species label %block ChemicalSpeciesLabel 1 6 C 2 1 H %endblock ChemicalSpeciesLabel #Atomic coordinates AtomicCoordinatesFormat Ang %block AtomicCoordinatesAndAtomicSpecies 0.000 0.000 0.000 1 1.219 -0.284 -0.377 2 -0.284 1.219 -0.377 2 -0.140 -0.140 1.219 2 -0.833 -0.833 -0.503 2 %endblock AtomicCoordinatesAndAtomicSpecies #Unit cell for the calculation LatticeConstant 15 Ang %block LatticeVectors 1.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 1.000 %endblock LatticeVectors #Real space grid MeshCutoff 125.0 Ry # Convergence of SCF DM.MixingWeight 0.4 DM.NumberPulay 2 # Variables for Geometry optimization # MD.TypeOfRun CG # MD.NumCGsteps 50 # Basis set definition PAO.BasisSize SZ PAO.EnergyShift 250 meV |
Change basis-set size or cardinality of the basis. Set PAO.BasisSize
to any
of: DZ, SZP, DZP, TZP. See what happens with the total energy.
Answer: Change of basis-set size.
In order to modify the bases-set size you should edit the line with the specific descriptor in the fdf file.
33 34 35 36 | # Basis set definition PAO.EnergyShift 250 meV PAO.SplitNorm 0.15 PAO.BasisSize SZ |
SZ should be replaced by DZ, SZP or DZP.
Everytime that you run SIESTA in a folder, all the output will be overwritten; thus, to be able to compare the results of the diferent runs, a specific directory should be created for each of them.
The information of the total energy can be obtained by manually checking the
standard output (as suggested above) or in a more automatic way doing a
grep
of Total =
of the standard output file. For example, if you have
structured the calculation in different folders:
$ grep -i "Total =" */output.dat
ene_DZ/output.dat:siesta: Total = -223.913569
ene_DZP/output.dat:siesta: Total = -224.380785
ene_SZ/output.dat:siesta: Total = -222.538221
ene_SZP/output.dat:siesta: Total = -222.825577
Let’s look at the effect of changing the basis set in some parameters with physical meaning, like the bond length. In this tutorial you will be doing a geometry optimization for CH4.
The input file ch4.fdf is already set up to optimize the geometry of a
single methane molecule, but until now it was comemented. Let’s uncomment the
corresponding lines by removing #
(line 39 and 40).
38 39 40 | # Variables for Geometry optimization # MD.TypeOfRun CG # MD.NumCGsteps 50 |
At the bottom of the file, you will find two options used to define the basis set:
PAO.BasisSize SZ
PAO.EnergyShift 250 meV
The idea of this exercise is to see what happens with the total energy and the bond lengths of the system when modifying these options. The bond lengths can be found in the ch4.BONDS_FINAL file.
The SystemLabel.BONDS_FINAL contains the information regarding the atomic position of each atom of the unit cell and their distance with respect to each of the other ones.
See what happens when you increase the size of the basis set to SZP and DZP.
Answer: Effect on the bond’s length.
The bond length will be found in the ch4.BONDS_FINAL. As we are studing ch4 we will have 1 C and 4 H, for each of them we will know the distances to their nearest neighbors. Let’s check the output for the SZ case focusing on the carbon atom:
Neighbors of: 1 C at: -0.0150 -0.0150 -0.0133 4 H 1.1911 Ang. Really at: -0.1868 -0.1868 2.2244 3 H 1.1913 Ang. Really at: -0.5086 2.0880 -0.6471 2 H 1.1913 Ang. Really at: 2.0880 -0.5086 -0.6471 5 H 1.1915 Ang. Really at: -1.4499 -1.4499 -0.9885
The index and position of carbon are given by the first line, then, for each H atom, we have the index, label and distance to the carbon atom. Also, their absolute position is included. Your final results should look similar to this:
Size Total energy C-H2 C-H3 C-H4 C-H5 -------------------------------------------------------------- SZ : Total: -222.996782 eV, Bonds: 1.1913/1.1913/1.1911/1.1915 SZP: Total: -223.522888 eV, Bonds: 1.1601/1.1601/1.1596/1.1591 DZP: Total: -225.830785 eV, Bonds: 1.1126/1.1126/1.1146/1.1149
For DZP, try changing the energy shift to 100 meV, 50 meV and 10 meV.
Answer: Change of the energy shift.
Shift Total energy C-H2 C-H3 C-H4 C-H5 -------------------------------------------------------------- 100: Total: -225.929648 eV, Bonds: 1.1220/1.1220/1.1222/1.1223 50 : Total: -225.929092 eV, Bonds: 1.1194/1.1194/1.1196/1.1197 10 : Total: -225.760365 eV, Bonds: 1.1289/1.1289/1.1292/1.1296
Consider that the experimental bond-length for gas methane is 1.087 Å.
Which has the biggest effect in the energy, increasing the cardinality/size of the basis set, or changing the energy shift?
Answer: Effect in the energy.
From the first experiment, we can see that the cardinality of the basis set induces a change in the range of a few eV, which is much larger than that induced by the energy shift. Be aware that this happens when the basis set is relatively small; for basis sets larger than DZP (such as TZP) you don’t usually get a huge improvement. However, going to SZ to SZP, or from SZP to DZP, the change in flexibility of the basis set is a determining factor in the quality of the results.
In which case would you consider the basis set is “good enough”?
Answer: How do we select the good one?
We definitely need the DZP basis set to start getting reasonable results. But with regards to the energy shift, we can see that 50 meV is already reasonable enough to get good results. The total energy seems to be already converged to the meV level, and the bond lengths are also reasonable. That said, increasing the energy shift is not detrimental in this case, it will just increase the time of the calculation.
Is it ok to compare the results to an experimental bond length? Or would it be better to check against an already converged basis set?
Answer: Is the comparison with the experimental possible?
The question is tricky and depends on what you are going after. If you are testing convergence of parameters, then by all means, compare to an already converged case, even if it is with another type of basis set (such as gaussians or plane-waves). That said, if you are already sure that your input parameters are good enough, then comparing to experimental values is a good idea: it will give you a hint on how your level of theory (i.e., DFT with a given functional) is able to model the real world data. In most cases, the best approach is to compare trends between experiments and theory, rather than absolute values.
basic DFT functional¶
Are you under a school environment?
Enter directory 03-CH4-XC-Functional
You can find the input files here:
Pseudos GGA:
C.gga.psml
andH.gga.psml
Input file:
ch4.fdf
A peek at the fdf file
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | #General system specifications SystemName CH4 molecule SystemLabel ch4 NumberOfAtoms 5 NumberOfSpecies 2 # Species index, atomic number, species label %block ChemicalSpeciesLabel 1 6 C 2 1 H %endblock ChemicalSpeciesLabel #Atomic coordinates AtomicCoordinatesFormat Ang %block AtomicCoordinatesAndAtomicSpecies 0.000 0.000 0.000 1 1.219 -0.284 -0.377 2 -0.284 1.219 -0.377 2 -0.140 -0.140 1.219 2 -0.833 -0.833 -0.503 2 %endblock AtomicCoordinatesAndAtomicSpecies #Unit cell for the calculation LatticeConstant 15 Ang %block LatticeVectors 1.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 1.000 %endblock LatticeVectors # Basis set definition PAO.EnergyShift 250 meV PAO.SplitNorm 0.15 PAO.BasisSize SZ # Exchange-Correlation functional XC.Functional LDA # Try: GGA XC.Authors PZ # Try: PBE, PBEsol #Real space grid MeshCutoff 125.0 Ry # Convergence of SCF DM.MixingWeight 0.4 DM.NumberPulay 2 |
Up to now we have been implicitly using the local density approximation (LDA) for our calculations, which is the default option in SIESTA. However, it is also possible to use other functionals, such as those of GGA type. Edit the ch4.fdf file to include this block:
XC.functional GGA
XC.authors PBE
Notice that both XC.Authors and
XC.Functional are needed, and they must be
consistent. For XC.Functional GGA
, the most used variants are BLYP, PBE
and PBEsol. For XC.Functional VDW
, frequently used functionals are DRSLL,
LMKLL and VV. See the manual for more options on XC functionals.
Run SIESTA and look for possible lines with WARNING
or ERROR
in them.
Answer: Change of the functional.
Even though the program ran successfully, you will find the following warning in the output:
xc_check: WARNING: Pseudopotential generated with LDA PW92 functional
You are using a GGA functional with a pseudopotential generated using LDA, as this is not consistent!. Fortunately, we have produced also the pseudopotentials using GGA for you. They are in the files C.gga.psml and H.gga.psml.
Modify the input file again to use these files by simply changing the species strings:
%block ChemicalSpeciesLabel
1 6 C.gga # Species index, atomic number, species label
2 1 H.gga # Species index, atomic number, species label
%endblock ChemicalSpeciesLabel
Remember, the species name can be anything you want, but it must be consistent with the name of the pseudopotential files. Run the program again and check whether the warning disappears from the output.
It is worth noting that, as with pseudopotentials, DFT functionals are not easy to compare between each other systematically. Again, it is worth expending some time searching the literature for the type of simulations you are trying to run.
Additional information: Visualizing the structure
To visualize the structure included in the fdf file siesta provides a utility that allows you to convert it to stardand crystal formats, such us xsf or POSCAR, which then can be open with most of the graphical tools.
xv2xsf
[it is available with the usual SIESTA compilation]. Converts the XV output, where atomic position and lattice vectors are saved, to an xsf file.
$ xv2xsf
It will ask which is the SystemLabel used, which, for this initial tutorial, it was ch4.
Specify SystemLabel (or 'siesta' if none): ch4
Found and opened: ch4.XV
Then, a ch4.XSF is genereated, and it can be visualize, for example using ovito.
sgeom [it is an exerternal tool part to the sisl python library]. Converts either fdf file or the XV to POSCAR file, additionally allows you to do some simple modifications to the structure.
$ sgeom ch4.fdf POSCAR
or
$ sgeom ch4.fdf ch4.xsf