A First Encounter - Part 1: Running SIESTA¶
In this exercise we will get a first acquaintance with SIESTA by studying two simple molecules, CH4 and CH3. We will cover quite a lot of features and concepts, without worrying too much about issues of convergence.
The input files - fdf¶
Hint
Enter the directory ‘first-encounter’
You will find an input file named ch4.fdf
, along with the files C.psf
and H.psf
which contain the information about the pseudopotentials. The FDF
(Flexible Data Format) file is the core input in SIESTA, since it provides
information about the system and sets the value of several parameters regarding
the quality of the calculation.
System information¶
We find first the inputs that specify the system:
SystemName CH4 molecule
SystemLabel ch4
NumberOfAtoms 5
NumberOfSpecies 2
%block ChemicalSpeciesLabel
1 6 C # Species index, atomic number, species label
2 1 H # Species index, atomic number, species label
%endblock ChemicalSpeciesLabel
You will notice several lines starting with #. These are comments on the file and will not be read by SIESTA.
The SystemName is only used for tagging purposes, while the SystemLabel will
determine the root name for all output files (i.e., they will all begin by
ch4.* in our case). NumberOfAtoms
indicates the total number of atoms
in our simulation box, and NumberOfSpecies
indicates the different kinds of
atoms present in the system.
Pay special attention to the %block ChemicalSpeciesLabel
. In this block, you
assign an index, an atomic number, and a label to each atomic species. The
atomic number is, as usual, the nuclear charge of the atom. The label will allow
to recognize the files containing the information about the pseudopotential and
the basis set (when provided); this means that the name of the label must
coincide with the pseudopotential file. The amount of species in this block must
be the one specified by the NumberOfSpecies
input.
Note
Options in the fdf file are case-insensitive. This means that
NumberOfAtoms
is the same as numberofatoms
or NUMBEROFATOMS
.
A few lines below, we find the coordinates for the atoms in the system:
AtomicCoordinatesFormat Ang
%block AtomicCoordinatesAndAtomicSpecies
0.000 0.000 0.000 1
1.219 -0.284 -0.377 2
-0.284 1.219 -0.377 2
-0.140 -0.140 1.219 2
-0.833 -0.833 -0.503 2
%endblock AtomicCoordinatesAndAtomicSpecies
In this block, each of the lines corresponds to an atom. The first three numbers are the X, Y and Z coordinates for each atom, and the last (integer) number indicates the species index in the ChemicalSpeciesLabel block. In this case, there is a Carbon atom (species index = 1) in the position (0.0,0.0,0.0), and the rest are the four Hydrogens (species index = 2).
The AtomicCoordinatesFormat
option can also be set to Bohr
(which is
the default value), or to Fractional
if one desires to set the coordinates
as a fraction of the lattice vectors (see below).
Simulation box and Periodic Boundary Condtions¶
You might have wondered about the appearance of this block in the input file:
#Unit cell for the calculation
LatticeConstant 15 Ang
%block LatticeVectors
1.000 0.000 0.000
0.000 1.000 0.000
0.000 0.000 1.000
%endblock LatticeVectors
which does not seem to make sense for a ‘molecule’ calculation. In fact, SIESTA always uses periodic boundary conditions (PBC); this means that we are doing a calculation for an infinite collection of regularly spaced molecules. If we want to simulate an isolated molecule it is important to have enough distance between the molecule and its neighboring images. At the very minimum, there should not be overlap between the orbitals on different image molecules; however, this size should be bigger for molecules with a dipole, which will have a long-range interaction with their images in PBC.
The input option LatticeConstant
is mandatory and multiplies the lattice
vectors by that constant. In this case, we will end up with a cubic simulation
box of 15 x 15 x 15 Angstrom.
Further Options¶
In the rest of the file you will find options regarding the quality of the calculation and other performance related variables:
# Basis set definition
PAO.EnergyShift 250 meV
PAO.SplitNorm 0.15
PAO.BasisSize SZ
#Real space grid
MeshCutoff 125.0 Ry
# Convergence of SCF
MaxSCFIterations 50
DM.MixingWeight 0.4
DM.NumberPulay 2
# Type of solution
SolutionMethod diagon
The specifics of these will be explored in other tutorials; but to provide a brief overview:
The first three variables are a rough definition the quality of the basis set. The number of orbitals per atom is defined by the parameter
PAO.BasisSize
, and we have set it to select a minimal basis (SZ) for a quick, cheap calculation disregarding quality.The parameter
MeshCutoff
controls the fineness of the real-space grid used to compute the integrals for the matrix elements of the Hamiltonian. A higher value corresponds to a finer grid.The next three options control the performance of the self-consistent field (SCF) cycle.
SolutionMethod
indicates the kind of solver that we are using.
Note
Options for PAO.BasisSize
follow the usual convention of amount of
functions per valence orbital. For C, a SZ basis set has 4 functions (s+p)
and a DZP basis set has 13 functions (2*s, 2*p, and a d polarization shell).
While the parameters specifying the system are mandatory, all other parameters have some default values and, in principle, it is not necessary to explicitly include them in the input file. However, it is important to note that the default values do not always guarantee a fast or a good quality calculation.
Running SIESTA¶
To run SIESTA via command line, you just need to do
siesta < ch4.fdf > ch4.out # traditional way
or simply:
siesta ch4.fdf > ch4.out
You will see that a lot of output files are generated in the simulation
directory. We will begin by having a look at the main output file, ch4.out
.
A look at the output¶
Take a look at ch4.out
once the program has finished. The file
contains a human-readable account of the doings of SIESTA for
this calculation:
A header with the version and compilation options, and a copy (in the first mode above) or a mention (in the second mode) of the input file:
Siesta Version : 5.0 Architecture : ---- Compiler version: GNU-11.4.0 Compiler flags : -fallow-argument-mismatch;-O3 -march=native PP flags : ---- Libraries : ---- Parallelisations: MPI [...] ************************** Dump of input data file **************************** #General system specifications SystemName CH4 molecule SystemLabel ch4 NumberOfAtoms 5 NumberOfSpecies 2 [...] # Type of solution SolutionMethod diagon ************************** End of input data file *****************************
A summary of existing species and pseudopotential information:
initatom: Reading input for the pseudopotentials and atomic orbitals ---------- Species number: 1 Atomic number: 6 Label: C Species number: 2 Atomic number: 1 Label: H [...] Valence configuration for ps generation: (assumed as above) For C, standard SIESTA heuristics set lmxkb to 2 (one more than the basis l, including polarization orbitals). Use PS.lmax or PS.KBprojectors blocks to override. For H, standard SIESTA heuristics set lmxkb to 1 (one more than the basis l, including polarization orbitals). Use PS.lmax or PS.KBprojectors blocks to override.
Details on the basis set generation:
<basis_specs> =============================================================================== C Z= 6 Mass= 12.011 Charge= 0.17977+309 Lmxo=1 Lmxkb= 2 BasisType=split Semic=F L=0 Nsemic=0 Cnfigmx=2 i=1 nzeta=1 polorb=0 (2s) [...] izeta = 1 lambda = 1.000000 rc = 4.810269 energy = -0.447136 kinetic = 0.945150 potential(screened) = -1.392287 potential(ionic) = -1.930221 atom: Total number of Sankey-type orbitals: 1 atm_pop: Valence configuration (for local Pseudopot. screening): 1s( 1.00) Vna: chval, zval: 1.00000 1.00000 Vna: Cut-off radius for the neutral-atom potential: 4.810269
A summary of the values used in the calculation for the most important parameters:
siesta: ******************** Simulation parameters **************************** siesta: siesta: The following are some of the parameters of the simulation. siesta: A complete list of the parameters used, including default values, siesta: can be found in file out.fdf siesta: redata: Spin configuration = none redata: Number of spin components = 1 [...] ts: ************************************************************** ts: Save H and S matrices = F ts: Save DM and EDM matrices = F ts: Only save the overlap matrix S = F ts: **************************************************************
Information of the geometry and matrix sparsity for a given geometry:
outcell: Unit cell vectors (Ang): 15.000000 0.000000 0.000000 0.000000 15.000000 0.000000 0.000000 0.000000 15.000000 outcell: Cell vector modules (Ang) : 15.000000 15.000000 15.000000 outcell: Cell angles (23,13,12) (deg): 90.0000 90.0000 90.0000 outcell: Cell volume (Ang**3) : 3375.0000 <dSpData1D:S at geom step 0 <sparsity:sparsity for geom step 0 nrows_g=8 nrows=8 sparsity=1.0000 nnzs=64, refcount: 7> <dData1D:(new from dSpData1D) n=64, refcount: 1> refcount: 1> new_DM -- step: 1 [...] InitMesh: MESH = 108 x 108 x 108 = 1259712 InitMesh: Mesh cutoff (required, used) = 125.000 143.274 Ry New grid distribution [1]: sub = 2 stepf: Fermi-Dirac step function
Followed by a full breakdown of the initial, non-converged energy:
siesta: Program's energy decomposition (eV): siesta: Ebs = -86.615317 siesta: Eions = 380.812454 [...] siesta: Etot = -214.688379 siesta: FreeEng = -214.688379
A log of the self-consistent-field (SCF) cycle:
iscf Eharris(eV) E_KS(eV) FreeEng(eV) dDmax Ef(eV) dHmax(eV) scf: 1 -224.477055 -214.688379 -214.688379 1.130398 -6.803327 1.039357 timer: Routine,Calls,Time,% = IterSCF 1 0.314 28.15 scf: 2 -214.708050 -214.704182 -214.704182 0.022332 -6.481242 0.243330 scf: 3 -214.704223 -214.704283 -214.704283 0.002871 -6.378875 0.148294 scf: 4 -214.704298 -214.704314 -214.704314 0.001596 -6.231017 0.052566 scf: 5 -214.704324 -214.704321 -214.704321 0.000517 -6.269207 0.001236 scf: 6 -214.704321 -214.704321 -214.704321 0.000031 -6.268437 0.000383 SCF Convergence by DM+H criterion max |DM_out - DM_in| : 0.0000307918 max |H_out - H_in| (eV) : 0.0003831179 SCF cycle converged after 6 iterations
A full breakdown of the total energy decomposition and forces, and stress:
siesta: Program's energy decomposition (eV): siesta: Ebs = -88.492716 siesta: Eions = 380.812454 [...] siesta: Total = -214.704321 siesta: Fermi = -6.268437 [...] siesta: Atomic forces (eV/Ang): siesta: 1 0.140447 0.140447 -1.039696 siesta: 2 -2.496202 0.517116 0.806177 siesta: 3 0.517116 -2.496202 0.806177 siesta: 4 0.363575 0.363575 -1.130990 siesta: 5 1.475788 1.475788 0.558249 siesta: ---------------------------------------- siesta: Tot 0.000724 0.000724 -0.000084 siesta: Stress tensor (static) (eV/Ang**3): siesta: 0.001324 -0.000018 -0.000133 siesta: -0.000018 0.001324 -0.000133 siesta: -0.000133 -0.000133 0.000671
The last part of the output file includes information on the cell pressure and the total electric dipole of the system:
siesta: Cell volume = 3375.000000 Ang**3 siesta: Pressure (static): siesta: Solid Molecule Units siesta: -0.00001205 0.00000000 Ry/Bohr**3 siesta: -0.00110663 0.00000026 eV/Ang**3 siesta: -1.77300959 0.00042191 kBar (Free)E+ p_basis*V_orbitals = -214.210260 (Free)Eharris+ p_basis*V_orbitals = -214.210260 siesta: Electric dipole (a.u.) = -0.011736 -0.011736 0.009580 siesta: Electric dipole (Debye) = -0.029830 -0.029830 0.024350
And to finish, literature regarding the simulation done:
cite: Please see "ch4.bib" for an exhaustive BiBTeX file. cite: Please clearly indicate Siesta version in published work: 5.1-MaX cite: This calculation has made use of the following articles cite: which are encouraged to be cited in a published work. Primary SIESTA paper DOI: www.doi.org/10.1088/0953-8984/14/11/302
Other Output files¶
As you can see, SIESTA generates a ton of output files besides the main log/out file. Each of these might interest you depending on what you intend to do with your simulation, but some of the most commonly relevant files are:
*.DM files (
ch4.DM
) contain a restart of the Density Matrix. Useful to restart already converged or close-to-convergence calculations without doing so from scratch.*.XV files (
ch4.XV
) contain a restart of the current geometry and cell size. This is particularly useful to restart or continue molecular dynamics runs.*.FA files (
ch4.FA
) contain total forces over all atoms. Each line corresponds to the cartesian force (x, y and z) over each atom (in eV/Ang).*.EIG files (
ch4.EIG
) contain the eigenvalues of the Hamiltonian for all k-points involved. k-points are explored in more detail in Sampling of the BZ with k-points.*.times and TIMES files (
ch4.times
andTIMES
) contain timing information, which is invaluable when optimizing parameters for fast performance.
Let’s play!¶
Now that you had a brief overview of how running SIESTA works, let’s play a bit with your CH4 inputs. See what happens when:
You change basis-set size: Set
PAO.BasisSize
to any of: DZ, SZP, DZP, TZP. See what happens with the total energy and the runtime. More on this topic in this tutorial.You change the fineness of the real-space grid: Try an unreasonably low value for the the parameter
MeshCutoff
(around 10 Ry) and check the resulting total energy and forces (you can also find the forces in the filech4.FA
). Then experiment with a much higher one (300 Ry). More on this topic in this tutorial.You can play with the size of the lattice parameters to go from ‘interacting’ molecules to effectively isolated ones. Look at the variation in the total energy as a function of the cell size, to see how the interaction between molecules decreases with increasing distance between images. For this non-polar molecule, the interaction should be very small.