A First Encounter - Part 1: Running SIESTA¶

In this exercise we will get a first acquaintance with SIESTA by studying two simple molecules, CH4 and CH3. We will cover quite a lot of features and concepts, without worrying too much about issues of convergence.

The input files - fdf¶

Hint

Enter the directory ‘first-encounter’

You will find an input file named ch4.fdf, along with the files C.psf and H.psf which contain the information about the pseudopotentials. The FDF (Flexible Data Format) file is the core input in SIESTA, since it provides information about the system and sets the value of several parameters regarding the quality of the calculation.

System information¶

We find first the inputs that specify the system:

SystemName          CH4 molecule
SystemLabel         ch4
NumberOfAtoms       5
NumberOfSpecies     2

%block ChemicalSpeciesLabel
 1  6 C   # Species index, atomic number, species label
 2  1 H   # Species index, atomic number, species label
%endblock ChemicalSpeciesLabel

You will notice several lines starting with #. These are comments on the file and will not be read by SIESTA.

The SystemName is only used for tagging purposes, while the SystemLabel will determine the root name for all output files (i.e., they will all begin by ch4.* in our case). NumberOfAtoms indicates the total number of atoms in our simulation box, and NumberOfSpecies indicates the different kinds of atoms present in the system.

Pay special attention to the %block ChemicalSpeciesLabel. In this block, you assign an index, an atomic number, and a label to each atomic species. The atomic number is, as usual, the nuclear charge of the atom. The label will allow to recognize the files containing the information about the pseudopotential and the basis set (when provided); this means that the name of the label must coincide with the pseudopotential file. The amount of species in this block must be the one specified by the NumberOfSpecies input.

Note

Options in the fdf file are case-insensitive. This means that NumberOfAtoms is the same as numberofatoms or NUMBEROFATOMS.

A few lines below, we find the coordinates for the atoms in the system:

AtomicCoordinatesFormat  Ang
%block AtomicCoordinatesAndAtomicSpecies
  0.000      0.000      0.000   1
  1.219     -0.284     -0.377   2
 -0.284      1.219     -0.377   2
 -0.140     -0.140      1.219   2
 -0.833     -0.833     -0.503   2
%endblock AtomicCoordinatesAndAtomicSpecies

In this block, each of the lines corresponds to an atom. The first three numbers are the X, Y and Z coordinates for each atom, and the last (integer) number indicates the species index in the ChemicalSpeciesLabel block. In this case, there is a Carbon atom (species index = 1) in the position (0.0,0.0,0.0), and the rest are the four Hydrogens (species index = 2).

The AtomicCoordinatesFormat option can also be set to Bohr (which is the default value), or to Fractional if one desires to set the coordinates as a fraction of the lattice vectors (see below).

Simulation box and Periodic Boundary Condtions¶

You might have wondered about the appearance of this block in the input file:

#Unit cell for the calculation
LatticeConstant 15 Ang
%block LatticeVectors
1.000 0.000 0.000
0.000 1.000 0.000
0.000 0.000 1.000
%endblock LatticeVectors

which does not seem to make sense for a ‘molecule’ calculation. In fact, SIESTA always uses periodic boundary conditions (PBC); this means that we are doing a calculation for an infinite collection of regularly spaced molecules. If we want to simulate an isolated molecule it is important to have enough distance between the molecule and its neighboring images. At the very minimum, there should not be overlap between the orbitals on different image molecules; however, this size should be bigger for molecules with a dipole, which will have a long-range interaction with their images in PBC.

The input option LatticeConstant is mandatory and multiplies the lattice vectors by that constant. In this case, we will end up with a cubic simulation box of 15 x 15 x 15 Angstrom.

Further Options¶

In the rest of the file you will find options regarding the quality of the calculation and other performance related variables:

# Basis set definition
PAO.EnergyShift 250 meV
PAO.SplitNorm   0.15
PAO.BasisSize   SZ

#Real space grid
MeshCutoff 125.0 Ry

# Convergence of SCF
MaxSCFIterations 50
DM.MixingWeight 0.4
DM.NumberPulay  2

# Type of solution
SolutionMethod diagon

The specifics of these will be explored in other tutorials; but to provide a brief overview:

The first three variables are a rough definition the quality of the basis set. The number of orbitals per atom is defined by the parameter PAO.BasisSize, and we have set it to select a minimal basis (SZ) for a quick, cheap calculation disregarding quality.
The parameter MeshCutoff controls the fineness of the real-space grid used to compute the integrals for the matrix elements of the Hamiltonian. A higher value corresponds to a finer grid.
The next three options control the performance of the self-consistent field (SCF) cycle.
SolutionMethod indicates the kind of solver that we are using.

Note

Options for PAO.BasisSize follow the usual convention of amount of functions per valence orbital. For C, a SZ basis set has 4 functions (s+p) and a DZP basis set has 13 functions (2*s, 2*p, and a d polarization shell).

While the parameters specifying the system are mandatory, all other parameters have some default values and, in principle, it is not necessary to explicitly include them in the input file. However, it is important to note that the default values do not always guarantee a fast or a good quality calculation.

Running SIESTA¶

To run SIESTA via command line, you just need to do

siesta < ch4.fdf > ch4.out           # traditional way

or simply:

siesta ch4.fdf > ch4.out

You will see that a lot of output files are generated in the simulation directory. We will begin by having a look at the main output file, ch4.out.

A look at the output¶

Take a look at ch4.out once the program has finished. The file contains a human-readable account of the doings of SIESTA for this calculation:

A header with the version and compilation options, and a copy (in the first mode above) or a mention (in the second mode) of the input file:

Siesta Version  : 5.0
Architecture    : ----
Compiler version: GNU-11.4.0
Compiler flags  : -fallow-argument-mismatch;-O3 -march=native
PP flags        : ----
Libraries       : ----
Parallelisations: MPI

[...]

************************** Dump of input data file ****************************
#General system specifications
SystemName          CH4 molecule
SystemLabel         ch4
NumberOfAtoms       5
NumberOfSpecies     2

[...]

# Type of solution
SolutionMethod diagon
************************** End of input data file *****************************

A summary of existing species and pseudopotential information:

initatom: Reading input for the pseudopotentials and atomic orbitals ----------
Species number:   1 Atomic number:    6 Label: C
Species number:   2 Atomic number:    1 Label: H

[...]

Valence configuration for ps generation: (assumed as above)
For C, standard SIESTA heuristics set lmxkb to 2
(one more than the basis l, including polarization orbitals).
Use PS.lmax or PS.KBprojectors blocks to override.
For H, standard SIESTA heuristics set lmxkb to 1
(one more than the basis l, including polarization orbitals).
Use PS.lmax or PS.KBprojectors blocks to override.

Details on the basis set generation:

<basis_specs>
===============================================================================
C                    Z=   6    Mass=  12.011        Charge= 0.17977+309
Lmxo=1 Lmxkb= 2    BasisType=split      Semic=F
L=0  Nsemic=0  Cnfigmx=2
          i=1  nzeta=1  polorb=0  (2s)

[...]

                  izeta = 1
                 lambda =    1.000000
                     rc =    4.810269
                 energy =   -0.447136
                kinetic =    0.945150
    potential(screened) =   -1.392287
       potential(ionic) =   -1.930221
atom: Total number of Sankey-type orbitals:  1

atm_pop: Valence configuration (for local Pseudopot. screening):
1s( 1.00)
Vna: chval, zval:    1.00000   1.00000

Vna:  Cut-off radius for the neutral-atom potential:   4.810269

A summary of the values used in the calculation for the most important parameters:

siesta: ******************** Simulation parameters ****************************
siesta:
siesta: The following are some of the parameters of the simulation.
siesta: A complete list of the parameters used, including default values,
siesta: can be found in file out.fdf
siesta:
redata: Spin configuration                          = none
redata: Number of spin components                   = 1

[...]

ts: **************************************************************
ts: Save H and S matrices                           =    F
ts: Save DM and EDM matrices                        =    F
ts: Only save the overlap matrix S                  =    F
ts: **************************************************************

Information of the geometry and matrix sparsity for a given geometry:

outcell: Unit cell vectors (Ang):
       15.000000    0.000000    0.000000
        0.000000   15.000000    0.000000
        0.000000    0.000000   15.000000

outcell: Cell vector modules (Ang)   :   15.000000   15.000000   15.000000
outcell: Cell angles (23,13,12) (deg):     90.0000     90.0000     90.0000
outcell: Cell volume (Ang**3)        :   3375.0000
<dSpData1D:S at geom step 0
  <sparsity:sparsity for geom step 0
    nrows_g=8 nrows=8 sparsity=1.0000 nnzs=64, refcount: 7>
  <dData1D:(new from dSpData1D) n=64, refcount: 1>
refcount: 1>
new_DM -- step:     1

[...]

InitMesh: MESH = 108 x 108 x 108 = 1259712
InitMesh: Mesh cutoff (required, used) =   125.000   143.274 Ry
New grid distribution [1]: sub = 2

stepf: Fermi-Dirac step function

Followed by a full breakdown of the initial, non-converged energy:

siesta: Program's energy decomposition (eV):
siesta: Ebs     =       -86.615317
siesta: Eions   =       380.812454
[...]
siesta: Etot    =      -214.688379
siesta: FreeEng =      -214.688379

A log of the self-consistent-field (SCF) cycle:

       iscf     Eharris(eV)        E_KS(eV)     FreeEng(eV)     dDmax    Ef(eV) dHmax(eV)
  scf:    1     -224.477055     -214.688379     -214.688379  1.130398 -6.803327  1.039357
timer: Routine,Calls,Time,% = IterSCF        1       0.314  28.15
  scf:    2     -214.708050     -214.704182     -214.704182  0.022332 -6.481242  0.243330
  scf:    3     -214.704223     -214.704283     -214.704283  0.002871 -6.378875  0.148294
  scf:    4     -214.704298     -214.704314     -214.704314  0.001596 -6.231017  0.052566
  scf:    5     -214.704324     -214.704321     -214.704321  0.000517 -6.269207  0.001236
  scf:    6     -214.704321     -214.704321     -214.704321  0.000031 -6.268437  0.000383

SCF Convergence by DM+H criterion
max |DM_out - DM_in|         :     0.0000307918
max |H_out - H_in|      (eV) :     0.0003831179
SCF cycle converged after 6 iterations

A full breakdown of the total energy decomposition and forces, and stress:

siesta: Program's energy decomposition (eV):
siesta: Ebs     =       -88.492716
siesta: Eions   =       380.812454
[...]
siesta:         Total =    -214.704321
siesta:         Fermi =      -6.268437

[...]
siesta: Atomic forces (eV/Ang):
siesta:      1    0.140447    0.140447   -1.039696
siesta:      2   -2.496202    0.517116    0.806177
siesta:      3    0.517116   -2.496202    0.806177
siesta:      4    0.363575    0.363575   -1.130990
siesta:      5    1.475788    1.475788    0.558249
siesta: ----------------------------------------
siesta:    Tot    0.000724    0.000724   -0.000084

siesta: Stress tensor (static) (eV/Ang**3):
siesta:     0.001324   -0.000018   -0.000133
siesta:    -0.000018    0.001324   -0.000133
siesta:    -0.000133   -0.000133    0.000671

The last part of the output file includes information on the cell pressure and the total electric dipole of the system:

siesta: Cell volume =       3375.000000 Ang**3

siesta: Pressure (static):
siesta:                Solid            Molecule  Units
siesta:          -0.00001205          0.00000000  Ry/Bohr**3
siesta:          -0.00110663          0.00000026  eV/Ang**3
siesta:          -1.77300959          0.00042191  kBar
(Free)E+ p_basis*V_orbitals  =        -214.210260
(Free)Eharris+ p_basis*V_orbitals  =        -214.210260

siesta: Electric dipole (a.u.)  =   -0.011736   -0.011736    0.009580
siesta: Electric dipole (Debye) =   -0.029830   -0.029830    0.024350

And to finish, literature regarding the simulation done:

cite: Please see "ch4.bib" for an exhaustive BiBTeX file.
cite: Please clearly indicate Siesta version in published work: 5.1-MaX
cite: This calculation has made use of the following articles
cite: which are encouraged to be cited in a published work.
    Primary SIESTA paper
      DOI: www.doi.org/10.1088/0953-8984/14/11/302

Other Output files¶

As you can see, SIESTA generates a ton of output files besides the main log/out file. Each of these might interest you depending on what you intend to do with your simulation, but some of the most commonly relevant files are:

*.DM files (ch4.DM) contain a restart of the Density Matrix. Useful to restart already converged or close-to-convergence calculations without doing so from scratch.
*.XV files (ch4.XV) contain a restart of the current geometry and cell size. This is particularly useful to restart or continue molecular dynamics runs.
*.FA files (ch4.FA) contain total forces over all atoms. Each line corresponds to the cartesian force (x, y and z) over each atom (in eV/Ang).
*.EIG files (ch4.EIG) contain the eigenvalues of the Hamiltonian for all k-points involved. k-points are explored in more detail in Sampling of the BZ with k-points.
*.times and TIMES files (ch4.times and TIMES) contain timing information, which is invaluable when optimizing parameters for fast performance.

Let’s play!¶

Now that you had a brief overview of how running SIESTA works, let’s play a bit with your CH4 inputs. See what happens when:

You change basis-set size: Set PAO.BasisSize to any of: DZ, SZP, DZP, TZP. See what happens with the total energy and the runtime. More on this topic in this tutorial.

You change the fineness of the real-space grid: Try an unreasonably low value for the the parameter MeshCutoff (around 10 Ry) and check the resulting total energy and forces (you can also find the forces in the file ch4.FA). Then experiment with a much higher one (300 Ry). More on this topic in this tutorial.

You can play with the size of the lattice parameters to go from ‘interacting’ molecules to effectively isolated ones. Look at the variation in the total energy as a function of the cell size, to see how the interaction between molecules decreases with increasing distance between images. For this non-polar molecule, the interaction should be very small.