A First Encounter - Part 1: Running SIESTA

In this exercise we will get a first acquaintance with SIESTA by studying two simple molecules, CH4 and CH3. We will cover quite a lot of features and concepts, without worrying too much about issues of convergence.

The input files - fdf

Hint

Enter the directory ‘first-encounter’

You will find an input file named ch4.fdf, along with the files C.psf and H.psf which contain the information about the pseudopotentials. The FDF (Flexible Data Format) file is the core input in SIESTA, since it provides information about the system and sets the value of several parameters regarding the quality of the calculation.

System information

We find first the inputs that specify the system:

SystemName          CH4 molecule
SystemLabel         ch4
NumberOfAtoms       5
NumberOfSpecies     2

%block ChemicalSpeciesLabel
 1  6 C   # Species index, atomic number, species label
 2  1 H   # Species index, atomic number, species label
%endblock ChemicalSpeciesLabel

You will notice several lines starting with #. These are comments on the file and will not be read by SIESTA.

The SystemName is only used for tagging purposes, while the SystemLabel will determine the root name for all output files (i.e., they will all begin by ch4.* in our case). NumberOfAtoms indicates the total number of atoms in our simulation box, and NumberOfSpecies indicates the different kinds of atoms present in the system.

Pay special attention to the %block ChemicalSpeciesLabel. In this block, you assign an index, an atomic number, and a label to each atomic species. The atomic number is, as usual, the nuclear charge of the atom. The label will allow to recognize the files containing the information about the pseudopotential and the basis set (when provided); this means that the name of the label must coincide with the pseudopotential file. The amount of species in this block must be the one specified by the NumberOfSpecies input.

Note

Options in the fdf file are case-insensitive. This means that NumberOfAtoms is the same as numberofatoms or NUMBEROFATOMS.

A few lines below, we find the coordinates for the atoms in the system:

AtomicCoordinatesFormat  Ang
%block AtomicCoordinatesAndAtomicSpecies
  0.000      0.000      0.000   1
  1.219     -0.284     -0.377   2
 -0.284      1.219     -0.377   2
 -0.140     -0.140      1.219   2
 -0.833     -0.833     -0.503   2
%endblock AtomicCoordinatesAndAtomicSpecies

In this block, each of the lines corresponds to an atom. The first three numbers are the X, Y and Z coordinates for each atom, and the last (integer) number indicates the species index in the ChemicalSpeciesLabel block. In this case, there is a Carbon atom (species index = 1) in the position (0.0,0.0,0.0), and the rest are the four Hydrogens (species index = 2).

The AtomicCoordinatesFormat option can also be set to Bohr (which is the default value), or to Fractional if one desires to set the coordinates as a fraction of the lattice vectors (see below).

Simulation box and Periodic Boundary Condtions

You might have wondered about the appearance of this block in the input file:

#Unit cell for the calculation
LatticeConstant 15 Ang
%block LatticeVectors
1.000 0.000 0.000
0.000 1.000 0.000
0.000 0.000 1.000
%endblock LatticeVectors

which does not seem to make sense for a ‘molecule’ calculation. In fact, SIESTA always uses periodic boundary conditions (PBC); this means that we are doing a calculation for an infinite collection of regularly spaced molecules. If we want to simulate an isolated molecule it is important to have enough distance between the molecule and its neighboring images. At the very minimum, there should not be overlap between the orbitals on different image molecules; however, this size should be bigger for molecules with a dipole, which will have a long-range interaction with their images in PBC.

The input option LatticeConstant is mandatory and multiplies the lattice vectors by that constant. In this case, we will end up with a cubic simulation box of 15 x 15 x 15 Angstrom.

Further Options

In the rest of the file you will find options regarding the quality of the calculation and other performance related variables:

# Basis set definition
PAO.EnergyShift 250 meV
PAO.SplitNorm   0.15
PAO.BasisSize   SZ

#Real space grid
MeshCutoff 125.0 Ry

# Convergence of SCF
MaxSCFIterations 50
DM.MixingWeight 0.4
DM.NumberPulay  2

# Type of solution
SolutionMethod diagon

The specifics of these will be explored in other tutorials; but to provide a brief overview:

  • The first three variables are a rough definition the quality of the basis set. The number of orbitals per atom is defined by the parameter PAO.BasisSize, and we have set it to select a minimal basis (SZ) for a quick, cheap calculation disregarding quality.

  • The parameter MeshCutoff controls the fineness of the real-space grid used to compute the integrals for the matrix elements of the Hamiltonian. A higher value corresponds to a finer grid.

  • The next three options control the performance of the self-consistent field (SCF) cycle.

  • SolutionMethod indicates the kind of solver that we are using.

Note

Options for PAO.BasisSize follow the usual convention of amount of functions per valence orbital. For C, a SZ basis set has 4 functions (s+p) and a DZP basis set has 13 functions (2*s, 2*p, and a d polarization shell).

While the parameters specifying the system are mandatory, all other parameters have some default values and, in principle, it is not necessary to explicitly include them in the input file. However, it is important to note that the default values do not always guarantee a fast or a good quality calculation.


Running SIESTA

To run SIESTA via command line, you just need to do

siesta < ch4.fdf > ch4.out           # traditional way

or simply:

siesta ch4.fdf > ch4.out

You will see that a lot of output files are generated in the simulation directory. We will begin by having a look at the main output file, ch4.out.


A look at the output

Take a look at ch4.out once the program has finished. The file contains a human-readable account of the doings of SIESTA for this calculation:

  • A header with the version and compilation options, and a copy (in the first mode above) or a mention (in the second mode) of the input file:

    Siesta Version  : 5.0
    Architecture    : ----
    Compiler version: GNU-11.4.0
    Compiler flags  : -fallow-argument-mismatch;-O3 -march=native
    PP flags        : ----
    Libraries       : ----
    Parallelisations: MPI
    
    [...]
    
    ************************** Dump of input data file ****************************
    #General system specifications
    SystemName          CH4 molecule
    SystemLabel         ch4
    NumberOfAtoms       5
    NumberOfSpecies     2
    
    [...]
    
    # Type of solution
    SolutionMethod diagon
    ************************** End of input data file *****************************
    
  • A summary of existing species and pseudopotential information:

    initatom: Reading input for the pseudopotentials and atomic orbitals ----------
    Species number:   1 Atomic number:    6 Label: C
    Species number:   2 Atomic number:    1 Label: H
    
    [...]
    
    Valence configuration for ps generation: (assumed as above)
    For C, standard SIESTA heuristics set lmxkb to 2
    (one more than the basis l, including polarization orbitals).
    Use PS.lmax or PS.KBprojectors blocks to override.
    For H, standard SIESTA heuristics set lmxkb to 1
    (one more than the basis l, including polarization orbitals).
    Use PS.lmax or PS.KBprojectors blocks to override.
    
  • Details on the basis set generation:

    <basis_specs>
    ===============================================================================
    C                    Z=   6    Mass=  12.011        Charge= 0.17977+309
    Lmxo=1 Lmxkb= 2    BasisType=split      Semic=F
    L=0  Nsemic=0  Cnfigmx=2
              i=1  nzeta=1  polorb=0  (2s)
    
    [...]
    
                      izeta = 1
                     lambda =    1.000000
                         rc =    4.810269
                     energy =   -0.447136
                    kinetic =    0.945150
        potential(screened) =   -1.392287
           potential(ionic) =   -1.930221
    atom: Total number of Sankey-type orbitals:  1
    
    atm_pop: Valence configuration (for local Pseudopot. screening):
    1s( 1.00)
    Vna: chval, zval:    1.00000   1.00000
    
    Vna:  Cut-off radius for the neutral-atom potential:   4.810269
    
  • A summary of the values used in the calculation for the most important parameters:

    siesta: ******************** Simulation parameters ****************************
    siesta:
    siesta: The following are some of the parameters of the simulation.
    siesta: A complete list of the parameters used, including default values,
    siesta: can be found in file out.fdf
    siesta:
    redata: Spin configuration                          = none
    redata: Number of spin components                   = 1
    
    [...]
    
    ts: **************************************************************
    ts: Save H and S matrices                           =    F
    ts: Save DM and EDM matrices                        =    F
    ts: Only save the overlap matrix S                  =    F
    ts: **************************************************************
    
  • Information of the geometry and matrix sparsity for a given geometry:

    outcell: Unit cell vectors (Ang):
           15.000000    0.000000    0.000000
            0.000000   15.000000    0.000000
            0.000000    0.000000   15.000000
    
    outcell: Cell vector modules (Ang)   :   15.000000   15.000000   15.000000
    outcell: Cell angles (23,13,12) (deg):     90.0000     90.0000     90.0000
    outcell: Cell volume (Ang**3)        :   3375.0000
    <dSpData1D:S at geom step 0
      <sparsity:sparsity for geom step 0
        nrows_g=8 nrows=8 sparsity=1.0000 nnzs=64, refcount: 7>
      <dData1D:(new from dSpData1D) n=64, refcount: 1>
    refcount: 1>
    new_DM -- step:     1
    
    [...]
    
    InitMesh: MESH = 108 x 108 x 108 = 1259712
    InitMesh: Mesh cutoff (required, used) =   125.000   143.274 Ry
    New grid distribution [1]: sub = 2
    
    stepf: Fermi-Dirac step function
    
  • Followed by a full breakdown of the initial, non-converged energy:

    siesta: Program's energy decomposition (eV):
    siesta: Ebs     =       -86.615317
    siesta: Eions   =       380.812454
    [...]
    siesta: Etot    =      -214.688379
    siesta: FreeEng =      -214.688379
    
  • A log of the self-consistent-field (SCF) cycle:

           iscf     Eharris(eV)        E_KS(eV)     FreeEng(eV)     dDmax    Ef(eV) dHmax(eV)
      scf:    1     -224.477055     -214.688379     -214.688379  1.130398 -6.803327  1.039357
    timer: Routine,Calls,Time,% = IterSCF        1       0.314  28.15
      scf:    2     -214.708050     -214.704182     -214.704182  0.022332 -6.481242  0.243330
      scf:    3     -214.704223     -214.704283     -214.704283  0.002871 -6.378875  0.148294
      scf:    4     -214.704298     -214.704314     -214.704314  0.001596 -6.231017  0.052566
      scf:    5     -214.704324     -214.704321     -214.704321  0.000517 -6.269207  0.001236
      scf:    6     -214.704321     -214.704321     -214.704321  0.000031 -6.268437  0.000383
    
    SCF Convergence by DM+H criterion
    max |DM_out - DM_in|         :     0.0000307918
    max |H_out - H_in|      (eV) :     0.0003831179
    SCF cycle converged after 6 iterations
    
  • A full breakdown of the total energy decomposition and forces, and stress:

    siesta: Program's energy decomposition (eV):
    siesta: Ebs     =       -88.492716
    siesta: Eions   =       380.812454
    [...]
    siesta:         Total =    -214.704321
    siesta:         Fermi =      -6.268437
    
    [...]
    siesta: Atomic forces (eV/Ang):
    siesta:      1    0.140447    0.140447   -1.039696
    siesta:      2   -2.496202    0.517116    0.806177
    siesta:      3    0.517116   -2.496202    0.806177
    siesta:      4    0.363575    0.363575   -1.130990
    siesta:      5    1.475788    1.475788    0.558249
    siesta: ----------------------------------------
    siesta:    Tot    0.000724    0.000724   -0.000084
    
    siesta: Stress tensor (static) (eV/Ang**3):
    siesta:     0.001324   -0.000018   -0.000133
    siesta:    -0.000018    0.001324   -0.000133
    siesta:    -0.000133   -0.000133    0.000671
    
  • The last part of the output file includes information on the cell pressure and the total electric dipole of the system:

    siesta: Cell volume =       3375.000000 Ang**3
    
    siesta: Pressure (static):
    siesta:                Solid            Molecule  Units
    siesta:          -0.00001205          0.00000000  Ry/Bohr**3
    siesta:          -0.00110663          0.00000026  eV/Ang**3
    siesta:          -1.77300959          0.00042191  kBar
    (Free)E+ p_basis*V_orbitals  =        -214.210260
    (Free)Eharris+ p_basis*V_orbitals  =        -214.210260
    
    siesta: Electric dipole (a.u.)  =   -0.011736   -0.011736    0.009580
    siesta: Electric dipole (Debye) =   -0.029830   -0.029830    0.024350
    
  • And to finish, literature regarding the simulation done:

    cite: Please see "ch4.bib" for an exhaustive BiBTeX file.
    cite: Please clearly indicate Siesta version in published work: 5.1-MaX
    cite: This calculation has made use of the following articles
    cite: which are encouraged to be cited in a published work.
        Primary SIESTA paper
          DOI: www.doi.org/10.1088/0953-8984/14/11/302
    

Other Output files

As you can see, SIESTA generates a ton of output files besides the main log/out file. Each of these might interest you depending on what you intend to do with your simulation, but some of the most commonly relevant files are:

  • *.DM files (ch4.DM) contain a restart of the Density Matrix. Useful to restart already converged or close-to-convergence calculations without doing so from scratch.

  • *.XV files (ch4.XV) contain a restart of the current geometry and cell size. This is particularly useful to restart or continue molecular dynamics runs.

  • *.FA files (ch4.FA) contain total forces over all atoms. Each line corresponds to the cartesian force (x, y and z) over each atom (in eV/Ang).

  • *.EIG files (ch4.EIG) contain the eigenvalues of the Hamiltonian for all k-points involved. k-points are explored in more detail in Sampling of the BZ with k-points.

  • *.times and TIMES files (ch4.times and TIMES) contain timing information, which is invaluable when optimizing parameters for fast performance.


Let’s play!

Now that you had a brief overview of how running SIESTA works, let’s play a bit with your CH4 inputs. See what happens when:

  • You change basis-set size: Set PAO.BasisSize to any of: DZ, SZP, DZP, TZP. See what happens with the total energy and the runtime. More on this topic in this tutorial.

  • You change the fineness of the real-space grid: Try an unreasonably low value for the the parameter MeshCutoff (around 10 Ry) and check the resulting total energy and forces (you can also find the forces in the file ch4.FA). Then experiment with a much higher one (300 Ry). More on this topic in this tutorial.

  • You can play with the size of the lattice parameters to go from ‘interacting’ molecules to effectively isolated ones. Look at the variation in the total energy as a function of the cell size, to see how the interaction between molecules decreases with increasing distance between images. For this non-polar molecule, the interaction should be very small.