Basis set optimization

Note

Before doing this tutorial, we encourage you to do and review the more general tutorial on the level of theory.

Note

As a general background for this topic, you can watch the 2021 talks by Emilio Artacho:

Using the default basis sets generated by SIESTA might be enough for some applications, but doing some degree of manual optimization of the basis sets may help to achieve better results with similar computational costs. This is especially advisable when dealing with extremely large systems, where going for high basis set cardinality (triple-zeta, quadruple-zeta) is not really an option.

Note

This tutorial shows just one possible method to optimize your basis sets. Different people tend to use different techniques, there is no “one true way”. This method does, however, work reliably across different kinds of systems.

General Concepts

When optimizing basis sets, the concept of transferability is one of the most important ideas to keep in mind. You might get the ultimate basis set that gives the perfect results for a given set of atoms in a given set of coordinates; however, if even a slight re-arrangement of those atoms produces terrible results, that means your basis set is not transferable. Ideally, one should aim for highly transferable basis sets that also produce reasonable results.

This ties in with a key aspect of basis set optimization: defining your optimization function, i.e., the magnitude that you will be using to monitor how well your basis is performing. One good magnitude to look at is the total energy of a given system, but even better is to use the concept of basis enthalpy (see below).

While it might be tempting to optimize basis sets using bond lengths, interaction energies, or even band structures, these typically result in poorly transferable basis sets. The use of these magnitudes as optimization functions is not encouraged; however, these can be used after the optimization is done in order to benchmark the behaviour of the newly optimized basis set.

Having defined the optimization function, one needs to define which are the optimization variables, i.e. the things that we will modify manually to improve our basis sets. While SIESTA has a ton of possibilities to modify your basis set, the most important targets for optimization are the orbital radii: the cut-off radii for first-zeta orbitals, and the matching radii for multiple-zeta. Polarization orbitals can also be optimized separately, but the way these are generated in SIESTA is generally good and they usually do not require further optimization.

Note

To summarize:
  • We aim for good-quality, highly-transferable basis sets.

  • Typically we will be modifying the cut-off radii of first-zeta and the matching radii of multiple-zeta orbitals.

  • We will take the basis enthalpy (or the total energy) of a system as our main way of optimizing the basis sets.

Basis Enthalpy

The basis enthalpy is a fictitious magnitude that adds an extra term to the energy, which depends on the volume of the orbitals; essentially, this prevents the basis orbitals from getting unnecessarily large. This extra term is essentially P*V, where V is the orbital volume and P is a fictitious “pressure”. This pressure is an input we will use in SIESTA.

For bulk materials such as metals (Cu, Fe) or semiconductors (Si, Ge), the default value of 0.2 GPa for the basis pressure produces reasonable results. However, for small atoms in isolated molecules (C, O, H, etc.) this pressure results in extremely short cut-off radii. One typically reduces the basis pressure to a tenth of the default value, i.e. 0.02 GPa.

How do you know whether your orbitals are large enough? A good rule of thumb is to consider that orbitals with a cut-off radius of less than 5.0 Bohr are very small, while a cut-off radius larger than 10.0 Bohr is unnecessarily large. Of course, there are a few exceptions: 1s and 2s orbitals of the first two rows of the periodic table (specially H) can be slightly smaller than 5.0 Bohr, but going below 4.0 Bohr should be avoided. In the oposite case, when dealing with surfaces or 2D materials one might need larger, diffuse orbitals in order to better describe the area around the exposed atoms: these diffuse orbitals might easily reach 10 to 12 Bohr. More on this topic is covered in Basis sets - Tips and tricks.

The Water molecule

Molecular dynamics of water (or aqueous systems in general) is still one of the most common types of simulations. In this tutorial, we will optimize the basis set for a single water molecule, and we will use the energy of a water dimer as a way to benchmark the quality of our basis functions.

Take a look at the wat.fdf file, in particular, the following lines:

PAO.SoftDefault     T
PAO.OldStylePolOrbs F
PAO.EnergyShift     0.001 Ry

All three are covered in A First Encounter - Part 2: Choosing your level of theory. The value for PAO.EnergyShift was chosen in a similar manner to that showed in said tutorial; however, since we will be optimizing the orbital radii, the value of the energy shift will become irrelevant once we are done. Choosing an appropriate value for it just provides a good starting point for the optimization.

Now instead of defining the cardinality just as PAO.BasisSize DZP, we will make use of the PAO.Basis block, which has more information:

%block PAO.Basis
  H 1
    n=1 0 2 P 1     # n, l (=0, s-orbital), number of zetas, include 1 set of polarization orbitals
        0.0 0.0     # rcut for first zeta, rmatch for second zeta
  O 2               # Number of different l
    n=2 0 2         # n, l (=0, s-orbital), number of zetas
        0.0 0.0     # rcut for first zeta, rmatch for second zeta
    n=2 1 2 P 1     # n, l (=1, p-orbital), number of zetas, include 1 set of polarization orbitals
        0.0 0.0     # rcut for first zeta, rmatch for second zeta
%endblock PAO.Basis

Notice how the H atom has a single 1s level (n=1/l=0) with 2 functions (zetas) and a single set of polarization orbitals (P 1). Meanwhile, Oxygen has two levels, 2s and 2p (n=2/l=0 and n=2/l=1), with an additional set polarization orbitals on the 2p level (P 1).

In this example, the input is the same as specifying the flag PAO.BasisSize DZP, since we have a double-zeta basis set for all valence orbitals, and we are including a single polarization orbital set for each atom.

The values for rcut and rmatch are all set to 0.0; this tells SIESTA that it should automatically find appropriate values for each of those radii using the energy shift. By the end of this tutorial, we will have set each of these radii manually. It is worth noting that all values for radii here should be in Bohr.

Since we are using small atoms, the basis pressure will be set to 0.02 GPa, as you can see in the following line:

BasisPressure  0.02 GPa

It is also important to note that we have set up a relatively big simulation box. Since water is an isolated molecule and not a bulk metal, we want to avoid orbital overlaps between one molecule and its periodic images

LatticeConstant 15.0 Ang
%block LatticeVectors
1.0 0.0 0.0
0.0 1.0 0.0
0.0 0.0 1.0
%endblock LatticeVectors

Optimizing the first-zeta cut-off radii

Note

Go to the subdirectory 1-BasisOptimization.

We will first start by optimizing the cut-off radii for the first zeta orbitals. This means we will be changing the first 0.0 we find for each of the orbitals (the * * *are for emphasis and should not be included in the actual input file):

%block PAO.Basis
H 1
  n=1 0 2 P 1     # n, l (=0, s-orbital), number of zetas, include 1 set of polarization orbitals\\
    *0.0* 0.0     # rcut for first zeta, rmatch for second zeta
O 2               # Number of different l
  n=2 0 2         # n, l (=0, s-orbital), number of zetas
    *0.0* 0.0     # rcut for first zeta, rmatch for second zeta
  n=2 1 2 P 1     # n, l (=1, p-orbital), number of zetas, include 1 set of polarization orbitals
    *0.0* 0.0     # rcut for first zeta, rmatch for second zeta
%endblock PAO.Basis

Modify the value for rcut in the 1s orbital of Hydrogen, going from 4.0 Bohr to 6.0 Bohr by 0.5 Bohr increases. Take note of the basis enthalpy of the system for each value of rcut, printed in the first line of the BASIS_ENTHALPY file.

Now we choose the cut-off radii for which the basis enthalpy is minimized. Assuming the value is 4.5 Bohr, your basis block will look somewhat similar to this:

%block PAO.Basis
  H 1
    n=1 0 2 P 1     # n, l (=0, s-orbital), number of zetas, include 1 set of polarization orbitals
        *4.5* 0.0   # rcut for first zeta, rmatch for second zeta
  O 2               # Number of different l
    n=2 0 2         # n, l (=0, s-orbital), number of zetas
        0.0 0.0     # rcut for first zeta, rmatch for second zeta
    n=2 1 2 P 1     # n, l (=1, p-orbital), number of zetas, include 1 set of polarization orbitals
        0.0 0.0     # rcut for first zeta, rmatch for second zeta
%endblock PAO.Basis

After choosing a value for the Hydrogen 1s orbital, do the same procedure for the 2s orbital of Oxygen, and then for the 2p orbital. Explore values from 4.5 Bohr to 7.5 Bohr for both orbitals.

Do the values for the optimized cut-off radii coincide with what you would expect from chemical intuition? Which is the largest orbital and which one is the shortest?

For the purposes of this tutorial, we will stop here the optimization of rcut. In a more general case, it would be better to re-optimize all of the orbitals iteratively to guarantee that the rcut values are truly converged.

Optimizing the second-zeta matching radii

We will now expand the previous block to the one we had at the beginning for the DZP basis set, but instead of zeroes in the rcut, we will place our optimized values for the single zeta basis (i.e. replace RCUT1, RCUT2 and RCUT3 for your own optimized values):

%block PAO.Basis
  O 2                  # Number of different l
    n=2 0 2            # n, l (=0, s-orbital), number of zetas
        *RCUT1 0.0*    # rcut for first zeta, rmatch for second zeta
    n=2 1 2 P 1        # n, l (=1, p-orbital), number of zetas, include 1 set of polarization orbitals
        *RCUT2 0.0*    # rcut for first zeta, rmatch for second zeta
  H 1
    n=1 0 2 P 1        # n, l (=0, s-orbital), number of zetas, include 1 set of polarization orbitals
        *RCUT3 0.0*    # rcut for first zeta, rmatch for second zeta
%endblock PAO.Basis

We will do the same as before, but now changing the value for the matching radii (which are still 0.0 in the block).

Using a similar procedure as before, find the value of rmatch for the 1s orbital of Hydrogen that minimizes the basis enthalpy, evaluating it from 1.5 Bohr to 3.0 Bohr. Do the same for Oxygen 2s and then 2p, changing the values from 2.5 Bohr to 4.0 Bohr for each of them.

Calculating the binding energy of a water dimer

Note

Go to the subdirectory 2-WaterDimer.

Now is the time to put our optimized basis set to test. We will use as an example the binding energy of a water dimer, i.e.:

E~binding~ = E~dimer~ - 2*E~monomer~

Copy your optimized basis block into both *monomer.fdf* and *dimer.fdf*, and run a single point calculation for both systems. Keep the total energies of both systems. Now do the same but instead of the optimized basis set, use the default DZP basis set with different values for the energy shift (for example, 0.1 Ry and 0.0001 Ry).

How well does our newly optimized basis set perform? Look at both the total energies for each system, and the total binding energy. Take into account that the binding energy for the water dimer using planewaves is 228.6 meV ([Corsetti et al.: “Optimal finite-range atomic basis sets for liquid water and ice”, J. Phys. Condens. Matter 2013](https://arxiv.org/abs/1307.3187)).

Optional: Optimizing the polarization orbitals

Note

Go to the subdirectory Optional-PolarizationOrbitals.

One last step in the orbital optimization could be to optimize the polarization orbitals. This is conceptually a bit different from what we have been doing up to now, even if the procedure is somewhat similar. Here, we will not be changing the cut-off radius of the orbital as above; instead, we will be altering the overall shape of the polarization orbital. For that purpose, we will expand the basis block to add one more orbital set per atom:

%block PAO.Basis
  H *2*
    n=1 0 2              # n, l (=0, s-orbital), number of zetas
        *RCUT1 RMATCH1*  # rcut for first zeta, rmatch for second zeta
    n=2 1 1 *Q 0.0*      # n, l (=1, p-orbital), number of zetas, charge value for confinement
        *RCUT1*          # rcut for first zeta, rmatch for second zeta
  O *3*
    n=2 0 2              # n, l (=0, s-orbital), number of zetas
        *RCUT2 RMATCH2*  # rcut for first zeta, rmatch for second zeta
    n=2 1 2              # n, l (=1, p-orbital), number of zetas
        *RCUT3 RMATCH3*  # rcut for first zeta, rmatch for second zeta
    n=3 2 1 *Q 0.0*      # n, l (=2, d-orbital), number of zetas, charge value for confinement
        *RCUT3*          # rcut for first zeta
%endblock PAO.Basis

Note that, in both cases, the polarization orbital belongs to the following principal number (n=3 for Oxygen, n=2 for Hydrogen). In addition, all polarization orbitals have the same cut-off radius as the orbital they are polarizing.

One of the recommended methods to generate appropriate polarization orbitals is the charge confinement method (more details in the manual), but suffice to say that this method relies on seeing how the atomic wavefunctions change in the presence of a charge confinement potential, \({ V(r;Q) = Q / r^{2}}\). In order to optimize our polarization orbitals, we just need to change the value of the charge Q.

Modify the charge used in the Hydrogen 2p polarization orbital (the number following Q) from 3.5 to 6.5, and choose a value that minimizes the basis enthalpy. Do the same for the Oxygen 3d polarization orbital, modifying the charge from 5.0 to 8.0.

Repeat the calculations for the water dimer using your newly optimized basis set. How do results change?

Note

In addition to the charge value, one can also change a decay parameter \({\lambda}\), changing the charge confinement potential to \({ V(r;Q, \lambda) = Q * e^{-\lambda r} / r^{2}}\). This is simply added as an additional number after Q. For example, n=2 1 1 Q 3.0 0.4 has a value of 0.4 for \({\lambda}\) and 3.0 for the charge.

The decay parameter is not independent from the value of the charge, therefore both values need to be optimized at the same time (or iteratively) for a given orbital. Refer to the manual for more information on the charge confinement potential.