(sec:wavefunction.detailed)=
# Choice of Wavefunction and Integral Handling

(sec:wavefunction.type.detailed)=
## Choice of Wavefunction Type

The basic variable that controls the type of wavefunction to be computed
is the variable `HFTyp` in the `%scf` block. If nothing is specified for
`HFTyp`, the program will check the multiplicity given in the input: for
closed-shell molecules with multiplicity 1, RHF/RKS is assumed; for open
shell molecules with multiplicity larger than 1, UHF/UKS is invoked.
`RHF` will lead to a spin restricted closed-shell type computation
{cite}`szabo1989modern`. For DFT calculations, `RKS`, `UKS` and `ROKS` can be
used as synonyms for `RHF`, `UHF` and `ROHF`. The restricted open-shell
DFT method (`ROKS`) is only operative for high-spin states that have $n$
unpaired electrons and $S=n/2$. `UKS` wavefunctions will not be
spin-purified.

```orca
%scf
  HFTyp  RHF     # closed-shell (RKS for DFT)
         UHF     # unrestricted open-shell (UKS for DFT)
         ROHF    # restricted open-shell (ROKS for DFT)
         CASSCF  # complete active space SCF
end
```

In certain cases you may want to run open-shell molecules with RHF/RKS
to get a "half-electron" type wavefunction {cite}`dewar1968am`. The total
energy is not corrected! Sometimes these half-electron computations lead
to acceptable convergence, and the resulting orbitals may be used as
input for ROHF, UHF or MRCI calculations. Especially for transition
metal complexes the orbitals are quite different from ROHF or UHF
orbitals, so that it is not recommended to over-interpret the
wavefunctions from such calculations. The calculation is set up in the
following way:

```orca
%method AllowRHF true end
# or simply: ! AllowRHF
```

(sec:wavefunction.rohf.detailed)=
## ROHF Options

For ROHF
calculations{cite}`mcweeny1974mol,brobowicz1977methods,carbo1978general,binkley1974mol,edwards1987theor,muller1994chem,kollmar1996chem,kollmar1997int,bofill1998comp`
the program will try to figure out what type of open-shell situation is
present on the basis of the initial guess orbitals and their energies.
Most "simple" cases are well recognized, but sometimes a little help from
the user is needed.

The simplest ROHF case is the `HIGHSPIN` case, where all unpaired electrons
in the open-shell are coupled parallel to each other, resulting in the highest
multiplicity possible. The user can request this case as follow:

```orca
%scf
  HFTyp       ROHF
  ROHF_case   HIGHSPIN
  ROHF_NEl[1] 4          # number of electrons in the open-shell
end
```

The ROHF code also has a very powerful feature that goes
back to insights of Mike Zerner {cite}`stavrev1997int,zerner1989int`. It
can average over either *all* states of a given configuration (`CAHF`)
or *all states of a given spin* for a given configuration (`SAHF`).
Especially the `SAHF` feature gives you easy access to most degenerate
high symmetry situations and the orbitals resulting from such
calculations will be very convenient as input for CI calculations.

```orca
%scf
  HFTyp         ROHF
  ROHF_case     CAHF    # configuration averaged HF
                SAHF    # spin averaged HF
  ROHF_NumOp    3       # number of operators (3, 2 or 1)
  ROHF_NOrb[1]  2,1     # number of orbitals in each open-shell
  ROHF_NEl[1]   1,1     # number of electrons in each open-shell
end
```

The hypothetical example below could represent an excited state of an
octahedral d$^{3}$ transition metal complex. In this case there are five
open-shell orbitals. The first three open-shell orbitals contain two
electrons and the last two one electron. The input for a SAHF
calculation is identical, just replace CAHF with SAHF.

```orca
%scf
  HFTyp         ROHF
  ROHF_case     CAHF   # configuration averaged HF
  ROHF_NumOp    3      # 3 operators in this case: closed, open1, open2
  ROHF_NOrb[1]  3,2    # 3 orbitals in first open shell, 2 in the second
  ROHF_NEl[1]   2,1    # 2 electrons in first open shell, 1 in the second
end
```

Another feature of the ROHF code is the ability to converge the SCF to
a given Configuration State Function (`CSF-ROHF`) {cite}`gouveia2024JPCA`.
In this way one can approach results from MCSCF calculations. This can be
requested in two ways.

The user can give a specific coupling situation.
```orca
%scf
  HFTyp      ROHF
  ROHF_CASE  USER_CSF          # User defined CSF
  ROHF_REF   {1 1 -1 -1} end   # CSF to be converged to
end
```

Or the user can give how many orbitals per shell. Where each open-shell
will couple with antiparallel spin with the previous one.
```orca
%scf
  HFTyp       ROHF
  ROHF_CASE   AF_CSF          # User defined CSF
  ROHF_AFORBS 2,2             # Coupling Situation
end
```

As an example, one can think of a Fe(III) dimer, where each center is locally
high spin, but they couple antiferromagnetically to each other. In order to get
the ROHF solution for this system, first one need a set of guess orbitals. The 
guess orbitals can be obtained either from the QROs of an UHF calculation, or a
high spin ROHF calculation, or even a SAHF or CAHF. Independently of the method used,
the orbitals need to be localized and ordered in a way that the 5 3d orbitals of each iron
are grouped together in sequence. From this, one can run a CSF-ROHF calculation for
the antiferromagnetic CSF as shown bellow:

```orca
%scf
  HFTyp      ROHF
  ROHF_CASE  USER_CSF                         # User defined CSF
  ROHF_REF   {1 1 1 1 1 -1 -1 -1 -1 -1} end   # 2 open shells coupling antiparallel
end

or

%scf
  HFTyp       ROHF
  ROHF_CASE   AF_CSF                          # User defined CSF
  ROHF_AFORBS 5,5                             # 2 open shells coupling antiparallel
end
```

The CSF-ROHF procedure can recognize doubly occupied and virtual orbitals in the definition of
the CSF when the `USER_CSF` case is invoked. When detected, these orbitals will be rotated out of 
the open-shells defined in the ROHF method and the calculation will run normally:

```orca
%scf
  HFTyp      ROHF
  ROHF_CASE  USER_CSF                   # User defined CSF
  ROHF_REF   {1 1 2 1  -1 -1 0 -1} end  # the DOMO will be rotated to the closed-shell and
                                        # the VMO will be rotated to the virtual space. 
end
```

The user can also directly input the ROHF variables by means of the
`ROHFOPT Case User` keyword. For example
for the high spin case with three electrons in three orbitals gives two
operators with vector coupling coefficients $a=1$ and $b=2$ (Zerner
convention).

```orca
%scf
  HFTyp      ROHF
  ROHFOP  Case    User  # manual input of ROHF variables
    Nop     2           # number of operators
    Norb[1] 3           # number of open-shell orbitals
    Nel[1]  3           # number of open-shell electrons
    A[1,1]  1           # Coulomb vector in the open shell
    B[1,1]  2           # Exchange vector in the open shell
  end
end
```

One awkward feature of the ROHF theory is that the Fock operator is
somewhat arbitrarily defined. Different choices lead to the same
wavefunction, but have different convergence properties that may vary
from system to system. ORCA thus lets the user choose the desired
variant. Playing around with these choices may turn a divergent or
slowly converging ROHF calculation into a successful calculation!

The `ROHF_Restrict` feature is another feature that may be useful. If
you suspect that the ROHF calculation does not converge because an
open-shell and a closed-shell orbital are flipping back and forth, you
can try to avoid this behavior by choosing `ROHF_Restrict true`. Of
course there is no guarantee that it will work, and no guarantee that
the system stays in the desired state. However, it decreases the chances
of large, uncontrolled steps.

```orca
%scf
  ROHF_Mode 0   # construct F according to Pulay (default)
            1   # construct F as in the Gamess program
            2   # construct F according to Kollmar

  ROHF_Restrict false # restrict orbital interchanges and off-diagonal elements
                      # (default=false)
                      
  # a complete list of ROHF variables
  ROHFOP 
    Case      User   # manual input of ROHF variables
    Nop       2      # number of operators
    Norb[1]   3      # number of open-shell orbitals
    Nel[1]    3      # number of open-shell electrons
    A[1,1]    1      # Coulomb vector in the open shell
    B[1,1]    2      # Exchange vector in the open shell
    Mode      2      # use the Kollmar operator
    Restrict  false  # do not restrict
  end
end
```

(sec:wavefunction.uhfno.detailed)=
## UHF Natural Orbitals

The program can produce the UHF natural orbitals (UNOs). With these, the
open-shell wavefunction can be pictured conveniently. The syntax is
simple:

```orca
%scf
  UHFNO true
end
# or simply: ! UNO
```

There are various printing options for UNOs described in the output
section {ref}`sec:pop.detailed`. The UNOs can also be plotted as described
in the plots section
{ref}`sec:plots.detailed`. In general the program stores a file
`BaseName.uno`, where `BaseName` is by default the name of you input
file with `.inp` stripped off. Accordingly, the gbw file is named
`BaseName.gbw`. The `.uno` file is a normal gbw file that contains the
geometry, basis set and the UNO orbitals. It could be used, for example,
to start a ROHF calculation.

(sec:wavefunction.integrals.detailed)=
## Integral Handling (Conventional and Direct) 

As the number of nonzero integrals grows very rapidly and reaches easily
hundreds of millions even with medium sized basis sets in medium sized
molecules, storage of all integrals is not generally feasible. This
desperate situation prevented SCF calculations on larger molecules for
quite some time, so that Almlöf
{cite}`almloef1982comp,almloef1984advanced,almloef1995modern` made the
insightful suggestion to repeat the integral calculation, which was
already the dominant step, in *every SCF cycle* to solve the storage
problem. Naively, one would think that this raises the effort for the
calculation to $n_{\text{iter} }t_{\text{integrals} }$ (where
$n_{\text{iter} }$ is the number of iterations and
$t_{\text{integrals} }$ is the time needed to generate the nonzero
integrals). However, this is not the case because only the change in the
Fock matrix is required from one iteration to the next, but not the Fock
matrix itself. As the calculations starts to converge, more and more
integrals can be skipped. The integral calculation time will still
dominate the calculation quite strongly, so that ways to reduce this
burden are clearly called for. As integrals are calculated in
*batches*[^1] the cost of evaluating the given batch of shells
$p, q, r, s$ may be estimated as:

$$\text{cost}\approx n_{p} n_{q} n_{r} n_{s} \left({ 2l_{p} +1} \right)\left({ 2l_{q}
+1} \right)\left({ 2l_{r} +1} \right)\left({ 2l_{s} +1} \right)$$ (eqn:51)

Here, $n_{p}$ is the number of primitives involved in shell $p$, and
$l_{p}$ is the angular momentum for this shell. *Large* integrals are
also good candidates for storage, because small changes in the density
that multiply large integrals are likely to give a nonzero contribution
to the changes in the Fock matrix.

ORCA thus features two possibilities for integral handling, which are
controlled by the variable `SCFMode`. In the mode `Conventional`, all
integrals above a given threshold are stored on disk (in a compressed
format that saves much disk space). In the mode `Direct`, all
two-electron integrals are recomputed in each iteration.

Two further variables are of importance: In the `Conventional` mode the
program may write enormous amounts of data to disk. To ensure this stays
within bounds, the program first performs a so-called "statistics run"
that gives a pessimistic estimate of how large the integral files will
be. Oftentimes, the program will overestimate the amount of disk space
required by a factor of two or more. The maximum amount of disk space
that is allowed for the integral files is given by `MaxDisk` (in
Megabytes).

On the other hand, if the integral files in `Conventional` run are small
enough to fit into the central memory, it is faster to do this since it
avoids I/O bottlenecks. The maximum amount of memory allocated for
integrals in this way is specified by `MaxIntMem` (in Megabytes). If the
integral files are larger than `MaxIntMem`, no integrals will be read
into memory.

```orca
%scf
  MaxIter   100    # Max. no. of SCF iterations
  SCFMode Direct   # default, other choice: Conventional
  Thresh  1e-8     # Threshold for neglecting integrals / Fock matrix contributions
                   # Depends on the chosen convergence tolerance (in Eh).
  TCut    1e-10    # Threshold for neglecting primitive batches. If the prefactor
                   # in the integral is smaller than TCut, the contribution of the
                   # primitive batch to the total batch is neglected.
  UseCheapInts false   # default: false
  DirectResetFreq 20   # default: 15
  MaxDisk  2500        # Max. amount of disk for 2 el. ints. (MB)
  MaxIntMem 400        # Max. amount of RAM for 2 el. ints. (MB)
end
```

The flag `UseCheapInts` has the following meaning: In a `Direct` SCF
calculation, the oscillations in the total energy and density are
initially quite large. High accuracy in the integrals is therefore not
crucial. If `UseCheapInts` is switched on, the program loosens the
threshold for the integrals and thus saves a lot of computational time.
After having obtained a reasonable initial convergence, the thresholds
are tightened to the target accuracy. One pitfall with this method is
that the number of cycles required to reach convergence may be larger
relative to a calculation with full integral accuracy throughout.[^2]
When restarting calculations that are close to convergence, it is
recommended to switch `UseCheapInts` off. `UseCheapInts` has no meaning
in a conventional SCF.

The value of `DirectResetFreq` sets the number of incremental Fock
matrix builds after which the program should perform a full Fock matrix
build in a `Direct` SCF calculation. To prevent numerical instabilities
that arise from accumulated errors in the recursively build Fock matrix,
the value should not be too large, since this will adversely affect the
SCF convergence. If the value is too small, the program will update more
frequently, but the calculation will take considerably longer, since a
full Fock matrix build is more expensive than a recursive one.

The thresholds `TCut` and `Thresh` also deserve a closer explanation.
`Thresh` is a threshold that determines when to neglect two-electron
integrals. If a given integral is smaller than `Thresh` Eh, it will not
be stored or used in Fock matrix construction. Additionally,
contributions to the Fock matrix that are smaller than `Thresh` Eh will
not be calculated in a `Direct` SCF.

Clearly, it would be wasteful to calculate an integral, then find out it
is good for nothing and thus discard it. A useful feature would be an
efficient way to estimate the size of the integral *before it is even
calculated*, or even have an estimate that is a *rigorous upper bound*
on the value of the integral. Häser and Ahlrichs {cite}`haeser1989comp` were
the first to recognize that such an upper bound is actually rather easy
to calculate. They showed that:

$$\left|{ \left\langle { ij\left|{ kl} \right.} \right\rangle}
\right|\leqslant \sqrt{ \left\langle { ij\left|{ ij} \right.} \right\rangle}
\sqrt{ \left\langle { kl\left|{ kl} \right.} \right\rangle}
$$ (eqn:52)

where:

$$\left\langle { ij\left|{ kl} \right.} \right\rangle=\int{ \int{ \phi_{i}
\left({ \vec{{r} }_{1} } \right)\phi_{j} \left({ \vec{{r} }_{1} }
\right)r_{12}^{-1} \phi_{k} \left({ \vec{{r} }_{2} } \right)\phi_{l} \left({\vec{{r} }_{2} } \right)d\vec{{r} }_{1} d\vec{{r} }_{2} } }
$$ (eqn:53)

Thus, in order to compute an upper bound for the integral only the right
hand side of this equation must be known. This involves only two index
quantities, namely the matrix of two center exchange integrals
$\left\langle { ij\left|{ ij} \right.} \right\rangle$. These integrals
are easy and quick to calculate and they are all
$\geqslant$0 so that there is no trouble with the
square root. Thus, one has a powerful device to avoid computation of
small integrals. In an actual calculation, the Schwartz prescreening is
not used on the level of individual basis functions but on the level of
shell batches because integrals are always calculated in batches. To
realize this, the largest exchange integral of a given exchange integral
block is looked for and its square root is stored in the so called
*pre-screening* matrix **K** (that is stored on disk in ORCA). In a
`Direct` SCF this matrix is not recalculated in every cycle, but simply
read from disk whenever it is needed. The matrix of exchange integrals
on the level of individual basis function is used in `Conventional`
calculations to estimate the disk requirements (the "statistics" run).

Once it has been determined that a given integral batch survives it may
be calculated as:

$$\left\langle { ij\left|{ kl} \right.} \right\rangle=\sum\limits_p { d_{pi}
\sum\limits_q { d_{qj} \sum\limits_r { d_{kr} \sum\limits_s { d_{sl}
\left\langle { i_{p} j_{q} \left|{ k_{r} l_{s} } \right.} \right\rangle} } }
}
$$ (eqn:54)

where the sums $p,q,r,s$ run over the primitive Gaussians in each basis
function $i,j,k,l$ and the $d$'s are the contraction coefficients. There
are more powerful algorithms than this one and they are also used in
ORCA. However, if many terms in the sum can be skipped and the total
angular momentum is low, it is still worthwhile to compute contracted
integrals in this straightforward way. In equation
{eq}`eqn:54`, each
primitive integral batch
$\left\langle { i_{p} j_{q} \left|{ k_{r} l_{s} } \right.} \right\rangle$
contains a prefactor $I_{Gaussians}$ that depends on the position of the
four Gaussians and their orbital exponents. Since a contracted Gaussian
usually has orbital exponents over a rather wide range, it is clear that
many of these primitive integral batches will contribute negligibly to
the final integral values. In order to reduce the overhead, the
parameter `TCut` is introduced. If the common prefactor $I_{pqrs}$ is
smaller than `TCut`, the primitive integral batch is skipped. However,
$I_{pqrs}$ is *not* a rigorous upper bound to the true value of the
primitive integral. Thus, one has to be more conservative with `TCut`
than with `Thresh`. In practice it appears that choosing
`TCut=0.01*Thresh` provides sufficient accuracy, but the user is
encouraged to determine the influence of `TCut` if it is suspected that
the accuracy reached in the integrals is not sufficient.

:::{Hint}

-   If the direct SCF calculation is close to convergence but fails to
    finally converge, this maybe related to a numerical problem with the
    Fock matrix update procedure -- the accumulated numerical noise from
    the update procedure prevents sharp convergence. In this case, set
    `Thresh` and `TCut` lower and/or let the calculation more frequently
    reset the Fock matrix (`DirectResetFreq`).
:::

:::{Note}

-   For a `Direct` calculation, there is no way to have `Thresh` larger
    than `TolE`. If the errors in the Fock matrix are larger than the
    requested convergence of the energy, the change in energy can never
    reach `TolE`. The program checks for that.

-   The actual disk space used for *all* temporary files may easily be
    larger than `MaxDisk`. `MaxDisk` only pertains to the two-electron
    integral files. Other disk requirements are not currently checked by
    the program and appear to be uncritical.
:::

[^1]: A *batch* is a set of integrals that arises from all components of
    the shells involved in the integral. For example a
    $\left\langle pp|pp \right\rangle$ batch gives rise to
    $3 \times 3\times 3 \times 3 = 81$ integrals due to all possible
    combinations of $p_{x}$, $p_{y}$ and $p_{z}$ functions in the four
    shells. Computations based on batches lead to great computational
    advantages because the 81 integrals involved in the
    $\left\langle pp|pp \right\rangle$ batch share many common
    intermediate quantities.

[^2]: This might be an undesirable feature of the current
    implementation.