7.1. The SHARK Integral Package and Task Driver¶
7.1.1. Preface¶
Starting with ORCA 5.0 very large changes have taken place in the way that the program handles integrals and integral related tasks like building Fock matrices. SHARK is a powerful and efficient infrastructure that greatly facilitates the handling of these tasks. This allows developers to write highly streamlined code with optimal performance and a high degree of reliability. Compared to the way ORCA handled integrals before ORCA 5.0, tens of thousands of lines of codes, often duplicated or nearly duplicated from closely related parts of the program could be eliminated. From the perspective of the user, the visible changes to the input and output of the program compared to ORCA 4.2.1 and earlier are relatively limited. However, under the hood, the changes are vast and massive and will ensure that ORCA’s infrastructure is modern and very well suited for the future of scientific computing.
The benefits of SHARK for the users of ORCA are:
Improved code efficiency that is consistent through all program tasks. In particular, complicated two-electron integrals, for example in the context of GIAOs, two-electron spin-orbit coupling and two-electron spin-spin coupling integrals are handled with vastly improved efficiency. Also, integral digestion has been vastly improved with very large benefits for calculations that build many Fock matrices at a time, for example in CIS/TD-DFT, analytic Hessians or response property calculations.
Improved code reliability, since all integrals now run through a well debugged, common interface
Shorter development times. The new infrastructure is so user friendly to programmers that writing new code that makes use of SHARK is much faster than in the past.
SHARK handles basis sets much better than the old infrastructure. Whether the basis sets used follow a segmented contraction, general contraction or partial general contraction is immaterial since the algorithms have been optimized carefully for each kind of basis throughout.
7.1.2. The SHARK integral algorithm¶
One cornerstone of SHARK is a new integral algorithm that allows for highly efficient evaluation of molecular integrals. The algorithm is based on the beautiful McMurchie-Davidson algorithm which leads to the following equation for a given two-electron integral:
Here
and the primitive Cartesian Gaussian basis functions
By means of the Gaussian product theorem, the two exponentials are straightforwardly rewritten as:
With
at which the new Gaussian is centered. The ingenious invention of
McMurchie and Davidson was to realize that the complicated polynomial
that arises from multiplying the two primitive Cartesian Gaussians can
be nicely written in terms of Hermite polynomials
And hence:
With
This means that the original four center integral is reduced to a sum of two-center integrals over Hermite Gaussian functions. These integrals are denoted as
With these definitions one understands the McMurchie Davidson algorithm as consisting of three steps:
Transformation of the Bra function product into the Hermite Gaussian Basis
Transformation of the Ket function product into the Hermite Gaussian Basis
Calculation of the Hermite Gaussian electron repulsion integral
SHARK is the realization that these three steps can be efficiently executed by a triple matrix product:
Here
There are many benefits to this formulation:
The integral is factorized allowing steps to be performed independent of each other. For example, the E matrices can be calculated at the beginning of the calculation and reused whenever needed. Their storage is unproblematic
Matrix multiplications lead to extremely efficient formation of the target integrals and drive the hardware at peak performance
Steps like contraction of primitive integrals and transformation from the Cartesian to the spherical Harmonics basis can be folded into the definition of the E matrices thus leading to extremely efficient code with next to no overhead creates by short loops.
Programming integrals becomes very easy and efficient. Other types of integrals as well as derivative integrals are readily approached in the same way. Also, two- and three-index repulsion integrals, as needed for the RI approximation are also readily formulated in this way.
One-electron integrals are equally readily done with this approach.
There is a very large number of technicalities that we will not describe in this manual which is only intended to provide the gist of the algorithm.
7.1.4. Basis set types¶
One significant aspect of molecular integral evaluation is the type of
contraction that is present in a Gaussian basis set. The most general
type of basis set is met in the “general contraction” scheme. Here all
primitive Gaussian basis functions of a given angular momentum are
collected in a vector
Where
The opposite of general contraction is met with segmented contraction. Here each basis function involves a number of primitives:
Quite typically, none of the
The third class of basis sets is met, when general contraction is combined with segmented contraction. Basis sets of this type are, for example, the correlation consistent (cc) basis sets. We call such basis sets “partially generally contracted”. In such basis sets, part of the basis functions are generally contracted (for example, the s- and p-functions in main group elements), while other basis functions (e. g. polarization functions, diffuse functions, core correlation functions) are not generally contracted. It is difficult to take full advantage of such basis sets given their complicated structure. In ORCA 5, special code has been provided that transforms the basis set into an intermediate basis set that does not contain any redundancies and hence drives SHARK or libint at peak performance.
In assessing the efficiency vs the accuracy of different integral algorithms, it is clear that segmented basis sets lead to the highest possible efficiency if they are well constructed. For such basis set the pre-screening that is an essential step of any integral direct algorithm performs best. The highest possible accuracy (per basis functions) is met with generally contracted basis sets. However, here the pre-screening becomes rather inefficient since it can only be performed at the level of atom/angular momentum combinations rather than individual shell quadruples. Thus, as soon as a given atom/angular momentum combination leads to any non-negligible integral, all integrals for this combination need to be calculated. This created a sizeable overhead. Consequently, SCF calculations can never be as efficient as with segmented basis sets. If this is immaterial, for example, because a subsequent coupled cluster or MRCI calculation is dominating the calculation time, general contraction is very worthwhile to be explored. For partial general contraction, our algorithm performs very nearly as efficiently as for segmented contraction in SCF calculations. However, since the intermediate basis set is larger than the original orbital basis, certain limited performance penalties can arise in some job types.
7.1.5. Task drivers¶
In traditional algorithms, quantum chemical programs frequently contain many instances of nested loops over basis function shells, the integral package is called and the integrals are “digested” for a given task. While these steps are inevitable, programming them repeatedly is laborious and error prone. In addition, improvements, say in the handling of contractions or symmetry, need to implemented in many different places. In the SHARK infrastructure all of this is unnecessary since it is programmed in an object-oriented fashion, where the programmer does not need to take care of any detail. Hence, developers only need to write short code sections that distribute the generated integrals into whatever data structure they need, while the SHARK interface takes care of all technical aspects and triggers the sophisticated and efficient machinery that underlies it.
Given this situation, the future of ORCA will involve SHARK taking care of nearly of the compute intensive, laborious tasks, while ORCA will organize and trigger all of these tasks. ORCA and SHARK communicate via a lean and well-defined interface to exchange the necessary data. In this way, a modern, efficient, easy to use and readily maintainable development environment is created.