Gaussian 03 Online ManualLast update: 4 April 2003 | ||||||||||||||||||||||||||||||||||||||

## Efficiency Considerations
Before proceeding, however, let us emphasize two very important points: The default algorithms selected by the program give good performance for all but very large jobs. Note that some defaults have changed with *Gaussian**03*to reflect current typical problem sizes. Defaults used in earlier versions of the program were designed for small jobs of under 100 basis functions. The default algorithms used in*Gaussian*are generally designed for longer jobs.For users or sites who routinely run very large jobs, the following defaults placed in the *Default.Route*file will produce good general performance:
-M- where the amount of available memory and disk are specified as indicated; the default units for each are 8-byte words, and either value may be followed by **KB**,**MB**,**GB**,**KW**,**MW**or**GW**(without intervening spaces) to specify units of kilo-, mega- or giga- bytes or words. Once the*Default.Route*file is set up, for many sites, no other special actions are required for overall efficient program use. The default memory size is 6MW.
## Estimating Calculation Memory RequirementsThe following formula can be used to estimate the memory requirement of various types of where
For example, on a 32-bit system, a 300 basis function HF geometry optimization using g functions would require about 5.2 MW (~42 MB) of memory. Note that 1 MW = 1,048,576 words (= 8,388,608 bytes). The remainder of this chapter is designed for users who wish to understand more about the tradeoffs inherent in the various choices in order to obtain optimal performance for an individual job, not just good overall performance. Techniques for both very large and small jobs will be covered. Additional, related information may be found in reference [572]. ## Memory Requirements for Parallel CalculationsWhen using multiple processors with shared memory, a good estimate of the memory required is the amount of memory from the preceding table for each processor. Thus, if the value from the table is 10 MW and you want to use four shared memory processors, set For distributed memory calculations (i.e., those performed via Linda), the amount of memory specified in In %Mem=20MW ## Storage, Transformation, and Recomputation of IntegralsOne of the most important performance-related choices is the way in which the program processes the numerous electron repulsion integrals. There are five possible approaches to handling two-electron repulsion integrals implemented in
At least two of these approaches are available for all methods in ## SCF Energies and GradientsThe performance issues that arise for SCF calculations include how the integrals are to be handled, and which alternative calculation method to select in the event that the default procedure fails to converge. ## Integral StorageBy default, SCF calculations use the direct algorithm. It might seem that direct SCF would be preferred only when disk space is insufficient. However, this is not the case in practice. Because of the use of cutoffs, the cost of direct SCF scales with molecular size as N The change to direct SCF as the default algorithm in In-core SCF is also available. Direct SCF calculations that have enough memory to store the integrals are automatically converted to in-core runs. GVB and MCSCF calculations can also be done using direct or in-core algorithms [405]. Memory requirements are similar to the open-shell Hartree-Fock case described above. The primary difference is that many Fock operators must be formed in each iteration. For GVB, there are 2N Cutoffs are less effective than for Hartree-Fock, so the crossover in efficiency is at a larger number of basis functions. The number of operators can be quite large for larger MCSCF active spaces, so performance can be improved by ensuring that enough memory is available to hold all the density and operator matrices at once. Otherwise, the integrals will be evaluated more than once per iteration.
## Direct SCF ProcedureIn order to speed up direct HF calculations, the iterations are done in two phases: The density is converged to about 10 ^{-5}using integrals accurate to six digits and a modest integration grid in DFT calculations. This step is terminated after 21 iterations even if it is not fully converged. This step is omitted by default if any transition metal atoms are present.The density is then converged to 10 ^{-8}using integrals accurate to ten digits, allowing up to a total of 64 cycles total for the two steps.
This approach is substantially faster than using full integral accuracy throughout without slowing convergence in all cases tested so far. In the event of difficulties, full accuracy of the integrals throughout can be requested using ## Single-Point Direct SCF ConvergenceIn order to improve performance for single-point direct and in-core SCF calculations, a modification of the default SCF approach is used: The integrals are done to only 10 ^{-6}accuracy except for all-electron (non-ECP) calculations involving molecules containing atoms heavier than argon.The SCF is converged to either 10 ^{-4}on both the energy and density, or to 10^{-5}on the energy, whichever comes first.
This is sufficient accuracy for the usual uses of single-point SCF calculations, including relative energies, population analysis, multipole moments, electrostatic potentials, and electrostatic potential derived charges. Conventional SCF single points and all jobs other than single points use tight convergence of 10 ## Problem Convergence CasesThe default SCF algorithm now uses a combination of two Direct Inversion in the Iterative Subspace (DIIS) extrapolation methods EDIIS and CDIIS. EDIIS [559] uses energies for extrapolation, and it dominates the early iterations of the SCF convergence process. CDIIS, which performs extrapolation based on the commutators of the Fock and density matrices, handles the latter phases of SCF convergence. This new algorithm is very reliable, and previously troublesome SCF convergence cases now almost always converge with the default algorithm. For the few remaining pathological convergence cases, These are the available alternatives if the default approach fails to converge (labeled by their corresponding keyword):
These approaches all tend to force convergence to the closest stationary point in the orbital space, which may not be a minimum with respect to orbital rotations. A stability calculation can be used to verify that a proper SCF solution has been obtained (see the ## SCF FrequenciesFour alternatives for integral processing are available for Hartree-Fock second derivatives:
By default, during in-core frequencies, the integrals are computed once by each link that needs them. This keeps the disk storage down to the same modest amount as for direct-O(N HF frequency calculations include prediction of the infrared and Raman vibrational intensities by default. The IR intensities add negligible overhead to the calculation, but the Raman intensities add 10-20%. If the Raman intensities are not of interest, they can be suppressed by specifying
While frequency calculations can be done using very modest amounts of memory, performance on very large jobs will be considerably better if enough memory is available to complete the major steps in one pass. Link 1110 must form a "skeleton derivative Fock matrix" for every degree of freedom (i.e., 3 x Number-of-atoms) and if only some of the matrices can be held in memory, it will compute the integral derivatives more than once. Similarly, in every iteration of the CPHF solutions, link 1002 must form updates to all the derivative Fock matrices. Link 1110 requires 3N The ## MP2 EnergiesFour algorithms are available for MP2, but most of the decision-making is done automatically by the program. The critical element of this decision making is the value of The algorithms available for
In addition, when the direct, semi-direct, and in-core MP2 algorithms are used, the SCF phase can be either conventional, direct, or in-core. The default is direct or in-core SCF. ## MP2 GradientsThe choices for MP2 gradients are much the same as for MP2 energies, except: The conventional algorithm requires the storage of the two-particle density matrix and therefore uses considerably more disk than if only energies are needed. The new methods require no more disk space for gradients than for the corresponding energies. The modern methods compute the integral derivatives at least twice, once in the E ^{2}phase and once after the CPHF step. As a result, for small systems (50 basis functions and below) on scalar machines, the conventional algorithm is somewhat faster.The integral derivative evaluation during E ^{2}in the new algorithms requires extra main memory if higher than f functions are used.
As for the MP2 energy, the default is to do direct or in-core SCF and then dynamically choose between semi-direct, direct, or in-core E ## MP2 FrequenciesOnly semi-direct methods are available for analytic MP2 second derivatives. These reduce the disk storage required below what a conventional algorithm requires. MP2 frequency jobs also require significant amounts of memory. The default of six million words should be increased for larger jobs. If f functions are used, eight million words should be provided for computer systems using 64-bit integers. ## Higher Correlated MethodsThe correlation methods beyond MP2 (MP3, MP4, CCSD, CISD, QCISD, etc.) all require that some transformed (MO) integrals be stored on disk and thus (unlike MP2 energies and gradients) have disk space requirements that rise quartically with the size of the molecule. There are, however, several alternatives as to how the transformed integrals are generated, how many are stored, and how the remaining terms are computed: z$$f$$>The default in *Gaussian*is a semi-direct algorithm. The AO integrals may be written out for use in the SCF phase of the calculation or the SCF may be done directly or in-core. The transformation recomputes the AO integrals as needed and leaves only the minimum number of MO integrals on disk (see below). The remaining terms are computed by recomputing AO integrals.A full transformation is performed if **MaxDisk**supplies sufficient disk for doing so. This will be faster than other approaches unless the computer system's I/O is very slow.The conventional algorithm, which was the default in *Gaussian**90*, involves storing the AO integrals on disk, reading them back during the transformation, and forming all of the MO two-electron integrals except those involving four virtual orbitals. The four virtual terms were computed by reading the AO integrals. This procedure can be requested in*Gaussian*by specifying**Tran=Conven**in the route section. However, it is appropriate only on very slow machines like legacy PCs.
If a post-SCF calculation can be done using a full integral transformation while keeping disk usage under The following points summarize the effect of CID, CISD, CCD, BD, and QCISD energies also have a fixed storage requirement proportional to O ^{2}N^{2}, with a large factor, but obey**MaxDisk**in avoiding larger storage requirements.CCSD, CCSD(T), QCISD(T), and BD(T) energies have fixed disk requirements proportional to ON ^{3}which cannot be limited by**MaxDisk**.CID, CISD, CCD, QCISD densities and CCSD gradients have fixed disk requirements of about N ^{4}/2 for closed-shell and 3N^{4}/4 for open-shell.
## Excited State Energies and GradientsIn addition to integral storage selection, the judicious use of the restart facilities can improve the economy of CIS and TD calculations. ## Integral StorageExcited states using CI with single excitations can be done using five methods (labeled by their corresponding option to the
## Restarting Jobs and Reuse of WavefunctionsCIS and TD jobs can be restarted from a ## CIS Excited State DensitiesIf only density analysis is desired, and the excited states have already been found, the CIS density can be recovered from the checkpoint file, using Separate calculations are required to produce the generalized density for several states, since a CPHF calculation must be performed for each state. To do this, first solve for all the states and the density for the first excited state: # CIS=(Root=1,NStates=N) Density=Current if CIS=(Read,Root=M,NStates=N) Density=Current for states ## Pitfalls for Open-Shell Excited StatesSince the UHF reference state is not an eigenfunction of S ## Stability CalculationsTests of Triplet and Singlet instabilities of RHF and UHF and restricted and unrestricted DFT wavefunctions can be requested using the ## CASSCF EfficiencyThe primary challenge in using the CASSCF method is selecting appropriate active space orbitals. There are several possible tactics: Use the standard delocalized initial guess orbitals. This is sometimes sufficient, e.g. if the active space consists of all p electrons. Use **Guess=Only**to inspect the orbitals and determine whether any alterations are required before running the actual calculation.Use localized initial guess orbitals. This is useful if specific bond pairs are to be included, since localization separates electron pairs. Use the natural orbitals from the total density from a UHF calculation (CAS-UNO) [415,416]. For singlets, this requires that one has coaxed the UHF run into converging to a broken symmetry wavefunction (normally with **Guess=Mix**). It is most useful for complex systems in which it is not clear which electrons are most poorly described by doubly-occupied orbitals.
In all cases, a single-point calculation should be performed before any optimization, so that the converged active space can be checked to ensure that the desired electrons have been correlated before proceeding. There are additional considerations in solving for CASSCF wavefunctions for excited states (see the discussion of the ## CASSCF FrequenciesCASSCF frequencies require large amounts of memory. Increasing the amount of available memory will always improve performance for CASSCF frequency jobs (the same is not true of frequency calculations performed with other methods). These calculations also require O |