Are dispersion corrections accurate outside equilibrium? A case study on benzene

Modern approaches to modelling dispersion forces are becoming increasingly accurate, and can predict accurate binding distances and energies. However, it is possible that these successes reflect a fortuitous cancellation of errors at equilibrium. Thus, in this work we investigate whether a selection of modern dispersion methods agree with benchmark calculations across several potential-energy curves of the benzene dimer to determine if they are capable of describing forces and energies outside equilibrium. We find the exchange-hole dipole moment (XDM) model describes most cases with the highest overall agreement with reference data for energies and forces, with many-body dispersion (MBD) and its fractionally ionic (FI) variant performing essentially as well. Popular approaches, such as Grimme-D and van der Waals density functional approximations (vdW-DFAs) underperform on our tests. The meta-GGA M06-L is surprisingly good for a method without explicit dispersion corrections. Some problems with SCAN+rVV10 are uncovered and briefly discussed.


Introduction
The past decade has seen an increasing awareness of the role played by van der Waals dispersion forces in chemistry and materials science [1][2][3][4][5][6]. It has consequently become firmly established that including dispersion forces can be vital for understanding and predicting the behaviour and structure of molecules, materials and surfaces [7][8][9][10][11][12].
The increased attention being paid to dispersion forces has paralleled, and been driven by, an increased interest in how to accurately model them. Multiple families of approaches for including dispersion forces in quantum chemical simulations now exist, mostly based around the principle of improving density functional theory (DFT) calculations (see, e.g., some key and recent summaries [3,5,13,14]) through a dispersion correction. The latest variants of these approaches have been highly successful in predicting key properties of a wide range of molecules and materials, such as binding energies and molecular/material structures [4,5,15]. Methods are increasingly converging towards accurate prediction of these properties [16].
What is less well known, however, is how well methods predict the properties of systems outside of internal equilibrium, i.e., whether they can predict energies and forces when a system has not relaxed to its lowest energy geometry. This question is important as it is feasible that methods benefit from a cancellation of errors at equilibrium, which may give false expectations about their general accuracy. We therefore need to understand the limitations of approaches in dealing with dispersion forces generally, and not just for systems at their equilibrium geometries. This is especially important for predicting how a system (or sub-system) behaves when subject to external forces, or when dispersion forces compete with other weak forces within molecules or structures. It is particularly relevant as recent work has shown that modern approaches can often provide accurate binding distances or binding energies in layered materials [17], but not both, suggesting limits to their accuracy.
The work by Řezáč et al. goes some way to resolving this question, by providing benchmark values (the S66x8 set) for eight equilibrium and non-equilibrium geometries of different molecular pairs [18]. This set has been used to test various dispersion methods [19]. However, while S66x8 certainly improves on tests only at optimized geometries, it may still fail to expose issues with forces or other energetic differences.
In this work we thus test the accuracy of modern dispersion approaches in reproducing the energetics of the benzene dimer, an important model system, in different geometries. The coupled-cluster with singles, doubles and perturbative triples [CCSD(T)] benchmark set of Sinnokrot et al. [20] is used, with the aim of establishing which approximations can best reproduce the full potential energy curves. This simple test is not designed to be comprehensive, but rather to interrogate the predictive ability of different approaches. Note that the benchmark set, while slightly inaccurate by modern standards, is predicted to be within 0.1 kcal/mol of more recent benchmarks [21], which is similar to methodological errors caused by using modern pseudopotential methods [22]. We thus feel that its range more than makes up for any limitations it may have.
Moreover, interactions between ring structures feature widely across organic chemistry [23][24][25]. Recently, there has been increasing interest in utilizing non-covalent π-stacking for synthetic catalysis -and it is notable that most structures shown in a recent review on the topic feature rings that interact at distances greater than the potential minimum [26]. Benzene dimers also feature in the S22 benchmark set [21,27] that is often used to semi-empirically optimize dispersion corrections, and is almost always used as a test of such methods. They are thus an excellent test of the quality of dispersion models on a system where failures may have chemical relevance.

Results and Discussion
The origin of dispersion forces Dispersion forces are a semi-classical effect coming from quantum fluctuations. Most simply, they can be viewed from the perspective of pairs of interacting fluctuating dipoles, in which a temporary dipole in one (sub)system induces a dipole in the other, and consequently lowers the total energy. Since the field from each temporary dipole falls off as the inverse cube of the distance D between the systems, and the contributions come in pairs, this leads to an interaction with the asymptotic form U vdW = −C 6 D −6 , where C 6 is a coefficient that depends on the properties of the independent systems. A similar semi-classical analysis can also be applied to more general multipoles, such as quadropoles, which give rise to terms proportional to D −8 (quadropole with dipole), D −10 (quadrupole-quadropole) etc.
In addition to direct coupling between pairs of multipoles, more general forms of coupling can also lead to higher order contributions, including 3rd-order effects between three multipoles, 4th-order etc., as detailed in, e.g., Dobson and Gould [2]. In certain cases, including graphene [28][29][30][31], this can lead to fundamental deviations from the simple model outlined above [18,[31][32][33][34][35][36]. However, the most extreme deviations from the pairwise model do not affect the benzene dimer system. Note, the importance of higher-order (many-body) dispersion terms in "typical" systems has been the subject of some debate. It is critical to differentiate between non-additive many-body electronic interactions [34,35] and non-additive C 9 or Axilrod-Teller-Muto (ATM) dispersion interactions here. The former cause large differences in the effective pairwise C 6 and higherorder dispersion coefficients, relative to corresponding values for free atoms [33,37,38] (these are known as Type-B non-additivity effects in the classification scheme of Dobson [33]). This is a particularly significant effect for metals and can alter the C 6 coefficients by more than an order of magnitude in some cases [39,40]. In contrast, the 3-body C 9 contribution to the dispersion energy is typically smaller in magnitude than the pairwise C 10 contribution [41] and consequently is negligible for most applications compared to 3-body electronic effects [42][43][44].
Mathematically, the ATM treatment is most applicable when the energy contribution from 3rd and higher order terms converge rapidly as a function of inverse distance and may thus be truncated after the 2nd-order or 3rd-order contributions. Many-body effects arise from a divergence or slow convergence in the same series due to Dobson-B or -C effects, so that the contributions must be treated as a formal power series and rewritten as an explicit function of the polarisability tensor.

Summary of modern dispersion methods
Over the last decade, a number of new approaches have been developed that explicitly introduce dispersion forces into electronic structure theory methods -typically density functional theory (DFT). These approaches seek to overcome the fundamental lack of dispersion forces in DFT and Hartree-Fock theory by introducing an explicit long-range correction, giving a total energy E = E DFT + ΔE vdW for the system. Typically, E DFT is taken from a standard density functional approximation (DFA), such as PBE [45] or B3LYP [37], while ΔE vdW is one of a range of dispersion correction models. Note, this is different to seamless approaches like MP2, RPA or other quantum chemistry methods which include dispersion forces automatically.
Common van der Waals corrections can be broadly divided into three categories, as will be detailed below. Substantial effort has seen steady improvements in the quality of approaches in all three categories. In this paper we focus only on recent (or older, but still very popular) iterations within each category, to reflect how the methods are designed to be used in practice. The three classes of approaches considered are: 1) Purely empirical corrections based only on semi-classical models of the nuclei, and their neighbours, without drawing from the electronic density [15,[46][47][48][49][50]. Of these we include Grimme's D2, D3 and D3-BJ functionals, as corrections on PBE and, in the case of D3-BJ, on B3LYP. Here and henceforth "on X" means that the correction is taken on top of the X (hybrid) density functional approximation, which we also denote as X-Y (e.g., PBE-D3-BJ); 2) atomic-dipole with density methods, which correct first-principles or empirical models of atomic dipoles (and sometimes multipoles) using properties of the electronic density. Of these we include XDM [51,52] 55] and FI [56,57] include dispersion contributions to all orders using the many-body dispersion method of Tkatchenko et al.
[54], but involve different treatment of polarisabilities and screening; and 3) first principles density functionals, in which dispersion forces depend only on the density in a totally seamless fashion [58][59][60] and in which the base DFA forms part of the functional itself. We include the functional of Dion et al. [59], vdW-DF2 [61], optPBEvdW [62] and optB88vdW [62]. We also include SCAN+rVV10, based on the strongly-constrained and normalised (SCAN) meta-GGA, which has been shown to be very successful in initial testing [63]. We refer collectively to these as vdW-DFAs.
In addition to the van der Waals functionals, we also show energies from the generalized gradient approximations (GGAs) PBE and B3LYP, and the meta-GGAs M06L [64] and SCAN [65] without dispersion corrections. Although both GGAs are known to neglect dispersion physics, the meta-GGAs M06-L and SCAN are expected to capture some dispersion-binding contributions through the large-gradient behaviour of their exchange functionals. Note, the interplay between exchange-correlation functionals and dispersion corrections has been the topic of some discussion [66,67]. Finally, we note that we include only general functionals, and avoid approaches that are designed to address one type of system (e.g., molecules, bulks or layered structures) only.

Calculation details
We performed most calculations using VASP 5.4 [68,69] where the valence electrons are separated from the core by use of projector-augmented wave pseudopotentials (PAW) [70]. The energy tolerance for the electronic structure determinations was set at 10 −7 eV. Calculations used only the Γ k-point. ENCUT was set to 400 eV in all calculations, which were carried out in a 15 × 15 × 25 Å 3 supercell. SCAN(+rVV10) required ENCUT = 700 eV as results showed significant noise with the standard energy cutoff, which led us to reduce the box dimensions to 12 × 12 × 20 Å 3 . We will discuss issues with SCAN later. Both MBD approaches (TS-MBD and FI-MBD) used the reciprocal space implementation [71], the latter in a custom version of VASP 5.4.1 [57]. The vdW-DFs use the implementation of Klimeš [72] of the Román-Pérez and Soler [73] method.
Some methods are not implemented in VASP and in these cases, the calculations were performed using other codes. XDM results were obtained using Gaussian09 (PBE, B3LYP, and LC-wPBE) or Psi4 [74] (B86bPBE) and the postg application. We include XDM results on several base functionals due to its broad success. M06-L results were calculated with Gaussian16 using the aug-cc-pVTZ basis set due to convergence difficulties with the plane-waves/pseudopotential approach.
To put these settings in context, we purposefully employed the methods as they are intended to be used, i.e., using more-or-less standard convergence parameters and recommended settings.

Results
Now that we have established the background methodology, let us summarise the shared features of Figures 1-3 to aid in  showing results for selected groupings of methods. Each subfigure shows as solid lines the benchmark potential energy curve U bench and the potential energy curves from the selected methods U method . They also show the benchmark force F bench , and the forces for different methods F method , all in dashes. All energies and forces are reported as functions of distance, either between the centre of dimers (Figure 1 and Figure 2) or the sliding distance (Figure 3).
We adopt some steps to ensure all energies and forces are calculated in the same way, so as to reduce uncontrolled errors from, e.g., basis set superpositions or pseudopotentials. Firstly, we use the electronic structure codes to calculate energies E(R) directly. We then fit E(R) = E ∞ -C 6 /R 6 to the last five points of the parallel configuration data to find E ∞ , the extrapolated energy of two monomers, which lets us determine interaction energies U(R) = E(R) − E ∞ . We plot U(R) in Figure 1 and Figure 2, and use the minimum-energy values directly in Table 1 - Figure 3 shows U(R) = E(R) − E(0). Secondly, we obtain all forces by fitting cubic splines through the energy data, and taking the derivative of the splines. Figure 1 shows the interaction energy for the parallel configuration of the benzene dimer (labelled P -with D 6h symmetry). Despite having a minimum as a function of distance between the two centres, this arrangement is unstable as the dimers wish to slide apart sideways (see later discussion on Figure 3) to reduce electrostatic effects, such as overlap of the densities of the monomers and static quadrupole-quadropole interactions, which make metastable AA graphite ≈0.23 kcal/mol/C less energetically favourable compared to AB graphite [75]. This configuration thus involves competition between dispersion forces, repulsive electrostatic forces, and other exchange and correlation effects, making it a good test of dispersion corrections. It is clear from the figure that D2, XDM (all variants), MBD and FI all give reliable energies across the entire curve. Their forces are slightly worse, but still within 0.5 kcal/mol/Å of the reference data at reasonable intermolecular distances. The more modern Grimme variants fare worse than their older cousin, and none of the two-point vdW-DFAs work very well at all, for energies or forces, except near the minima. Indeed, most of the tested vdW-DFAs give force errors outside equilibrium that are similar in magnitude to the force itself. A notable exception is SCAN+rVV10 which is broadly on a par with XDM and TS/FI-MBD. Somewhat surprisingly, the semi-local meta-GGA M06L gives an energy curve which is also in broad agreement with the benchmark, but which fluctuates [76] making the spline-derived forces less reliable. Other dispersion-free methods are less successful, as expected. Figure 2 then reports the energies for T configuration as a function of distance (T -with C 2v symmetry), which includes the global minimum for benzene dimer interactions, or at least a local minimum that is energetically very close to it. Here the balance of energetic contributions is more strongly skewed to dispersion, and it is expected that vdW dispersion corrections should work better than for the parallel configuration shown in Figure 1.
Indeed, the successful methods for the parallel geometry (XDM, MBD, FI) seem to work very well here, reproducing the reference energies and forces to within methodological error of ≈0.1 kcal/mol. The vdW-DFAs perform slightly better than in the parallel case, as one would hope. D2, however, is quite poor despite its success in the parallel case and its more modern cousins are conversely much better. Again, M06L works well. Here, however, we notice that SCAN shows significant oscillations around the true curve, which SCAN+rVV10 inherits (to be discussed later).
Next, Figure 3 reports the potential energy curves for sliding of parallel benzene molecules relative to one another at a fixed inter-planar distance D, known as the slipped-parallel configuration (SP -with C 2h symmetry). We show results for Here XDM is a stand-out, giving almost perfect agreement with the benchmarks, thus indicating its ability to simultaneously capture competing energy contributions. All other methods are much more successful here than in the previous tests, reflecting their consistency in reproducing electrostatic effects compared to dispersion interactions which are more-or-less constant across the curves. These results are replicated in other tests (not shown) at D = 3.2, 3.4 and 3.8 Å. Again, the SCAN and SCAN+rVV10 curves display oscillations.
The strange behaviour of SCAN and SCAN+rVV10 warrants special attention. Previous tests of meta-GGAs using Gaussiantype orbital codes suggest this issue might be related to the density of the real-space grid [76]. In Figure 4 we thus show results for SCAN and SCAN+rVV10 for all four intermolecular distances (D = 3.2, 3.4, 3.6 and 3.8 Å) and for both the large energy cutoff 700 eV used in previous calculations, and also a smaller cutoff of 450 eV.
It is obvious from these curves that the oscillations seem to be smallest when the overlap between the orbitals is largest, as in the parallel case and the two closest (D = 3.2, 3.4) sliding cases, versus the T case and the more distant sliding cases. Furthermore, the oscillations seem to hold consistently for the smaller and larger energy cutoffs, a result we find somewhat perplexing as, if they were sensitive to the grid, we would expect them to decrease with a larger cutoff (and consequently finer grid).
Finally, in Table 1 we quantify how the different methods perform in prediction of the relative energies of the various local minima, U 0 (T), U 0 (P) and U 0 (SP), for the T, parallel (P) and slipped-parallel (SP) configurations, respectively. We thus show the energy differences, ΔU(P/SP) = U 0 (P/SP) − U 0 (T), between the local minima for the parallel and slipped-parallel configurations, and the (presumed) global minimum for the T configuration. In all cases we fit quadratic curves to data to obtain a value as close to the true minimum as possible. We also take advantage of revised benchmark values from Takatani et al. Table 1: Relative energy differences, ΔU(P/SP) = U 0 (P/SP) − U 0 (T) [in kcal/mol], between lowest energies U 0 (T/P/SP) for the T, parallel (P) and slipped-parallel (SP) configurations of the benzene dimer, with respect to the minimum-energy T configuration. Here we use the revised benchmarks from Takatani et al. [21] for references, and to quantify the error in our main source of benchmark data [20]. Solid lines separate the different groupings of functionals used in this paper, which are ranked within each section according to |Error|. |Error| = 1/2[|ΔU(P) method − ΔU(P) revbench | + |ΔU(SP) method − ΔU(SP) revbench |]. [21] to establish errors in the main reference data used for the full potential curves.

ΔU
Here, FI and MBD are the best-performing methods, with absolute errors smaller than those for the older benchmark data set. Variants of XDM (LC-wPBE-XDM, B3LYP-XDM) and some Grimme methods are a little poorer, but are still very good. The vdW-DFAs and PBE-TS method can be quite poor, however, further reflecting their poorer treatment of dispersion energies and forces away from equilibrium.

Conclusion
In this work we used benchmark results for several configurations of the benzene dimer to test the ability of dispersioncorrected density functional theory to obtain accurate energies and forces away from equilibrium, and thus to understand their predictive capabilities. All the methods tested here are backed by previously reported successes on a wide range of chemical and/or materials systems. We have shown that many of them do not match these successes at equilibrium by guaranteed success outside it. In the worst cases, some methods have errors in the predicted forces as large as the force itself.
The exchange-hole dipole moment (XDM) model, which we tested with several DFAs, performs very well in general. PBE-MBD and PBE-FI (which incorporates an improved treatment of polarisabilities into an MBD-like calculation) both perform similarly well. We suspect any of these methods can be reliably trusted for predictions in systems involving benzene ring structures Grimme's various methods, TS theory, and various two-point van der Waals density functionals are less successful in our tests, however. M06L is surprisingly accurate for a meta-GGA without explicit dispersion corrections, but is numerically noisy [76]. We thus advise caution when using any of these methods for systems where interactions between ring structures might be important.
The results for SCAN-rVV10 are troubling. We suspect that the oscillations in the potential energy curves reflect previously reported problems with the integration grid [76]. We could, however, not remove them even with a large energy cutoff of 700 eV, just shy of the value used by its developers for rare gas solids [63]. Also, the results were very similar when a smaller energy cutoff was employed, hinting at a deeper underlying problem. This convergence issue is certainly something that should be investigated before dispersion-corrected SCAN is applied widely to weak-bonding problems.
The results reported here also strongly support the importance of using good polarizabilities (dipole and higher) in dispersion models. XDM, MBD and its FI variant include contributions from both the local density and geometry, and thus can capture type-A and -B non-additivity (the latter semi-locally in the case of XDM), in the classification scheme of Dobson [33]. By contrast, the other methods tested involve more simplistic treatment of environmental contributions to polarisabilities and dispersion coefficients.
Finally, we note that we have only tested one type of molecular dimer which means our conclusions are necessarily limited, despite the benzene dimer being an important and difficult example involving competing contributions to the interaction energies and forces. The results uncovered here are interesting enough, we feel, to establish an impetus to carry out further testing of dispersion forces away from equilibrium and to establish the role of effects beyond dipole pairs (including Axilrod-Teller-Muto terms, quadropole and higher multipole terms, and full many-body terms). We hope this work will drive development and use of new benchmarking sets, along these lines, which can test a wider range of physics and chemistry at and outside of equilibrium, and be used to improve future dispersion models.

Supporting Information
We provide all data generated through this project to allow other researchers to carry out their own analyses. This file contains all reference data used for this paper, stored in comma separated variables format with "#" used to indicate comments.

Supporting Information File 1
All reference data.