The question of D3 versus D4 comes up with notable frequency in methods discussions, often generating more debate than the accuracy difference warrants — and sometimes insufficient debate when the difference actually matters. This post is an attempt to put the comparison on quantitative footing for the specific use cases that arise in catalyst binding energy prediction and reaction energy calculation.
Both corrections are post-SCF additions: they do not modify the self-consistent field procedure. The dispersion energy E_disp is added to the DFT electronic energy after the SCF has converged. The difference is in how the C₆ coefficients (the leading-order dispersion coefficients) are computed and how they respond to the chemical environment.
D3 and D4: What Actually Changed Between Versions
Grimme’s D3 (2010, with Becke-Johnson damping in D3(BJ)) uses tabulated C₆ coefficients that depend on the element and a geometry-based coordination number. The coordination number captures some environmental dependence — a carbon in benzene has a different C₆ than a carbon in methane — but the coefficients do not respond to formal charge or oxidation state.
D4 (2019) introduces two principal changes:
- Charge-dependent C₆ coefficients: D4 computes partial atomic charges via a classical electronegativity equilibration scheme and uses these to interpolate between reference C₆ values for different oxidation states. A Pd(0) and Pd(II) atom in the same complex type will receive different dispersion coefficients in D4; in D3, they receive identical values.
- Three-body Axilrod–Teller–Muto (ATM) term: D4 includes the leading-order three-body dispersion term by default. For most molecular systems at practical densities, the ATM contribution is small (typically <1% of total dispersion energy), but for very dense packing — stacked aromatic systems, cage compounds, supramolecular assemblies — it can contribute 0.5–2 kcal/mol to binding energies.
Systematic Comparison on Catalyst-Relevant Systems
We evaluated D3(BJ) and D4 against coupled-cluster reference values on three benchmark categories directly relevant to transition metal catalyst binding energy prediction:
Pd-Phosphine Ligand Dissociation Energies (12 complexes)
Pd(0)L₂ → Pd(0)L + L, L = phosphine or NHC ligand, computed at PBE0 level with def2-TZVP basis. Reference: DLPNO-CCSD(T)/aug-cc-pVTZ.
- PBE0-D3(BJ): MAE 1.6 kcal/mol
- PBE0-D4: MAE 1.1 kcal/mol
- Largest improvement: bulky NHC ligands (IPr, SIPr), where D4’s charge-dependent coefficients better capture the polarized N–C bonds in the NHC carbon donor. Improvement: 0.8–1.4 kcal/mol per case.
S30L Host-Guest Binding Energies (30 complexes)
S30L is a benchmark set of supramolecular host-guest complexes where London dispersion dominates binding. Computed at TPSS level (a functional that relies heavily on the dispersion correction for non-covalent interactions).
- TPSS-D3(BJ): MAE 1.84 kcal/mol
- TPSS-D4: MAE 1.31 kcal/mol
- The ATM three-body term alone accounts for 0.3–0.5 kcal/mol of the improvement for large, concave hosts (cyclodextrin-type complexes)
BH76 Reaction Barrier Heights (76 barriers)
BH76 covers hydrogen transfer and heavy atom transfer barriers — organic systems, no transition metals. At ωB97X-D level:
- ωB97X-D3: MAE 1.62 kcal/mol (note: ωB97X-D uses a modified D3 scheme built into the functional parameterization)
- ωB97X-D4: MAE 1.59 kcal/mol
- Difference: statistically indistinguishable. For barrier heights of neutral organic reactions, D3 and D4 give essentially the same result.
Where the Difference Is and Is Not Meaningful
The pattern across these and other benchmarks is consistent:
D4 shows meaningful improvement (>0.3 kcal/mol MAE reduction) in: charged species, large polarizable systems (extended π-systems, bulky organometallics), and noncovalent complexes where specific interactions create substantial charge asymmetry. The charge-dependent C₆ in D4 captures the physical reality that the polarizability of an atom changes with its formal charge — a fact that D3 ignores entirely.
D3 and D4 give equivalent results for: neutral, localized-charge organic reaction barriers; atomization and reaction energies for small molecules; organometallic bond energies involving first-row TMs with minimal formal charge variation across the reaction coordinate.
The Practical Cost of D4
Both D3 and D4 are computed post-SCF and add negligible wall time (<0.1% for typical single-point calculations). The implementation in ORCA, Gaussian, and Turbomole handles both transparently. For most workflows, there is no computational cost argument for choosing D3 over D4.
The one practical reason to stay with D3(BJ) is literature comparability: if you’re reproducing or extending a study that used a specific functional+D3(BJ) parameterization, changing to D4 introduces a methodological inconsistency. In that case, matching the literature level is the right choice, and D3(BJ) is explicitly appropriate.
We’re not saying D3 is wrong — it’s a well-validated method that performs well for a wide range of systems. We’re saying that when you’re starting a new study and choosing between the two, D4 is the better default for catalyst-relevant applications without any cost penalty.
Functional-Specific Damping Parameters
Both D3 and D4 require functional-specific damping parameters to avoid double-counting of short-range correlation. These parameters are fitted separately for each functional and cannot be freely mixed. Using D3(BJ) parameters from PBE0 with the B3LYP functional, or applying D4 without its specifically fitted parameters for a given functional, produces worse results than either correctly parameterized version alone.
Check that your software is applying the correct parameter set for your chosen functional+correction combination. ORCA’s output explicitly lists the D3/D4 parameters used; Gaussian’s dispersion output requires more careful inspection. A common error in calculations posted to supporting information files is running B3LYP with D3 parameters from a different functional — the error shows up as ~0.5–1.0 kcal/mol systematic offset in binding energies but can be hard to trace if you’re not looking for it.
Recommendation for Catalyst Binding Energy Screening
For Qchemvyx workflows involving Pd, Ir, and Ru catalyst binding energies, we use D4 as the default dispersion treatment for all hybrid functionals (B3LYP, PBE0, ωB97X-D). For the specific combination of ωB97X-D with its built-in dispersion parameterization, we keep the native D3 variant (ωB97X-D3) unchanged — the functional was fitted with a specific dispersion parameterization that is inseparable from its exchange-correlation design.
For screening applications where you’re comparing relative binding energies within a structurally homogeneous series (all the same ligand class, same metal), the D3/D4 difference will typically be smaller than the functional error and smaller than the basis set incompleteness error. In those cases, consistency within the series matters more than which version you choose.