In silico rationalisation of selectivity and reactivity in Pd-catalysed C–H activation reactions

A computational approach has been developed to automatically generate and analyse the structures of the intermediates of palladium-catalysed carbon–hydrogen (C–H) activation reactions as well as to predict the final products. Implemented as a high-performance computing cluster tool, it has been shown to correctly choose the mechanism and rationalise regioselectivity of chosen examples from open literature reports. The developed methodology is capable of predicting reactivity of various substrates by differentiation between two major mechanisms – proton abstraction and electrophilic aromatic substitution. An attempt has been made to predict new C–H activation reactions. This methodology can also be used for the automated reaction planning, as well as a starting point for microkinetic modelling.


Exchange and correlation functionals evaluation for DFT calculation
As mentioned in the previous chapter, varying functionals and basis sets will cause changes in calculation of thermodynamic properties of molecules. In order to choose functionals suitable for transition metal catalysed intermediates calculation, palladium(II)-catalysed intermediates in particular, a crosscheck for different functionals need to be conducted.
Four different exchange and correlation functionals have been chosen for testing, respectively wB97xD, B3LYP, TPSSh and M06-2X (Table1). Ten examples from Reaxys database were chose to check exchange and correlation functionals for the calculation at density functional theory level. For those ten examples 1 , starting molecules, possible intermediates and final product were calculated with the four different functionals mentioned above. The latest functional from Head-Gordon and co-workers, which includes the version of Grimme's D2 dispersion. B3LYP ii The hybrid functional using Becke 3 parameters and the Lee-Yang-Parr functional.
These four functionals were implemented for DFT calculation, and then results were compared. As can be seen in Table 2, in this palladium catalysed system, different exchange and correlation functionals only cause minor influence to the results. In this case, the most popular and welldeveloped functional B3LYP functional v was selected for further calculation in this project.

Geometry optimization and electronic structure determination at DFT level
Once the possible intermediates were generated based on the different mechanisms, a two-step optimization procedure was carried out in order to balance the computational time and the accuracy of the results.
At first, geometry of all generated intermediates was optimized over 5 iterations 2 . Then the structures having higher energy by a value of 10 kcal·mol −1 as compared to the least energy intermediate were crossed out, the remaining structure were then fully optimized and thermochemistry applied 3 . The geometry of the structures was considered stable and correct only if no negative vibrational frequencies were found after calculating force constants and vibrational frequencies. Moreover, for the unstable intermediates, the algorithm would automatically generate 4 another structure based on the former calculation as a starting point for further optimization.
All calculations were performed with the open source NWChem software package vi by using Becke's three-parameter hybrid B3LYP functional, while the molecular orbitals are expanded in triple-zeta all electron 6-31 set with added polarization and diffuse functions [6-31g(d,p)]. vii B3LYP functional has been proven to give accurate description of geometries, frequencies, relative stabilities of different conformers and the energy profile calculation, not only in previous literatures, viii but also in the benchmarking studies done previously using Gaussian 09 software.

Selectivity analysis
Within the results analysis step, a text parsing and geometry operation module, coded in Python, has been developed to read and analyse the data from NWChem plain text outputs. By comparing Gibbs free energies of each intermediates, most reactive site for the given structure can be predicted which corresponds to the most stable intermediates. Moreover, the expected regioselectivity can be calculated based on the relative energy of intermediates using the Boltzmann distribution equation