Comparative ligand structural analytics illustrated on variably glycosylated MUC1 antigen–antibody binding

When faced with the investigation of the preferential binding of a series of ligands against a known target, the solution is not always evident from single structure analysis. An ensemble of structures generated from computer simulations is valuable; however, visual analysis of the extensive structural data can be overwhelming. Rapid analysis of trajectory data, with tools available in the Galaxy platform, can be used to understand key features and compare differences that inform the preferential ligand structure that favors binding. We illustrate this informatics approach by investigating the in-silico binding of a peptide and glycopeptide epitope of the glycoprotein Mucin 1 (MUC1) binding with the antibody AR20.5. To study the binding, we performed molecular dynamics simulations using OpenMM and then used the Galaxy platform for data analysis. The same analysis tools are applied to each of the simulation trajectories and this process was streamlined by using Galaxy workflows. The conformations of the antigens were analyzed using root-mean-square deviation, end-to-end distance, Ramachandran plots, and hydrogen bonding analysis. Additionally, RMSF and clustering analysis were carried out. These analyses were used to rapidly assess key features of the system, interrogate the dynamic structure of the ligand, and determine the role of glycosylation on the conformational equilibrium. The glycopeptide conformations in solution change relative to the peptide; thus a partially pre-structuring is seen prior to binding. Although the bound conformation of peptide and glycopeptide is similar, the glycopeptide fluctuates less and resides in specific conformers for more extended periods. This structural analysis which gives a high-level view of the features in the system under observation, could be readily applied to other binding problems as part of a general strategy in drug design or mechanistic analysis.


S1
The inputs, workflows and data for these simulations are available at https://github.com/chrisbarnettster/bjoc-paper-2020-sm. Figure S1: A comparison of all Ramachandran analyses for all residues of the antigen (left column, A, B) and Tn-antigen (right column, C, D) in solution. The first row illustrates the φ-ψ angles for amino acid 2 of the peptide, proline, with a scatter plot showing the allowed φ-ψ regions highlighted in blue (A), and a probability density Ramachandran plot (B) for the antigen, and a scatter plot (C) and probability density Ramachandran plot (D) for the Tn-antigen. The last row is for amino acid 7, alanine. By definition, the φ-ψ of residue 1 and residue 8 cannot be calculated. Figure S2: A comparison of Ramachandran analyses for the antigen (left column, A, B) and Tn-antigen (right column, C, D) bound to the antibody. The first row illustrates the φ-ψ angles for amino acid 2 of the peptide, proline, with a scatter plot showing the allowed φ-ψ regions highlighted in blue (A), and a probability density Ramachandran plot (B) for the antigen and a scatter plot (C) and probability density Ramachandran plot (D) for the Tn-antigen. The last row is for amino acid 7, alanine.

S2
By definition, the φ-ψ of residue 1 and residue 8 cannot be calculated.

Ramachandran analysis for antigens bound to the antibody
The Ramachandran plots for each amino acid ( Figure S2) show unimodal distributions with a single preference in φ and ψ. The second residue and seventh residues are exceptions. Pro minimally samples the negative ψ region, and this is observed less for the Tn-antigen, while Ala explores the top left of the β region (−160,160). The specific preferences for the φ-ψ distribution of the antigen, when bound to the peptide, can be compared to solution distributions (compare Figure S1 and S2). In some cases, the preference stays the same and reduced flexibility is observed, for example, Pro2. In other cases, the conformational preferences shift on binding but this shows no correlation to the effect of glycosylation, for example, Pro6, Ala7, and finally, the conformational preference seen for glycosylation in solution aligns with the preference seen for both bound antigens, for example, Asp3 and

Hydrogen bonding
Tables S1-S7 contain hydrogen-bonding results from the hydrogen-bond analysis tool MDAnalysis. This tool conveniently summarizes the hydrogen-bond occupancy as a percentage but is not very specific about the hydrogen-bond contact (VMD Hydrogen-bond analysis was used for further analysis and some specifics are detailed in the main text). 'Main' represents the main chain (i.e., carbonyl and amino group), while 'Side' represents the side chain. For proline, which is cyclic, 'Side' refers to the amine group which has cyclized and viewed as part of the side chain.
AGA1 is the sugar (GalNAc). AGA1Main and AGA1Side are not specific as there are multiple hydrogen-bonding-donor and acceptor sites in the sugar.      Table S7: Hydrogen-bonding interactions between the Tn-peptide antigen and chain B of the AR20.5 antibody.