Information Content in Ribonuclease A: An Investigation of Protein Structure and Identification of New Qsars
The research of this dissertation presents a newly identified quantitative relationship between statistical information expressed by protein structure and the resulting enzymatic function. All molecules contain information in their atom/covalent bond structure. It is this structural information that confers the physical and chemical properties of the molecule. It was the goal of this project to formulate a method for quantifying structural information of protein molecules. A computational model was developed for this quantification. Ribonuclease A (RNase A) was used as a model system of investigation.
In solution, molecules are subjected to numerous collisions powered by the thermal energy of Brownian motion. It is upon each Brownian collision that structural information is momentarily trapped and therefore communicated between molecules. In most cases, the result of a collision is not a reaction. However, the sequences of collisions resulting in no chemical reaction are the ones that communicate details of molecular structure. In a liquid environment, every molecule, enzyme or otherwise, offers a complex set of collision possibilities, which differentiates its chemical nature.
The computational model of this investigation is one that mimics collisions between a folded protein and an inert colliding random walker. Every instance of collision between the random walker and protein molecule results in expression of information about a small fraction of the protein structure. The deviations from randomness in the random walk are imposed by the molecular lattice of the protein molecule. The model tracks the path of the random walker as it traverses the structure of a folded protein. The structural data expressed by the protein molecule at each collision is documented and recorded. The collision sequences are then probed for Shannon and mutual information quantities.
This model is one that is not expensive in computational time or power. In fact, all of the programs implemented in this model are written in the high level languages of PASCAL and BASIC. The only input data needed to model a protein by this method are the results of structure determination by X-ray experiment, as deposited in Protein Data Bank (PDB).
This investigation has yielded several interesting results. Information properties for wild type (WT) Ribonuclease A, site-substituted variants and enzyme inhibitor complexes of the same enzyme were established. Distinctions were evident on informational grounds upon comparison of Ribonuclease A to the variants and inhibited complexes. It was found that the unique information distribution of the wild type sample is altered in a signature way upon mutation or inhibitor binding. Furthermore, comparisons between the wild type and various mutants elucidated a pattern of change in information properties, distinguishing critical site residues from those where the chemical function is maintained upon mutation. Also enzyme complexes with inhibitors of potent effect have been found to perturb the wild type information signature in a way that is distinguished from those of weaker potency.
The results of this project identify a new method for predicting critical site residues and potent inhibitors of enzymatic proteins. This model allows for use with no a priori knowledge about the protein sample, except for structure data from X-ray diffraction experiments. It is possible to use this model, for example, having no knowledge of the active site constituency. In fact, this model is an effective tool in the identification of active site residues, for example. This would be useful, as identification of active site residues is necessary to elucidate protein function.