Applications of Large-Scale Computers and Computer Graphics


Applications of Large-Scale Computers and Computer Graphics...

1 downloads 83 Views 2MB Size

11 Applications of Large-Scale Computers and Computer Graphics Investigations of Biological Macromolecular Structure, Function, and Evolution ARTHUR M. LESK

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

Fairleigh Dickinson University, Teaneck, NJ 07666 KARL D. HARDMAN IBM Corporation, Thomas J. Watson Research Center, Yorktown Heights,NY10598 C r y s t a l - s t r u c t u r e determinations provide atomic coordinates of p r o t e i n s , n u c l e i c a c i d s , and v i r u s e s . Computational s t u d i e s of these data — using both purely-numerical techniques and i n t e r a c t i v e graphics — seek the p r i n c i p l e s of s t r u c t u r e , dynamics, f u n c t i o n and e v o l u t i o n of living systems at the molecular l e v e l . How can the new generation of computers and the new generat i o n of molecular b i o l o g i s t s i n t e r a c t most e f f e c t i v e l y ? Certain algorithms and software now in use are mature, and will be a p p l i cable to a l i b r a r y of s t r u c t u r e s that is i n c r e a s i n g p r o g r e s s i v e l y i n scope and q u a l i t y . We a n t i c i p a t e two e f f e c t s of the i n t r o duction of very l a r g e , f a s t computers: C e r t a i n tasks, in which simple computational power is the limiting resource, will become f e a s i b l e . Other tasks, which must now be run in "batch" mode, will achieve s u f f i c i e n t l y f a s t execution times to make it p o s s i b l e to run them i n t e r a c t i v e l y . To achieve the optimal d i v i s i o n of labor between human and computer, it will be necessary to improve the channels of communication between them. T h i s w i l l r e q u i r e c a r e f u l design o f : (1) The s t r u c t u r e o f the data base. I t must have the f l e x i b i l i t y to a s s i m i l a t e the r e s u l t s of i n v e s t i g a t i o n s i n progress. (2) I n t e r a c t i v e graphics systems. The increased power of host computers and d i s p l a y devices can e a s i l y overwhelm the human part i c i p a n t i n the i n t e r a c t i v e execution of a program. Current computational i n v e s t i g a t i o n s of p r o t e i n s t r e a t : (1) S t r u c t u r e s . The i d e n t i f i c a t i o n of paradigms of conformation and the study of t h e i r e v o l u t i o n . (2) Thermodynamic s t a b i l i t y . A p r o t e i n as a three-dimensional jig-saw puzzle; i t s sidechains f i t together snugly, excluding water. (3) The pathway by which p r o t e i n s f o l d spontaneously. (4) F l e x i b i l i t i e s of conformations. (5) I n t e r a c t i o n s with small molecules, other p r o t e i n s , and other macromolecules. We s h a l l d i s c u s s the e f f e c t s of i n c r e a s i n g s i z e and power of computers on our a b i l i t y to address these problems.

0097-6156/81/0173-0143$05.00/0 © 1981 American Chemical Society

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

SUPERCOMPUTERS IN CHEMISTRY

144

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

Background X-ray c r y s t a l l o g r a p h e r s have now determined the s t r u c t u r e s of approximately one hundred b i o l o g i c a l macromolecules — proteins, n u c l e i c a c i d s , and v i r u s e s — to atomic r e s o l u t i o n . These invest i g a t i o n s have demonstrated t h a t , u n l i k e s y n t h e t i c polymers, the b i o l o g i c a l molecules have s p e c i f i c three-dimensional conformations. Indeed, a l l information required to s p e c i f y the s t r u c t u r e of a p r o t e i n i s contained i n the sequence of amino a c i d s , and t h e r e f o r e the s t r u c t u r e i s a l s o i m p l i c i t i n the sequence of nucl e o t i d e s i n the DNA or RNA genome. A n a l y s i s of the s t r u c t u r e s has provided explanations of t h e i r b i o l o g i c a l f u n c t i o n s , and has r e vealed that there are recurrent a r c h i t e c t u r a l themes i n t h e i r des i g n (1, 2) . I t i s worth emphasizing the thermodynamic dilemma that nature has faced i n generating, i n the p r o t e i n s , a set of molecules such that (1) each one w i l l take up a s p e c i f i c conformation (under appropriate c o n d i t i o n s of solvent and temperature), so that i t w i l l have r e l i a b l e and reproducible f u n c t i o n a l p r o p e r t i e s , but (2) that the same b a s i c chemical s t r u c t u r e must be compatible with the spontaneous formation of a great v a r i e t y of molecular s t r u c tures and f u n c t i o n s , so that the molecules can evolve by means of small changes. The p o t e n t i a l f o r v a r i e t y r e q u i r e s a f l e x i b l e chain that can f o l d i n many p o s s i b l e patterns. Each i n d i v i d u a l molecule i s thereby forced to pay a high thermodynamic p r i c e f o r the f i x a t i o n of i t s degrees of i n t e r n a l r o t a t i o n a l freedom i n the a c t i v e s t r u c t u r e . The thermodynamic s t a b i l i t y of the f u n c t i o n a l s t a t e s of biopolymers i s achieved through s u b t l e i n t e r a c t i o n s among subunits and between the polymer and the s o l v e n t . Much of the succeeding d i s c u s s i o n w i l l emphasize the case of g l o b u l a r p r o t e i n s , because many more s t r u c t u r e s of t h i s c l a s s are a v a i l a b l e , and because the quantity and v a r i e t y of computational s t u d i e s of p r o t e i n s i s e s p e c i a l l y l a r g e . The sets of atomic coordinates of p r o t e i n s t r u c t u r e s provide the raw m a t e r i a l f o r a number of i n v e s t i g a t i o n s aimed at e l u c i d a t i n g the p r i n c i p l e s of p r o t e i n a r c h i t e c t u r e , the mechanism of f o l d i n g , the dynamics of the s t r u c t u r e s ( i n c l u d i n g the mechanism of f u n c t i o n , which may be thought of as the dynamics of the i n t e r a c t i o n among p r o t e i n s , substrates and c o f a c t o r s ) and the mechanism of p r o t e i n e v o l u t i o n . The purpose of t h i s a r t i c l e i s to a n a l yze and c l a s s i f y the kinds of studies now i n progress i n our own and other l a b o r a t o r i e s , and the kinds of questions that people would l i k e to ask but are c u r r e n t l y unable to answer. Our b a s i c question w i l l be: What computational t o o l s w i l l produce the most e f f e c t i v e progress of molecular biology? We f e e l that what i s required i s not only increased power, but increased s o p h i s t i c a t i o n i n the design of the channels of access to t h i s increased power. Most current s t u d i e s are l a r g e l y d e s c r i p t i v e : What do prot e i n s look l i k e ? In which respects and to what extent do two

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

11.

LESK AND HARDMAN

Macromolecular

Structure

145

s t r u c t u r e s resemble each other? Other s t u d i e s are a n a l y t i c or p r e d i c t i v e : Can we c a l c u l a t e the f o r c e s that s t a b i l i z e the s t r u c ture? I f so, can we use our knowledge of these f o r c e s to p r e d i c t the thermodynamically s t a b l e s t a t e of a p r o t e i n from i t s amino a c i d sequence? (The spontaneity of the formation of the n a t i v e s t a t e proves that nature has an algorithm f o r t h i s process.) A l l these i n v e s t i g a t i o n s depend on both high computer power and s o p h i s t i c a t e d software. For example, conformational energy c a l c u l a t i o n s have the [ultimate and as yet u n r e a l i z a b l e ] goal of determining the g l o b a l minima of n o n l i n e a r f u n c t i o n s of l a r g e numbers of v a r i a b l e s : the values of the thousands of atomic coordinates that correspond to the minimum conformational f r e e energy (_3, 4) . I t i s u n s u r p r i s i n g that no general s o l u t i o n i s achievable even w i t h the expenditure of l a r g e amounts of computer time. More r e s t r i c t e d s i m u l a t i o n s , i n which the molecule stays i n a l o c a l region of i t s phase space, have produced i n t e r e s t i n g r e s u l t s , but even these r e q u i r e heavy c a l c u l a t i o n s (5, 6). The d e s c r i p t i v e , c l a s s i f i c a t o r y and comparative approach to a n a l y s i s of s t r u c t u r e s a l s o depends on computing. I t i s no longer f e a s i b l e to pursue these s t u d i e s w i t h p h y s i c a l models. Beyond the obvious mechanical problems, there i s the l o g i s t i c a l catastrophe: the amount of space and m a t e r i e l increases l i n e a r l y with the number of s t r u c t u r e s to be examined. In a d d i t i o n , there i s no way to save and r e s t o r e s t r u c t u r e s . I t i s therefore necessary to apply computer graphics to draw representations of s t r u c t u r e s . The importance of supercomputers i n s t u d i e s of t h i s kind w i l l l i e not only i n the f e a s i b i l i t y of l a r g e c a l c u l a t i o n s that are not p o s s i b l e at present, but i n the conversion of many tasks from batch mode to i n t e r a c t i v e . Although the d i s t i n c t i o n between a n a l y t i c and p r e d i c t i v e s t u d i e s i s u s e f u l , we do not suggest that i t i s p o s s i b l e to d i v i d e the f i e l d i n t o graphics problems on the one hand and "dark" number-crunching problems on the other. Graphics i s necessary to r e port the r e s u l t s of dynamics c a l c u l a t i o n s : the computer-generated movies of R. Feldmann, i n c o l l a b o r a t i o n w i t h M. L e v i t t and M. Karp l u s , show how d i f f i c u l t i t would be to e x t r a c t the information they contain from a program i n any other way. And, of course, graphics i s u s e f u l i n preparing and checking the i n i t i a l s t a t e of a system p r i o r to such a c a l c u l a t i o n , and i n monitoring i t s progress. Conversely, the a n a l y s i s of a s t a t i c s t r u c t u r e can i n v o l v e extensive numerical c a l c u l a t i o n s , e s p e c i a l l y those i n v o l v i n g the d e s c r i p t i o n of the o c c l u s i o n of i n t e r n a l surfaces and the extent of a c c e s s i b i l i t y of d i f f e r e n t residues to solvent (_7, 8). There emerges from these c o n s i d e r a t i o n s a kind of t r i a n g u l a r s t r u c t u r e of the a c t i v i t i e s : at one corner i s the s c i e n t i s t , at a second the b r u t e " f o r c e of a l a r g e and powerful computer, and at the t h i r d are graphics devices. One theme of t h i s a r t i c l e i s that a p r o p e r l y designed system must pay a t t e n t i o n to the s p e c i a l t a l ents of each member of t h i s p a r t n e r s h i p , d i v i d i n g the labor i n an optimal way, and must provide channels of communication of adequate c a p a c i t y among them. M

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

SUPERCOMPUTERS IN

146

CHEMISTRY

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

Supercomputers o f f e r p o t e n t i a l f o r progress i n the f i e l d of computational molecular b i o l o g y , i n that they can permit more ext e n s i v e experiments i n the compute-bound areas such as molecular dynamics, and quicker response times f o r more complicated tasks i n i n t e r a c t i v e work. But the increased power they promise c r e a t e s challenges f o r the system designer, to enable the s c i e n t i s t to u t i l i z e the power most e f f e c t i v e l y . With many tasks, the l a r g e r computational power generates a l a r g e r q u a n t i t y of r e s u l t s . If these are to remain comprehensible, they must be presented to the s c i e n t i s t i n an i n t e l l i g i b l e form, and i n a managable s i z e . This w i l l g e n e r a l l y r e q u i r e p i c t o r i a l output — drawings of s t r u c t u r e s , or graphs or c h a r t s — r a t h e r than pages of numbers. Plan of T h i s P r e s e n t a t i o n A f t e r a b r i e f review of some b a s i c vocabulary u s e f u l i n des c r i b i n g p r o t e i n s t r u c t u r e , we should l i k e to d i s c u s s the f o l l o w ing: (1) Computational aspects of s t r u c t u r e determination of b i o l o g i c a l macromolecules. T h i s has important i m p l i c a t i o n s about the expected q u a l i t y of the f i n a l r e s u l t s . (2) What kinds of questions do we and others want to study? What kinds of t o o l s do we need to help us answer them? (3) A survey of e x i s t i n g software f o r a n a l y s i s of s t r u c t u r e , f u n c t i o n and dynamics. (4) Design c o n s i d e r a t i o n s f o r systems based on supercomputers. Protein

Conformation

Chemically, p r o t e i n s c o n t a i n l i n e a r chains of amino a c i d s , l i n k e d by peptide bonds. The twenty s i d e chains that can occur i n g l o b u l a r p r o t e i n s d i f f e r i n s i z e , shape, charge, and p o l a r i t y . A p r o t e i n i s i n some respects l i k e a jigsaw puzzle, i n which the pieces of the molecule f i t together i n s p e c i f i c ways to create the native structure. The peptide linkages between amino a c i d s form the primary s t r u c t u r e of the p r o t e i n . The primary s t r u c t u r e i s a l l that the n u c l e o t i d e sequence of the genetic m a t e r i a l determines, and theref o r e the primary s t r u c t u r e contains a l l information necessary to s p e c i f y the complete three-dimensional conformation. Because the peptide group tends to be planar, the backbone of the p r o t e i n has two angles of i n t e r n a l r o t a t i o n per monomer u n i t . Side chains c o n t r i b u t e other degrees of conformational freedom. S t e r i c i n t e r a c t i o n s l i m i t the allowed ranges of the conformational angles of the backbone. Within the allowed regions of c o n f o r mation space, c e r t a i n s t r u c t u r e s are s t a b i l i z e d by hydrogen bondi n g and other i n t e r a c t i o n s . These include the a - h e l i x and 3-sheet. These types of s t r u c t u r e s are t h e r e f o r e u t i l i z a b l e as standard

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

11.

LESK AND HARDMAN

Macromolecular

Structure

147

subunits i n p r o t e i n s t r u c t u r e s . In some p r o t e i n s , d i s u l f i d e bonds between c y s t e i n e residues hold two regions of the chain together. The hydrogen-bonded s t r u c t u r a l u n i t s , plus the d i s u l f i d e bonds, c o n s t i t u t e the secondary s t r u c t u r e of the p r o t e i n . In the complete s t r u c t u r e , secondary-structured regions are assembled i n t o compact u n i t s . The i n t e r a c t i o n s between secondary s t r u c t u r a l u n i t s seem to be stereochemically s p e c i f i c . This i s c a l l e d the t e r t i a r y s t r u c t u r e of the p r o t e i n . ( C e r t a i n common patterns of i n t e r a c t i o n between secondary s t r u c t u r a l u n i t s are known as supersecondary s t r u c t u r e s . ) The assembly of two or more independently-folding polypeptide chains i n t o a complete p r o t e i n i s known as i t s quaternary s t r u c ture. There i s some reason to suspect — although t h i s assumption has not been proved — that p r o t e i n s f o l d through the formation of secondary s t r u c t u r a l u n i t s , which assemble by a c c r e t i o n , each combination of s t r u c t u r e d u n i t s lending a d d i t i o n a l s t a b i l i t y to the nascent globule (9, 10). T h i s suggests that a p o s s i b l e approach to the p r e d i c t i o n of a p r o t e i n s t r u c t u r e might be to attempt to p r e d i c t secondary s t r u c t u r e f i r s t , and then search f o r favorable i n t e r a c t i o n s between u n i t s (JJ^, 12) . I t should be emphasized, however, that any f a i l u r e of such c a l c u l a t i o n s does not disprove the model f o r f o l d i n g , and any success would not confirm it. Sources of the S t a b i l i t y of P r o t e i n s Under P h y s i o l o g i c a l Conditions of Solvent and Temperature Several f a c t o r s compensate the p r o t e i n - s o l v e n t system f o r the l o s s of chain entropy required by f o l d i n g : (1) Hydrogen bonding. Because peptide u n i t s form hydrogen bonds to water i n the unfolded s t a t e , peptide-peptide hydrogen bonds must form i n the folded s t a t e to avoid an otherwise i n t o l erable l o s s of enthalpy (13). (2) The g l o b u l a r s t r u c t u r e s of p r o t e i n s reduce the amount of surface area of the residues that can come i n t o contact w i t h s o l vent. Although water i t s e l f i s a r e l a t i v e l y highly-ordered liquid water molecules surrounding a s o l u t e molecule can be even more highly-ordered than they are i n pure water. Therefore the r e l e a s e of water upon burying residues i n t o the i n t e r i o r of p r o t e i n s cont r i b u t e s a p o s i t i v e entropy change to the f o l d i n g process (14). (3) A f a i r l y dense packing of the atoms i n the p r o t e i n i n t e r i o r enhances the c o n t r i b u t i o n of van der Waals i n t e r a c t i o n s to the stability (15). Structure

Determination

We are concerned p r i m a r i l y with i n v e s t i g a t i o n s that begin where s t r u c t u r e determination leaves o f f . However, we must cons i d e r the q u a l i t y of the r e s u l t s of s t r u c t u r e determinations,

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

SUPERCOMPUTERS IN CHEMISTRY

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

148

which p r e s c r i b e s the kinds of computations we can c r e d i b l y undertake. The b a s i c steps i n a contemporary p r o t e i n s t r u c t u r e determina t i o n are these (16): (1) The m a t e r i a l i s i s o l a t e d i n i t s " n a t i v e " form. (2) C r y s t a l s of the m a t e r i a l are grown, and isomorphous deri v a t i v e s are prepared. (The d e r i v a t i v e s d i f f e r from the parent s t r u c t u r e by the a d d i t i o n of a small number of heavy atoms at f i x e d p o s i t i o n s i n each — or at l e a s t most — u n i t c e l l s . The s i z e and shape of the u n i t c e l l s of the parent c r y s t a l and the d e r i v a t i v e s must be the same, and the d e r i v a t i z a t i o n must not appreciably d i s t u r b the s t r u c t u r e of the p r o t e i n . ) The r e l a t i o n ship between the X-ray d i f f r a c t i o n patterns of the n a t i v e c r y s t a l and i t s d e r i v a t i v e s provides information used to solve the phase problem. (3) Sets of X-ray d i f f r a c t i o n data are c o l l e c t e d on n a t i v e c r y s t a l s and d e r i v a t i v e s . (4) By combining the i n t e n s i t y patterns of d i f f r a c t i o n from n a t i v e c r y s t a l s ' a n d d e r i v a t i v e s , i t i s p o s s i b l e to generate a rough map of the e l e c t r o n density d i s t r i b u t i o n . T h i s w i l l not be expected to have w e l l - r e s o l v e d atomic peaks. However, i n favorable cases i t w i l l be p o s s i b l e to t r a c e most of the chain; and c e r t a i n prominent sidechains should be v i s i b l e , f o r example, tyrosine "lollypops". (5) A model of the s t r u c t u r e must be f i t to the map. This used to be done with the "Richards box", a device c o n t a i n i n g a h a l f - s i l v e r e d mirror to give the i l l u s i o n of s u p e r p o s i t i o n of a p h y s i c a l model and the e l e c t r o n density map (conventionally contoured i n s e r i a l s e c t i o n s , traced onto transparent f i l m , and stacked (17).) Recently, some p r o t e i n s t r u c t u r e s have been determined using i n t e r a c t i v e computer graphics to f i t a s t i c k model of a s t r u c t u r e to an e l e c t r o n density map (18, 19). (6) When a correspondence has been e s t a b l i s h e d between peaks i n the e l e c t r o n d e n s i t y and atomic p o s i t i o n s of the model, with the r.m.s. d e v i a t i o n of the a-carbon atoms from t h e i r true p o s i t i o n s below about 0.5 A, i t i s f e a s i b l e to begin some kind of r e finement procedure. A refinement procedure i s a method f o r c a l c u l a t i n g and checking adjustments i n the atomic coordinates i n the model, to minimize some measure of i t s inaccuracy (20). The c l a s s i c measure of accuracy i s the r e s i d u a l , or "R-factor":

R

i n which |F | and

S|

|Fj

-

|F |

n*g

c

|

|F | are corresponding observed s t r u c t u r e f a c t o r

magnitudes and those c a l c u l a t e d from the model, and the summation estends over that p o r t i o n of F o u r i e r space f o r which data were collected. (The s t r u c t u r e f a c t o r s are the F o u r i e r c o e f f i c i e n t s of the e l e c t r o n d e n s i t y d i s t r i b u t i o n . Although both the magnitude and the phase of the s t r u c t u r e f a c t o r s F of the model are c a l c u l c r

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

11.

LESK AND HARDMAN

Macromolecular

Structure

149

able, only the magnitudes of the observed s t r u c t u r e f a c t o r s F are measurable. Therefore the comparison between the model and tfie experimental data can i n v o l v e only the s t r u c t u r e f a c t o r magnitudes.) Some refinement procedures i n c l u d e , at c e r t a i n stages, c r i t e r i a f o r stereochemical r e g u l a r i t y i n a d d i t i o n to the agreement between observed and c a l c u l a t e d s t r u c t u r e f a c t o r magnitudes.

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

Recent Advances i n Refinement Techniques

f o r Macromolecules

One of the primary goals of c r y s t a l l o g r a p h i c s t u d i e s of b i o l o g i c a l macromolecules i s to extend the methods f a r enough f o r r e l i a b l e determinations of atomic bond lengths, p a r t i c u l a r l y i n a c t i v e s i t e regions. In the past, f o r the determination of models of p r o t e i n s as l a r g e as 100 amino a c i d s , i t has been necessary to r e t a i n standard stereochemistry ( f i x e d bond lengths and angles) throughout the e n t i r e procedure (21). I t i s now f o r s e e a b l e that i n the f i n a l stages of s t r u c t u r e determination these geometric c o n s t r a i n t s may be relaxed or even e l i m i n a t e d e n t i r e l y , to give an o b j e c t i v e determination of s t r u c t u r e . Some important recent advances are experimental: low-temperature methods, and the d e v e l opment of areadetectors f o r d i f f r a c t i o n methods. Others are comp u t a t i o n a l : i n p a r t i c u l a r , advances i n the power of c r y s t a l l o graphic refinement techniques f o r macromolecules. For example, Agarwal has developed a novel approach to least-squares refinement which provides extremely f a s t computation times and remains p r a c t i c a l f o r l a r g e r macromolecules (over 2500 nonhydrogen atoms) at very high r e s o l u t i o n (better than 1.2 A) (22). The method u t i l i z e s algorithms developed f o r d i g i t a l s i g n a l processing and o p t i m i z a t i o n techniques, w i t h f a s t - F o u r i e r t r a n s form procedures f o r c a l c u l a t i o n of s t r u c t u r e f a c t o r s and gradients f o r p r e d i c t i n g the three-dimensional atomic s h i f t s . Computaton times are approximately p r o p o r t i o n a l to n l o g n per c y c l e , where n i s the number of s t r u c t u r e f a c t o r s . T h i s i s an important advance over the n dependence of computation time on number of parameters that c h a r a c t e r i z e s o l d e r methods. The 2-Zn form of i n s u l i n has been r e f i n e d to 1.5 A r e s o l u t i o n (more than 1100 nonhydrogen atoms and approximately 12000 s t r u c t u r e f a c t o r s ) to an R f a c t o r of 0.113 (23). Other computational experiments demons t r a t e the u t i l i t y of t h i s method at both higher r e s o l u t i o n s (the a n t i b i o t i c b e a u v e r i c i n at 1.2 A (24)) and lower ones (sperm whale myoglobin at 2.0 A (25)). I t i s evident that improved methods of data c o l l e c t i o n w i l l make very high r e s o l u t i o n s t u d i e s of b i o l o g i c a l macromolecules possible. The speed and s i z e of current computers are already adequate to r e f i n e atomic models of medium-sized p r o t e i n s to higher r e s o l u t i o n than most p r o t e i n c r y s t a l s d i f f r a c t , without the need to impose stereochemical c o n s t r a i n t s at the f i n a l stages of the refinement. For example, with the Agarwal refinement program, myoglobin can be r e f i n e d using a complete set of c r y s t a l l o g r a p h i c 2

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

150

SUPERCOMPUTERS IN

CHEMISTRY

data at 1.0 A nominal r e s o l u t i o n (approximately 70,000 independent s t r u c t u r e f a c t o r s ) i n l e s s than 4 minutes of computer time per c y c l e , on an IBM 370/3033 (25). Improvements c u r r e n t l y being implemented by R.C. Agarwal and others w i l l reduce t h i s time to between 1 and 2 minutes (26). Looking to the next generation of computer hardware, i t i s apparent that even these c a l c u l a t i o n s w i l l r e q u i r e l e s s than 10 seconds per c y c l e . Thus the accuracy of the models f o r macromolecules obtainable w i l l depend only on the q u a l i t y of the c r y s t a l s and of the measurements of t h e i r d i f f r a c tion patterns. These r e s u l t s imply that i t i s already p o s s i b l e to provide a v a r i e t y of s t a t i s t i c s f o r e v a l u a t i o n of the p r e c i s i o n of the atomic coordinates. These s t a t i s t i c s can be used to judge whether the s t r u c t u r e determination has achieved an acceptable l e v e l f o r c r e d i b l e i n t e r p r e t a t i o n of f u n c t i o n . R e a l i s t i c goals f o r p r e c i s ion of i n d i v i d u a l bond lengths are o v e r a l l r.m.s. d e v i a t i o n s of approximately 0.1 A f o r p r o t e i n s of moderate s i z e with data to 1.5 A (and b e t t e r f o r data nearer 1.2 A ) . In some cases, the " i n t e r e s t i n g " part of the p r o t e i n i s more r i g i d i n s t r u c t u r e than the average — f o r example, organometallic complexes w i t h i n prot e i n s , or tightly-bound substrates — and i n these cases the prec i s i o n of i n d i v i d u a l bond lengths and angles might w e l l be higher than the average. The accuracy of the model i s a more d i f f i c u l t question, and c r i t e r i a f o r accuracy are yet to be e s t a b l i s h e d . At some p o i n t , at very high r e s o l u t i o n , i t w i l l be necessary to prove that a t y p i c a l stereochemical features are indeed r e a l , by r e f i n i n g the mod e l s w i t h relaxed geometrical c o n s t r a i n t s or none at a l l . This must be a major goal of c r y s t a l l o g r a p h i c s t u d i e s i f c o r r e l a t i o n s of s t r u c t u r e and f u n c t i o n are to be meaningful. The r e s u l t s of s t r u c t u r e determinations provide the core of a data base of b i o l o g i c a l macromolecular s t r u c t u r e a p p l i c a b l e to further investigations. C u r r e n t l y , the P r o t e i n Data Bank at the Chemistry Department of Brookhaven N a t i o n a l L a b o r a t o r i e s has r e s p o n s i b i l i t y f o r a r c h i v a l storage and d i s t r i b u t i o n of these r s s u l t s i n the United States of America (27). L a t e r i n t h i s paper we s h a l l d i s c u s s howsystems might be designed to f a c i l i t a t e access to such a data base. What kinds of questions are we asking? What computational t o o l s do we have? (To be followed by: What kinds of questions would we l i k e to ask? What kinds of t o o l s do we need?) I t i s obvious that we cannot review e x h a u s t i v e l y a f i e l d as a c t i v e as computational p r o t e i n chemistry i s today. A l l that we can hope to do i s to c l a s s i f y some of the approaches to problems c u r r e n t l y recognized as (or at l e a s t deemed by consensus to be) i n t e r e s t i n g ones. One p o s s i b l e approach to such a c l a s s i f i c a t i o n i s based on the range of conformations t r e a t e d i n the i n v e s t i g a t i o n . Thus we might be l e d to d i s t i n g u i s h :

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

11.

LESK AND HARDMAN

Macromolecular

Structure

151

(1) Studies of s t a t i c conformations, i n c l u d i n g a l l s t u d i e s of p r o t e i n a r c h i t e c t u r e per se, and some s t u d i e s of the i n t e r a c t i o n s of small molecules w i t h p r o t e i n s . (2) Studies of l o c a l conformational deformations, i n c l u d i n g the response of the macromolecule to the b i n d i n g of substrate i n d e s c r i p t i o n s of f u n c t i o n , or the i n t e r c o n v e r s i o n of conformations in a l l o s t e r i c proteins. (3) Studies of t r a n s i t i o n s from the unfolded to the f o l d e d s t a t e , i n v o l v i n g g l o b a l conformational rearrangements. I n v e s t i g a t i o n s of s t a t i c s t r u c t u r e s include a r c h i t e c t u r a l d e s c r i p t i o n s , comparisons and c l a s s i f i c a t i o n s , and i d e n t i f i c a t i o n s of recurrent patterns such as supersecondary s t r u c t u r e s . Examples include the c l a s s i f i c a t i o n of types of p r o t e i n s by L e v i t t and Chothia (28), the h i e r a r c h i c a l a n a l y s i s of p r o t e i n s t r u c t u r e s by Rose (29), or the comparison of g l o b i n s t r u c t u r e s by Lesk and Chothia (30). Other computational s t u d i e s of s t a t i c s t r u c t u r e s seek to analyze the c o n t r i b u t i o n s to the thermodynamic s t a b i l i t y of the n a t i v e s t a t e . These s t u d i e s include the d e t e c t i o n of hydrogen bonds, packing p a t t e r n s , and a n a l y s i s of buried surface area. Examples of s t u d i e s of l o c a l conformational dynamics include the f i l m s made by Richard Feldmann, i n c o l l a b o r a t i o n with M. Levi t t and with M. Karplus, which show the dynamics of p a n c r e a t i c t r y p s i n i n h i b i t o r and i t s i n t e r a c t i o n with s o l v e n t , and the study by Case and Karplus of the pathway by which an oxygen molecule can enter and leave the b i n d i n g pocket of myoglobin (31). (In the s t a t i c s t r u c t u r e , there i s no stereochemically f e a s i b l e path f o r b i n d i n g oxygen — the process r e q u i r e s a d i s t o r t i o n of the p r o t e i n structure.) No attempt to determine the f o l d i n g pathway of a p r o t e i n from the denatured to the n a t i v e s t a t e e i t h e r by energy minimization or by molecular dynamics has yet been s u c c e s s f u l . Many people bel i e v e that i t would be more e f f e c t i v e to approach the problem through a study of the formation and i n t e r a c t i o n of secondary s t r u c t u r a l u n i t s . A conservative a p p r a i s a l of the s i t u a t i o n i s that t h i s problem i s not one that can be solved by increase i n computer power alone. Software t o o l s f o r a n a l y s i s of s t a t i c s t r u c t u r e s i n c l u d e : (1) Graphics programs that permit s c r u t i n y , d i s s e c t i o n and manip u l a t i o n of s t r u c t u r e s . Computer-generated representations of macromolecules f a l l i n t o three broad c a t e g o r i e s : (a) l i n e drawings ( l i n e segments r e present chemical bonds), i n c l u d i n g the f a m i l i a r " b a l l - a n d - s t i c k " models, and OR-TEP drawings. (b) Drawings of s p a c e - f i l l i n g models such as simulations of CPK models, or representations of molecular surfaces (Figure 1). (c) Schematic diagrams, or cartoons, i n the most common v a r i a t i o n of which c y l i n d e r s represent h e l i c e s and arrows represent strands of sheet (Figure 2). O r i g i n a t e d by A. Rossmann, as a r t i s t s drawings, there now e x i s t computer programs to create them.

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

SUPERCOMPUTERS IN CHEMISTRY

Figure 1.

A space-filling model of sperm whale myoglobin.

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

S P E R M WHALE MYOGLOBIN

Figure 2. A schematic diagram of sperm whale myoglobin. Cylinders represent a-helices. Figures 1 and 2 show the molecule in the same orientation, looking into the heme pocket.

S P E R M WHALE MYOGLOBIN

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

154

SUPERCOMPUTERS

IN

CHEMISTRY

Each r e p r e s e n t a t i o n of a p r o t e i n or n u c l e i c a c i d conveys to the viewer d i f f e r e n t aspects of i t s s t r u c t u r e : l i n e drawings give the bones, s p a c e - f i l l i n g models the f l e s h , and schematic diagrams the g e s t a l t of the design. No s i n g l e representation of a p r o t e i n or n u c l e i c a c i d i s adequate f o r a l l purposes, but the combination of s e v e r a l i s more powerful than the t o t a l of a l l taken independently. Line drawings are e f f e c t i v e at showing the chain f o l d i n g , and can i n d i c a t e s p a t i a l r e l a t i o n s h i p s between a few s e l e c t e d groups such as the sidechains i n t e r a c t i n g with a p r o s t h e t i c group or subs t r a t e . The e n t i r e molecule i s v i s i b l e . S p a c e - f i l l i n g represent a t i o n s give a b e t t e r idea of the packing of the atoms, but spec i a l techniques are required to delve beneath the surface, such as "cheese-wire s e c t i o n s , or rendering of the atoms on surface l a y ers as t r a n s l u c e n t rather than opaque. The main disadvantage of simulated CPK models i s the d i f f i c u l t y of a n a l y z i n g a p i c t u r e of an e n t i r e p r o t e i n . I t i s u s e f u l to have a schematic diagram alongside a s p a c e - f i l l i n g r e p r e s e n t a t i o n , to a i d i n i t s i n t e r pretation. 11

(2) A n a l y s i s of s t r u c t u r a l patterns Given the standard geometries of hydrogen-bonded atoms, or of h e l i c e s and sheets, i t i s p o s s i b l e to apply p a t t e r n - r e c o g n i t i o n techniques to analyze the conformation of a p r o t e i n . This has been done (many times) and i s a u s e f u l approach to the extent that the problem i s w e l l - d e f i n e d . However, i n many cases h e l i c e s devi a t e from standard geometry — e s p e c i a l l y at the ends where they can t i g h t e n up or unravel. Any i n v e s t i g a t i o n that r e q u i r e s a n a l y s i s of t h i s kind of d e t a i l i s b e t t e r pursued i n t e r a c t i v e l y . (3) Estimates of s t a b i l i z a t i o n energies Conformational energies are analyzable both i n terms of p a i r wise i n t e r a c t i o n s among atoms (the Coulomb and van der Waals cont r i b u t i o n s ) and by the c a l c u l a t i o n of surface area a c c e s s i b l e to s o l v e n t , based on methods pioneered by Lee and Richards (32). Wodak and Janin have r e c e n t l y improved the computational technique f o r estimating a c c e s s i b l e surface areas (33). Programs that explore conformation space — e i t h e r l o c a l l y or g l o b a l l y — r e q u i r e some estimate of conformational energy. Gener a l l y t h i s i s done by d e r i v i n g parameters f o r atom-atom i n t e r a c t i o n s which reproduce the thermodynamic and s t r u c t u r a l parameters of c r y s t a l s of small organic molecules. These parameters d e f i n e the conformational energy surface i n the v i c i n i t y of any s t a t e , i f i n t e r a c t i o n s with solvent are neglected (34). In molecular dynamics c a l c u l a t i o n s , a program determines s o l u t i o n s of the c l a s s i c a l equations of motion, s t a r t i n g from some set of i n i t i a l v e l o c i t i e s (that define the temperature of the system, assuming i t to be i s o l a t e d ) . By f o l l o w i n g the motion of the system over a period of time, s t a r t i n g from a nonequilibrium state,

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

11.

LESK AND HARDMAN

Macromolecular

Structure

155

i t i s p o s s i b l e to study the approach to e q u i l i b r i u m . By f o l l o w i n g the motion of the system i n an e q u i l i b r i u m s t a t e , i t i s p o s s i b l e to c a l c u l a t e e q u i l i b r i u m p r o p e r t i e s , and to study f l u c t u a t i o n s . Simple energy minimization can be thought of as molecular dynamics at OK, i n a medium of i n f i n i t e v i s c o s i t y . We have described some of the software that already e x i s t s and which already i s a p p l i c a b l e to i n v e s t i g a t i o n s of p r o t e i n structure. But i f these c a l c u l a t i o n s can already be accomplished, why do we need supercomputers? In some cases, such as molecular dynamics, the question i s one of brute number-crunching power — a f a s t e r computer w i l l permit simulations of the motions of l a r g e r molecules over longer time i n t e r v a l s . But i n other cases the question i s not one of f e a s i b i l i t y vs. i n f e a s i b i l i t y of the c a l c u l a t i o n , but i n the speed w i t h which the c a l c u l a t i o n can be comp l e t e d r e l a t i v e to the time s c a l e of i n t e r a c t i v e computing. Supercomputers may, i n some cases, be able to convert c e r t a i n tasks that must c u r r e n t l y be run i n batch mode to tasks that can be run i n t e r a c t i v e l y . This opens new doors. What kinds of questions would we kinds of t o o l s do we need?

l i k e to be able to ask?

What

The increased power of the next generation of computers w i l l permit many tasks that must now be executed as batch runs to be performed i n t e r a c t i v e l y . This implies that i t w i l l be necessary to design an i n t e r f a c e between a human operator and the set of atomic coordinates that permits the i n t e r a c t i o n to be e f f e c t i v e . Although many algorithms c u r r e n t l y i n use are adequate f o r t h i s purpose, much of the a c t u a l software i s not supple enough to be e a s i l y adaptable to a v a r i e t y of tasks. We envisage a system i n which the point of view of the user has s h i f t e d to the idea of r e t r i e v i n g information from the data base, rather than performing c a l c u l a t i o n s on one or more p r o t e i n s . We r e a l i z e that i f we could suggest a complete set of " p r i m i t i v e questions", we would be performing a u s e f u l s e r v i c e . However, we f e e l that we cannot a n t i c i p a t e a l l the questions that we or anyone e l s e w i l l want to ask. (Conversely, we w i l l not concede that anyone e l s e can a n t i c i p a t e a l l the questions that we w i l l want to ask.) Any g e n e r a l l y u s e f u l system w i l l t h e r e f o r e have to be adaptable to the needs of d i f f e r e n t users. C e r t a i n categories of questions are f a i r l y obvious, however. These i n c l u d e : (1) I d e n t i f i c a t i o n of s t r u c t u r a l u n i t s . This i n v o l v e s the search f o r secondary s t r u c t u r a l u n i t s (more p r e c i s e l y , f o r u n i t s that are superposable on standard u n i t s to w i t h i n a s p e c i f i e d e r r o r ) , or f o r combinations of them. The operator w i l l want to s p e c i f y the range of the search — w i t h i n a c e r t a i n molecule, or over a c l a s s of p r o t e i n s , or over the e n t i r e a v a i l a b l e set of coordinates. He or she w i l l a l s o need to be able to q u a l i f y the object of the search, s p e c i f y i n g perhaps a range of lengths of

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

156

SUPERCOMPUTERS IN CHEMISTRY

h e l i x , or perhaps to ask f o r two h e l i c e s i n contact, with i n t e r a x i a l distances and angles i n s p e c i f i e d ranges. ( 2 ) A n a l y s i s of any s e l e c t e d u n i t . What i s c l o s e to what? (Here what = atom or residue or s e c o n d a r y - s t r u c t u r a l u n i t . ) Where are the hydrogen bonds? What surface i s a c c e s s i b l e to solvent? (both i n an excised fragment of a molecule per se, and a l s o the value that would c h a r a c t e r i z e the fragment w i t h i n the e n t i r e protein.) ( 3 ) Conformational energy estimates. What, approximately, i s the cohesive energy of a s e l e c t e d u n i t ? (4) The geometry of i n t e r f a c e s — i n c l u d i n g the binding of s u b s t r a t e s , e f f e c t o r s and drugs. What i s the nature of the packing at an i n t e r a f a c e ? Is there room f o r another methyl group? How does a substrate f i t i n t o a pocket? How would a known drug, or a molecule that someone i s t r y i n g to f a s h i o n i n t o a drug, f i t i n t o a pocket? (5) Manipulation of conformations. T h i s must i n c l u d e comb i n a t i o n s of (a) manual changes to i n d i v i d u a l sidechains and l a r g e - s c a l e movements of h e l i c e s ( f o r example) and (b) energy minimization and molecular dynamics. C l e a r l y t h i s kind of question can e a s i l y go beyond the l i m i t s of i n t e r a c t i v e computing even with a supercomputer. Two examples of d i f f i c u l t problems that have not been solved yet because of computational l i m i t a t i o n s are: (a) Given a p r o t e i n , such as human hemoglobin, that undergoes an a l l o s t e r i c change upon b i n d i n g of a small molecule. Take the molecule i n the unliganded form, i n s e r t the l i g a n d without changing the p r o t e i n conformation, and f o l l o w the n a t u r a l course of the a l l o s t e r i c change. (b) Given two s e c o n d a r y - s t r u c t u r a l u n i t s such as h e l i c e s or sheets. Determine a l l p o s s i b l e complexes with estimated cohesive energies greater than some t h r e s h o l d . Design Considerations

For Data Bases

It i s c l e a r from these examples that both n u m e r i c a l / t e x t u a l processing, and v e r s a t i l e and h i g h - q u a l i t y i n t e r a c t i v e graphics, both f o r input and f o r output, must be a component of any system f o r i n t e r a c t i v e s t u d i e s of b i o l o g i c a l macromolecules that can take advantage of the power of supercomputers. I t i s a l s o c l e a r that i t would be f u t i l e to attempt to design a system that would s a t i s f y every i n v e s t i g a t o r . I t i s our hope that the s t r u c t u r e of the data base could be designed so that the system i s adaptive — moldable by any user to h i s or her i n d i v i d u a l needs at the moment. The data base should have the p o t e n t i a l f o r a s s i m i l a t i n g the r e s u l t s of working s e s s i o n s , to f a c i l i t a t e c o n t i n u i n g the study. Indeed, i d e a l l y one could provide the user with a " v i r t u a l data base" — one that i s c o n f i g u r a b l e w i t h i n a work s e s s i o n to s a t i s f y the s p e c i a l needs of any job of any user.

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

11.

LESK AND HARDMAN

Macromolecular

Structure

157

The goal i s to help supercomputers to permit the achievement of e x c e l l e n t r e s u l t s by o r d i n a r y s c i e n t i s t s .

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

The Human Component

(35)

We have emphasized the r o l e of a human being as a p a r t i c i p a n t i n the execution of a computer program (as d i s t i n c t from h i s or her other r o l e s as programmer or operator of a computer or a t e r minal (36).) There are a number of c o n s i d e r a t i o n s governing the optimal design of an i n t e r a c t i v e system i n v o l v i n g a supercomputer which r e l a t e s p e c i f i c a l l y to the needs of the human being. A l though these problems occur to some extent i n a l l i n t e r a c t i v e computing, we mention them here because (1) i n general they have received inadequate a t t e n t i o n and (b) i n p a r t i c u l a r there i s reason to f e a r that they w i l l become more severe w i t h a supercomputer. It i s not uncommon to f i n d that a human being working at a computer t e r m i n a l i s under p h y s i o l o g i c a l and p s y c h o l o g i c a l s t r e s s . C e r t a i n f a c t o r s c o n t r i b u t i n g to the s t r e s s are obvious (these may i n c l u d e : crowded and n o i s y c o n d i t i o n s , extremes of temperature, u n c e r t a i n t y over response time, l i m i t e d d u r a t i o n of access to equipment, deadline f o r f i n i s h i n g work, lengthy t r a v e l r e q u i r e d to reach the s i t e of s p e c i a l f a c i l i t i e s ) . Other f a c t o r s are more s u b t l e , and are not well-understood. [A p e c u l i a r type of impatience a f f l i c t s some s c i e n t i s t s when attached to computers, prev e n t i n g optimal response — by the s c i e n t i s t — to even simple problems when they a r i s e unexpectedly. There seems to be a d i f f i c u l t y i n switching from the implementation of s o l u t i o n s to problems (which can be performed e f f e c t i v e l y i n p a r t n e r s h i p w i t h a computer), to the planning of the s o l u t i o n s ( f o r which the comput e r o f t e n not only f a i l s to be a help but i s a s e r i o u s d i s t r a c t i o n . ) The analogy to "... two spent swimmers, who do c l i n g t o gether and choke t h e i r art ', i s apt and w i l l be recognized, ruef u l l y , by numerous readers.] Some of the f a c t o r s causing s t r e s s a r i s e from "managerial decisions concerning a l l o c a t i o n of space, choice of p e r i p h e r a l s , operation of the monitor system, p r i o r i t i e s ; the usual j u s t i f i c a t i o n being the cost of improvements. Other s t r e s s f a c t o r s a r i s e from sloppy programming techniques ( f o r example, r i g i d and o v e r l y complicated input conventions, or unclear or absent e r r o r diagnost i c messages.) Within the community of scientist-programmers, these are the f a c t o r s over which we can e x e r c i s e the greatest control. S t r e s s degrades the performance of tasks i n v o l v i n g c r e a t i v e thought. I f a human being i s to c o n t r i b u t e e f f e c t i v e l y to a p a r t nership with a computer, he or she should be allowed to f u n c t i o n without handicaps. Therefore the causes of s t r e s s should be i d e n t i f i e d and, as f a r as p o s s i b l e , removed. E f f e c t i v e p a r t n e r s h i p with a supercomputer w i l l r e q u i r e even higher l e v e l s of i n t e l l e c t u a l performance from the human. Therefore i t w i l l more-imperativ e l y r e q u i r e the r e d u c t i o n of s t r e s s . 1

11

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

158

SUPERCOMPUTERS

IN

CHEMISTRY

To the extent that pleasant and relaxed working c o n d i t i o n s are l u x u r i e s , s i m p l i s t i c f i n a n c i a l arguments j u s t i f y r e f u s i n g them (or r e s e r v i n g them as rewards), and complaints of t h e i r absence are dismissable as h e d o n i s t i c . But what i f they are not l u x u r i e s ? This i s the question we wish to r a i s e . Work supported grant PCM80-12007.

i n part by N a t i o n a l Science

Foundation

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

Literature Cited 1. Schulz, G.E.; Schirmer, R.H. " P r i n c i p l e s of P r o t e i n S t r u c t u r e " ; Springer-Verlag: New York, 1979. 2. Dickerson, R.E.; Geis, I. "The Structure and A c t i o n of Prot e i n s " ; Harper & Row, New York, 1969. 3. Némethy, G; Scheraga, H.A. Quart. Revs. Biophysics 1977, 10, 239. 4. L e v i t t , M.; Warshel, A. Nature 1975, 253, 694. 5. McCammon, J.A.; G e l i n , B.R.; Karplus, M. Nature 1977, 267, 585. 6. L e v i t t , M. i n " P r o t e i n F o l d i n g " ; R. Jaenicke, ed.; E l s e v i e r / North-Holland Biomedical Press, Amsterdam, 1980; p. 17. 7. Richards, F.M. Ann. Rev. Biophys. Bioeng. 1977, 6, 151. 8. Richards, F.M. C a r l s b e r g Res. Commun. 1979, 44, 47. 9. Karplus, M.; Weaver, D.L. Nature 1976, 260, 404. 10. Baldwin, R.L. Trends Biochem. S c i . 1978 3, 66. 11. Richmond, T.J.; Richards, F.M. J . Mol. B i o l . 1978, 119, 537. 12. Lesk, A.M.; Chothia, C. Biophys. J . 1980, 32, 35. 13. P a u l i n g , L.; Corey, R.B.; Branson, H.R. Proc. Nat. Acad. S c i . 1951, 37, 205. 14. Kauzmann, W. Adv. P r o t . Chem. 1959, XIV, 1. 15. Chothia, C. Nature 1975, 254, 304. 16. B l u n d e l l , T.L.; Johnson, L.N. "Molecular Biology: P r o t e i n C r y s t a l l o g r a p h y " ; Academic Press: London, 1976. 17. Richards, F.M. J . Mol. B i o l . 1968, 37, 225. 18. Diamond, R. " B i l d e r User's Manual"; Laboratory of Molecular Biology, Cambridge, England, 1978. 19. Jones, T.A. J . Appl. C r y s t . 1978, 11, 268. 20. Ahmed, F.R., ed. " C r y s t a l l o g r a p h i c Computing", Munksgaard, Copenhagen, 1976, s e c t i o n B4. 21. Diamond, R. Acta C r y s t . 1966, 21, 253. 22. Agarwal, R.C. Acta C r y s t . 1978, A43,791. 23. Isaacs, N.W.; Agarwal, R.C. Acta C r y s t . 1978, A34, 782. 24. Geddes, A.; Hardman, K. MS. i n p r e p a r a t i o n . 25. Hardman, K. i n " I n t e r a c t i o n Between Iron and P r o t e i n s i n Oxygen and E l e c t r o n Transport"; Chien Ho, ed.; E l s e v i e r / North-Holland Biomedical Press, 1981. 26. Agarwal, R.C. Personal communication. 27. Bernstein, F.C.; K o e t z l e , T.S.; W i l l i a m s , G.J.B.; Meyer, E.F. J r . ; B r i c e , M.D.; Rodgers, J.R.; Kennard, O.; Shimanouchi, T.; Tasumi, M. J . Mol. B i o l . 1977, 112, 535.

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

11.

LESK AND HARDMAN

Macromolecular

Structure

2 8 . L e v i t t , M.; Chothia, C. Nature 1 9 7 6 , 2 6 1 , 5 5 2 . 2 9 . Rose, G.D. J . Mol. Biol. 1 9 7 9 , 134, 4 4 7 . 3 0 . Lesk, A.M.; Chothia, C. J . Mol. Biol. 1 9 8 0 , 1 3 6 , 2 2 5 . 3 1 . Case, D.A.; Karplus, M. J . Mol. Biol. 1 9 7 9 , 1 3 2 , 3 4 3 . 3 2 . Lee, B.; Richards, F.M. J . Mol. Biol. 1 9 7 1 , 55, 3 7 9 . 3 3 . J a n i n , J . ; Wodak, S. Proc. Nat. Acad. S c i . 1 9 8 0 , 1736. 3 4 . L e v i t t , M. J . Mol. Biol. 1 9 7 4 , 8 2 , 3 9 3 . 3 5 . Brenner, A. Datamation 1 9 7 7 , 23, 2 8 3 . 3 6 . Lesk, A.M. Comp. Biol. Med. 1 9 7 7 , 7, 1 1 3 .

Downloaded by EAST CAROLINA UNIV on April 11, 2018 | https://pubs.acs.org Publication Date: November 6, 1981 | doi: 10.1021/bk-1981-0173.ch011

RECEIVED July 20, 1981.

Lykos and Shavitt; Supercomputers in Chemistry ACS Symposium Series; American Chemical Society: Washington, DC, 1981.

159