Anal. Chem. 2000, 72, 91R-97R
Chemometrics Barry K. Lavine
Department of Chemistry, Clarkson University, Potsdam, New York 13699 Review Contents Multivariate Curve Resolution Multivariate Calibration Pattern Recognition Structure-Property Relationships Multiway Analyses Literature Cited
92R 92R 94R 94R 95R 95R
Chemometrics is an approach to analytical and measurement science that uses mathematical, statistical, and other methods of formal logic to determine (often by indirect means) the properties of substances that otherwise would be very difficult to measure directly (A1). Measurements related to the chemical composition of a substance are usually taken, and the value of some property of interest is inferred from these measurements, through an appropriate mathematical relation. Chemometrics works because the properties of many substances are uniquely determined by their chemical composition. Indirect observation of a property is the goal, for good reasons such as speed and economy. This review, the thirteenth of the series and the eleventh with the title “Chemometrics” covers the most significant developments in the field from November 1997 to November 1999. As in the previous review (A2), breakthroughs and advances in the field are highlighted, trends evaluated, and challenges that must be successfully met in order to ensure continued progress in the field delineated. The current review is limited to approximately 100 references. Hence, there is a shift away from applications and toward methodology. This change in format poses a challenge since the number of publications concerned with the development of new methods or new ideas had actually declined while the number of citations on chemometrics in general had actually shown steady growth, as measured by the number of literature citations on this subject. This can be attributed in large measure to the number of chemometric researchers actively engaged in the development and publication of novel chemometric methods, which has declined over the past few years. Brown (A3) reviewed basic trends in the field of chemometrics from the perspective of soft modeling. Wold (A4) posed the pertinent question: How successful has chemometrics been? He observed that chemometrics has been most successful in developing solutions for problems in multivariate calibration, pattern recognition, and multivariate process monitoring. Wold believes that chemometrics is successful when chemical problem solving, not methods development, is the focus. Hasegawa’s (A5) application of principal component analysis to detect minute bands in IR spectra of Langmuir-Blodgett films is an example of the type of chemical problem solving that should be pursued by chemometricians since principal component analysis has permitted investigators crucial insight about events occurring in the film. However, it is not always possible to focus on chemical problem 10.1021/a1000016x CCC: $19.00 Published on Web 04/29/2000
© 2000 American Chemical Society
solving due to the limitations of data analysis techniques used by chemometricians. Principal component analysis is central to many of the more popular chemometric methods. As data analysis problems become more complex, e.g., missing values, correlated measurement errors, large numbers of data points with high leverage, etc., it becomes ever important for chemometricians to modify their methods to handle these types of data structures. Undoubtedly, this will require changes to be made in the implementation of principal component analysis. However, most chemists do not understand the mechanics that underlie principal component analysis. In order for the field to experience further growth, this problem must be addressed. Simply borrowing algorithms from other fields, which has been the modus operandi of previous chemometricians, may not prove effective here. Fortunately, some chemometricians have been working on this problem. Wentzell (A6) has developed a principal component analysis routine that utilizes maximum likelihood theory to compensate for correlated measurement error among the variables in the data matrix. Manne (A7) has developed a singular value decomposition algorithm to compute missing values when performing principal component analysis on messy data sets. Paaterro and Hopke (A8) have developed a principal component analysis routine that individually weights data points. Clearly, these developments are a step in the right direction. Much of the growth in the field of chemometrics has been and continues to be driven by demands that more information be extracted from data. The press of too much data, which is a direct result of the dramatic increase in the number and sophistication of chemical instruments, has triggered interest in the development of new data analysis techniques that can extract information from the large arrays of chemical data routinely being generated. As a result, the focus of this field continues to shift away from indirect observation and toward analytical measurement science. Placing chemical instrumentation under computer control has paved the way for development of new algorithms to optimize instrument performance. Resolving overlapping peaks, improving instrumental calibration, and classifying materials according to source are currently areas of active research in the field. Uncovering hidden relationships in large and complex multivariate data sets has been and will always be the raison d’etre for this field. This review focuses on five sub-areas judged to be hot and/or crucial. This judgment is based in part on the number of literature citations uncovered during the search and in part on the perceived impact that developments in these particular areas will have on chemometrics and analytical chemistry. Although methodology is emphasized, applications that are unusual or will be important to the field are also cited. Because chemometrics is an applicationdriven field, a review of the field cannot be formulated without including novel or exciting applications. Analytical Chemistry, Vol. 72, No. 12, June 15, 2000 91R
MULTIVARIATE CURVE RESOLUTION This section is concerned with the mathematical resolution of mixtures. A mathematical resolution of mixtures is usually performed in far less time than a physical or chemical separation. Thus, mathematical resolution of data obtained on mixtures is faster than an alternative separation and can be more accurate and more precise. The mathematical resolution of mixtures is generally achieved using methods based on principal component analysis. During this reporting period, some new methods to resolve multicomponent systems appeared in the literature. All of these methods required the user to determine by principal component analysis the number of abstract factors in the data. The problem of selecting the number of abstract factors in the data can be a daunting task. Shao (B1) was able to use wavelets to determine the number of components in an overlapping chromatographic peak. Chen (B2) was able to incorporate chemical information into the principal component analysis to determine the number of components in an overlapping chromatographic band. He developed a novel index based on the ratio of eigenvalues calculated by smoothed principal components and those calculated by ordinary principal component analysis. The index is capable of handling situations involving minor components, involving severe collinearity in spectra, or confronting heteroscedastic noise. Wang (B3) has developed a morphological index to differentiate primary from secondary eigenvectors. Elbergali (B4) has integrated several statistical methods into the NIPLS algorithm to predict the number of components in an automated fashion. During the reporting period, several new methods were published on the resolution of overlapping chromatographic or spectral responses. Positive matrix factorization (B5) appears promising since it individually weights data points, thereby simultaneously addressing two issues: high leverage and heteroscedasticity. Subwindow factor analysis (B6, B7), based on identifying regions in the chromatogram where only one component elutes, was developed to extract spectral responses from overlapping components in hyphenated chromatography without resolving the concentration profiles. Using wavelets, Shao (B8) showed that improvements in the performance of window factor analysis could be realized, thereby demonstrating the importance of denoising data prior to peak deconvolution. Wentzell (B9) developed a technique called window target testing factor analysis that can help confirm the presence or absence of an analyte in a severely overlapped chromatogram. The algorithm attempts to determine if the response profile of a target analyte lies within the response subspace of a time window containing unresolved chromatographic peaks. Because the window moves sequentially through the chromatogram, window target testing factor analysis allows a relative assessment of match quality. Ahlberg (B10) describes the extension of self-modeling curve resolution to four or more components implemented by a method called SAFER. By imposing certain natural constraints, it is possible to determine the basic feasible region in the eigenspace that holds the solution. Additional constraints, which are problem-dependent, however, must be imposed to determine the actual solution. When the location of one component has been determined, the effect will be to limit the possible location of the other components. Liang (B11) developed a penalty function to smooth noisy two-way data in order to improve both signal detection and resolution of 92R
Analytical Chemistry, Vol. 72, No. 12, June 15, 2000
overlapping chromatographic responses. Liang (B12) also demonstrated that window factor analysis and orthogonal projections are equivalent methods, suggesting that many of the techniques reported in the chemometric literature for peak deconvolution may be more closely related than previously thought. Most applications of multivariate curve resolution reported in the literature during this period employ an iterative procedure involving alternating least squares, which was developed by Tauler. Kudrev (B13) employed Tauler’s procedure to study acidbase equilibria and conformational changes of double stranded polyadenylic-polyuridylic acid from individual absorbance spectra at different pHs. Mendieta (B14) used alternating least squares to detect intermediate structures in protein folding from CD spectra. Gargallo (B15) used multivariate curve resolution with alternating least squares to study protonic equilibrium and conformational changes of a mixture of valine tRNA from spectroscopically monitored acid-base titrations. De Juan (B16) used alternating least squares to analyze data from several experiments in order to identify pH-dependent thermodynamic and conformational transitions of polynucleotides. Tauler (B17) investigated temperature-dependent conformational multiequilibria processes by analyzing UV spectral-monitoring data of the melting behavior of heteropolynucleotides. Furosjoe (B18) used alternating least squares to analyze the data from a chemical reaction that was monitored using FTIR spectrometry. Although the reaction was run in an unbuffered aqueous medium, it was still possible to obtain information about the reaction using multivariate curve resolution with alternating least squares. Multivariate curve resolution was also used to deconvolute peaks in electroanalytical experiments. Esteban (B19) used alternating least squares to deconvolute overlapping waves in linear sweep voltammetry. Diaz-Cruz (B20) used multivariate curve resolution with alternating least squares to improve resolution in cyclic voltammetry. Keesey (B21) used evolving factor analysis to deconvolute spectro-electrochemical data obtained in kinetic experiments. Biljsma (B22) also used multivariate curve resolution to estimate reaction rate constants from spectral data. Windig (B23-B25) demonstrated that exponential curves from NMR diffusion experiments could be resolved into their individual constituents through suitable modification of the generalized rank annihilation method. Andrew (B26) showed that Raman images could be resolved into their constituents using a Gram-Schmidt modified orthogonal projection approach with alternating least squares or principal components with varimax rotations followed by alternating least squares optimization. MULTIVARIATE CALIBRATION Multivariate calibration refers to the process of relating analyte concentration or the measured value of a physical or chemical property to a measured response, e.g., near-IR spectra of multicomponent mixtures or gas chromatographic profiles of complex biological samples. It remains by far the fastest growing area of chemometrics as evidenced by the tremendous number of papers that have appeared in the last two years on partial least squares (PLS). PLS has become the de facto standard for multivariate calibration because of the quality of the calibration models produced and the ease of their implementation due to the availability of software. The latent variables in PLS are developed
simultaneously along with the calibration model, so that each latent variable is a linear combination of the original measurement variables rotated to ensure maximum correlation with the information provided by the response variable. Hence, confounding of the desired signal by interferants is usually less of a problem in PLS since PLS utilizes both the response and measurement variables to iteratively determine the latent variables in the data. Gil (C1) reported the development of a robust and more efficient PLS algorithm. He stabilized the covariance matrix using the well-known Stahel-Donoho estimator. Robertsson (C2) addressed the problem of nonlinearities in the relationships between the X and Y block, which can adversely affect the performance of PLS, for nonlinear absorbance spectra. Absorbance below a certain threshold value was described as linear and above this limit as nonlinear. The extension of the regressor variables is the squared absorbencies above the linear range. Geladi (C3) showed that a mixed approach, PLS and neural nets, may be the best solution to model nonlinear and noisy calibration systems. He showed that certain aspects of PLS could be transferred to neural network models. Direct orthogonalization (C4), a method for removing factors from the data that describe irrelevant phenomena, can improve the performance of PLS and has advantages over multiplicative scatter correction and second derivatives. Kowalski (C5) demonstrates the utility of sample weighting for lowering prediction error. Schemes that employ leverage-based criterion for selecting weights and new calibration samples are described. Thus, fewer samples describing a new source of variation will be needed to update a model. Massart (C6), who proposed several new statistical tests to identify samples that are not similar to the ones in the calibration model, tackled the problem of detecting outliers and inliers. Conlin (C7) showed that adding Gaussian noise to a sparse data set could sometimes lower the prediction error. Wentzell (C8) demonstrated the hazards of using symmetrical digital smoothers as preprocessing tools in multivariate calibration, while Faber (C9) took a closer look at the bias-variance tradeoff in PLS. He concluded that PLS would not necessarily produce biased predictions. Haaland (C10) observed that examination of PLS loadings can sometimes yield erroneous interpretations about the information contained in the model. The prediction error in PLS can be minimized through judicious wavelength selection. Spiegelman (C11) developed a theoretical justification for wavelength selection in PLS. Walmsley (C12) demonstrated that multiple linear regression analysis could perform as well as PLS when improved variable selection procedures are used. McShane (C13) used a multiple ranking chain approach to identify spectral regions correlated with the signal of interest. Forina (C14) described an iterative method for elimination of useless predictors in PLS. However, heteroscedascity can significantly reduce the efficiency of variable selection procedures. Woodward (C15) showed that variable selection procedures for heteroscedastic data can produce sparse models, but significant modeling will take place purely on the noise components. Achievement of a satisfactory calibration model is usually not the final step in the practical application of PLS or any other multivariate calibration method. Once a calibration model is developed, it must be transferred to other instruments, so the calibration can be used at the point of application. Hoffmann (C16)
used a Shenk-Westerhaus correction to take into account changes in sample temperature and the field of view of the instrument for PLS models to predict various properties of hydro-treated gas oils. Brown (C17) used a different approach to standardize multivariate calibration models for near-IR FTIR spectrometers equipped with fiber-optic probes. Calibration transfer across instruments and probes was studied by employing calibration models built on one instrument to predict properties from spectra measured on the other. A far-IR filter was used to transform the instrument response function of one instrument to match that of another. The transformation was performed over a moving processing window without the use of transfer standards. Lin (C18) used piecewise direct standardization for calibration transfer between NIR spectra measured at different temperatures. Wold (C19, C20) and Sjostrom (C21) have developed a method for calibration transfer called orthogonal signal correction, in which the X matrix is corrected by a subtraction of variation that is orthogonal to the Y matrix. The resulting spectra are less dependent on instrument variation. Hopke (C22) examined calibration transfer as a data reconstruction problem. Calibration transfer can be posed as a missing data problem in which the spectra on a secondary instrument are missing. This approach takes advantage of the capability of positive matrix factorization to estimate missing values and requires a set of standards that have been measured on both instruments. Swierenga (C23) used feature selection to identify a subset of spectral wavelengths or features that were less sensitive to differences between instruments. Use of PLS in biological and industrial analyses is fast becoming commonplace. Arnold (C24) challenged the validity of published reports claiming to have successfully measured in vivo blood glucose from noninvasive near-IR spectra. He developed an in vitro model to simulate noninvasive human near-IR spectra. The phantom glucose data set was created by purposely omitting glucose spectra in these modeled samples. Glucose values were then assigned to successive phantom glucose spectra, and multivariate calibration models were developed for glucose using PLS. Apparent functional models could be obtained when glucose assignments were made in a nonrandom, time-dependent manner. Schenkman (C25) used PLS to develop a method to determine oxygen saturation of myoglobin in the presence of Hb, in vitro from transmission or reflectance near-IR. Noninvasive reflectance NIR was also used to determine deep-tissue pH. Zhang (C26) used partial least squares to develop suitable calibration models. Muscle pH was varied by controlling the blood supply to the muscle. Alam (C27) developed a method to measure blood pH using near-IR and PLS. The PLS model was shown to be able to predict pH from other data sets when red blood cell size and oxygen saturation were varied orthogonally to the pH variation in the training set. Maeda (C28) used NIR and PLS to study hydrogen bonding in alcohols in CCl4 as a function of temperature. Zhou (C29) used NIR and PLS to develop a method to determine moisture in freezedried drug products. Second derivative spectra were used and the number of PLS factors were optimized for the lowest standard error of prediction. Niemczyk (C30) used IR emission for the quantitation of borophosphosilicate glass films on silicon monitor wafers. Bak (C31) used principal component regression to simulate spectra at varying temperatures using the Hitran dataAnalytical Chemistry, Vol. 72, No. 12, June 15, 2000
base. The approach is based on the scores and loadings from three principal component temperature calibration models. Frenich (C32) demonstrated the potential of a cross-section technique in combination with partial least squares to determine the concentration of individual pesticides from overlapped chromatographic bands obtained in a diode array HPLC analysis. Herrero (C33) showed the utility of PLS to resolve overlapping bands in stripping voltammetry. During the past two years, other calibration methods were also reported in the literature. Kalivas (C34) investigated cyclic subspace regression, a technique that encompasses PLS, PCR, and least squares. He showed that PLS and PCR produce essentially the same results where minor differences stem from overfitting by PLS. This conclusion is (to some degree) at variance with previously published reports on the differences between PLS and PCR. He also showed that cyclic subspace regression could be used for wavelength selection (C35). Myrick (C36) describes the development of a set of optical filters for a spectrometer equivalent to the regression vector developed by PLS or PCR for a spectral property calibration. The advantage of this approach stems from the development of an inexpensive spectrometer for the prediction of a specific chemical or physical property of a substance from its spectrum. Depczynski (C37) demonstrated the advantages of using the coefficients from the wavelet transform as variables for a calibration developed for near-infrared spectra using multiple linear regression analysis. Harrington (C38) described a novel neural network for nonlinear calibration. The advantages of temperature-constrained cascade correlation neural networks include ease of use, stability, and faster training times. PATTERN RECOGNITION The overall goal of pattern recognition is classification. Developing a classifier from spectral or chromatographic data may be desirable for any number of reasons, including source identification, detection of odorants, presence or absence of disease in an animal or person from which the sample was taken, and food quality testing. During the past two years, some new classification methods were reported in the literature. Li (D1) developed a robust linear discriminant analysis routine, which has a high breakdown value for outliers. Carrieri (D2) developed an artificial neural network for detecting amino acids, sugars, and other solid organic matter by analyzing the polarized light scattering signatures of these compounds as a Mueller matrix. The product of the training is a weight matrix that, when applied as a filter, discerns the presence of the analytes from their cured susceptive Muller matrix difference elements. Lavine (D3-D5) developed a genetic algorithm (GA) for pattern recognition analysis of chromatographic and spectroscopic data. The GA selects features that optimize the separation of the classes in a plot of the two largest principal components (PCs) of the data. Because the largest PCs capture the bulk of the variance in the data, the features chosen by the GA convey information primarily about differences between the classes in the data set. Hence, the principal component analysis routine, embedded in the fitness function of the GA, acts as an information filter, significantly reducing the size of the search space, since it restricts the search to feature subsets whose PC plots show clustering on the basis of class. In addition, the algorithm focuses on those classes and 94R
Analytical Chemistry, Vol. 72, No. 12, June 15, 2000
or samples that are difficult to classify as it trains using a form of boosting. Samples that consistently classify correctly are not as heavily weighted in the analysis as samples that are difficult to classify. Over time, the algorithm learns its optimal parameters in a manner similar to a neural network. The proposed algorithm actually integrates aspects of artificial intelligence and evolutionary computations to yield a “smart” one-pass procedure for pattern recognition. Much of the literature on pattern recognition during this reporting period focused on novel and not so novel applications. Only the novel applications are referenced in this section. Furthermore, the bulk of the references in this section are organized according to type of application, namely applications to art, chromatography, biological analyses, chromatography, sensors, and spectral interpretation. The majority of these studies involved the use of one or several techniques that are fairly well established, e.g., neural networks, linear discriminant analysis, fuzzy clustering, etc. Classification of data remains an important subject as evidenced by the fact that pattern recognition had the second largest number of citations in the Chemical Abstract database during this reporting period. One of the most interesting applications of pattern recognition techniques was near-infrared reflectance imaging. Mantsch (D6) utilized linear discriminant analysis (LDA) and cluster analysis to analyze near-infrared spectral data for the purposes of authenticating artwork. Neural nets were used to interpret twodimensional gel electrophoresis spot patterns (D7) obtained from Streptomyces coelicolor and controls. Self-organizing maps were shown to be useful in the classification of human blood plasma lipoprotein lipid profiles from 1H NMR data (D8). Sarmini (D9) used cluster analysis to assess the similarity of solvents used for buffering background electrolytes in capillary zone electrophoresis. Each solvent was described by the electrophoretic mobilities of 26 aromatic solutes. Zellers (D10) used pattern recognition techniques to investigate vapor recognition by SAW devices as a function of the number of sensors in the array, the polymer sensor coatings employed, and the number and concentration of the vapors being analyzed. Barko (D11) used fuzzy clustering to discriminate organic compounds on the basis of their piezoelectric chemical sensor data. Principal component analysis was used to select the coating materials for the sensors. McAlernon (D12) used piezoelectric crystal sensor arrays and Kohonen selforganizing maps to discriminate among different organic vapors. The coatings on each crystal were selected to promote a range of different solvation interactions with the sample test set. Artificial neural networks (D13) also proved to be useful in classifying modified starches from their IR spectra and clustering chemical compounds based on their spectral characteristics (D14). Finally, neural networks were used to identifying structural features of organic compounds (D15). The training algorithms, which had been reported to give good performance with hypothetical problems, did not necessarily give the best results with real world problems, suggesting that selecting the best learning algorithm is problem-dependent. STRUCTURE-PROPERTY RELATIONSHIPS In this section, the use of multivariate methods to build linear or nonlinear models that relate chemical structure to a physical
or chemically measurable property is reviewed. Most studies in this area focus on representing the structure of a compound by a set of molecular descriptors and applying soft modeling methods to discover the relationship between structure and property. Researchers have also attempted to develop models that predict some type of spectroscopic response from chemical structure and vice-versa. Harrington (E1) trained a neural network to predict the toxicity of organic phosphorus pesticides by identifying the active substructure and then using the network to screen GC/MS data for environmentally hazardous compounds. Heravi (E2) used a neural network to model the flame ionization response factors of a variety of organic compounds. Barden (E3) trained neural nets to predict the location of a characteristic IR peak from 1116 spectra. Liu (E4) trained a neural net to predict the association constant of mono and di-substituted aromatics for R-cyclodextrin inclusion complexes. The descriptors found useful conveyed information about the hydrophobicity and size of the substituents bound to the aromatic ring. Tetteh (E5) trained a radial basis neural network to simultaneously model the flash point and boiling point of organic compounds using topological descriptors. Hall (E6) utilized a neural network and topological indices including a new electro-topological state index to model boiling point for a set of 372 saturated compounds including alkanes, alcohols, and chloroalkanes. Liang (E7) predicted vapor pressure from computationally derived descriptors for a set of 469 compounds. Multiple linear regression was found to produce the best model. A method for predicting the aqueous solubility of 211 drugs was developed using topological indices and artificial neural network modeling (E8). Wold (E9) developed a strategy for cluster analysis of organic compounds for combinatorial chemistry and applied it to a set of 627 alcohols. Each compound was represented by 50 computed molecular descriptors. PLS models of each cluster confirmed the chemical relevance of the clusters formed. Gasteiger (E10) used principal component analysis and self-organizing neural networks to focus on the changes of electronic features of oxygen atoms at the reaction site. Good correlations were found between the similarities in the changes of the electronic features of oxygen atoms of the reaction sites and the similarities in the substructural transformations at the reaction site as well as with the known reaction types. Jurs (E11) predicted clearing temperature for a series of liquid crystals from molecular structure, which were encoded by numerical descriptors conveying information regarding size, shape, and the ability to participate in intermolecular interactions. Qin (E12) used Raman spectra to develop a PLS model to predict the degree of crystallinity of isotropically crystalline and amorphous polyactide. The crystallinity of the training set samples was assessed using differential scanning calorimetry. Finally, Meusinger (E13) developed a quantitative structure-property relationship (QSPR) to model the knocking behavior of 241 organic compounds using topological descriptors and a nonbinary genetic algorithm. MULTIWAY ANALYSES Many chemical problems involve three-way arrays of data. For example, liquid chromatography coupled to a fluorescence spectrometer produces a three-way matrix consisting of layers of two-
way excitation-emission matrices that vary as a function of elution time. The fluorescent intensities depend on the excitation wavelength, emission wavelength, and elution time, variables representing the three modes. Thus, three-way techniques offer exciting possibilities for research. During this reporting period, there were many papers published on three-way methodology. DeJong (F1) is able to demonstrate that Bro’s three-way PLS is equivalent to Stahle’s linear three-way decomposition. The different ways of unfolding, mean centering, and scaling the three-way matrix are compared and discussed with respect to their effects on the analysis of batch data by MacGregor (F2). Bro (F3) proposes a procedure for fitting a PARAFAC 2 model to a set of data matrices. Using this approach, he was able to model chromatographic data with retention time shifts (F4). Smilde (F5) investigates cross-validation as an approach for choosing the number of components to use in the Tucker3 model. Henrion (F6) develops a new criterion for simple structure transformations of core arrays based on the maximization of the variance of squared core entries. Smilde (F7) proposes an algorithm for fitting a range of constrained three-mode factor models to process data. Paatero and Hopke (F8-F10) describe a weighted nonnegative least squares algorithm for three-way factor analysis, which he calls PMF3. The algorithm imposes a logarithmic penalty function to achieve nonnegativity, and it can dynamically reweigh the data, permitting a robust analysis of outlier-containing data. Jiang (F11) describes a new calibration procedure for three-way data, which identifies a few vectors that minimize a least-squares criterion. Applications of three-way techniques are too numerous to be cited in their entirety. Nevertheless, some of the more novel applications are listed. Wise (F12) investigated the use of threeway for fault detection in a semiconductor etch process. Booksh (F13) accurately quantitated trace pesticides and PAHs by coupling excitation-emission matrix fluorescence spectroscopy with three-way analysis. Martins (F14) studied piroxicam in cyclodextrin-containing hydrophilic solvents by total fluorescence using PARAFAC to resolve its spectra. Beltran (F15, F16) used three-way PLS to determine the concentration of PAHs in water samples via HPLC coupled with fast-scanning fluorescence spectroscopy. Windig (F17) used three-way methods to resolve NMR data of complex mixtures. Hopke (F18, F19) investigated arctic haze using three-way techniques. Smilde (F20) used three-way techniques to estimate rate constants and pure UV-visible spectra of intermediates and products in a two-step reaction. Barry K. Lavine is an Associate Professor of Chemistry at Clarkson University in Potsdam, NY. He has published more than 60 papers in chemometrics and is on the editorial board of several journals. He is also Assistant Editor of Chemometrics for Analytical Letters and teaches a short course, entitled “Winning at Chemometrics”, for The American Chemical Society. Lavine’s research interests encompass many aspects of the application of computers to chemical analysis, including pattern recognition and multivariate calibration using genetic algorithms and other evolutionary techniques.
LITERATURE CITED INTRODUCTION (A1) Lavine, B. K.; Brown, S. D. Managing Mod. Lab. 1998, 3(1), 9-14. (A2) Lavine, B. K. Anal. Chem. 1998, 70(12), 209R-228R. (A3) Brown, S. D. Comput. Chem. Eng. 1998, 23(2), 203-216. (A4) Wold, S.; Sjostrom, M. Chemom. Intell. Lab. Syst. 1998, 44(1, 2), 3-14. (A5) Hasegawa, T. Anal. Chem. 1999, 71(15), 3085-3091.
Analytical Chemistry, Vol. 72, No. 12, June 15, 2000
(A6) Wentzell, P. D.; Lohnes, M. T. Chemom. Intell. Lab. Syst. 1999, 45(1, 2), 65-85. (A7) Grung, B.; Manne, R. Chemom. Intell. Lab. Syst. 1998, 42(1, 2), 125-139. (A8) Hopke, P. K.; Xie, Y.; Paatero, P. J. Chemom. 1999, 13(3-4), 343-352. MULTIVARIATE CURVE RESOLUTION (B1) Shao, X.; Cai, W.; Sun, P. Chemom. Intell. Lab. Syst. 1998, 43(1, 2), 147-155. (B2) Chen, Z.-P.; Liang, Y.-Z.; Jiang, J.-H.; Li, Y.; Qian, J.-Y.; Yu, R.Q. J. Chemom. 1999, 13(1), 15-30. (B3) Wang, J.-H.; Jiang, J.-H.; Xiong, J.-F.; Li, Y.; Liang, Y.-Z.; Yu, R.-Q. J. Chemom. 1998, 12(2), 95-104. (B4) Elbergali, A.; Nygren, J.; Kubista, M. Anal. Chim. Acta 1999, 379(1-2), 143-158. (B5) Xie, Y.-L.; Hopke, P. K.; Paatero, P. J. Chemom. 1998, 12(6), 357-364. (B6) Manne, R.; Shen, H.; Liang, Y. Chemom. Intell. Lab. Syst. 1999, 45(1,2), 171-176. (B7) Shen, H.; Manne, R.; Xu, Q.; Chen, D.; Liang, Y. Chemom. Intell. Lab. Syst. 1999, 45(1,2), 323-328. (B8) Shao, X.; Cai, W. J. Chemom. 1998, 12(2), 85-93. (B9) Lohnes, M. T.; Guy, R. D.; Wentzell, P. D. Anal. Chim. Acta 1999, 389(1-3), 95-113. (B10) Ahlberg, C. Drug Discovery Today 1999, 4(8), 370-376. (B11) Liang, Y.-Z.; Leung, A. K.-M.; Chau, F.-T. J. Chemom. 1999, 13(5), 511-524. (B12) Xu, Q.; Liang, Y. Chemom. Intell. Lab. Syst. 1999, 45(1,2), 335338. (B13) Kudrev, A.; Gargallo, R.; Izquierdo-Ridorsa, A.; Tauler, R.; Casassas, E. Anal. Chim. Acta 1998, 363(2-3), 119-132. (B14) Mendieta, J.; Diaz-Cruz, M. S.; Esteban, M.; Tauler, R. Biophys. J. 1998, 74(6), 2876-2888. (B15) Gargallo, R.; Tauler, R.; Izquierdo-Ridorsa, A. Spectrosc. Biol. Mol.: Mod. Trends [Eur. Conf.], 7th, 1997; Carmona, P., Navarro, R., Hernanz, A., Eds.; Kluwer: Dordrecht, Neth., 1997; pp 249-250. (B16) DeJuan, A.; Izquierdo, A.; Tauler, R. Spectrosc. Biol. Mol.: Mod. Trends, [Eur. Conf.], 7th, 1997; Carmona, P., Navarro, R., Hernanz, A., Eds.; Kluwer: Dordrecht, Neth., 1997; pp 247248. (B17) Tauler, R.; Gargallo, R.; Vives, M.; Izquierdo-Ridorsa, A. Chemom. Intell. Lab. Syst. 1999, 46(2), 275-295. (B18) Furusjoe, E.; Danielsson, L.-G.; Koenberg, E.; Rentsch-Jonas, M.; Skagerberg, B. Anal. Chem. 1998, 70(9), 1726-1734. (B19) Esteban, M.; Harlyk, C.; Rodriguez, A. R. J. Electroanal. Chem. 1999, 468(2), 202-212. (B20) Diaz-Cruz, M. S.; Mendieta, J.; Tauler, R.; Esteban, M. Anal. Chem. 1999, 71(20), 4629-4636. (B21) Keesey, R. L.; Ryan, M. D. Anal. Chem. 1999, 71(9), 17441752. (B22) Bijlsma, S.; Smilde, A. K. Anal. Chim. Acta 1999, 396(2-3), 231-240. (B23) Winding, W.; Antalek, B.; Sorriero, L. J.; Bijlsma, S.; Louwerse, D. J.; Smilde, A. K. J. Chemom. 1999, 13(2), 95-110. (B24) Winding, W.; Hornak, J. P.; Antalek, B. J. Magn. Reson. 1998 132(2), 298-306. (B25) Antalek, B.; Hornak, J. P.; Windig, W. J. Magn. Reson. 1998, 132(2), 307-315. (B26) Andrew, J. J.; Hancewicz, T. M. Appl. Spectrosc. 1998, 52(6), 797-807. MULTIVARIATE CALIBRATION (C1) Gil, J. A.; Romera, R. J. Chemom. 1998, 12(6), 365-378. (C2) Robertsson, G. Chemom. Intell. Lab. Syst. 1999, 47(1), 99106. (C3) Hadjiiski, L.; Geladi, P.; Hopke, P. Chemom. Intell. Lab. Syst. 1999, 49(1), 91-103. (C4) Andersson, C. A. Chemom. Intell. Lab. Syst. 1999, 47(1), 5163. (C5) Stork, C. L.; Kowalski, B. R. Chemom. Intell. Lab. Syst. 1999, 48(2), 151-166. (C6) Jouan-Rimbaud, D.; Bouveresse, E.; Massart, D. L.; de Noord, O. E. Anal. Chim. Acta 1999, 388(3), 283-301. (C7) Conlin, A. K.; Martin, E. B.; Morris, A. J. Chemom. Intell. Lab. Syst. 1998, 44(1,2), 161-173. (C8) Brown, C. D.; Wentzell, P. D. J. Chemom. 1999, 13(2), 133152. (C9) Faber, N. M. J. Chemom. 1999, 13(2), 185-192. (C10) Haaland, D. M.; Han, L.; Niemczyk, T. M. Appl. Spectrosc. 1999, 53(4), 390-395. (C11) Spiegelman, C. H.; McShane, M. J.; Cote, G. L.; Goetz, M. J.; Motamedi, M.; Yue, Q. L. Anal. Chem. 1998, 70(1), 35-44. (C12) Walmsley, A. D. Anal. Chim. Acta 1997, 354(1-3), 225-232. (C13) McShane, M. J.; Cameron, B. D.; Cote, G. L.; Motamedi, M.; Spiegelman, C. H. Anal. Chim. Acta 1999, 388(3), 251-264. (C14) Forina, M.; Casolino, C.; Millan, C. P. J. Chemom. 1999, 13(2), 165-184. (C15) Woodward, A. M.; Alsberg, B. K.; Kell, D. B. Chemom. Intell. Lab. Syst. 1998, 40(1), 101-107. 96R
Analytical Chemistry, Vol. 72, No. 12, June 15, 2000
(C16) Hoffmann, U.; Zanier-Szydlowski, N. J. Near Infrared Spectrosc. 1999, 7(1), 33-45. (C17) Sum, S. T.; Brown, S. D. Appl. Spectrosc. 1998, 52(6), 869877. (C18) Lin, J. Appl. Spectrosc. 1998, 52(12), 1591-1596. (C19) Wold, S.; Antti, H.; Lindgren, F.; Ohman, J. Chemom. Intell. Lab. Syst. 1998, 44(1,2), 175-185. (C20) Sjoblom, J.; Svensson, O.; Josefson, M.; Kullberg, H.; Wold, S. Chemom. Intell. Lab. Syst. 1998, 44(1,2), 229-244. (C21) Marklund, A.; Hauksson, J. B.; Edlund, U.; Sjostrom, M. Nord. Pulp Pap. Res. J. 1999, 14(2), 140-148. (C22) Xie, Y.; Hopke, P. K. Anal. Chim. Acta 1999, 384(2), 193205. (C23) Swierenga, H.; De Weijer, A. P.; Buydens, L. M. C. J. Chemom. 1999, 13(3-4), 237-249. (C24) Arnold, M. A.; Burmeister, J. J.; Small, G. W. Anal. Chem. 1998, 70(9), 1773-1781. (C25) Schenkman, K. A.; Marble, D. R.; Feigl, E. O.; Burns, D. H. Appl. Spectrosc. 1999, 53(3), 325-331. (C26) Zhang, S.; Soller, B. R.; Micheels, R. H. Appl. Spectrosc. 1998, 52(3), 401-406. (C27) Alam, M. K.; Rohrscheib, M. R.; Franke, J. E.; Niemczyk, T. M.; Maynard, J. D.; Robinson, M. R. Appl. Spectrosc. 1999, 53(3), 316-324. (C28) Maeda, H.; Wang, Y.; Ozaki, Y.; Suzuki, M.; Czarnecki, M. A.; Iwahashi, M. Chemom. Intell. Lab. Syst. 1999, 45(1,2), 121130. (C29) Zhou, X.; Hines, P. A.; White, K. C.; Borer, M. W. Anal. Chem. 1998, 70(2), 390-394. (C30) Niemczyk, T. M.; Zhang, S.; Franke, J. E.; Haaland, D. M. Appl. Spectrosc. 1999, 53(7), 822-828. (C31) Bak, J. Appl. Spectrosc. 1999, 53(11), 1375-1381. (C32) Frenich, A. G.; Vidal, J. L. M.; Galera, M. M. Anal. Chem. 1999, 71(21), 4844-4850. (C33) Herrero, A.; Ortiz, M. C. Talanta 1998, 46(1), 129-138. (C34) Kalivas, J. H. Chemom. Intell. Lab. Syst. 1999, 45(1,2), 215224. (C35) Bakken, G. A.; Houghton, T. P.; Kalivas, J. H. Chemom. Intell. Lab. Syst. 1999, 45(1,2), 225-239. (C36) Nelson, M. P.; Aust, J. F.; Dobrowolski, J. A.; Verly, P. G.; Myrick, M. L. Proc. - ASC Tech. Conf. Compos. Mater., 13th, 1998; Vizzini, A. J., Uleck, K. R., Eds.; American Society for Composites: Los Angeles, California, 1998, pp 687-703. (C37) Depczynski, U.; Jetter, K.; Molt, K.; Niemoller, A. Chemom. Intell. Lab. Syst. 1999, 47(2), 179-187. (C38) Harrington, P. de B. Anal. Chem. 1998, 70(7), 1297-1306. PATTERN RECOGNITION (D1) Li, Y.; Jiang, J.-H.; Chen, Z.-P.; Xu, C.-J.; Yu, R.-Q. J. Chemom. 1999, 13(1), 3-13. (D2) Carrieri, A. H. Appl. Opt. 1999, 38(17), 3759-3766. (D3) Lavine, B. K.; Moores, A.; Helfend, L. K. J. Anal. Appl. Pyrolysis 1999, 50(1), 47-62. (D4) Lavine, B. K.; Moores, A. J.; Mayfield, H. T. Anal. Lett. 1998, 31(15), 2805-2822. (D5) Lavine, B. K.; Moores, A. J.; Mayfield, H.; Faruque, A. Microchem J. 1999, 61(1), 69-78. (D6) Mansfield, J. R.; Sowa, M. G.; Majzels, C.; Collins, C.; Cloutis, E.; Mantsch, H. H. Vib. Spectrosc. 1999, 19(1), 33-45. (D7) Vohradsky, J. Electrophoresis 1997, 18(15), 2749-2754. (D8) Kaartinen, J.; Hiltunen, Y.; Kovanen, P. T.; Ala-Korpela, M. NMR Biomed. 1998, 11(4/5), 168-176. (D9) Sarmini, K.; Reich, G.; Kenndler, E. J. Microcolumn Sep. 1999, 11(8), 576-581. (D10) Park, J.; Groves, W. A.; Zellers, E. T. Anal. Chem. 1999, 71(17), 3877-3886. (D11) Barko, G.; Abonyi, J.; Hlavay, J. Anal. Chim. Acta 1999, 398(2-3), 219-226. (D12) McAlernon, P.; Slater, J. M.; Lau, K.-T. Analyst (Cambridge, U.K.) 1999, 124(6), 851-857. (D13) Dolmatova, L.; Ruckebusch, C.; Dupuy, N.; Huvenne, J.-P.; Legrand, P. Appl. Spectrosc. 1998, 52(3), 329-338. (D14) Cleva, C.; Cachet, C.; Cabrol-Bass, D. Analusis 1999, 27(1), 81-90. (D15) Eghbaldar, A.; Forrest, T. P.; Cabrol-Bass, D. Anal. Chim. Acta 1998, 359(3), 283-301. STRUCTURE-PROPERTY RELATIONSHIPS (E1) Cai, C.; de Harrington, P. Anal. Chem. 1999, 71(19), 41344141. (E2) Jalali-Heravi, M.; Fatemi, M. H. J. Chromatogr., A 1998, 825(2), 161-169. (E3) Barden, C. J.; Boysworth, M. K.; Palocsay, F. A. J. Chem. Inf. Comput. Sci. 1998, 38(3), 483-488. (E4) Liu, L.; Li, W.-G.; Guo, Q.-X. J. Inclusion Phenom. Macrocyclic Chem. 1999, 34(3), 291-298. (E5) Tetteh, J.; Suzuki, T.; Metcalfe, E.; Howells, S. J. Chem. Inf. Comput. Sci. 1999, 39(3), 491-507. (E6) Hall, L. H.; Story, C. T. SAR QSAR Environ. Res. 1997, 6(34), 139-161.
(E7) Liang, C.; Gallagher, D. A. J. Chem. Inf. Comput. Sci. 1998, 38(2), 321-324. (E8) Huuskonen, J.; Salo, M.; Taskinen, J. J. Chem. Inf. Comput. Sci. 1998, 38(3), 450-456. (E9) Linusson, A.; Wold, S.; Norden, B. Chemom. Intell. Lab. Syst. 1998, 44(1,2), 213-227. (E10) Satoh, H.; Sacher, O.; Nakata, T.; Chen, L.; Gasteiger, J.; Funatsu, K. J. Chem. Inf. Comput. Sci. 1998, 38(2), 210-219. (E11) Johnson, S. R.; Jurs, P. C. Chem. Mater. 1999, 11(4), 10071023. (E12) Qin, D.; Kean, R. T. Appl. Spectrosc. 1998, 52(4), 488-495. (E13) Meusinger, R.; Moros, R. Chemom. Intell. Lab. Syst. 1999, 46(1), 67-78. MULTIWAY ANALYSIS (F1) De Jong, S. J. Chemom. 1998, 12(1), 77-81. (F2) Westerhuis, J. A.; Kourti, T.; Macgregor, J. F. J. Chemom. 1999, 13(3-4), 397-413. (F3) Kiers, H. A. L.; Ten Berge, J. M. F.; Bro, R. J. Chemom. 1999, 13(3-4), 275-294. (F4) Bro, R.; Andersson, C. A.; Kiers, H. A. L. J. Chemom. 1999, 13(3-4), 295-309. (F5) Louwerse, D. J.; Smilde, A. K.; Kiers, H. A. L. J. Chemom. 1999, 13(5), 491-510. (F6) Henrion, R.; Andersson, C. A. Chemom. Intell. Lab. Syst. 1999, 47(2), 189-204. (F7) Kiers, H. A. L.; Smilde, A. K. J. Chemom. 1998, 12(2), 125147.
(F8) Paatero, P. Chemom. Intell. Lab. Syst. 1997, 38(2), 223-242. (F9) Hopke, P. K.; Paatero, P.; Jia, H.; Ross, R. T.; Harshman, R. A. Chemom. Intell. Lab. Syst. 1998, 43(1,2), 25-42. (F10) Geladi, P.; Xie, Y.-L.; Polissar, A.; Hopke, P. J. Chemom. 1998, 12(5), 337-354. (F11) Jiang, J.-H.; Wu, H.-L.; Chen, Z.-P.; Yu, R.-Q. Anal. Chem. 1999, 71(19), 4254-4262. (F12) Wise, B. M.; Gallagher, N. B.; Butler, S. W.; White, D. D., Jr.; Barna, G. G. J. Chemom. 1999, 13(3-4), 379-396. (F13) JiJi, R. D.; Cooper, G. A.; Booksh, K. S. Anal. Chim. Acta 1999, 397(1-3), 61-72. (F14) Martins, J. A.; Sena, M. M.; Poppi, R. J.; Pessine, F. B. T. Appl. Spectrosc. 1999, 53(5), 510-522. (F15) Beltran, J. L.; Guiteras, J.; Ferrer, R. Anal. Chem. 1998, 70(9), 1949-1955. (F16) Beltran, J. L.; Guiteras, J.; Ferrer, R. J. Chromatogra., A 1998, 802(2), 263-275. (F17) Windig, W.; Antalek, B. Chemom. Intell. Lab. Syst. 1999, 46(2), 207-219. (F18) Xie, Y.-L.; Hopke, P. K.; Paatero, P.; Barrie, L. A.; Li, S.-M. Atmos. Environ. 1999, 33(16), 2549-2562. (F19) Polissar, A. V.; Hopke, P. K.; Paatero, P.; Malm, W. C.; Sisler, J. F. J. Geophys. Res., [Atmos.] 1998, 103(D15), 19045-19057. (F20) Bijlsma, S.; Louwerse, D. J.; Smilde, A. K. J. Chemom. 1999, 13(3-4), 311-329.
Analytical Chemistry, Vol. 72, No. 12, June 15, 2000