Chemometrics - Analytical Chemistry (ACS Publications)https://pubs.acs.org/doi/abs/10.1021/ac101202zSimilarby B Lavine -...
Anal. Chem. 2010, 82, 4699–4711
Chemometrics Barry Lavine*,† and Jerry Workman‡ Department of Chemistry, Oklahoma State University, Stillwater, Oklahoma 74078, and Technology Business Associates, 3031 Rivoli, Newport Beach, California 92660 Review Contents Pattern Recognition Multivariate Calibration Multivariate Curve Resolution Literature Cited
4701 4704 4707 4709
This review, the eighteenth of this series and the sixteenth with the title of “Chemometrics,” covers the most significant developments in the field from January 2008 through December 2009. As in the previous review covering the period 2006-2007 (1), new developments in the field are highlighted and recent trends within the field are discussed. The current review format limits the number of citations so only the most novel or “groundbreaking” papers are cited. We apologize in advance to any authors who feel slighted, but the citations and space limitation restrict the potential for broader coverage, particularly difficult for reviewers since the number of publications within the field of chemometrics continues to dramatically increase. If one includes the peripheral applications of chemometrics in such areas as image analysis and biomedical physics or genetic screening, drug discovery, and process technology, then the number of papers published during this period is too numerous to comprehensively evaluate. There were thousands of citations across these fields using chemometric-like data processing methods during the period of 2008 through 2009. There has been an increased use of multivariate methods across broad ranges of scientific disciplines, and these tools have become rather standard training for most graduate level scientific disciplines. In the earlier days, one had to code these algorithms in order to use them; however, this has now become elementary because of more powerful computer software packages that are available today which provide the option of a standard form of the algorithm or availability for customization of pre- and postprocessing algorithms. Since multivariate analysis methods are now standard for basic and applied research, the number of publications in each review period is burgeoning. The delineation of chemometric nomenclature is essential in such a growing and creative field. Such terminology discussions both clarify and unify the field and provide a basis for cooperative understanding and research. IUPAC has undertaken a project to create glossary terms and concepts used in chemometrics (2). This will be accomplished by consultation with the community through wiki, a Web site that can be modified by users (see http:// www.iupacterms.eigenvector.com/index.php?title)Main_Page). Over time new terms can be added and consensus definitions † ‡
Oklahoma State University. Technology Business Associates.
10.1021/ac101202z 2010 American Chemical Society Published on Web 05/19/2010
developed. The definitions will be published as IUPAC recommendations. Nonspecialists have discovered the advantages offered by chemometrics in their own research which includes the efficient extraction of information from chemical data and the design of better and more informative experiments. Several texts on this subject have been published during the 2008-2009 review period including those covering the subject of chemometrics as it pertains to methods development in capillary zone electrophoresis (3), practical statistics as it applies to selecting statistical tests for specific data analysis needs (4), and pattern recognition (5), which is one of the most important and rapidly advancing areas of chemometrics driven in large measure by analysis problems in such areas as LC/MS, GC/MS, and NMR. The application of chemometrics in such diverse fields as environmental analysis (6, 7) and atomic spectroscopy (8) has been highlighted in several texts. A four volume set entitled, “Comprehensive Chemometrics,” which is designed to be the preeminent technical reference for chemometrics, has been published in 2009 (9). An introductory text on multivariate statistical analysis (10) and one on Kalman filtering (11) has also been published during this period. There are a number of chemometrics software packages that are commercially available. Many packages offer the flexibility for the user to code their own versions of algorithms or to combine or make modifications to programs to suit their specific application requirements. We make no claims or recommendations for software choices, but the user must select based on problem solving requirements, experience, and the familiarity of their working group with the various tool sets. We list the most common packages below in alphabetical order based on company name. Applied Chemometrics offers a variety of software for chemometric problem solving, locate them at http://www.Chemometrics.com. They offer Multi-Quant for Windows and other platforms, Multi-Qual for Windows and other platforms; a MATLAB Chemometric Toolbox http://www.chemometrics.com/software/chemometrics.html, and a MATLAB Factor Analysis Toolbox http:// www.chemometrics.com/software/fatb.html. Camo is one of the original chemometric software vendors. Their UNSCRAMBLER software package has been enhanced throughout the years to expand its capabilities. Their Web site is located at http://www.camo.com/. Eigenvector’s PLS Toolbox offers a set of powerful tools for basic to advanced chemometric problem solving. They also provide training and workshops to assist users who are interested in all types of chemistry related quantitative or qualitative multivariate analysis. Eigenvector offers a variety of products and Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
services specifically related to chemometric problem solving. Their Web site is located at http://www.eigenvector.com/. Infometrix was one of the first and foremost chemometrics software companies. Since 1978 they have been active in assisting a wide variety of customers with their software and data analysis problems. Their main software product is called Pirouette and they have expanded it to Metabolomics and other cutting edge applications. Their Web site is located at http://www.infometrix. com/. Multivariate analysis in LabVIEW can be performed using their G-Programming Language. With LabVIEW real-time chemometrics is combined with instrument control and data acquisition using a single software platform. Various user applications and other helpful suggestions for G language chemometric programming can be found at http://www.ni.com/labview/. MathCAD offers a large number of mathematical functions useful for multivariate problem solving. There are a number of sites that provide programming advice and sharing of MathCAD scripts. One such site is found at the following url: http:// www.ptc.com/appserver/search/results.jsp?q ) multivariate. The Math Works MATLAB (http://www.mathworks.com/ products/) product has been one of the principal development tools for chemometric research for the past 25 years. Many users and researchers exchange SCRIPTS (MATLAB programs) for a variety of chemometric applications; MATLAB is a powerful PCbased Matrix Laboratory tool. The MATLAB Chemometrics Toolbox (3rd Party Software) can be found at http://www. mathworks.com/products/connections/product_detail/product_35297. html. Information about Math Works and tools for Chemometrics can be found at http://www.mathtools.net/MATLAB/Chemometrics/ index.html. Microsoft Excel has commercially available add-ons for Chemometrics. One can also browse the Web for helpful suggestions in creating your own set of tools using the Excel Mathematical Functions. A commercial source for help is Chemetrica which is available from RSD Associates at http://www.rsd-associates.com/ chemetrica.htm. We recommend searching the Web for commercial, university, and other “power user” sites to locate addons as required for your particular data analysis problems using Excel. MultiSimplex, a software tool for experimental design and analysis, is offered at http://www.multisimplex.com/. R is a statistical computing and graphics environment. It is a GNU project (http://www.gnu.org/) resembling the earlier S language developed at Bell Laboratories. R is Free Software in source code form that is compatible with UNIX, Linux, FreeBSD, MacOS, and Windows platforms. The URL is located at http://www.r-project. org/. Thermo Fisher GRAMS is considered to be one of the better packages for analysis of spectroscopic data because of its functionality which includes a utility for converting data formats which is one of the outstanding features of GRAMS. The Web site http://www.thermo.com/com/cda/landingpage/0,,585,00.html contains further information about GRAMS. A number of chemometric conferences were held during 2008 and 2009 covering a variety of topics with many having international venues. Those of note are listed here with their corresponding URLs. The conferences are listed in chronological order: (1) 4700
Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
WSC-6 (sixth Winter Symposium on Chemometrics) February 18-22, 2008 Kazan, Russia, http://rcs.chph.ras.ru/WSC6/index.html; (2) CMA4CH 08 (Multivariate Analysis and Chemometrics applied to Cultural Heritage and Environment) June 1-4, 2008 Ventotene Island, Italy, http://w3.uniroma1.it/cma4ch/; (3) CAC-2008 (11th Chemometrics in Analytical Chemistry Conference) June 30-July 4, 2008, Montpellier, France, http:// www.cac2008.org/; (4) Analytics & Analysts (2nd International Forum) September 22-26, 2008, Voronezh, Russia, http:// www.vgta.vrn.ru/; (5) FACSS 2008, September 28-October 2, Reno, NV, http://www.facss.org; (6) SSC-11 (Scandinavian Symposium on Chemometrics) June 8-11, 2009, Loen/Stryn, Norway, http://org.uib.no/ssc11/index.htm; (7) Euroanalysis 2009, September 6-10, Innsbruck, Austria, http://www.euroanalysis2009.at/; (8) Conferentia Chemometrica 2009, September 27-30, Siofok, Lake Balaton, Hungary, http://www.cc2009.mke.org.hu/; (9) International Workshop on Multivariate Image Analysis September 28-29, Valencia, Spain, http://mseg.webs.upv.es/MIA_Workshop/index.html; (10) FACSS 2009, October 18-22, Louisville, KY, http://www.facss.org. There are many national and local chemometric societies. Many groups are quite active while others are inactive. The groups that have greater visibility are listed here. One of the main links to multiple chemometric Web sites, which has been mentioned in previous reviews, is located at http://www.namics.nysaes.cornell.edu/. This site has links to Australian, Belgian, British, Czech, Danish, Dutch, German, Finnish, French, Italian, North American, Norwegian, Russian, South African, Spanish, and the Swedish chemometric societies. Umea University is the original home of chemometrics society activity worldwide, and the group continues to be very active in research. Their Analytical Chemistry Springboard includes a comprehensive chemometrics links and is found at http:// www.anachem.umu.se/cgi-bin/jumpstation.exe?Chemometrics. Other chemometric URLs such as http://cheminformatics.org/ and Cheminformatics Links contain 637 links in 90 categories, including multiple data sets in several distinct categories. The Wiley Chemometrics & Informatics site contains news events, conferences, and recent publications pertaining to chemometrics on its Web site. Its link is found by going to http://www.spectroscopynow.com/ and clicking on the Chemometrics and Informatics tab. The homepage of Chemometrics URL is located at http://www.chemometrics.se/editorial/index.html. During 2008 and 2010, several review articles on applications of chemometrics and chemometric methods have appeared in the literature. Workman (12) discussed recent developments in chemometrics, e.g., batch modeling and multivariate statistical process control, pharmaceutical chemometrics, pharmaceutical (PAT), and AI or artificial intelligence, as specifically applied to process analysis. Dorman (13) focused on applications of chemometrics to gas chromatography, and Will focused on applications in photochemistry (14). Metabolic processes in complex biological systems can be studied using techniques which combine data rich analytical chemical measurements with chemometrics; such a process provides detailed chemical information with which to compare and contrast control subjects from test subjects. Challenges in applying chemometrics to LC/MS-based global metabolite profile data have been reviewed by Want (15). Trygg (16)
and Lindon (17) have reviewed chemometric methods used for metabolomic disease diagnosis. All three reviews provided recommendations for improving data analysis for disease diagnosis. Metabolic fingerprinting by capillary zone electrophoresis has been the subject of a review by Garcia-Perez (18), and the application of advanced multivariate data analysis techniques in the field of mid-infrared spectroscopic biomedical diagnosis has been reviewed by Mizaikoff (19). Both reviews anticipate more widespread application of pattern recognition and multivariate calibration techniques in assisting chemical fingerprinting in the near future as a screening or diagnostic tool in biomedical research and clinical studies. Integration of metabolomics and proteomics using a high throughput shotgun proteomic approach implemented through LC/MS requires pattern recognition methods based on dimensionality reduction for biomarker identification. This integrative approach to increase the information content of LC/MS data has been reviewed by Weckwerth (20). The use of proton NMR to investigate diabetes, which requires the use of statistical pattern recognition to reveal novel insights into the biochemical consequences of this disease, has been reviewed by Nicholson (21). Proteomic studies produce large quantities of data characterized by few samples and a large number of measurement variables. Data analysis strategies, which include the use of pattern recognition and multivariate curve resolution methods, for the discovery of biomarkers in proteomic studies have recently been reviewed by Smilde (22). Mining of data in chemometrics using both hypothesis driven methods and data driven models has been recently reviewed by Mutihac (23). Chemometric techniques based on factorial design and response surface methodologies have many advantages over one way optimization for analytical applications including a reduced number of experiments and the opportunity to assess interactions among variables. These techniques also enable the selection of optimum experimental conditions. Experimental design approaches to evaluate and optimize chemical analysis have been reviewed for electroanalytical chemistry (24), microextractions (25), and capillary zone electrophoresis (26). The advantages and disadvantages of using artificial neural networks to solve classification and calibration problems as compared to more traditional chemometric methods has been reviewed by Marini (27). Genetic algorithms (28) have been reviewed with emphasis on the opportunities that are afforded to simultaneously develop accurate calibration models and to regulate model complexity and predictive ability within a considered validation framework. A collection of representative papers of notable stature are summarized for this review. The development of more complex analytical instrumentation and the need to analyze larger data sets have demanded newer and better approaches for data analysis. Chemometric analysis of the results from analytical laboratory methods is providing more insight for understanding complex chemical and biological systems. Chemometrics applications covered for this review includes metabolomics, imaging, improvements in analytical modeling methods, quantitative structureactivity relationships, and fingerprinting. Chemometrics is a discipline concerned with the application of statistical and mathematical methods as well as those methods based on formal mathematical logic to chemistry. Publications concerned with development of new chemometric methods
experienced only modest growth during this period. We attribute this to the number of researchers actively engaged in development and publication of novel chemometrics methods which has remained constant. On the other hand, the number of researchers applying chemometrics continues to grow, and the number of publications concerned with applications of chemometric methods also grew substantially. The extraction of information from chemical data continues to drive research in the field of chemometrics. Development of new methods in chemometrics and novel or important applications of these methods occurred in three major areas: pattern recognition, resolution, and calibration, which are summarized below. Topics such as feature selection, data preprocessing, signal processing, library searching, parameter estimation, and optimization are also covered in this literature survey and are treated in the context of the three major application areas that are the focus of this review. PATTERN RECOGNITION The overall goal of pattern recognition is classification. Developing a classifier from spectral, chromatographic, or compositional data may be desirable for any number of purposes including source identification, presence or absence of disease in a patient or animal from which the sample has been taken, and food quality testing to name just a few. The classification step is often accomplished using one of several techniques that are now fairly well established including principal component analysis, hierarchical clustering, k-nearest neighbor, statistical, and regularized discriminant analysis. Few novel pattern recognition methods were reported in the literature during the past 2 years. Instead the chemical literature on pattern recognition focused on novel and not so novel applications. However, classification of data remains an important subject in chemometrics as evidenced by the large number of citations in the Chemical Abstract database on pattern recognition applications during 2008-2009 which were only rivaled by multivariate curve resolution. Hence, most of the references in this section are organized according to the type of application. However, there were papers published by research groups in this period that focused on improvements in the methods used for classification. Fundamental work on classification continues to be refined in a number of aspects with special focus given to new methodology and data preprocessing. Improving the methods used for pattern recognition continues to be an active area of research in chemometrics. Several groups offered new algorithms for visualization, clustering, and classification of multivariate chemical data. Ivosev and Burton (29) have introduced principle component variable grouping, an unsupervised method that assigns variables to sample clusters that can be visualized and more readily understood when exploiting knowledge of the source or the nature of the variables associated with the cluster. Using kernel methods and introducing differential penalties, Yamamoto (30) has developed a nonlinear extension of principal component analysis (PCA), partial least-squares (PLS), orthonormalized partial least-squares (OPLS), and regularized Fisher discriminant analysis (RFDA) to data where each observation varies with time. These methods were successfully applied to fermentation data. Hancock (31) reported that changes in peptide/protein abundance in samples indicative of potential disease biomarkers can be identified from LC/MS data using CLUE-TIPS, clustering using Euclidean distance in Tanimoto Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
interpoint space. CLUE-TIPS can also be used to assess the quality of data from different LC/MS runs. Calvacanti (32) has proposed a new clustering algorithm call Adaptive Mean-Linkage with Penalty, which applies a penalty to Euclidean distances, which defines the similarity among samples when clustering data. Brown (33) reported a new classification method for spectral data called wavelet orthogonal signal correction (WOSC), which combines a wavelet prism decomposition of a spectral response and orthogonal signal correction to improve classification and reduce model complexity. WOSC compared favorably to OPLSdiscriminant analysis. Zhao (34) described a diagonal discriminant analysis method that employed both shrinkage and regularization of the variances. The proposed method performed better than support vector machines and k-nearest neighbor for both simulated and real data obtained from a microarray. The combination of Adaboost and decision stumps was investigated as a potential classification method (35). With the use of a data set consisting of 9 features (trace elements) and 122 samples (urine), Adaboost performed better than Fisher discriminant analysis with overtraining not being reported as a problem. The application of bagging in classification was also investigated. In one study (36), K-NN with a nested bootstrapped scheme was used to classify a data set consisting of chromatograms of different wines. In the other study (37), bagging was used to improve the performance of CART, an unstable and overfitting classifier. In both studies, bagging did improve the performance of the classifier but the decrease in the error rate was not significant. A cotraining algorithm (38) with applications to data fusion has been developed to utilize data from diverse sources to predict a common class variable. An R package for performing the proposed approach is available. A robust classification algorithm (39) is proposed for support vector machine that is able to identify outliers in the machine framework and appropriately deweight them during the analysis of the data. A major drawback of support vector machines is the requirement of optimization of the regularization and kernel parameters to control the risk of overfitting. An approach (40) to guide the choice of these machine parameters based on a grid search to minimize classification error but also utilizing visualization of the supporting vectors is proposed. Support vector machine discriminant models that are developed using the grid search provide good classification performance but are more parsimonious and easier to interpret. Scaling and preprocessing of data is often crucial in the development of a classifier. However, the procedures that should be employed in a particular study are highly dependent upon the nature of the problem under investigation and the goals of the analysis. Jorgenson (41) investigated the effects of scaling and prefiltering by univariate nonparametric statistics on the selection of spots in 2D gel electrophoresis. A modified autoscaling of the entire data set using within group standard deviations was shown to be advantageous in revealing potential group dependent protein patterns. Tutz (42) reported that scaling of variables using pooled variances instead of overall variances will improve prediction accuracy when using nearest neighbor methods for classification. Substantial variability between spectra of replicates is often a problem in diffuse reflectance spectroscopy. Fearn (43) has shown that by using PCA to model the variability of replicates followed by subtraction of the modeled effects through projection of the 4702
Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
spectra onto the subspace orthogonal to factors derived from PCA, this effect can be ameliorated. The presence of missing data represents a real challenge if an objective statistical analysis of multivariate chemical data is desired. Nicolati (44) has shown that missing values in data from gel based proteomic data sets are better estimated using Bayesian principal component analysis than by the NIPALS algorithm. A large number of publications have appeared in the chemical literature on the practical aspects and implications of preprocessing chromatographic and spectroscopic data to correct for undesirable time-shifts. For a successful pattern recognition study, it is essential that features encode the same information for all samples or objects in the data set. If variable 3 is the area of a gas chromatographic peak for acetaldehyde in sample 1, it must also be the area of the GC peak for acetaldehyde in the other samples that comprise the data set. Hence peak matching is crucial when chromatograms or spectra are translated into data vectors or data matrices. Daszykowski (45) represented individual data tables obtained from capillary electrophoresis-mass spectrometry or high-performance liquid chromatography with diode array detection as Gram matrices where the summation is taken over the time dimension. This eliminated the variation in time scales while the time information was preserved by the correlation structure between time channels. Nicholson (46) developed a recursive segment-wise peak alignment algorithm to reduce variability in all peak positions across multiple proton NMR spectra. The proposed algorithm is suitable for many other types of data sets such as chromatographic profiling data or mass spectral data. Binning is often used to peak match proton NMR spectra. Anderson (47) addressed the problem of peaks near boundaries in uniform binning by developing a Gaussian binning method that utilizes overlapping bins to minimize effects associated with shifts in peaks near boundaries. Janssen (48) developed a set of control standards to evaluate the quality of an alignment result for chromatographic based metabolomic data. It is anticipated that determining the optimum set of parameters for an alignment algorithm will become straightforward as a result of using this algorithm to assess the quality of a proposed alignment. Horvatovich (49) used a two-dimensional correlation function for LC/MS data to provide a reliable alignment scoring function that was insensitive to both spurious peaks and background noise. Several papers have appeared in the literature on feature selection. Feature selection is crucial in the development of a classifier. Irrelevant features can introduce so much noise that a good classification of the data cannot be obtained. When these irrelevant features are removed, a clear and well-separated class structure can be found. For averaging techniques such as partial least-squares and linear discriminant analysis, feature selection is vital since signal is averaged with noise over a large number of variables with a loss of discernible signal amplitude when noisy features are not removed from the data. Feature selection improves the reliability of a classifier because noisy variables will increase the chances of false classification and decrease classification success-rates on new data. It is important to identify and delete features from the data set that contain information about experimental artifacts or other systematic variations in the data not related to legitimate chemical differences between the classes represented in the study. Feature selection can also lead to an
understanding of the essential features that play an important role in governing the behavior of the system under investigation. It can identify those measurements that are informative and those measurements that are not informative. Hamprecht (50) used feature selection to improve the performance of regularized classification methods. With the utilization of recursive feature elimination as implemented by random forest classification together with regularized classification methods on spectral data sets from chemotaxonomy, biomedical analysis, and food science, discriminant partial least-squares regression performed better than a filtering of the variables based on several univariate tests. A method (51) for coupling data dimensionality reduction and variable selection by compressing windows of the original data and only retaining the scores of the significant principal components of local models developed within these data windows yielded classification models that performed better than discriminants using features developed from the entire chromatographic profile. Toulhoat (52) proposed an algorithm for feature selection in NMR based on the landscape of the covariance/ correlation ratio of consecutive variables along the chemical shift axis to restore the spectral dependency and recouple variables in clusters. Variables are divided into clusters with each cluster considered as an individual object for feature selection. With the use of partial least-squares discriminant analysis to obtain class models and the bootstrap (20 000 iterations) for both model optimization and random splits into training and test sets, features can be selected by ranking the variables based on their PLS regression coefficients (53). Dougherty (54) has evaluated several well established feature selection methods from two fundamental perspectives: classification accuracy and the optimum number of features that should be used. Applications of pattern recognition methods dominated the literature. Proton NMR and pattern recognition analysis were used for metabolic profiling of human blood serum from a control group (n ) 25) and patients with bipolar disorder (n ) 25). The study (55) was conducted to identify molecular changes due to the disorder and to different drug treatments. N-Glycan oligosaccharides of human serum samples isolated from three groups (healthy individuals, patients with lymphoma, and patients with ovarian cancer) were analyzed by MALDI-TOF and linear discriminant analysis (56). There was good separation between the three groups, and cross validation indicates good predictive power for the classifier. Principal component analysis and an advanced computer based bucketing approach have been applied to LC/ MS data to identify novel putative compounds from secondary metabolites produced by bacteria (57). Metabolic profiling studies generate complex data sets which are difficult to summarize and visualize. To address this problem, Trygg and co-workers (58) proposed S-plot, which visualizes both the covariance and correlation between the metabolites and the class designations to help identify statistically significant potentially biochemically significant metabolites. Improved visualization and discrimination of interesting metabolites could be demonstrated by using S-plot with OPLS. The application of FCV clustering and PLS to LC/ MS metabolomic data (59) revealed phenotype changes and individual characters of three genes of Escherichia coli. The output of the membership matrix from the FCV clustering algorithm for a specified number of clusters is used for the Y-matrix in the PLS
model. Large multivariate data sets of complex protein mixture samples generated by TOF-SIMS, which could provide insights into cellular processes and disease diagnosis, were analyzed using five multivariate analysis methods: PCA, LDA, OPLS-DA, and decision trees. LDA and OPLS-DA performed the best whereas the decision tree was the least successful for the tested samples (60). Bayesian wavelet based functional mixed models (61) were used to analyze MALDI-TOF mass spectra. With the mass spectra modelled as functions, the problem of peak matching is obviated and systematic block and batch effects that characterize these data can be taken into account. Principal component and linear discriminant analysis of attenuated total reflection FT-IR spectra were able to differentiate among oliginucleotides 15 bases in length that differed by a single base pair (62). These results suggest that the combination of mid-IR spectroscopy and pattern recognition analysis can be used to differentiate polymorphic forms of a DNA template. There were several publications that focused on the application of pattern recognition techniques to forensics. The combination of near IR chemical imaging and pattern recognition analysis was used to characterize 55 counterfeit Heptodin tablets and 11 authentic Heptodin tablets (63). Counterfeit tablets could easily be distinguished from the authentic ones. Furthermore, k-means clustering and PCA clustered the counterfeit tablets into 13 groups, indicative of their origin. Both near IR and Raman spectroscopy were used to develop a potential method to discriminate between genuine and counterfeit Lipitor (64). Classification by PLS discriminant analysis models was successful for both the near IR and Raman data. The robustness of the PLS models was sufficiently large to allow for reliable discrimination despite the strong spectroscopic activity of the excipients. With the use of proton NMR and SIMCA pattern recognition, an efficient method was developed to detect malicious and accidental contamination of carbonated soft drinks (65). The advantages of using feature selection to extract contaminant NMR frequencies from the data were also investigated. Detection limits for perspective impurities were estimated. The combination of an array of micelle-solubilized fluorophores and statistical discriminant analysis was used to detect explosives, e.g., TNT, tetryl, RDX, and HMX (66). The quenching patterns generated from the array were used in linear discriminant analysis. Heroin and cocaine gas chromatographic data was analyzed using clustering techniques to assess the potential of these chemical signatures in the investigative process (67). Each clustering algorithm showed different behavior for the two data sets investigated. Gas chromatography and pattern recognition analysis were used to identify fuel samples obtained from the wreckage of the Prestige tanker (68). PCA, cluster analysis, and Kohonen neural networks were used to characterize and differentiate the Prestige fuel oil from other illegal discharges. Human gender classification using Raman spectroscopy of fingernail clippings has been demonstrated in forensic applications (69). PCA and support vector machines were implemented to perform the data classification. Pattern recognition analysis of inductively coupled argon plasma emission spectroscopy was used to develop a potential method to differentiate between healthy and hepatitis C patients (70). PCA was used to visualize the separation between the groups Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
and show relationships between the six trace elements and the diseased state. PLS was used to develop a classification model from the data. A few papers focused on the application of pattern recognition techniques in the medical sciences. Goodacre (71) investigated the discrimination of three strains of the bacteria bacillus using pyrolysis gas liquid chromatography-differential mobility spectrometry and PLS discriminant analysis. The data were generated in both positive and negative modes, and the data in each mode were analyzed independently. Discriminant PLS was able to differentiate all three strains whereas PCA could only differentiate two of the three strains. Menezes (72) compared PCA, PLS discriminant analysis, and SIMCA for bacteria discrimination based on FT-IR spectra. Bias and uncertainty for each discrimination model was determined using jackknifing and bootstrapping. Bacteria in either their sporulated or vegetative physiological states have previously been discriminated using FT-IR spectroscopy. However, discrimination between bacteria in the same physiological state requires pattern recognition methods. Johnson (73) demonstrated a successful identification to the species level for Bacillus spores using PCA and a classification method based upon similarity measurements. NIR spectroscopy has been investigated as a potential method to identify bacteria at the genus and species level (74). Unmodeled spectra were transformed to the first and second derivative spectra, and PCA, PLS2 discriminant analysis, and SIMCA were used in the analysis of the data. PLS2 was able to classify the bacterial spectra at the species level whereas SIMCA produced high classification success rates for the bacteria at the genus level. An electronic nose was used by Thaler (75) to differentiate between biofilm and nonbiofilm producing bacteria. Logistic regression was used to develop a classification model from the volatiles emitted by Staphylococcus and Pseudomonas bacterial species. Pattern recognition methods have become an integral part of image analysis. FT-IR spectral microimaging and pattern recognition methods were used to develop a potential method to diagnose cutaneous carcinomas (76). Classification models were developed using linear discriminant analysis with wavenumbers identified by statistical tests or genetic algorithms or scores from PCA used as features for the model. The feasibility of using magnetic resonance imaging and pattern recognition analysis to determine reno-vascular diseases was investigated by Michoux (77). Segmentation using PCA to separate out the cortex from the rest of the kidney was undertaken. SIMCA and PLS discriminant analysis were tested for various types of data pretreatment. PLS classifiers outperformed SIMCA pattern recognition. OPLS discriminant analysis was used to differentiate between two different cell types in mouse liver samples, hepatocytes and erythrocytes, using high spatial resolution FT-IR microspectroscopy (78). The principal advantage of OPLS in this application is its ability to isolate predictive variation between cell-type from variation uncorrelated to cell type in the classification model to facilitate understanding of the different sources of variation in the FT-IR data. The use of NIR hyperspectral imaging and pattern recognition methods for distinguishing between hard, intermediate, and soft maize kernels from inbred lines was evaluated by Geladi and co-workers (79). PCA was used to remove background, bad pixels, and shading from absorbance images. PLS discriminant analysis was used to 4704
Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
successfully develop classification models from the “cleaned” images. Micro-Raman spectral images of the adhesive/dentin interface were studied using PCA and fuzzy clustering, which were able to present the entire hyperspectral information from each specimen in a concise way (80). The pattern recognition methods used were also able to highlight minute and often important variations in the spectra that were previously difficult to observe using univariate methods. Finally, some novel, unusual, or interesting applications of pattern recognition were reported in the recent literature. Rajan (81) used principal component analysis to assess the statistical interdependency of electronic descriptors and crystal structure parameter of spinel nitrides. Furthermore, classical versions of structure maps from the early work of Hill based on heuristic observations for this class of crystals can be reproduced using PCA. This multivariate approach also provides an alternative method to visualize the structure activity relationships of the spinel nitrides showing data clustering associated with site occupancy. Circular dichroism (CD) and pattern recognition methods were used to differentiate among various DNA structures, e.g., random coil, duplex, hairpin, reversed and normal triplex, and parallel and antiparallel G-quadruplex (82). Hierarchical clustering, PCA, and PLS discriminant analysis allowed for efficient classification of the CD data. PCA, discriminant function analysis, and cluster analysis were applied to data generated by an electronic noise used to characterize the volatile organics emitted by cucumber, pepper, and tomato leaves (83). Furthermore, pattern recognition analysis of the data was able to identify plants infested with mites, mildew, or hornworm. This study suggests that an electronic nose and pattern recognition analysis can serve as a real time pest and disease monitoring system for plants since VOC signatures characteristic of the stress experienced by the plant could be identified from the data. Pattern recognition methods have been used to construct classifiers that can specify the date of an unknown sherd using elemental data obtained from neutron activation analysis (84). Discriminant functions developed from elemental analysis data classified unknown sherds recovered in an excavation in Beijing as Ming Dynasty. This prediction is consistent with the fact that these sherds are similar to other Longquan Ming celadon artifacts that were previously recovered. Pattern recognition methods have been used to assist in library searching of mass spectra (85). A classification model that exploits the neutral loss to identify differences between mass spectra of alcohols or ethers was developed by selecting the appropriate features in the mass spectra. The proposed classifier performed better than one developed directly from the mass spectral data using either linear discriminant analysis or partial least-squares with features directly selected from the spectra for each of these classifiers using a genetic algorithm. MULTIVARIATE CALIBRATION Calibration involves relating, correlating, or modeling a measured response based on the amounts, concentrations, or other physical or chemical properties of a set of analytes. Multivariate calibration refers to the process of relating the analyte concentration or the measured value of the physical or chemical property to a measured response, e.g., near IR spectra of multicomponent mixtures. There were a large number of citations found in the Chemical Abstract Database during this review period. PLS has
come to dominate the practice of multivariate calibration because of the quality of the calibration models produced and the ease of their implementation due to availability of PLS software. Latent variables in PLS are developed simultaneously along with the calibration model so that each latent variable is a linear combination of the original measurement variables rotated to ensure maximum correlation with the information provided by the property variable. Nine PLS-1 algorithms were compared in terms of their numerical stability and speed (86). NIPALS by Wold, the nonorthogonalized scores algorithm by Martens, improved kernel PLS by Dayal, and direct scores PLS-1 based on a new recurrent formula for the calculation of basis vectors yielding scores directly from X and y. Improved kernel PLS and direct scores PLS, which are numerically stable, are also much faster than NIPALS and the nonorthogonalized scores algorithm by Martens. The improved interpretation provided by OPLS, a new multivariate analysis method, was demonstrated by Trygg (87) using four NIR data sets. A robust PLS algorithm that is insensitive to outliers and overcomes the deficiencies of previous PLS algorithms is discussed (88). A new nonlinear PLS algorithm (89), quadratic fuzzy PLS, that uses the Takagi-Sugeno-Kang (TSK) fuzzy inference system to model the outer relationship and the quadratic TSK fuzzy inference system to model the inner relationship was developed during this period. Quadratic fuzzy PLS was compared to linear PLS, quadratic PLS, linear fuzzy PLS, and neural network PLS using randomly generated test data. Quadratic fuzzy PLS outperformed the other PLS algorithms. A new PLS algorithm has been developed (90) that shrinks the regression coefficients of the calibration model, not only toward the concentrations of the training set samples but also toward the spectra that comprise the validation set. PLS has been combined with canonical correlation analysis to develop a new dimensionality reduction method for estimating latent variables in multivariate classification and regression problems, where there is more than one response variable (91). Only simple modifications of existing PLS algorithms are necessary to develop the proposed method. A new PLS2 algorithm has been developed for large data sets (92). A PLS-2 model is computed between the score matrices obtained by PCA on the original X and Y block. After running PLS-2 on the scores, there is a back transformation to the original measurement space, leading to results that are identical to those obtained with the original data. A number of papers exploring calibration methods other than PLS appeared. Small et al. (93) developed multivariate calibration models using Gaussian basis functions to extract information from single beam spectra which can be implemented in the spectrometer hardware. The proposed calibration methodology was demonstrated through the development of near-IR regression models for the determination of the physiological levels of glucose in two synthetic biological matrixes. The models were validated using data collected outside the time frame of the calibration data used to develop the models. The calibration models developed using the Gaussian basis functions performed better than PLS models, particularly with respect to calibration stability over time. Relevance vector machines as a nonlinear multivariate calibration method capable of tackling ill-posed regression problems were investigated by Hernandez and co-workers (94). Although rel-
evance vector machines did not outperform linear least-squares support vector machines for the three data sets investigated, the relevance vector machine models were sparser. Bayesian linear regression models optimized using Bayesian evidence approximation to established the model hyperparameters and variable selection employed within the Bayesian framework to improve model performance were evaluated using two spectroscopic data sets (95). The Bayesian linear regression models outperformed PLS. An iterative principal component regression model previously developed to establish the error variances in the wavelength or mixture domain has been extended to address measurement error in both domains (96). With the use of simulated and experimental data, the quality of calibration models developed using this iterative approach to principal component regression is superior to calibration models obtained through principal component regression. Determining the correct number of latent components for a soft calibration model continues to be a challenge. Although selection of training and prediction sets using experiment design techniques is the best approach for dealing with this problem, many laboratories cannot afford to take such an approach since samples are often obtained over time in an undesigned manner. Kritchman (97) proposed a method to determine the number of components from a limited number of high-dimensional noisy samples based on the eigenvalues of the sample covariance matrix which combines a matrix perturbation approach to describe the interaction of signal and noise eigenvalues with results from random matrix theory regarding the behavior of noise eigenvalues. Repeated double cross validation was investigated as a method for optimizing the complexity of regression models (98). The effect of the number of segments in repeated double cross validation and the number of repetitions was investigated for estimating the optimum number of PLS components in a calibration model. Models using all of the original measurement variables and models using a small subset of the measurement variables selected by a genetic algorithm were compared using two NIR data sets. Wold (99) determined the number of significant PLS and OPLS components for calibration models of spectroscopic data using an analysis of variance (ANOVA) of the cross validated residuals. The CV-ANOVA diagnostics work well in cases where PLS and OPLS work well. Feature selection has become an important topic in multivariate calibration. A backward step wrapper method has been proposed for PLS (100). The RMSEP for an independent test set was used as the selection criterion to quantify the gain obtained using a selected set of variables. The proposed method was evaluated using several data sets, and in many instances one could improve the prediction performance of the PLS models as compared to using full spectrum models. A new procedure to enhance prediction of PLS calibration models by sorting the variables using informative vectors followed by a systematic investigation of the PLS regression model which involved comparing the cross validated parameters of the model to identify the most relevant features was investigated using several NIR data sets (101). The informative vectors used were the regression vector, correlation vector, residual vector, variable influence on projection vector, net analyte signal, covariance procedure vector, signal-to-noise ratio vector, and their combinations. A genetic algorithm for selection Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
of an optimum combination of wavelengths for multicomponent spectral data using competitive adaptive reweighted sampling, which compared the absolute values of the regression coefficients of PLS models developed during each generation as an index for evaluating the importance of each wavelength was investigated using a simulated and real NIR data set (102). An advantage of the proposed method is the interpretability of the PLS models developed by this feature selection procedure. A new cutoff criterion has been proposed for uninformative variable elimination (UVE) which utilizes the t-Student distribution (103). The procedure was used to develop PLS models for the determination of heroin in illicit street drugs. Achievement of a satisfactory multivariate calibration model is often not the final step in many practical applications. Once it is developed, it is often necessary to transfer the calibration model to other instruments or update the calibration model to ensure that the calibration can be used at the point of measurement. One way to achieve transfer of a calibration is to standardize either the instrumentation used or the calibration itself. Two multiplicative signal correction strategies, window multiplicative scatter correction (W-MSC) and moving window multiplicative scatter correction (MW-MSC), for calibration transfer were investigated by Rose-Pehrsson (104). Data from one instrument was standardized to match data from a second instrument. For WMSC, user defined windows were selected whereas for MWMSC the window size was optimized based on a two-step procedure based on sample leverage. A statistical significance test was developed to select the appropriate window size. This approach has the advantage of not requiring standards for the calibration transfer. A method based on canonical correlation analysis was used to solve the calibration transfer problem (105). When compared to piecewise direct standardization using two NIR data sets, the transfer that resulted using the proposed method is better than the transfer obtained by piecewise direct standardization. Instrumental, process, and operational drift can render a multivariate calibration model unsuitable for prediction of key chemical components. Drift correction methods are either implicit or explicit. Implicit correction methods (ICM) incorporate drift in the training set data whereas explicit correction methods (ECM) model the drift using online reference measurements and make the space of the calibration model orthogonal or invariant to the space spanned by the drift. Monte Carlo simulation studies undertaken as part of a larger study to study drift in process calibrations show that ECM performs better than ICM (106). Calibration transfer has also been a problem for calibrations involving high-resolution NMR data. Recently, direct standardization, piecewise direct standardization, and double-window piecewise direct standardization techniques were explored and PLS calibration models developed from proton NMR data for glucose, glycine, and citrate metabolite concentrations in model mixtures (107). Results from these studies suggest that direct and piecewise standardization techniques provide similar improvement in error prediction and provide significant improvement over standard preprocessing techniques such as reference deconvolution and spectral binning. Applications of PLS regression dominated the literature. Use of PLS has become commonplace in analytical chemistry, and applications are appearing in very distant fields. A major part of the increase in the use of PLS can be attributed to improving 4706
Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
commercial software for chemometrics, but better education of chemists in the use and application of multivariate calibration also appears to have a role. Vibrational spectroscopy has long been an area where chemometric methods are embraced, and it is no surprise that many of the applications appeared in analyses using near-IR spectroscopy. Analysis of glucose in various aqueous media has been the subject of attention from many groups. The lower levels of glucose found in physiological samples and the high background presents a special challenge in both spectroscopy and calibration. Small has developed a spectral simulation protocol to extend the lifetime of near-infrared multivariate PLS calibrations for glucose in the presence of alanine, ascorbate, lactate, urea, and triacetin (108). A calibration model developed from 25 synthetic calibration standards measured 23 times over a period of 325 days performed better than a PLS model developed from 64 calibration samples measured on the same day. The key to the success of this approach is the use of a set of spectra of a phosphate buffer collected on each prediction day to construct synthetic calibration spectra that are specific to that day. FT-NIR transmission spectra have been used to quantify glucose for prediction data sampled as much as 6 months outside the time frame of the corresponding calibration data (109). Spectra of the matrix acquired during the instrumental warm up period on the prediction day are used to update the calibration model and determine the optimum frequency for an infinite impulse response digital filter. By tuning the filter and the calibration model to the specific instrumental response associated with the prediction day, the performance of the calibration model is enhanced. The selectivity of PLS calibration models for glucose, glucose-6phosphate, and pyruvate has been investigated using ternary mixtures of these compounds. Selectivity, which is demonstrated for each analyte, originates from the unique spectral features associated with each compound (110). Saccharin and aspartame in commercial nocaloric sweeteners is quantified from UV-visible absorbance data using a PLS-2 model (111). A full factorial design for the standards comprising the training set was used, and an internal standard was added to the real samples to facilitate their adjustment to the PLS-2 model. As noted above, there were too many applications of multivariate calibration to cite here. However, representatives of these publications are summarized in the following paragraphs. These were selected on the basis of unusual calibration or unusual measurement systems. These articles did not fit in any of the categories or divisions within this section but are noteworthy for their methodology. Surface area of silica gel particles were quantified using NIR and PLS (112). The area of the silica gel particles in the slurries that were used as standards was determined by the BET method. Error estimates in the surface area predicted for the unknowns lie in the same range as the error in the BET surface area determination. The feasibility of quantifying enantiomeric excess in mixtures of two pharmacological important active principles was investigated using PLS models of IR data (113). Because the spectra were taken with KBr, different background correction and data pretreatment methods were used. Internal and external validation of the PLS models demonstrated their accuracy. IR spectroscopy in combination with PLS was used to simultaneously determine sulfamethoxazole and tremethoprim in raw material powder mixtures used for manufacturing com-
mercial pharmaceutical products (114). Interval PLS and synergy PLS were applied to select a spectral range that provided the lowest prediction error. The proposed procedure compared favorably to HPLC, which is the primary reference method. PLS-1 was compared to classical linear least-squares regression and a single wavenumber calibration model to generate accurate chemical images of the active pharmaceutical ingredient and its excipients in a tablet (115). The accuracy of the generated chemical images was evaluated by the ability to predict the concentration of the constituents comprising the tablet, with the most accurate predictions generated using PLS-1. The quantitative performance of ion mobility mass spectrometry for explosive analyses was enhanced using PLS (116). The superior performance of PLS-1 over traditional peak area calibrations can be attributed to signal averaging and the maximization of the correlation between the entire span of the ion mobility mass spectra and the known TNT and RDX concentrations. Three fluoroquinoline antibiotics in urine were studied using synchronous fluorescence spectroscopy which was measured in a flow injection system with a double pH gradient to achieve the second order advantage which was necessary because of the fluorescent background of the urine (117). Several multivariate calibration algorithms including parallel factor analysis, unfolded, and multiway partial least-squares with residual bilinearization were used to analyze the data. PLS in a specific unfolded version with residual bilinearization achieved the best results. Calibration to properties also continues to be an active area of investigation. The breadth and ingenuity of the applications appearing during the last 2 years were impressive (118). PLS-1, support vector machines, and multiple linear regression analysis were used to select descriptors to predict the pKa of organic compounds. The support vector machine model was better than the other QSPR models developed from the data. Midinfrared spectroscopy and partial least-squares analysis was used to predict sorption coefficients of pesticides in soil (119). The PLS model predicted sorption coefficients more accurately than models developed from soil organic carbon content. The lower flammability limits of organic compounds were predicted using support vector machines and molecular descriptors that contained information about the size, shape, and electronic properties of the molecules (120). The molecular descriptors were selected using a genetic algorithm for feature selection. The resulting model showed high prediction ability. A total of 170 chemical shift values for 10 benzaldehydes were predicted using partial least-squares and molecular descriptors optimized at the semiempirical PM3 level (121). Various electronic and steric descriptors accounting for properties of the benzene ring and aldehyde group were computed. The QSPR models were comparable to the literature empirical model and DFT calculations. A PLS model to predict the surface energies of polymers within and outside of the training set using TOF-SIMS data was developed (122). Surface energy predictions for polymers that were synthesized from monomers not used in the training set were poor. When the polymers were synthesized from monomers present in the training set, the surface energy predictions were reasonably accurate confirming the importance of the design of the training set.
MULTIVARIATE CURVE RESOLUTION This section is concerned with methods for the resolution and recovery of pure-component spectra from the overlapped spectra of mixtures. A review on the use of factor analysis and principal component analysis for resolution in Raman spectroscopic imaging was published by Ozaki (123), and a review on the use of preprocessing, multivariate regression, and image segmentation in image analysis for component resolution was published by de Juan (124). Rius (125) reviewed the application of multivariate curve resolution to spectroscopic data acquired by monitoring chemical reactions and other processes. The application of alternating least-squares to environmental monitoring data tables for identification of the constituents responsible for the contamination patterns affecting a particular geographic region during a period of time was reported by Tauler (126). The application of rank annihilation to reduce scan effects (i.e., significant differences between spectra on the left and right side of the chromatographic peak) thereby ensuring bilinearity in the data matrix which is a necessary condition for the successful application of multivariate curve resolution methods was reported by Grung (127). A large number of publications have appeared in the chemical literature on the practical aspects and potential limitations of multivariate curve resolution techniques. Only the most significant of these publications will be summarized here. Malinowski (128) reported on determining the number of pure chemical components in a mixture using maximum likelihood to estimate the number of significant principal components in the data. He observed that computations previously attributed to Malinowski’s F-test are incorrect because they were based on determining the equality of the secondary eigenvalues, not the equality of the reduced eigenvalues. Using Borgen plots, Rajko (129) has reported that concentration and spectral profiles computed by multivariate curve resolution-alternating least-squares for mixtures of three components may lie outside of the range of the data matrix. He recommends adoption of the band solution if only the minimal constraint of nonnegativity can be applied. The application of additional constraints to modeling and the existence of regions in the data corresponding to only pure components ensure that multivariate curve resolution algorithms will work properly. Tauler (130) investigated the problems of rotational and scale ambiguities in multivariate curve resolution. His studies were limited to twocomponent systems. He reported that multivariate curve resolution yields boundaries for the band of feasible solutions that were in agreement with the results obtained from a systematic grid search of all feasible solutions which was undertaken to reveal these boundaries. The effect of rotational and scale ambiguities on the boundaries of feasible solutions in multivariate curve resolution has also been investigated by Rajko (131). Several groups have proposed new algorithms for multivariate curve resolution. Hong (132) described an algorithm called warped factor analysis which is a nonlinear generalization of two mode principal component analysis and three-mode PARAFAC models. Sparse component analysis (133), a method to extract pure component spectra from mixture spectra wherein the number of mixtures is less than the number of pure components, is proposed and demonstrated using mass spectral data. A variant of the band target entropy minimization (BTEM) algorithm called AutoBTEM has been developed to achieve self-modeling curve Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
resolution (134, 135). AutoBTEM combines a novel automatic band targeting numerical strategy with exhaustive BTEM curve resolution and hierarchical clustering analysis in a blind search approach. AutoBTEM has been successfully demonstrated using FT-IR, Raman, and infrared imaging data. Local rank based spatial information can be used to resolve spectroscopic images. Hancewicz (136) reported that local rank information when combined with reference spectral information in the form of constraints increases component mapping integrity and decreases ambiguity for the reconstructed images. Berman (137) discusses a new algorithm called iterated constrained end members, which combines alternating least-squares with a so-called unmixing algorithm used in remote sensing to resolve hyperspectral images. The algorithm is based on a convex geometric model, and the estimation is performed in a subspace of the data. Van Benthem (138) describes a faster algorithm for performing trilinear decomposition. The algorithm, which is a modification of the PARAFAC-ALS algorithm, decomposes the data matrix into a core matrix and three loading matrices based on the Tucker1 model. Gemperline (139) discusses an extension to the penalty least-squares method called multiway penalty alternating least-squares which applies either hard constraints or soft constraints through the application of a row wise penalty least-squares function. The advantages of this approach are demonstrated using multibatched NIR data from a based catalyzed esterification reaction to resolve the concentration and spectral profiles of both reactants and products. A large number of papers have appeared in the last 2 years on the application of multivariate curve resolution methods. In almost all of these applications, the authors employed well established methods. The great diversity in the application of multivariate curve resolution methods indicates that these methods are being increasing adopted by research groups far removed from chemometrics, a consequence of the interest that chemometrics is attracting from fields far removed from analytical chemistry. Only the more interesting and novel applications are cited here. Multivariate curve resolution techniques continue to be exploited in chemical analysis. Multivariate curve resolution coupled to high-performance liquid chromatography with fast scanning fluorescence spectroscopy was used to analyze for 10 selected polycyclic aromatic hydrocarbons (140). Second order data matrices were obtained for each sample from a chromatographic system operating in the isocratic mode. The performance of alternating least-squares was compare to two parallel factor analysis algorithms. Alternating least-squares were able to successfully resolve all of the components in this complex system despite the presence of interferences. Three fluorquinolone antibiotics were determined in human serum in the presence of salicylate by applying alternating least-squares to lanthanidesensitized excitation-time decay data (141). The orthogonal projection approach was used to assess peak purity in data from complex biological samples obtained with capillary zone electrophoresis-diode array detection (142). Alternating least-squares was used to resolve comigrating peaks of metabolites and to recover qualitative and quantitative information about comigrating components of urine extract. A direct spectrophotometric method utilizing alternating least-squares for the determination of the compounds that are responsible for the three artificial colors used in beverages was proposed (143). The second order data consisted 4708
Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
of the spectra of the samples in a matrix that was augmented using the spectra of the three dyes and synthetic mixtures of the dyes. Evolving factor analysis and alternating least-squares was successfully used to deconvolute overlapping peaks from a miniature gas chromatograph equipped with a microsensor array for detection (144). Alternating least-squares was used to deconvolute and quantify ascorbic acid and acetylsalicyclic acids in four pharmaceutical preparations using flow injection analysis with pH gradient flow and diode array detection (145). The Schlieren effect was resolved using an additional component in the modeling scheme and another additional component was necessary to model the presence of possible interferences. Malonaldehyde was determined in olive oil using four way data obtained by recording the kinetic evolution of excitation-emission fluorescence matrixes for the product of the Hantzch reaction between methylamine and the targeted analyte malonaldehyde (146). The second order advantage in this data was achieved by processing the four-way data using parallel factor analysis combined with a using nonlinear pseudoinverse regression based on a quadratic polynomial fit and a neural net. Profiling of a chemical weapon precursor for possible forensic signatures of 29 analyte impurities in 6 samples was performed using two-dimensional gas chromatography/mass spectrometry and PARAFAC for the resolution of overlapped GC × GC peaks, which yielded clean spectra for compound library matching (147). A new PARAFAC method for the resolution of peaks across large sections or entire GC × GC-TOFMS chromatograms was developed by Synovec (148) and validated using three different data sets. As noted above, there were too many applications of multivariate curve resolution to cite here. Representatives of these publications are cited in the following paragraph, however. The sorption process of water in a biocompatible film was monitored by time-resolved in situ attenuated total reflectance spectroscopy, and the noisy and heavily overlapped O-H vibrational bands were deconvolved using alternating least-squares to yield meaningful pure component spectra and detailed kinetic sorption profiles of each component (149). Raman spectra depicting the reorganization of solvent molecules around the solute have been obtained using multivariate curve resolution for acetonitrile, acetone, and pyridine in water (150). The effect of pH on the excitation emission matrices of fluorescence of CdTe quantum dots capped with mercaptopropionic acid were analyzed using parallel factor analysis and alternating least-squares (151). Only a single component parallel factor analysis or alternating least-squares model was necessary to model the effect of the pH induced variation, other components were necessary to model scattering and background noise. A novel approach to compensate for scatter in reflectance spectroscopy which involves using a separate scattering spectral fingerprint and an absorption fingerprint as input to the alternating least-squares algorithm in multivariate curve resolution as a constraint improved the performance (152). Multivariate curve resolution using alternating least-squares was used to resolve visible CD spectra from the Cu2+ titration of the metal-binding region of a prion protein and determine the number of binding modes used in the uptake of copper by the full binding region of the protein (153). FT-IR and alternating least-squares was used to resolve the fraction and spectral profiles of the different structural motifs (154). Ultraviolet resonance Raman spectros-
copy and multivariate curve resolution using alternating leastsquares and a chemically relevant shape-constraint was used to resolve secondary structure changes because of aromatic side chains which introduce overlapping vibrational modes (155). Multivariate curve resolution with alternating leastsquares was used to study protein folding (156). The data matrix analyzed by alternating least-squares consisted of the transformations in myoglobin due to changes in pH monitored by UV-visible absorbance and circular dichroism and data from steady state and stopped flow experiments. Multivariate curve resolution techniques continue to be exploited in spectroscopic imaging. The application of multivariate curve resolution techniques to hyperspectral confocal fluorescence microscopy continues to be an active area of research with a large and growing literature. Haaland and co-workers (157-159) applied multivariate curve resolution techniques to both real and simulated hyperspectral images weighted to compensate for the two major sources of noise in the data, Poisson distributed noise and detector read noise, and greatly improved the extraction of pure emission spectra and their concentrations for spherical beads, autofluorescence from fixed lung epithelial cells, and fluorescence of quantum dots in aqueous solutions. Near infrared chemical imaging is another attractive technique because of its capability to record a great amount of spectral information in a short time period. Multivariate curve resolution methods have been used to provide quantitative and spatial information of all ingredients in a complex tablet matrix composed of five components without the development of any previous calibration model (160). The performances of several multivariate curve resolution algorithms (principal component analysis, positive matrix factorization, nonnegative matrix factorization, and alternating least-squares) for near-infrared imaging have been compared (161). Principal component analysis extracts information that is difficult to interpret whereas positive matrix factorization gives accurate results and provides a good match with reference spectra but is an algorithm that requires accurate tuning. Barry K. Lavine is an Associate Professor of Chemistry at Oklahoma State University in Stillwater, OK. He has published approximately 100 papers in chemometrics and is on the editorial board of several journals including the Journal of Chemometrics and the Microchemical Journal. He is the Assistant Editor of Chemometrics for Analytical Letters. Lavine’s research interests encompass many aspects of the applications of computers in chemical analysis including pattern recognition, multivariate curve resolution, and multivariate calibration using genetic algorithms and other evolutionary techniques. Jerome (Jerry) Workman, Jr. is consulting with Technology Business Associates, and has held many technical positions involving chemical analysis and chemometrics. This Chemometrics review article constitutes the fourth in this series he has coauthored. In his career, Workman has focused on molecular and electronic spectroscopy, process analysis, and chemometrics and has received many key awards for his work. Over the past 25 years he has published widely, including numerous tutorials, scientific papers, and book chapters, individual text volumes, software programs, and inventions.
LITERATURE CITED (1) Lavine, B.; Workman, J. Anal. Chem. 2008, 80 (12), 4519–4531. (2) Hibbert, D. B.; Minkkinen, P.; Faber, N. M.; Wise, B. M. Anal. Chim. Acta 2009, 642 (1-2), 3–5. (3) Hanrahan, G.; Gomez, F. A. Chemometric Methods in Capillary Electrophoresis; John Wiley & Sons: New York, 2009. (4) Ellison, S. L.R.; Farrant, T. J. D.; Barwick, V. J. Practical Statistics for the Analytical Scientist: A Bench Guide; Royal Society of Chemistry: London, U.K., 2009.
(5) Brereton, R. Chemometrics for Pattern Recognition; John Wiley & Sons: New York, 2009. (6) Namiesnik, J., Szefer, P., Eds. Analytical Measurements in Aquatic Environments; Taylor & Francis: Boca Raton, FL, 2009. (7) Hanrahan, G., Ed. Environmental Chemometrics: Principles and Modern Applications; Taylor & Francis: Boca Raton, FL, 2008. (8) Manuel, J.; Andrade-Garda, J. M. Basic Chemometric Techniques in Atomic Spectroscopy; The Royal Society of Chemistry: London, U.K., 2009. (9) Brown, S. D., Tauler, R., Walczak, B., Eds. Comprehensive Chemometrics and Biochemical Data Analysis; Elsevier: Amsterdam, The Netherlands, 2009. (10) Varmuza, K.; Filzmoser, P. Introduction to Multivariate Statistical Analysis in Chemometrics; Taylor & Francis: Boca Raton, FL, 2009. (11) Thijssen, P. State Estimation in Chemometrics: The Kalman Filter and Beyond; Woodhead Press Limited: Cambridge, U.K., 2009. (12) Workman, J., Jr.; Koch, M.; Lavine, B.; Chrisman, R. Anal. Chem. 2009, 81 (12), 4623–4643. (13) Dorman, F. L.; Overton, E. B.; Whiting, J. J.; Cochran, J. W.; GardeaTorresdey, J. Anal. Chem. 2008, 80 (12), 4487–4497. (14) Liu, S.; Kokot, S.; Will, G. J. Photochem. Photobiol., C: Photochem. Rev. 2009, 10 (4), 159–172. (15) Want, E. Bioanalysis 2009, 1 (4), 805–819. (16) Madsen, R.; Lundstedt, T.; Trygg, J. Anal. Chim. Acta 2010, 659 (12)), 23–33. (17) Waterman, D. S.; Bonner, F. W.; Lindon, J. C. Bioanalysis 2009, 1 (9), 1559–1578. (18) Garcia-Perez, I.; Vallejo, M.; Garcia, A.; Legido-Quigley, C.; Barbas, C. J. Chromatogr., A 2008, 1204 (2), 130–139. (19) Wang, L.; Mizaikoff, B. Anal. Bioanal. Chem. 2008, 39 (15), 1641–1654. (20) Weckwerth, W. Physiol. Plant. 2008, 132 (2), 176–189. (21) Maher, A. D.; Lindon, J. C.; Nicholson, J. K. Future Med. Chem. 2009, 1 (4), 737–747. (22) Smit, S.; Hoefsloot, H. C. J.; Smilde, A. K. J. Chromatogr., B 2008, 866 (1-2), 77–88. (23) Mutihac, L.; Mutihac, R. Anal. Chim. Acta 2008, 612 (1), 1–18. (24) Tarley, C. R. T.; Silveira, G.; Lopes dos Santos, W.; Matos, G. D.; da Silva, P.; Galvao, E.; Bezerra, M. A.; Miro, M.; Ferreira, S. L. C. Microchem. J. 2009, 92 (1), 58–67. (25) Stalikas, C.; Fiamegos, Y.; Sakkas, V.; Albanis, T. J. Chromatogr., A 2009, 1216 (2), 175–189. (26) Hanrahan, G.; Montes, R.; Gomez, F. A. Anal. Bioanal. Chem. 2008, 390 (1), 169–179. (27) Marini, F.; Bucci, R.; Margri, A. L.; Magri, A. D. Microchem. J. 2008, 88 (2), 178–185. (28) Koljonen, J.; Nordling, T. E. M.; Alander, J. T. J. Near Infrared Spectrosc. 2008, 16 (3), 189–197. (29) Ivosev, G.; Burton, L.; Bonner, R. Anal. Chem. 2008, 80 (13), 4933–4944. (30) Yamamoto, H.; Yamaji, H.; Abe, Y.; Harada, K.; Waluyo, D.; Fukusaki, E.; Kondo, A.; Ohno, H.; Fukuda, H. ChemoLab 2009, 98 (2), 136–142. (31) Akella, L. M.; Tomas, O.; Orazine, C.; Hincapie, M.; Hancock, W. S. J. Proteome Res. 2009, 8 (10), 4732–4742. (32) Dorsea, M.; Silva, L.; Silva, M. A.; Cavalcanti, S. ChemoLab 2008, 94 (1), 1–8. (33) Ni, W.; Brown, S. D.; Man, R. Anal. Chem. 2009, 81 (21), 8962–8967. (34) Pang, H.; Tong, T.; Zhao, H. Biometrics 2009, 65 (4), 1021–1029. (35) Tan, C.; Chen, H.; Xia, C. J. Pharm Biomed. Anal. 2009, 49 (3), 746–752. (36) Villa Medina, J. L.; Boque, R.; Ferre, J. Anal. Chim. Acta 2009, 646 (12), 62–68. (37) Vu, T.; Braga-Neto, U. M. Bioinform. Syst. Biol. 2009, 4, 580–589. (38) Culp, M.; Michailidis, G. J. Chem. 2009, 23 (6), 294–303. (39) Debruyne, M.; Serneels, S.; Verdonck, T. J. Chem. 2009, 23 (9), 479– 486. (40) Devos, O.; Ruckebusch, C.; Durand, A.; Duponchel, L.; Huvene, J.-P. ChemoLab 2009, 96 (1), 27–33. (41) Nedenskov, J. K.; Jessen, F.; Jorgenson, B. M. J. Proteome Res. 2008, 7 (3), 1288–1296. (42) Gertheiss, J.; Tutz, G. J. Chem. 2009, 23 (3), 149–151. (43) Zhu, Y.; Fearn, T.; Samuel, D.; Dhar, A.; Hameed, O.; Bown, S. G.; Lovat, L. B. J. Chem. 2008, 22 (2), 130–134. (44) Pedreschi, R.; Hertog Marten, L. A. T. M.; Carpentier, S. C.; Lammertyn, J.; Robben, J.; Noben, J. P.; Panis, B.; Swennen, R.; Nicolai, B. M. Proteomics 2008, 8 (7), 1371–1383. (45) Daszykowski, M.; Danielsson, R.; Walczak, B. J. Chromatogr., A 2008, 1192 (1), 157–165. (46) Veselkov, K. A.; Lindon, J. C.; Ebbels, J. C.; Crockford, D.; Volynkin, V. V.; Holmes, E.; Davies, D. B.; Nicholson, J. K. Anal. Chem. 2009, 81 (1), 56–66.
Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
(47) Anderson, P. E.; Reo, N. V.; DelRaso, N. J.; Doom, T.; Raymer, M. L. Metabolomics 2008, 4 (3), 261–272. (48) Peters, S.; van Velzen, E.; Janssen, H.-G. Anal. Bioanal. Chem. 2009, 394 (5), 1273–1281. (49) Suits, F.; Lepre, J.; Du, P.; Bischoff, R.; Horvatovich, P. Anal. Chem. 2008, 80 (9), 3095–3104. (50) Menze, B. H.; Kelm, M.; Masuch, R.; Himmerreich, U.; Bachert, P.; Petrich, W.; Hamprecht, F. A. BMC Bioinf. 2009, 10, 261–265. (51) Ballabio, D.; Skov, T.; Leardi, R.; Bro, R. J. Chem. 2008, 22 (8), 457–463. (52) Blaise, B. J.; Shintu, L.; Laetitia; Elena, B.; Emsley, L.; Dumas, M.-E.; Toulhout, P. Anal. Chem. 2009, 81 (15), 6242–6251. (53) Wongravee, K.; Heinrich, N.; Holmboe, M.; Schaefer, M. L.; Reed, R. L.; Trevejo, J.; Brereton, R. G. Anal. Chem. 2009, 81 (13), 5204–5217. (54) Dougherty, E. R.; Hua, J.; Sima, C. Curr. Genomics 2009, 10 (6), 365– 374. (55) Sussulini, A.; Prando, A.; Maretto, D. A.; Poppi, R. J.; Tasic, L.; Banzato, C. E. M.; Arruda, M. A. Z. Anal. Chem. 2009, 81 (23), 9755–9763. (56) Timea, I.; Kremmer, T.; Heberger, K.; Moinar-Szollosi, E.; Ludanyi, K.; Pocsfavi, G.; Malorni, A.; Drahos, L.; Vekey, K. J. Proteomics 2008, 71 (2), 186–197. (57) Krug, D.; Zurek, G.; Schneider, B.; Garcia, R.; Mueller, R. Anal. Chim. Acta 2008, 624 (1), 97–106. (58) Wiklund, S.; Johansson, E.; Sjoestroem, L.; Mellerowicz, E. J.; Edlund, U.; Shockcor, J. P.; Gottfries, J.; Moritz, T.; Trygg, J. Anal. Chem. 2008, 80 (1), 115–122. (59) Li, X.; Lu, X.; Tian, J.; Gao, P.; Kong, H.; Xu, G. Anal. Chem. 2009, 81 (11), 4468–4475. (60) Berman, E. S.F.; Wu, L.; Fortson, S. L.; Kulp, K. S.; Nelson, D. O.; Wu, K. J. Surf. Interface Anal. 2009, 41 (2), 97–104. (61) Morris, J. S.; Brown, P. J.; Herrick, R. C.; Baggerly, K. A.; Coombes, K. R. Biometrics 2008, 64 (2), 479–489. (62) Kelly, J.; Martin-Hirsch, P. L.; Martin, F. L. Anal. Chem. 2009, 81 (13), 5314–5319. (63) Lopes, M. B.; Wolff, J.-C. Anal. Chim. Acta 2009, 633 (1), 149–155. (64) de Peiner, P.; Vredenbregt, M. J.; Visser, T.; de Kaste, D. J. Pharm. Biomed. Anal. 2008, 47 (4-5), 688–694. (65) Charlton, A. J.; Robb, P.; Donarski, J. A.; Godward, J. Anal. Chim. Acta 2008, 618 (2), 196–203. (66) Hughes, A. D.; Glenn, I. C.; Patrick, A. D.; Ellington, A.; Anslyn, E. V. Chem.sEur. J. 2008, 14 (6), 1822–1827. (67) Ratie, F.; Gagne, C.; Terrettaz-Zufferey, A.-L.; Kanevski, M.; Esseiva, P.; Ribaux, O. ChemoLab 2008, 90 (2), 123–131. (68) Fernandez-Varela, R.; Andrade, J. M.; Muniategui, S.; Prada, D.; RamirezVillalobos, F. Mar. Pollut. Bull. 2008, 56 (2), 335–347. (69) Widjaja, E.; Lim, G. H.; An, A. Analyst 2008, 133 (4), 493–498. (70) Lloyd, G. R.; Ahmad, S.; Wasim, M.; Brereton, R. G. Anal. Chim. Acta 2009, 649 (1), 33–42. (71) Cheung, W.; Xu, Y.; Thomas, C. L.; Goodacre, R. Analyst 2009, 134 (3), 557–563. (72) Preisner, O.; Loes, J. A.; Menezes, J. C. ChemoLab 2008, 94 (1), 33–42. (73) Forrester, J. B.; Valentine, N. B.; Su, Y.-F.; Johnson, T. J. Anal. Chim. Acta , 651 (1), 24–30. (74) Alexandrakis, D.; Downey, G.; Scannell, A. G. M. J. Agric. Food Chem. 2008, 56 (10), 3431–3437. (75) Thaler, E. R.; Huang, D.; Giebeig, L.; Palmer, J.; Dan, L.; Hanson, C. W.; Cohen, N. Am. J. Rhin. 2008, 22 (1), 29–33. (76) Ly, E.; Piot, O.; Durlach, A.; Bernard, P.; Manfait, M. Analyst 2009, 134 (6), 1208–1214. (77) Gujral, P.; Amrhein, M.; Bonvin, D.; Vallee, J.-P.; Montet, X.; Michoux, N. ChemoLab 2009, 98 (2), 173–181. (78) Stenlund, H.; Gorzsas, A.; Persson, P.; Sundberg, B.; Trygg, J. Anal. Chem. 2008, 80 (18), 6898–6906. (79) Williams, P.; Geladi, P.; Fox, G.; Marena, M. Anal. Chim. Acta 2009, 653 (2), 121–130. (80) Parthasarathy, R.; Thiagarajan, G.; Yao, X.; Wang, Y.-P.; Spencer, P.; Wang, Y. J. Biomed. Opt. 2008, 13 (1), 014020/1–014020/9. (81) Suh, C.; Rajan, K. Mater. Sci. Technol. 2009, 25 (4), 466–471. (82) Jaumot, J.; Eritja, R.; Navea, S.; Gargallo, R. Anal. Chim. Acta 2009, 642 (1-2), 117–126. (83) Laothawornkitkul, J.; Moore, J. P.; Taylor, J. E.; Possell, M.; Gibson, T. D.; Hewitt, C. N.; Paul, N. D. Environ. Sci. Technol. 2008, 42 (22), 8433– 8439. (84) Xie, G.; Feng, S.; Feng, X.; Li, Y.; Han, H.; Wang, Y.; Zhu, J.; Yan, L.; Li, L. Archaeometry 2009, 51 (4), 682–699. (85) Zhang, L.; Liang, Y.; Chen, A. Analyst 2009, 134 (8), 1717–1724. (86) Andersson, M. J. Chem. 2009, 23 (10), 518–529. (87) Stenlund, H.; Johansson, E.; Gottfries, J.; Trygg, J. Anal. Chem. 2009, 81 (1), 203–209.
Analytical Chemistry, Vol. 82, No. 12, June 15, 2010
(88) Kruger, U.; Zhou, Y.; Wang, X.; Rooney, D.; Thompson, J. J. Chem. 2008, 22 (1), 1–13. (89) Abdel-Rahman, A. I.; Lim, G. J. J. Chem. 2009, 23 (10), 530–537. (90) Xu, L.; Yu, X.-P.; Lu, X.-L.; Wu, Y.-H.; Wu, H.-L.; Jiang, J.-H.; Shen, G.-L.; Yu, R.-Q. Anal. Chim. Acta 2009, 644 (1-2), 25–29. (91) Indahl, U. G.; Liland, K. H.; Naes, T. J. Chem. 2009, 23 (9), 495–504. (92) Bouveresse, J.-R.; Rutledge, D. N. Anal. Chim. Acta 2009, 642 (1-2), 37–44. (93) Tarumi, T.; Wu, Y.; Small, G. W. Anal. Chem. 2009, 81 (6), 2199–2207. (94) Hernandez, N.; Talavera, I.; Dago, A.; Biscay, R. J.; Ferreira, M. M. C.; Porro, D. J. Chem. 2008, 22 (11-12), 686–694. (95) Chen, T.; Martin, E. Anal. Chim. Acta 2009, 631 (1), 13–21. (96) Bhatt, N.; Narasimhan, S. ChemoLab 2009, 98 (2), 182–194. (97) Kritchman, S.; Nadler, B. ChemoLab 2008, 94 (1), 19–32. (98) Fitzmoser, P.; Liebmann, B.; Varmuza, K. J. Chem. 2009, 23 (4), 160– 171. (99) Eriksson, L.; Trygg, J.; Wold, S. J. Chem. 2008, 22 (11-12), 594–600. (100) Fernandez Pierna, J. A.; Abbas, Q.; Baeten, V.; Dardenne, P. Anal. Chim. Acta 2009, 642 (1-2), 89–93. (101) Teofilo, R. F.; Martens, J. P. A.; Ferreira, M. M. C. J. Chem. 2009, 23 (1-2), 32–48. (102) Li, H.; Liang, Y.; Xu, Q.; Cao, D. Anal. Chim. Acta 2009, 648 (1), 77–84. (103) Moors, J.; Kuligowski, J.; Quintas, G.; Garrigues, S.; de la Guardia, M. Anal. Chim. Acta 2008, 630 (2), 150–160. (104) Kramer, K. E.; Morris, R. E.; Rose-Pehrsson, S. L. ChemoLab 2008, 92 (1), 33–43. (105) Fan, W.; Liang, Y.; Yuan, D.; Wang, J. Anal. Chim. Acta 2008, 623 (1), 22–29. (106) Gujral, P.; Amrhein, M.; Bonvin, D. Anal. Chim. Acta 2009, 642 (1-2), 27–36. (107) Alam, T. M.; Alam, M. K.; McIntyre, S. K.; Volk, D. E. Anal. Chem. 2009, 81 (11), 4433–4443. (108) Sulub, Y.; Small, G. W. Anal. Chem. 2009, 81 (3), 1208–1216. (109) Kramer, K. E.; Small, G. W. Appl. Spectrosc. 2009, 63 (2), 246–255. (110) Liu, L.; Arnold, M. A. Anal. Bioanal. Chem. 2009, 393 (2), 669–677. (111) Cantarelli, M. A.; Pellerano, R. G.; Marchevsky, E. J.; Carmina, J. M. J. Agric. Food Chem. 2008, 56 (20), 9345–9349. (112) Christy, A. A. Colloids Surf. 2008, 322 (1-3), 248–252. (113) Marini, F.; Bucci, R.; Ginevro, I.; Magri, A. L. ChemoLab 2009, 97 (1), 52–63. (114) Silva, F. E. B.; Ferraro, M. F.; Parisotto, G.; Muller, E. I.; Flores, E. M. M. J. Pharm. Biomed. Anal. 2009, 49 (3), 800–805. (115) Raven, C.; Skibsted, E.; Bro, R. J. Pharm. Biomed. Anal. 2008, 48 (3), 554–561. (116) Kerr, D. R.; Atkinson, D. A. Analyst 2009, 134 (11), 2329–2337. (117) Borraccetti, M. D.; Damiani, P. C.; Olivieri, A. C. Analyst 2009, 134 (8), 1682–1691. (118) Goudarzi, N.; Goodarzi, M. Mol. Phys. 2009, 107 (14), 1495–1503. (119) Kookana, R. S.; Janik, L. J.; Forouzangohar, M.; Forrester, S. T. J. Agric. Food Chem. 2008, 56 (9), 3208–3213. (120) Pan, Y.; Jiang, J.; Wang, R.; Cao, H.; Cui, Y. J. Hazard. Mater. 2009, 168 (2-3), 962–969. (121) Kiralj, R.; Ferreira, M. M. C. J. Phys. Chem. A 2008, 112 (27), 6134– 6149. (122) Taylor, M.; Urquhart, A. J.; Anderson, D. G.; Langer, R.; Davies, M. C.; Alexander, M. R. Surf. Interface Anal. 2009, 41 (2), 127–135. (123) Shinzawa, H.; Awa, K.; Kanematsu, W.; Ozaki, Y. J. Raman Spectrosc. 2009, 40 (12), 1720–1725. (124) deJuan, A.; Maeder, M.; Hancewicz, T.; Duponchel, L.; Tauler, R. In Infrared and Raman Spectroscopic Imaging; Salzer, R., Siesler, H. W., Eds.; Wiley-VCH: Weinheim, Germany, 2009; pp 65-109. (125) Garrido, M.; Rius, F. X.; Larrechi, M. S. Anal. Bioanal. Chem. 2008, 390 (8), 2059–2066. (126) Terrado, M.; Barcelo, D.; Tauler, R. Environ. Sci. Technol. 2009, 43 (14), 5321–5326. (127) Mjos, S. A.; Grung, B. Anal. Chim. Acta 2009, 640 (1-2), 33–39. (128) Malinowski, E. R. J. Chem. 2009, 23 (1-2), 64–65. (129) Rajko, R. J. Chem. 2009, 23 (4), 172–178. (130) Abdollahi, H.; Maeder, M.; Tauler, R. Anal. Chem. 2009, 81 (6), 2115– 2122. (131) Rajko, R. Anal. Chim. Acta 2009, 645 (1-2), 18–24. (132) Hong, S. J. Chem. 2009, 23 (7-8), 371–384. (133) Kopriva, I.; Jeric, I. J. Mass Spectrom. 2009, 44 (9), 1378–1388. (134) Tan, S.-T.; Zhu, H.; Chew, W. Anal. Chim. Acta 2009, 639 (1-2), 29–41. (135) Xu, W.; Chen, K.; Liang, D.; Chew, W. Anal. Biochem. 2009, 387 (1), 42–53. (136) de Juan, A.; Maeder, M.; Hancewicz, T.; Tauler, R. J. Chem. 2008, 22, 291–298.
(137) Berman, M.; Phatak, A.; Langstrom, R. J. Chem. 2009, 23 (1/2), 101– 116. (138) Van Benthem, M. H.; Keenan, M. R. J. Chem. 2008, 22 (5), 345–354. (139) Richard, S.; Miller, R.; Gemperline, P. Appl. Spectrosc. 2008, 62 (2), 197– 206. (140) Bortolato Santiago, A.; Arancibia, J. A.; Escandar, G. M. Anal. Chem. 2009, 81 (19), 8074–84. (141) Loszano, V. A.; Tauler, R.; Ibanez, G. A.; Olivieri, A. C. Talanta 2009, 77 (5), 1715–1723. (142) Szymanska, E.; Makuszewski, M. J.; Vander Heyden, Y.; Kaliszan, R. Electrophoresis 2009, 30 (20), 3573–3581. (143) Llamas, N. E.; Garrido, M.; Di Nezio, M. S.; Band, B.; Fernandez, S. Anal. Chim. Acta 2009, 655 (1-2), 38–42. (144) Chunguang, J.; Zellers, E. T. Sens. Actuators, B 2009, B139 (2), 548– 556. (145) Carneiro, R. L.; Braga, J. W. B.; Poppi, R. J.; Tauler, R. Analyst 2008, 133 (6), 774–783. (146) Garcia-Reiriz, A.; Damiani, P. C.; Olivieri, A. C.; Canada-Canada, F.; Munoz de la Pena, A. Anal. Chem. 2008, 80 (19), 7248–7256. (147) Hoggard, J. C.; Wahl, J. H.; Synovec, R. E.; Mong, G. M.; Fraga, C. G. Anal. Chem. 2010, 82 (2), 689–698. (148) Hoggard, J. C.; Siegler, W. C.; Synovec, R. E. J. Chem. 2009, 23 (7-8), 421–431. (149) Tanabe, A.; Morita, S.; Tanaka, M.; Ozaki, Y. Appl. Spectrosc. 2008, 62 (1), 46–50.
(150) Perera, P.; Wyche, M.; Loethen, Y.; Ben-Amotz, D. J. Am. Chem. Soc. 2008, 130 (14), 4576–7. (151) Leitao, J. M. M.; Gonclaves, H. M. C.; Esteves, S.; Joaquim, C. G. Anal. Chim. Acta 2008, 628 (2), 143-154. (152) Kessler, W.; Oelkrug, D.; Kessler, R. Anal. Chim. Acta 2009, 642 (1-2), 127–134. (153) Pollock, J. B.; Cutler, P. J.; Kenney, J. M.; Gemperline, P. J.; Burns, C. S. Anal. Biochem. 2008, 377 (2), 223–233. (154) Shariati-Rad, M.; Hasani, M. Biochimie 2009, 91 (7), 850–856. (155) Simpson, J. V.; Balakrishnan, G.; Jiji, R. D. Analyst 2009, 134 (1), 138– 147. (156) Cutler, P.; Gemperline, P. J.; de Juan, A. Anal. Chim. Acta 2009, 632 (1), 52–62. (157) Jones, H. D. T.; Haaland, D. M.; Sinclair, M. B.; Melgaard, D. K.; Van Benthem, M. H.; Pedoso, M. C. J. Chem. 2008, 22 (9), 482–490. (158) Haaland, D. M.; Jones, H. D. T.; Van Benthem, M. H.; Sinclair, M. B.; Melgaard, D. K.; Stork, C. L.; Pedroso, M. C.; Liu; Andrews, N. L.; Lidke, D. S. Appl. Spectrosc. 2008, 22 (9), 482–490. (159) Vermaas, W. F. J.; Timlin, J. A.; Jones, H. D. T.; Sinclair, M. B.; Nieman, L. T.; Hamad, S. W.; Melgaard, D. K.; Haaland, D. M. Proc. Natl. Acad. Sci. U.S.A. 2008, 105 (10), 4050–4055. (160) Amigo, J. M.; Ravn, C. Eur. J. Pharm. Sci. 2009, 37 (2), 76–82. (161) Gendrin, C.; Roggo, Y.; Collet, C. J. Near Infrared Spectrosc. 2008, 16 (3), 151–157.
Analytical Chemistry, Vol. 82, No. 12, June 15, 2010