Mass-Remainder Analysis (MARA): a New Data Mining Tool for


Mass-Remainder Analysis (MARA): a New Data Mining Tool for...

2 downloads 85 Views 593KB Size

Subscriber access provided by JAMES COOK UNIVERSITY LIBRARY

Comment

Comment on “Mass-Remainder Analysis (MARA): a New Data Mining Tool for Copolymer Characterization”: an example of multiple discovery Thierry Nicolas Jean Fouquet Anal. Chem., Just Accepted Manuscript • DOI: 10.1021/acs.analchem.8b01628 • Publication Date (Web): 12 Jun 2018 Downloaded from http://pubs.acs.org on June 12, 2018

Just Accepted “Just Accepted” manuscripts have been peer-reviewed and accepted for publication. They are posted online prior to technical editing, formatting for publication and author proofing. The American Chemical Society provides “Just Accepted” as a service to the research community to expedite the dissemination of scientific material as soon as possible after acceptance. “Just Accepted” manuscripts appear in full in PDF format accompanied by an HTML abstract. “Just Accepted” manuscripts have been fully peer reviewed, but should not be considered the official version of record. They are citable by the Digital Object Identifier (DOI®). “Just Accepted” is an optional service offered to authors. Therefore, the “Just Accepted” Web site may not include all articles that will be published in the journal. After a manuscript is technically edited and formatted, it will be removed from the “Just Accepted” Web site and published as an ASAP article. Note that technical editing may introduce minor changes to the manuscript text and/or graphics which could affect content, and all legal disclaimers and ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errors or consequences arising from the use of information contained in these “Just Accepted” manuscripts.

is published by the American Chemical Society. 1155 Sixteenth Street N.W., Washington, DC 20036 Published by American Chemical Society. Copyright © American Chemical Society. However, no copyright claim is made to original U.S. Government works, or works produced by employees of any Commonwealth realm Crown government in the course of their duties.

Page 1 of 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

Comment on “Mass-Remainder Analysis (MARA): a New Data Mining Tool for Copolymer Characterization”: an example of multiple discovery Thierry Fouquet Research Institute for Sustainable Chemistry, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan.

The present comment aims at highlighting the similarity of the mathematical background proposed by T. Nagy and his co-workers in their article entitled “Mass-Remainder Analysis (MARA): a New Data Mining Tool for Copolymer Characterization” to the formulas developed by our group in a letter-to-the-editor2 entitled “First Gut Instincts Are Always Right: The Resolution Required for a Mass Defect Analysis of Polymer Ions Can Be as Low as Oligomeric”. Such similarity may not be obvious at first sight owing to the processing of different types of dataset and the use of two different names, the latter revealing two different approaches to produce this new post-acquisition data processing tool. The letter-to-the-editor published by our group is indeed the logical continuation of a series of papers aiming at rejuvenating the Kendrick mass defect (KMD) analysis of mass spectra from polymers – justifying our choice to keep the term “Kendrick” in our method’s name to be part of an “advanced KMD analysis”. On the opposite, the MARA tool proposed by Nagy et al has been specifically developed to be free of any change of mass scale during the computation as stated in their article – hence justifying the use of “mass-remainder” only. As the very first step of a longer procedure, the latter “mass-remainder values” (noted MRs) of MARA1 are calculated as follows: ݉/‫ݖ‬ MR = ݉/‫ ݖ‬− floor( )∙R R

(1)

with R the exact mass of the repeating unit of a polymer backbone in the IUPAC mass scale and floor(x) the greatest integer less than or equal to x (which corresponds to the function “int()” used by Nagy et al, but “floor()” is appropriate for an easy comparison of formulas, vide infra). The term “MR” only has been used while a slightly more accurate notation would be MR(R) since the “massremainder values” are a function of the chosen repeating unit R acting as a variable. Considering our methodology based on the calculation of the so-called “remainders of Kendrick mass” (noted RKM),2 the preliminary definition of a Kendrick mass (noted KM(R)) extended to polymer ions3 using the repeating unit R as the base unit (hence the variable) is: KM(R) = m/z ∙

round(R) R

(2)

As a complementary quantity to the more traditional KMDs3-6 (being the difference between rounded and exact KM, KMD=round(KM)-KM)) and specifically developed for mass spectral data of low accuracy,2 the notion of “remainders of KM” has been defined as follows:

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

RKM(R) =

Page 2 of 6

KM(R) KM(R) − floor( ) round(R) round(R)

(3)

Both KMs and RKMs are function of the repeating unit R justifying the notations KM(R) and RKM(R)). Replacing KM(R) by its definition from Equation (2), Equation (3) becomes: RKM(R) =

݉/‫ݖ‬ ݉/‫ݖ‬ − floor( ) R R

(4)

Comparing Equation (4) (RKMs) and Equation (1) (MRs of MARA), it is obvious that: Remainders of KM analysis2

‫܀‬۹‫= )܀(ۻ‬

‫)܀(܀ۻ‬ ‫܀‬

MARA1

(5)

The “remainders of KM” proposed in the framework of an “advanced KMD analysis” and the “mass-remainder values” proposed for the MARA technique are thus similar computations modulo a division by the mass of the repeating unit which has for only effect to make MRs varying from 0 to R when RKM varies from 0 to 1. It is briefly exemplified with the data processing of the MALDI-TOF mass spectrum of a poly(ethylene oxide-block-propylene oxide-block-ethylene oxide) 1900 g mol-1 50wt% EO (noted P(EO-b-PO-b-EO), Figure 1A, 435414 from Sigma Aldrich, St Louis, US) voluntarily chosen to mimic the sample used by Nagy et al.1 Beyond the regular KMD plot computed with PO as the base unit (Figure 1B), the RKM plot computed with the “remainders of KMs” (Equations (2) and (3)) depicted in Figure 1C displays a strictly identical shape in terms of point clustering (four main groups obliquely oriented) and separation capabilities (each group is composed of several horizontal lines assigned to EO-PO co-oligomers and isotopes) as the “MR vs. m/z plot” proposed by Nagy et al (Figure 2 in their article1) further supporting the validity of Equation (5).

Figure 1. (A) MALDI-TOF mass spectrum of a P(EO-b-PO-b-EO) triblock copolymer recorded with a JMS S3000 spiralTOF mass spectrometer (JEOL, Tokyo, Japan). (B) Regular KMD plot

ACS Paragon Plus Environment

Page 3 of 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

(base unit: PO). (C) Remainders of KM (RKM) plot (base unit: PO) to be compared to the “MR vs.

m/z plot”.1 (D) Resolution-enhanced KMD plot (base unit: PO/57).

As a part of our “advanced KMD analysis” toolkit, the resolution-enhanced KMD plot4-6 computed with a fractional PO/57 base unit is depicted in Figure 1D. Besides some point misalignments, its appearance is highly similar to the shape of the RKM plot, a fortiori the shape of the “MR vs. m/z plot” proposed by Nagy et al.1 The intimate link between KMDs and RKMs has been very recently demonstrated for the case of multiply charged polymer ions.7 Considering the similarity of RKMs and MRs, it may be worth reporting on the same link between MRs and KMDs for the more simple case of singly charged ions. The resolution-enhanced KMs4,5 used in Figure 1D have been developed based on the regular KMs3 (Equation (2)) but using a new fractional base unit R/X with X being an integer (i.e. the chemical moiety R divided by an integer to form a mathematical moiety R/X). Following several refinements of the technique,6 it has been found that the optimal divisors are comprised between X=round(2/3R) and round(2R) with the recommended values defined as X=round(R)+n with n being a positive or negative integer (n=+/-1,2,3…). It is mathematically translated into: R round( ) round(R) + n KM(R, n) = ݉/‫∙ ݖ‬ R round(R) + n

(6)

The resolution-enhanced KMs become a function of R and n (hence noted KM(R,n)) as opposed to the regular KMs noted KM(R) function of R only. The regular or resolution-enhanced KMDs are the difference between the regular or resolution-enhanced rounded KMs and exact KMs: KMD(R) = round൫KM(R)൯ − KM(R)

(7)

KMD(R, n) = round൫KM(R, n)൯ − KM(R, n)

Exploring the applicability of the remainders of KM for multiply charged ions,7 the definition of RKM has also been generalized as follows: RKM(R, n) =

KM(R) KM(R) − floor( ) round(R)ൗ round(R)ൗ n n

(8)

with n being an integer (typically a charge state, RKM becoming a function of R and n). Equation (8) can be re-written as follows replacing KM(R) by its definition (Equation 2) RKM(R, n) = n ∙

݉/‫ݖ‬ ݉/‫ݖ‬ − floor(n ∙ ) R R

(9)

Such definition of RKM is totally compatible with singly charged polymeric ions observed in MALDI as it will simply produce different clustering of points in the associated RKM plot depending on the value of n. With KMD(R) the regular Kendrick mass defect computed using R as the base unit, KMD(R,n) its resolution-enhanced counterpart computed using the fractional Rൗround(R) + n as base unit and

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 4 of 6

RKM(R,n) the remainders of Kendrick mass computed with R as the base unit and round(R)/n as the divisor, it has been demonstrated8 that these three outputs of an “Advanced KMD analysis” are linked via a unifying equation: KMD(R) − KMD(R, n) = RKM(R, n) 3

Regular KMD analysis

Resolution-enhanced KMD analysis4-6

(10) Resolution-enhanced Remainders of KM analysis2

Since RKM(R) of the KMD analysis and MR(R) of MARA have been found to be similar quantities in the previous section (Equation (5)), a generalized definition of MR can also be proposed with an additional parameter n being an integer as follows: MR(R, n) = n ∙ ݉/‫ ݖ‬− floor(n ∙

݉/‫ݖ‬ )∙R R

(11)

It is strictly equivalent to the original definition proposed by Nagy et al1 using n=1 but would also produce informative charge-dependent and/or resolution-enhanced “MR vs. m/z plots” by varying the value of n. From Equations (9) and (11), it is immediate to notice that: RKM(R, n) =

MR(R, n) R

(12)

which is a generalization of Equation (5) function of R and n. It eventually means that: ‫܀(܀ۻ‬, ‫)ܖ‬ ‫܀‬ Resolution-enhanced MARA1 4-6 KMD analysis

۹‫ۻ‬۲(‫ )܀‬− ۹‫ۻ‬۲(‫܀‬, ‫= )ܖ‬ Regular KMD analysis3

(13)

Displaying the MRs (or RKMs) takes advantage of the expanded y-axis from the resolutionenhanced KMDs while the errors of mass measurements are nearly cancelled from their subtraction to the regular KMDs. It then accounts for the strikingly clear horizontal alignments and the separation capabilities of the “MR vs. m/z” and the RKM plots. Incidentally, it also means that all the plots (KMD-based or remainder-based) can be computed simultaneously using a single set of inputs consisting in the mass of the repeating unit R as the only indispensable input and an integer n in case of resolution-enhancement for either KMDs or MRs / RKMs. The rise of MARA and RKM analysis nicely illustrates the concept of “multiple discovery” when two or more inventors come up with the same idea, simultaneously and independently but via different pathways or for different applications. In the present case, it ultimately broadens the scope of what is possible with a simple one-step calculation for single stage MS of copolymers,1 single stage MS of high molecular weight homopolymers2 and MALDI-TOF/TOF data2 among others to come.

Acknowledgments I am sincerely grateful to T. Nagy, A. Kuki, M. Zsuga and S. Kéki for their nice article and related research work. Our two groups had a good idea which was definitely worth being published.

ACS Paragon Plus Environment

Page 5 of 6 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Analytical Chemistry

References (1) Nagy, T.; Kuki, A.; Zsuga, M.; Kéki, S. Anal. Chem. 2018, 90, 3892−3897. (2) Fouquet, T.; Satoh, T.; Sato, H. Anal. Chem. 2018, 90, 2404-2408. (3) Sato, H.; Nakamura, S.; Teramoto, K.; Sato, T. J. Am. Soc. Mass Spectrom. 2014, 25, 1346-1355. (4) Fouquet, T.; Sato, H. Mass Spectrom. (Tokyo) 2017, 6, A0055. (5) Fouquet, T.; Sato, H. Anal. Chem. 2017, 89, 2682-2686. (6) Fouquet, T.; Sato, H. Rapid Commun. Mass Spectrom. 2017, 31, 1067-1072. (7) Fouquet, T.; Cody, R. B.; Ozeki, Y.; Kitagawa, S.; Ohtani, H.; Sato, H. J. Am. Soc. Mass Spectrom. 2018, DOI: 10.1007/s13361-018-1972-4.

ACS Paragon Plus Environment

Analytical Chemistry 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Figure 1. (A) MALDI-TOF mass spectrum of a P(EO-b-PO-b-EO) triblock copolymer recorded with a JMS S3000 spiralTOF mass spectrometer (JEOL, Tokyo, Japan). (B) Regular KMD plot (base unit: PO). (C) Remainders of KM (RKM) plot (base unit: PO) to be compared to the “MR vs. m/z plot”.1 (D) Resolutionenhanced KMD plot (base unit: PO/57). 113x70mm (300 x 300 DPI)

ACS Paragon Plus Environment

Page 6 of 6