Standardization of complex biologically derived spectrochemical datasets

Standardization of complex biologically derived spectrochemical datasets

Play all audios:

Loading...

Spectroscopic techniques such as Fourier-transform infrared (FTIR) spectroscopy are used to study interactions of light with biological materials. This interaction forms the basis of many


analytical assays used in disease screening/diagnosis, microbiological studies, and forensic/environmental investigations. Advantages of spectrochemical analysis are its low cost, minimal


sample preparation, non-destructive nature and substantially accurate results. However, an urgent need exists for repetition and validation of these methods in large-scale studies and across


different research groups, which would bring the method closer to clinical and/or industrial implementation. For this to succeed, it is important to understand and reduce the effect of


random spectral alterations caused by inter-individual, inter-instrument and/or inter-laboratory variations, such as variations in air humidity and CO2 levels, and aging of instrument parts.


Thus, it is evident that spectral standardization is critical to the widespread adoption of these spectrochemical technologies. By using calibration transfer procedures, in which the


spectral response of a secondary instrument is standardized to resemble the spectral response of a primary instrument, different sources of variation can be normalized into a single model


using computational-based methods, such as direct standardization (DS) and piecewise direct standardization (PDS); therefore, measurements performed under different conditions can generate


the same result, eliminating the need for a full recalibration. Here, we have constructed a protocol for model standardization using different transfer technologies described for FTIR


spectrochemical applications. This is a critical step toward the construction of a practical spectrochemical analysis model for daily routine analysis, where uncertain and random variations


are present.


The datasets generated and/or analyzed during the current study are available from the corresponding authors on reasonable request.


Outlier detection algorithm: https://doi.org/10.6084/m9.figshare.7066613.v1


C.L.M.M. thanks Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) - Brazil (grant 88881.128982/2016-01) for financial support. The work in the laboratory of F.L.M. was


supported in part by The Engineering and Physical Sciences Research Council (EPSRC; grant nos: EP/K023349/1 and EP/K023373/1). M.P. acknowledges the Rosemere Cancer Foundation for funding.


These authors contributed equally: Camilo L.M. Morais, Maria Paraskevaidi.


School of Pharmacy and Biomedical Sciences, University of Central Lancashire, Preston, UK


Key Lab of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, Xiamen, China


Division of Biomedical and Life Sciences, Faculty of Health and Medicine, Lancaster University, Lancaster, UK


Spectroscopy Products Division, Renishaw plc., New Mills, Wotton-under-Edge, UK


Institute of Chemistry, Biological Chemistry and Chemometrics, Federal University of Rio Grande do Norte, Natal, Brazil


Department of Obstetrics and Gynaecology, Lancashire Teaching Hospitals NHS Foundation, Preston, UK


Department of Pathology, University of Illinois at Chicago, Chicago, IL, USA


Institute of Astronomy, Geophysics and Atmospheric Sciences, University of São Paulo, São Paulo, Brazil


F.L.M. is the principal investigator who conceived and developed the idea for the article; C.L.M.M. and M.P. wrote the manuscript. L.C., N.J.F., M.I., K.M.G.L., P.L.M.-H., H.S., J.T.,


M.J.W., D.Z. and Y.-G.Z. contributed recommendations and provided feedback and changes to the manuscript, and C.L.M.M., M.P. and F.L.M. brought together the text and finalized the


manuscript.


Journal peer review information: Nature Protocols thanks Åsmund Rinnan and other anonymous reviewer(s) for their contribution to the peer review of this work.


Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Martin, F. L. et al. Nat. Protoc. 5, 1748–1760 (2010): https://doi.org/10.1038/nprot.2010.133


Baker, M. J. et al. Nat. Protoc. 9, 1771–1791 (2014): https://doi.org/10.1038/nprot.2014.110


Medeiros de Morais, C. L. & de Lima, K. M. G. Anal. Methods 7, 6904–6910 (2015): https://doi.org/10.1039/C5AY01369K


Vasconcelos de Andrade, E. W. et al. Curr. Anal. Chem. 14, 488–494 (2018): https://doi.org/10.2174/1573411014666171212141909


a–d, Average (a) raw and (b) preprocessed spectra for healthy control samples, and average (c) raw and (d) preprocessed spectra for cancer samples across three different instruments (A, B


and C).


a, PCA scores for healthy control samples according to the instrument used for spectra acquisition (A, B and C). b, PCA scores for cancer samples according to the instrument used for spectra


acquisition (A, B and C). c, Hotelling’s T2 versus Q residuals test for healthy control samples according to the instrument used for spectra acquisition (A, B and C) based on a PCA using 5


PCs (94.77% cumulative variance). d, Hotelling’s T2 versus Q residuals test for cancer samples according to the instrument used for spectra acquisition (A, B and C) based on a PCA using 5


PCs (92.96% cumulative variance). Circled samples in c and d indicate outliers removed. Confidence ellipse was 95%, depicted in blue in a and b.


a, PCA loadings for healthy control samples measured in different instruments (A, B and C). b, PCA loadings for cancer samples measured in different instruments (A, B and C).


a,b, Average (a) raw and (b) pre-processed spectra for healthy control samples acquired with instrument A depending on the operator. c,d, Average (c) raw and (d) preprocessed spectra for


healthy control samples acquired with instrument B depending on the operator. e,f, Average (e) raw and (f) preprocessed spectra for healthy control samples acquired with instrument C,


varying the operator.


a,b, Average (a) raw and (b) preprocessed spectra for cancer samples acquired with instrument A depending on the operator. c,d, Average (c) raw and (d) preprocessed spectra for cancer


samples acquired with instrument B depending on the operator. e,f, Average (e) raw and (f) preprocessed spectra for cancer samples acquired with instrument C depending on the operator.


a,b, PCA scores for (a) healthy control and (b) cancer samples acquired with instrument A depending on the operator. c,d, PCA scores for (c) healthy control and (d) cancer samples acquired


with instrument B depending on the operator. e,f, PCA scores for (e) healthy control and (f) cancer samples acquired with instrument C depending on the operator. Confidence ellipse was 95%,


depicted in blue.


a, Hotelling’s T2 versus Q residuals test based on a PCA using 8 PCs (99.07% cumulative variance) for healthy control samples depending on the instrument for spectra acquisition (A, B and C)


used by operator 2. b, Hotelling’s T2 versus Q residuals test based on a PCA using 5 PCs (96.92% cumulative variance) for cancer samples depending on the instrument for spectra acquisition


(A, B and C) used by operator 2. Circled sample in a indicates an outlier removed. The Hotelling’s T2 versus Q residuals test for operator 1 is depicted in Supplementary Fig. 2c,d.


Confidence ellipse at a 95% confidence level is depicted in blue.


Anyone you share the following link with will be able to read this content: