Predictive modelling of colossal ATR-FTIR spectral data using PLS-DA: empirical differences between PLS1-DA and PLS2-DA algorithms
文献信息
Loong Chuen Lee, Abdul Aziz Jemain
In response to our review paper [L. C. Lee et al., Analyst, 2018, 143, 3526–3539], we present a study that compares empirical differences between PLS1-DA and PLS2-DA algorithms in modelling a colossal ATR-FTIR spectral dataset. Over the past two decades, partial least squares-discriminant analysis (PLS-DA) has gained wide acceptance and huge popularity in the field of applied research, partly due to its dimensionality reduction capability and ability to handle multicollinear and correlated variables. To solve a K-class problem (K > 2) using PLS-DA and high-dimensional data like infrared spectra, one can construct either K one-versus-all PLS1-DA models or only one PLS2-DA model. The aim of this work is to explore empirical differences between the two PLS-DA algorithms in modeling a colossal ATR-FTIR spectral dataset. The practical task is to build a prediction model using the imbalanced, high dimensional, colossal and multi-class ATR-FTIR spectra of blue gel pen inks. Four different sub-datasets were prepared from the principal dataset by considering the raw and asymmetric least squares (AsLS) preprocessed forms: (a) Raw-global region; (b) Raw-local region; (c) AsLS-global region; and (d) AsLS-local region. A series of 50 models which includes the first 50 PLS components incrementally was constructed repeatedly using the four sub-datasets. Each model was evaluated using six different variants of v-fold cross validation, autoprediction and external testing methods. As a result, each PLS-DA algorithm was represented by a number of figures of merit. The differences between PLS1-DA and PLS2-DA algorithms were assessed using hypothesis tests with respect to model accuracy, stability and fitting. On the other hand, confusion matrices of the two PLS-DA algorithms were inspected carefully for assessment of model parsimony. Overall, both the algorithms presented satisfactory model accuracy and stability. Nonetheless, PLS1-DA models showed significantly higher accuracy rates than PLS2-DA models, whereas PLS2-DA models seem to be much more stable compared to PLS1-DA models. Eventually, PLS2-DA also proved to be less prone to overfitting and is more parsimonious than PLS1-DA. In conclusion, the relatively high accuracy of the PLS1-DA algorithm is achieved at the cost of rather low parsimony and stability, and with an increased risk of overfitting.
相关文献
Organic nonlinear optical materials: where we have been and where we are going
DOI: 10.1039/B512646K
The hexamethylpentalene dianion and other reagents for organometallic pentalene chemistry
Andrew E. Ashley, Andrew R. Cowley, Dermot O'Hare
DOI: 10.1039/B702150J
Hydroxy-cruciforms
Psaras L. McGrier, Kyril M. Solntsev, Jan Schönhaber, Scott M. Brombosz, Laren M. Tolbert, Uwe H. F. Bunz
DOI: 10.1039/B702883K
The first insoluble polymer-bound palladium complexes of 2-pyridyldiphenylphosphine: highly efficient catalysts for the alkoxycarbonylation of terminal alkynes
Simon Doherty, Julian G. Knight, Michael Betham
DOI: 10.1039/B512556A
Transformation of nanoporous oxoselenoantimonates into Sb2O3—nanoribbons and nanorods
Dorota Sendor, Thomas Weirich, Ulrich Simon
DOI: 10.1039/B509657J
Design, synthesis and evaluation of near-infrared fluorescent pH indicators in a physiologically relevant range
Zongren Zhang, Samuel Achilefu
DOI: 10.1039/B512315A
Insertion of arynes into carbon–halogen σ-bonds: regioselective acylation of aromatic rings
Hiroto Yoshida, Yasuhiro Mimura, Joji Ohshita, Atsutaka Kunai
DOI: 10.1039/B701581J
A metal–oxo mediated approach to the synthesis of 21,22-diepi-membrarollin
Yulai Hu, Richard C. D. Brown
DOI: 10.1039/B512126D
Growth of porous single-crystal Cr2O3 in a 3-D mesopore system
Kun Jiao, Bin Zhang, Bin Yue, Yu Ren, Shixi Liu, Shirun Yan, Calum Dickinson, Wuzong Zhou, Heyong He
DOI: 10.1039/B512080B
您可能还喜欢
什么是2,6-二溴-4,8-双[(2-乙基己基)氧基]苯并[1,2-b:4,5-b']二噻吩(CAS号:1226782-13-3)?
2,6-二溴-4,8-双[(2-乙基己基)氧基]苯并[1,2-b:4,5-b']二噻吩是一种有机化合物,分子式为C23H32Br2O2S2。该化合物具有芳香性和...
木聚硫钠(CAS号:37319-17-8)的物理化学性质是什么?
木聚硫钠通常为无色或白色结晶性粉末,具有吸湿性。其分子量约为121.11 g/mol。木聚硫钠易溶于水,不溶于醇类和其他非极性溶剂。在酸性或碱性溶液中,木聚硫钠...
2-甲氧基-4-(三氟甲基)苄溴, JRD(CAS号:886500-59-0)适用哪些法规指南?
该化合物在合成、储存和运输过程中需遵循《全球化学品统一分类和标签制度》(GHS)的健康、环境和物理危险分类。在欧洲还需符合《化学品注册、评估、授权和限制》(RE...
1,4-Diazoniabicyclo[2.2.2]octane-1,4-disulfinate(CAS号:119752-83-9)的主要用途是什么?
1,4-二氮杂双环[2.2.2]辛烷-1,4-二硫酸二酯主要用于有机合成中的保护基团,特别是在保护胺基和硫醇基方面具有广泛应用。此外,它还用于一些特殊化学反应的...
如何处理含有4-(Bromomethyl)-2-fluorobenzenesulphonamide(CAS号:1645275-47-3)的废料?
含有4-(Bromomethyl)-2-fluorobenzenesulphonamide的废液应首先进行中和处理,以降低pH值,避免对环境造成腐蚀性影响。随后...
Loureiriol(CAS号:479195-44-3)的物理化学性质是什么?
Loureiriol是一种天然化合物,其分子式为C15H22O4。Loureiriol为无色结晶性粉末,具有较高的熔点和良好的热稳定性。其相对分子质量为262....
在合成中是否有3-氨基苯甲酰苯胺(CAS号:14315-16-3)的替代品?
在合成过程中,可以考虑使用类似结构的化合物作为3-氨基苯甲酰苯胺的替代品,例如N-苯基-3-氰基苯胺或N-苯基-3-硝基苯胺等,这些化合物具有相似的化学性质,可...
4-异氰酰苯基硼酸频哪醇酯(CAS号:380430-64-8)的市场或研究趋势如何?
4-异氰酰苯基硼酸频哪醇酯主要应用于有机合成、药物化学和材料科学领域。随着绿色化学的发展,该化合物因其高效的官能团转化能力和环境友好性而受到越来越多的关注。近年...
如何储存3β-乙酰氧基-7,25-甘遂二烯-24(R)-醇(CAS号:1352001-09-2)?
3β-乙酰氧基-7,25-甘遂二烯-24(R)-醇应储存在阴凉、干燥、通风良好的地方,避免直接光照。储存容器应密封,防止空气中的水分和氧气影响化合物的稳定性。建...
如何储存4-氟-2-甲基-1H-吲哚(CAS号:1260383-51-4)?
应将4-氟-2-甲基-1H-吲哚存放在阴凉、干燥、通风良好的地方,避免直接暴露在光照下。容器应密封,避免与空气中的水蒸气接触。建议在避光、温度不超过25℃的环境...
来源期刊
Analyst

Analyst publishes analytical and bioanalytical research that reports premier fundamental discoveries and inventions, and the applications of those discoveries, unconfined by traditional discipline barriers.













