Repository logo
 
Publication

Machine learning-based spectral analyses for camellia japonica cultivar identification

dc.contributor.authorRodrigues, Pedro Miguel
dc.contributor.authorSousa, Clara
dc.date.accessioned2025-02-06T11:24:41Z
dc.date.available2025-02-06T11:24:41Z
dc.date.issued2025-01
dc.description.abstractCamellia japonica is a plant species with high cultural and biological relevance. Besides being used as an ornamental plant species, C. japonica has relevant biological properties. Due to hybridization, thousands of cultivars are known, and their accurate identification is mandatory. Infrared spectroscopy is currently recognized as an accurate and rapid technique for species and/or subspecies identifications, including in plants. However, selecting proper analysis tools (spectra pre-processing, feature selection, and chemometric models) highly impacts the accuracy of such identifications. This study tests the impact of two distinct machine learning-based approaches for discriminating C. japonica cultivars using near-infrared (NIR) and Fourier transform infrared (FTIR) spectroscopies. Leaves infrared spectra (NIR—obtained in a previous study; FTIR—obtained herein) of 15 different C. japonica cultivars (38 plants) were modeled and analyzed via different ma- chine learning-based approaches (Approach 1 and Approach 2), each combining a feature selection method plus a classifier application. Regarding Approach 1, NIR spectroscopy emerged as the most effective technique for predicting C. japonica cultivars, achieving 81.3% correct cultivar assignments. However, Approach 2 obtained the best results with FTIR spectroscopy data, achieving a perfect 100.0% accuracy in cultivar assignments. When comparing both approaches, Approach 2 also improved the results for NIR data, increasing the correct cultivar predictions by nearly 13%. The results obtained in this study highlight the importance of chemometric tools in analyzing infrared data. The choice of a specific data analysis approach significantly affects the accuracy of the technique. Moreover, the same approach can have varying impacts on different techniques. Therefore, it is not feasible to establish a universal data analysis approach, even for very similar datasets from comparable analytical techniques.pt_PT
dc.description.versioninfo:eu-repo/semantics/publishedVersionpt_PT
dc.identifier.doi10.3390/molecules30030546pt_PT
dc.identifier.eid85217566108
dc.identifier.issn1420-3049
dc.identifier.pmid39942650
dc.identifier.urihttp://hdl.handle.net/10400.14/48081
dc.identifier.wos001418642800001
dc.language.isoengpt_PT
dc.peerreviewedyespt_PT
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/pt_PT
dc.subjectChemometricspt_PT
dc.subjectFeature selectionpt_PT
dc.subjectMachine learningpt_PT
dc.subjectInfrared spectroscopypt_PT
dc.subjectPlant typingpt_PT
dc.titleMachine learning-based spectral analyses for camellia japonica cultivar identificationpt_PT
dc.typejournal article
dspace.entity.typePublication
oaire.citation.issue3pt_PT
oaire.citation.titleMoleculespt_PT
oaire.citation.volume30pt_PT
rcaap.rightsopenAccesspt_PT
rcaap.typearticlept_PT

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
114382050.pdf
Size:
1.18 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.44 KB
Format:
Item-specific license agreed upon to submission
Description: