Classification of Similarly Colored Medicinal Berries using Hyperspectral Images and Machine Learning Models

Authors
Kim, Min ChaeYoon, Hyo InLee, HyeinPark, So JinYang, Jung SeokJung, Dae-HyunPark, Soo Hyun
Issue Date
2024-06
Publisher
한국원예학회
Citation
Horticultural Science & Technology, v.42, no.3, pp.249 - 263
Abstract
As the misuse of medicinal plants increases due to misclassifications brought about by similarities in the external characteristics (color, size, shape) of plants and their fruits, accurate identification techniques must be developed. Spectral information can be used to identify various characteristics of medicinal plants in wavelength ranges that cannot be seen by the naked eye. This study develops a non-destructive identification and classification technology for medicinal plants using hyperspectral imaging combined with machine learning models to eliminate the misidentification of medicinal berries that are very similar in size, shape, and color. Four models were used to classify different plant species: the logistic regression (LR), K -nearest neighbor (KNN), decision tree (DT), and random forest (RF) models. The optimal classification model was selected based on classification performance indicators. The dried fruit of four medicinal plant species were used: Cornus officinalis , Lycium chinense , Lycium barbarum , and Schisandra chinensis . Hyperspectral images of the samples were obtained corresponding to 150 wavelength bands in the 400-1000 nm range. For the training dataset, the average reflectance spectrum per berry was extracted. The accuracy, F1 score, confusion matrix, and receiver operating characteristic (ROC) curve were used to evaluate the performance of each classification model. The LR model performed best, with accuracy of 0.99 and an area under the curve (AUC) value of 1 for all samples. The LR model produces very accurate results, and the classification system based on it is fast and non-destructive. The machine -learning -based hyperspectral imaging classification system can be applied and scaled up to the industrial level, effectively eliminating the misuse of medicinal plants through accurate identification of these plants.
Keywords
RANDOM FOREST; TEXTURE; classification model; evaluation metrics; logistic regression; medicinal plant; red fruits
ISSN
1226-8763
URI
https://pubs.kist.re.kr/handle/201004/150188
DOI
10.7235/HORT.20240022
Appears in Collections:
KIST Article > 2024
Files in This Item:
There are no files associated with this item.
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE