Classification of Similarly Colored Medicinal Berries using Hyperspectral Images and Machine Learning Models
- Authors
- Kim, Min Chae; Yoon, Hyo In; Lee, Hyein; Park, So Jin; Yang, Jung Seok; Jung, Dae-Hyun; Park, Soo Hyun
- Issue Date
- 2024-06
- Publisher
- 한국원예학회
- Citation
- Horticultural Science & Technology, v.42, no.3, pp.249 - 263
- Abstract
- As the misuse of medicinal plants increases due to misclassifications brought about by similarities in the external characteristics (color, size, shape) of plants and their fruits, accurate identification techniques must be developed. Spectral information can be used to identify various characteristics of medicinal plants in wavelength ranges that cannot be seen by the naked eye. This study develops a non-destructive identification and classification technology for medicinal plants using hyperspectral imaging combined with machine learning models to eliminate the misidentification of medicinal berries that are very similar in size, shape, and color. Four models were used to classify different plant species: the logistic regression (LR), K -nearest neighbor (KNN), decision tree (DT), and random forest (RF) models. The optimal classification model was selected based on classification performance indicators. The dried fruit of four medicinal plant species were used: Cornus officinalis , Lycium chinense , Lycium barbarum , and Schisandra chinensis . Hyperspectral images of the samples were obtained corresponding to 150 wavelength bands in the 400-1000 nm range. For the training dataset, the average reflectance spectrum per berry was extracted. The accuracy, F1 score, confusion matrix, and receiver operating characteristic (ROC) curve were used to evaluate the performance of each classification model. The LR model performed best, with accuracy of 0.99 and an area under the curve (AUC) value of 1 for all samples. The LR model produces very accurate results, and the classification system based on it is fast and non-destructive. The machine -learning -based hyperspectral imaging classification system can be applied and scaled up to the industrial level, effectively eliminating the misuse of medicinal plants through accurate identification of these plants.
- Keywords
- RANDOM FOREST; TEXTURE; classification model; evaluation metrics; logistic regression; medicinal plant; red fruits
- ISSN
- 1226-8763
- URI
- https://pubs.kist.re.kr/handle/201004/150188
- DOI
- 10.7235/HORT.20240022
- Appears in Collections:
- KIST Article > 2024
- Files in This Item:
There are no files associated with this item.
- Export
- RIS (EndNote)
- XLS (Excel)
- XML
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.