DSpace at KIST: Text mining in MOF research: from manual curation to large language model-based automation

Browse

Full metadata record

DC Field	Value	Language
dc.contributor.author	Bae, Suyeon	-
dc.contributor.author	Jeon, Mingyu	-
dc.contributor.author	Moon, Hoi Ri	-
dc.date.accessioned	2025-07-29T05:00:11Z	-
dc.date.available	2025-07-29T05:00:11Z	-
dc.date.created	2025-07-28	-
dc.date.issued	2025-07	-
dc.identifier.issn	1359-7345	-
dc.identifier.uri	https://pubs.kist.re.kr/handle/201004/152862	-
dc.description.abstract	The rapid expansion of metal-organic framework (MOF) literature presents both a rich resource and a significant challenge for knowledge extraction. Text mining, which enables the conversion of unstructured scientific texts into structured, machine-readable data, has emerged as a key tool for accelerating data-driven research in the MOF domain. This review traces the development of text mining approaches in MOF research, from early manual curation and rule-based methods to recent breakthroughs powered by large language model (LLM)-based automation. We discuss the foundational role of natural language processing (NLP) and machine learning (ML) techniques such as named entity recognition and vector embedding models, followed by an in-depth analysis of LLM-based frameworks that enable flexible, scalable, and context-aware information extraction. Additionally, we introduce and compare their accuracy, and explore their diverse applications-including prediction of synthesizability, materials properties, and thermal stability. We conclude with a perspective on future directions for text mining in MOF research, including its integration into interactive graphical user interfaces, autonomous laboratories, multi-agent AI systems, and multi-modal LLM frameworks that can process textual, visual, and structural information in a unified way. This review aims to provide a foundational understanding for both experimental and computational researchers interested in adopting or advancing text mining methods in the MOF field.	-
dc.language	English	-
dc.publisher	Royal Society of Chemistry	-
dc.title	Text mining in MOF research: from manual curation to large language model-based automation	-
dc.type	Article	-
dc.identifier.doi	10.1039/d5cc02511g	-
dc.description.journalClass	1	-
dc.identifier.bibliographicCitation	Chemical Communications, v.61, no.60, pp.11083 - 11094	-
dc.citation.title	Chemical Communications	-
dc.citation.volume	61	-
dc.citation.number	60	-
dc.citation.startPage	11083	-
dc.citation.endPage	11094	-
dc.description.isOpenAccess	Y	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.identifier.wosid	001522367100001	-
dc.identifier.scopusid	2-s2.0-105009932119	-
dc.relation.journalWebOfScienceCategory	Chemistry, Multidisciplinary	-
dc.relation.journalResearchArea	Chemistry	-
dc.type.docType	Review	-
dc.subject.keywordPlus	ORGANIC FRAMEWORK SYNTHESIS	-

Appears in Collections:: KIST Article > Others

Export: RIS (EndNote); XLS (Excel); XML

Show Simple Item Record

KIST Library Institutional Repository

Browse

BROWSE