Text mining in MOF research: from manual curation to large language model-based automation
- Authors
- Bae, Suyeon; Jeon, Mingyu; Moon, Hoi Ri
- Issue Date
- 2025-07
- Publisher
- Royal Society of Chemistry
- Citation
- Chemical Communications, v.61, no.60, pp.11083 - 11094
- Abstract
- The rapid expansion of metal-organic framework (MOF) literature presents both a rich resource and a significant challenge for knowledge extraction. Text mining, which enables the conversion of unstructured scientific texts into structured, machine-readable data, has emerged as a key tool for accelerating data-driven research in the MOF domain. This review traces the development of text mining approaches in MOF research, from early manual curation and rule-based methods to recent breakthroughs powered by large language model (LLM)-based automation. We discuss the foundational role of natural language processing (NLP) and machine learning (ML) techniques such as named entity recognition and vector embedding models, followed by an in-depth analysis of LLM-based frameworks that enable flexible, scalable, and context-aware information extraction. Additionally, we introduce and compare their accuracy, and explore their diverse applications-including prediction of synthesizability, materials properties, and thermal stability. We conclude with a perspective on future directions for text mining in MOF research, including its integration into interactive graphical user interfaces, autonomous laboratories, multi-agent AI systems, and multi-modal LLM frameworks that can process textual, visual, and structural information in a unified way. This review aims to provide a foundational understanding for both experimental and computational researchers interested in adopting or advancing text mining methods in the MOF field.
- Keywords
- ORGANIC FRAMEWORK SYNTHESIS
- ISSN
- 1359-7345
- URI
- https://pubs.kist.re.kr/handle/201004/152862
- DOI
- 10.1039/d5cc02511g
- Appears in Collections:
- KIST Article > Others
- Files in This Item:
There are no files associated with this item.
- Export
- RIS (EndNote)
- XLS (Excel)
- XML
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.