Building an end-to-end battery recipe knowledge base via transformer-based text mining
- Authors
- Lee, Daeun; Mizuseki, Hiroshi; Choi, Jaewoong; Lee, Byungju
- Issue Date
- 2025-05
- Publisher
- SPRINGERNATURE
- Citation
- Communications Materials, v.6, no.1
- Abstract
- Recent studies have increasingly applied natural language processing to automatically extract experimental information from battery materials literature. Despite the complexity of battery manufacturing—from material synthesis to cell assembly—no comprehensive study has systematically organized this information. Here we present a language modeling-based protocol for extracting complete battery recipes from scientific papers. Using machine learning-based filtering and topic modeling, we identified 2174 relevant papers and extracted over 5800 paragraphs describing synthesis and assembly procedures. Deep learning-based named entity recognition models were trained to extract 30 entities with F1-scores of 88.18% and 94.61%. We also evaluated large language models, including GPT-4, using few-shot learning and fine-tuning. These results enabled the structured construction of 165 end-to-end recipes and identification of trends such as precursor–method associations. The resulting knowledge base supports flexible recipe retrieval and provides a scalable framework for organizing protocols across large volumes of publications, thereby accelerating literature review and data-driven battery design.
- ISSN
- 2662-4443
- URI
- https://pubs.kist.re.kr/handle/201004/153513
- DOI
- 10.1038/s43246-025-00825-z
- Appears in Collections:
- KIST Article > 2025
- Export
- RIS (EndNote)
- XLS (Excel)
- XML
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.