Building an end-to-end battery recipe knowledge base via transformer-based text mining

Authors
Lee, DaeunMizuseki, HiroshiChoi, JaewoongLee, Byungju
Issue Date
2025-05
Publisher
SPRINGERNATURE
Citation
Communications Materials, v.6, no.1
Abstract
Recent studies have increasingly applied natural language processing to automatically extract experimental information from battery materials literature. Despite the complexity of battery manufacturing—from material synthesis to cell assembly—no comprehensive study has systematically organized this information. Here we present a language modeling-based protocol for extracting complete battery recipes from scientific papers. Using machine learning-based filtering and topic modeling, we identified 2174 relevant papers and extracted over 5800 paragraphs describing synthesis and assembly procedures. Deep learning-based named entity recognition models were trained to extract 30 entities with F1-scores of 88.18% and 94.61%. We also evaluated large language models, including GPT-4, using few-shot learning and fine-tuning. These results enabled the structured construction of 165 end-to-end recipes and identification of trends such as precursor–method associations. The resulting knowledge base supports flexible recipe retrieval and provides a scalable framework for organizing protocols across large volumes of publications, thereby accelerating literature review and data-driven battery design.
ISSN
2662-4443
URI
https://pubs.kist.re.kr/handle/201004/153513
DOI
10.1038/s43246-025-00825-z
Appears in Collections:
KIST Article > 2025
Export
RIS (EndNote)
XLS (Excel)
XML

qrcode

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

BROWSE