GPT4Battery A LLM Driven Framework For Adaptive State of Health Estimation of Raw Li Ion Batteries
GPT4Battery A LLM Driven Framework For Adaptive State of Health Estimation of Raw Li Ion Batteries
14 cross-manufacturer, and cross-capacity. Hence, learning-based methods for SOH estimation have emerged re- 56
15 this paper utilizes the strong generalization capa- cently [Shu et al., 2021; Tian et al., 2021]. For instance, [Tan 57
16 bility of large language model (LLM) to proposes and Zhao, 2020] developed an LSTM-FC network with a fine- 58
17 a novel framework for adaptable SOH estimation tuning strategy to predict SOH using only the first 25 % of 59
18 across diverse batteries. To match the real sce- the target LIB dataset. Besides, [Lu et al., 2023] integrated a 60
19 nario where unlabeled data sequentially arrives in swarm of DNNs with domain adaptation to enable SOH esti- 61
20 use with distribution shifts, the proposed model is mation without target LIB labels. While these approaches al- 62
21 modified by a test-time training technique to en- leviate the data collection difficulties to some extent, they still 63
22 sure estimation accuracy even at the battery’s end cannot follow the battery upgrading pace and pose a barrier to 64
23 of life. The validation results demonstrate that the developing battery technologies. Another issue is that exist- 65
24 proposed framework achieves state-of-the-art accu- ing transfer learning methods are effective on well-collected 66
25 racy on four widely recognized datasets collected datasets for fine-tuning or feature alignment, but they fail to 67
26 from 62 batteries. Furthermore, we analyze the the- match the real-world scenario where a raw battery continues 68
27 oretical challenges of cross-battery estimation and to age, bringing unlabeled data incrementally over months. 69
28 provide a quantitative explanation of the effective- Hence, developing a large model that can handle diverse types 70
96 • We fully utilize the strong generalization capability of the cross-modality power of Bert in transferring from nat- 150
97 LLM to establish a framework for cross-battery SOH es- ural language to ammonia acid, DNA, and music. Addi- 151
98 timation, lighting the burden of months-to-year degrada- tionally, [Lu et al., 2022] investigated the capability of a 152
99 tion experiments for data collection. transformer pre-trained on natural language to generalize to 153
other modalities with minimal fine-tuning. Their so-called 154
100 • We employ a test-time training strategy to match the Frozen Pretrained Transformer (FPT) shows superiority over 155
101 real-world scenario where a raw battery continues to the randomly initialized same architecture in improving per- 156
102 age, bringing unlabeled data incrementally but often in- formance and compute efficiency on cross-modality down- 157
103 volves temporal distribution shifts. This strategy ensures stream tasks spanning numerical computation, vision, and 158
104 estimation accuracy even at the battery’s end of life. protein fold prediction. Furthermore, [Zhou et al., 2023] also 159
105 • The validation results demonstrate that the proposed leveraged a modified GPT2 on general time series analysis 160
106 framework achieves state-of-the-art zero-shot accuracy and achieved results comparable to SOTA. [Pang et al., 2023] 161
107 on four widely recognized LIB datasets, and two of reveals that LLMs trained solely on textual data are surpris- 162
108 them are even comparable to the latest domain adapta- ingly strong encoders for purely visual tasks. 163
109 tion methods. Inter-modality migration of generalization capability. 164
Relatively, exploring whether the powerful generalization ca- 165
110 2 Related Work pability of LLM can be migrated to another modality is an 166
ongoing research direction. Note that [Zhou et al., 2023] dis- 167
111 2.1 Data-driven battery SOH estimation covered the superior few-shot /zero-shot capability of LLM 168
112 Data-driven approaches for battery SOH estimation display on general time series analysis experimentally despite a little 169
113 greater benefits in accuracy and online computation efficiency lack of explainability. 170
117 posed a machine learning pipeline for battery state of health ing the performance of predictive models when there are dis- 173
118 estimation involving four algorithms: Bayesian Ridge Re- tribution shifts between training and testing data [Sun et al., 174
119 gression (BPR), Gaussian process regression (GPR), Random 2020]. A typical TTT model has a supervised main task head 175
120 Forest (RF), and Deep ensemble of neural networks (dNNe). for labeled training data and a self-supervised head for un- 176
121 While accurate health state estimation methods have pro- labeled testing data. When testing data arrives in an online 177
122 gressed significantly, the time- and resource-consuming stream, the online version of TTT also updates incremen- 178
123 degradation experiments needed to generate lifelong time tally via self-supervision. This method operates well when 179
124 training data make target battery agnostic approaches attrac- the self-supervised task propagates gradients that correlate 180
125 tive [Ye and Yu, 2021; Han et al., 2022]. For instance, [Tan with the main task, demonstrating an improved generaliza- 181
126 and Zhao, 2020] developed an LSTM-FC network to predict tion ability on many visual benchmarks for distribution shifts 182
127 SOH by fine-tuning only the first 25 % of the target dataset. [Gandelsman et al., 2022]. 183
128 Similarly, [Wang et al., 2023] retrained LSTM using only two
129 target battery cells during the transfer learning process. Be- 3 Methodology 184
130 sides, [Lu et al., 2023] integrated a swarm of deep neural net- In this section, we introduce our proposed framework, 185
131 works with domain adaptation to enable SOH estimation in GPT4Battery, which leverages LLM’s generalization capabil- 186
132 the absence of target battery labels, which inspired our work. ity for cross-battery state of health (SOH) estimation, along 187
with a more practical setting—test-time training strategy as 188
133 2.2 Capability of LLM shown in Figure 1. 189
134 In-modality generalization capability. Extensive research
135 work has verified the in-modality generalization capability of 3.1 Problem Formulation 190
136 large language models. Bert [Devlin et al., 2018] used trans- Using the battery’s raw voltage-time charging curve is a 191
137 former encoders and employed a masked language modeling popular and easy-to-implement approach to accurately es- 192
138 task to recover the random masked tokens within a text. Fur- timate multiple states over battery life [Tian et al., 2022; 193
139 thermore, OpenAI proposed GPT [Radford et al., 2018] and Roman et al., 2021]. Determining SOH can be considered a 194
Regression Head SSL Head for
Reconstruction
GPT4Battery
Battery-specific
Adaptor
Figure 1: Proposed GPT4Battery architecture. We highlight the leverages of LLM’s generalization capability for cross-battery SOH estima-
tion and the scenario-fit setting of test-time learning strategy.
Electrode Material
Dataset Nominal capacity Voltage range Samples Collector
(Cathode)
CALCE LCO 1.1(Ah) 2.7-4.2(V) 2807 University of Maryland
SANYO NMC 1.85(Ah) 3.0-4.1(V) 415 RWTH Aachen University
KOKAM LCO/NCO 0.74(Ah) 2.7-4.2(V) 503 University of Oxford
PANASONIC NCA 3.03(Ah) 2.5-4.29(V) 2770 Beijing Institude of Technology
GOTION LFP 27(Ah) 2.0-3.65(V) 4262 Beijing Institude of Technology
263 pre-trained model’s performance on downstream tasks with In a real-world scenario, a raw battery undergoes continu- 289
264 minimal effort, PEFT methods, such as LoRA [Hu et ous aging with incrementally acquired unlabeled data, sam- 290
265 al., 2021], VPT [Jia et al., 2022], and Prefix Tuning pled over months. This renders existing transfer learning- 291
266 [Li and Liang, 2021], have been widely employed in CV based methods unadaptable. In this section, we introduce the 292
Figure 3: Visual results of adaptation to four different kinds of LIBs under zero-shot setting using GPT4Battery, pre-trained on the GOTION
Dataset.
Table 2: Comparison of methods under zero-shot setting. We calculate the MAE (as %) for each dataset and average four datasets as well
as inference time (ms) for three benchmarks. A lower MAE score indicates better performance. Red: best, Black: second best. Note that
swarm is tested under domain adaptation setting where target data is accessible.
331 4 Experiments Table 2 reports the comparative results of the mean abso- 368
lute error (MAE) on baselines and inference time on three 369
332 This section evaluates the proposed GPT4Battery on five pub-
benchmarks. Without the target training data, the exist- 370
333 licly recognized datasets collected from 65 battery cells to
ing methods fail to provide reliable estimation with their 371
334 demonstrate our framework’s effectiveness on this challeng-
MAEs over 10% except for SVR. In contrast, the proposed 372
335 ing problem.
GPT4Battery framework achieves accurate SOH estimation 373
336 4.1 Experiment Settings of an average MAE of 2.17% on four datasets. On SANYO 374
and PANASONIC, our method even outperforms the latest 375
337 Datasets. The experiments employ the following datasets domain adaptation method where the target LIB features are 376
338 manufactured by CALCE [He et al., 2011], SANYO [Li et accessible by MAE of 0.34% and 0.28%, respectively. 377
339 al., 2021], KOKAM [Birkl, 2017], PANASONIC, and GO- The expense of a little inference time (within 1 seconds per 378
340 TION HIGH-TECH [Lu et al., 2023]. These datasets cover cycle) is certainly acceptable considering one charging cycle 379
341 widely-used cathode active materials, a capacity ranging from could take hours. It should be noted that the least desirable 380
342 0.74Ah to 27Ah, and five different manufacturers, emphasiz- performance is observed on KOKAM, mainly attributed to 381
343 ing our method’s potential to work directly on new-generation high linearity of this specific dataset. 382
344 batteries under a zero-shot setting. This paper adopts [Lu et
345 al., 2023] for data pre-processing. The differences between Evaluation under regular setting 383
349 • Four popular data-driven SOH estimation methods in- GPR 6.42 5.84 8.06 7.63 6.39
350 clude Gaussian process regression (GPR), support vec- RD 0.48 0.13 0.11 0.52 0.29
SVR 4.23 5.87 6.36 5.50 4.66
351 tor regression (SVR), Random Forest (RD) [Roman et CNN 0.72 1.90 2.65 0.72 0.28
352 al., 2021], and CNN [Tian et al., 2022]. Benchmark2 2.22 0.61 1.13 1.40 0.25
353 • One latest domain adaptation based transfer learning
354 method for estimation without the label of target LIBs Table 3: Comparison of methods under regular setting. We calculate
355 [Lu et al., 2023]. the MAE (as %) for each dataset. Black: best.
356 • We created Benchmark1 by substituting PLM with a reg-
Under regular settings, we disabled TTT and compared this 384
357 ular CNN architecture and Benchmark2 by disabling the
LLM-driven model (Benchmark2) to the existing methods. 385
358 test-time training technique.
LIBs within one dataset are divided into training and testing 386
359 4.2 Results sets to fit this traditional machine learning setting. Unsurpris- 387
ingly, our model shows no superior performance over base- 388
360 Evaluation under zero-shot setting line methods, as a model with strong generalization capability 389
361 Under a zero-shot setting, we employed GOTION [Lu et al., may not converge as effectively on one specific dataset. 390
362 2023] as the pre-training dataset for joint training because it
363 contains more samples relatively. After obtaining a founda- 4.3 Analysis 391
364 tion model, test-time training is employed to adapt this model The challenge for zero-shot estimation originates from the 392
365 to four different types of batteries. Fig. 3 illustrates the visual spatial domain shifts across batteries with different chemistry, 393
366 accuracy performance in the absence of any data of the four capacities, or manufacturers. Meanwhile, since the degrada- 394
367 LIBs except for the 1st cycle with SOH label considered as 1. tion process often spans over months to years, variations in 395
Figure 4: Quantitative comparison of GPT4Battery (left) and Benchmark2 (right, with TTT disabled) on CALCE under zero-shot setting.
Despite both methods demonstrating accurate estimation early in the life cycles, GPT4Battery with TTT effectively reduces errors accumu-
lated due to temporal distribution shifts, ensuring accuracy in the mid-to-late life cycles.
396 the mechanisms within the battery (such as side reactions and the spatial distribution shifts between diverse LIBs and tem- 434
397 stability of solid electrolyte interface) [Broussely et al., 2005] poral distribution shifts during the months-to-year aging pro- 435
398 are more likely to induce temporal distribution shifts, making cess, emphasizing the necessity of LLM-driven and test-time 436
399 a fixed model fall in the mid-to-late life cycle. The excel- learning. To our knowledge, we are the first to explore the 437
400 lent performance of our framework can be attributed to the iter-modal migratability potential of large language model’s 438
401 powerful generalization capability migrated from LLM and generalization capability and deploy it successfully. Practi- 439
402 test-time learning strategy, both effectively working in deal- cally, our proposed GPT4Battery achieved state-of-art results 440
403 ing with this out-of-distribution problem. in zero-shot SOH estimation across various LIBs. 441
404 Here, we provide a qualitative explanation of a crucial
405 strategy in our proposed framework, i.e., test-time training.
406 We present two estimation and error analysis maps to visu-
407 alize the contributions of TTT to SOH estimate. Compared
408 with Benchmark2, where TTT is disabled, GPT4Battery
409 achieves significantly lower Mean Absolute Error (MAE)
410 throughout the life cycle. This stands in contrast to the ac-
411 cumulating estimation error observed in Benchmark2, con-
412 firming the assumption of temporal distribution shifts.
413 Theoretically, an intuitive explanation for TTT is that the
414 self-supervised task happens to propagate gradients that cor-
415 relate with those of the main task [Sun et al., 2020]. In our
416 framework, we believe that the charging curve reconstruction
417 task finds a better bias-variance trade-off under the tempo-
418 rally accumulated distribution shifts. The fixed model is bi-
419 ased because it is completely based on biased training data,
420 and the migrated generalization ability from language mod-
421 els is not powerful enough to handle this non-trivial change.
422 The other extreme is completely discarding the pre-trained
423 knowledge and training a new model from scratch on each
424 test input. However, this is undesirable because of the high
425 variance of each input, and the reconstruction task does not
426 always contribute to the main regression task according to
427 our trial and error.
428 5 Conclusion
429 We proposed an LLM-driven framework equipped with time-
430 time learning for cross-battery state of health (SOH) estima-
431 tion, addressing the real-world scenario where raw battery
432 data arrives sequentially without labels. Theoretically, we an-
433 alyzed that the challenges for cross-battery tasks come from
442 References [Kao and Lee, 2021] Wei-Tsung Kao and Hung-yi Lee. Is 495
bert a cross-disciplinary knowledge learner? a surpris- 496
443 [Birkl, 2017] Christoph Birkl. Oxford battery degradation
ing finding of pre-trained models’ transferability. arXiv 497
444 dataset 1. 2017. preprint arXiv:2103.07162, 2021. 498
445 [Broussely et al., 2005] Michel Broussely, Ph Biensan, [Li and Liang, 2021] Xiang Lisa Li and Percy Liang. Prefix- 499
446 F Bonhomme, Ph Blanchard, S Herreyre, K Nechev, and tuning: Optimizing continuous prompts for generation. In 500
447 RJ Staniewicz. Main aging mechanisms in li ion batteries. Proceedings of the 59th Annual Meeting of the Association 501
448 Journal of power sources, 146(1-2):90–96, 2005. for Computational Linguistics and the 11th International 502
449 [Brown et al., 2020] Tom Brown, Benjamin Mann, Nick Ry- Joint Conference on Natural Language Processing (Vol- 503
450 der, Melanie Subbiah, Jared D Kaplan, Prafulla Dhari- ume 1: Long Papers), pages 4582–4597, 2021. 504
451 wal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, [Li et al., 2021] Weihan Li, Neil Sengupta, Philipp Dechent, 505
452 Amanda Askell, et al. Language models are few-shot David Howey, Anuradha Annaswamy, and Dirk Uwe 506
453 learners. Advances in neural information processing sys- Sauer. One-shot battery degradation trajectory predic- 507
454 tems, 33:1877–1901, 2020. tion with deep learning. Journal of Power Sources, 508
455 [Caruana, 1997] Rich Caruana. Multitask learning. Machine 506:230024, 2021. 509
456 learning, 28:41–75, 1997. [Lu et al., 2022] Kevin Lu, Aditya Grover, Pieter Abbeel, 510
[Devlin et al., 2018] Jacob Devlin, Ming-Wei Chang, Ken- and Igor Mordatch. Frozen pretrained transformers as uni- 511
457
versal computation engines. In Proceedings of the AAAI 512
458 ton Lee, and Kristina Toutanova. Bert: Pre-training of
Conference on Artificial Intelligence, volume 36, pages 513
459 deep bidirectional transformers for language understand-
7628–7636, 2022. 514
460 ing. arXiv preprint arXiv:1810.04805, 2018.
[Lu et al., 2023] Jiahuan Lu, Rui Xiong, Jinpeng Tian, 515
461 [Gandelsman et al., 2022] Yossi Gandelsman, Yu Sun, Xin-
Chenxu Wang, and Fengchun Sun. Deep learning to es- 516
462 lei Chen, and Alexei Efros. Test-time training with masked timate lithium-ion battery state of health without addi- 517
463 autoencoders. Advances in Neural Information Processing tional degradation experiments. Nature Communications, 518
464 Systems, 35:29374–29385, 2022. 14(1):2760, 2023. 519
465 [Genikomsakis et al., 2021] Konstantinos N. Genikomsakis, [Ng et al., 2020] Man-Fai Ng, Jin Zhao, Qingyu Yan, 520
466 Nikolaos-Fivos Galatoulas, and Christos S. Ioakimidis. Gareth J Conduit, and Zhi Wei Seh. Predicting the state 521
467 Towards the development of a hotel-based e-bike rental of charge and health of batteries using data-driven ma- 522
468 service: Results from a stated preference survey and chine learning. Nature Machine Intelligence, 2(3):161– 523
469 techno-economic analysis. Energy, 215:119052, 2021. 170, 2020. 524
470 [Han et al., 2022] Te Han, Zhe Wang, and Huixing Meng. [Pang et al., 2023] Ziqi Pang, Ziyang Xie, Yunze Man, and 525
471 End-to-end capacity estimation of lithium-ion batteries Yu-Xiong Wang. Frozen transformers in language mod- 526
472 with an enhanced long short-term memory network con- els are effective visual encoder layers. arXiv preprint 527
473 sidering domain adaptation. Journal of Power Sources, arXiv:2310.12973, 2023. 528
474 520:230823, 2022. [Radford et al., 2018] Alec Radford, Karthik Narasimhan, 529
475 [He et al., 2011] Wei He, Nicholas Williard, Michael Os- Tim Salimans, Ilya Sutskever, et al. Improving language 530
476 terman, and Michael Pecht. Prognostics of lithium- understanding by generative pre-training. 2018. 531
477 ion batteries based on dempster–shafer theory and the [Radford et al., 2019] Alec Radford, Jeffrey Wu, Rewon 532
478 bayesian monte carlo method. Journal of Power Sources, Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 533
479 196(23):10314–10321, 2011. Language models are unsupervised multitask learners. 534
480 [Hu et al., 2021] Edward J Hu, Phillip Wallis, Zeyuan Allen- OpenAI blog, 1(8):9, 2019. 535
481 Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, [Roman et al., 2021] Darius Roman, Saurabh Saxena, 536
482 et al. Lora: Low-rank adaptation of large language models. Valentin Robu, Michael Pecht, and David Flynn. Machine 537
483 In International Conference on Learning Representations, learning pipeline for battery state-of-health estimation. 538
484 2021. Nature Machine Intelligence, 3(5):447–456, 2021. 539
485 [Jakob Fleischmann, 2023] Martin Linder Mikael Hanicke [Severson et al., 2019] Kristen A Severson, Peter M Attia, 540
486 Evan Horetsky Dina Ibrahim Sören Jautelat ukas Torscht Norman Jin, Nicholas Perkins, Benben Jiang, Zi Yang, 541
487 Alexandre van de Rijt Jakob Fleischmann, Patrick Schau- Michael H Chen, Muratahan Aykol, Patrick K Herring, 542
488 fuss. Battery 2030: Resilient, sustainable, and circular. Dimitrios Fraggedakis, et al. Data-driven prediction of 543
489 2023. January 16, 2023. battery cycle life before capacity degradation. Nature En- 544
490 [Jia et al., 2022] Menglin Jia, Luming Tang, Bor-Chun ergy, 4(5):383–391, 2019. 545
491 Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, [Shu et al., 2021] Xing Shu, Jiangwei Shen, Guang Li, 546
492 and Ser-Nam Lim. Visual prompt tuning. In European Yuanjian Zhang, Zheng Chen, and Yonggang Liu. A 547
493 Conference on Computer Vision, pages 709–727. Springer, flexible state-of-health prediction scheme for lithium-ion 548
494 2022. battery packs with long short-term memory network and 549
550 transfer learning. IEEE Transactions on Transportation
551 Electrification, 7(4):2238–2248, 2021.
552 [Sun et al., 2020] Yu Sun, Xiaolong Wang, Zhuang Liu,
553 John Miller, Alexei Efros, and Moritz Hardt. Test-time
554 training with self-supervision for generalization under dis-
555 tribution shifts. In International conference on machine
556 learning, pages 9229–9248. PMLR, 2020.
557 [Tan and Zhao, 2020] Yandan Tan and Guangcai Zhao.
558 Transfer learning with long short-term memory net-
559 work for state-of-health prediction of lithium-ion bat-
560 teries. IEEE Transactions on Industrial Electronics,
561 67(10):8723–8731, 2020.
562 [Tian et al., 2021] Jinpeng Tian, Rui Xiong, Weixiang Shen,
563 Jiahuan Lu, and Xiao-Guang Yang. Deep neural network
564 battery charging curve prediction using 30 points collected
565 in 10 min. Joule, 5(6):1521–1534, 2021.
566 [Tian et al., 2022] Jinpeng Tian, Rui Xiong, Weixiang Shen,
567 Jiahuan Lu, and Fengchun Sun. Flexible battery state of
568 health and state of charge estimation using partial charg-
569 ing data and deep learning. Energy Storage Materials,
570 51:372–381, 2022.
571 [Wang et al., 2023] Yixiu Wang, Jiangong Zhu, Liang Cao,
572 Bhushan Gopaluni, and Yankai Cao. Long short-term
573 memory network with transfer learning for lithium-ion
574 battery capacity fade and cycle life prediction. Applied
575 Energy, 350:121660, 2023.
576 [Ye and Yu, 2021] Zhuang Ye and Jianbo Yu. State-of-health
577 estimation for lithium-ion batteries using domain adversar-
578 ial transfer learning. IEEE Transactions on Power Elec-
579 tronics, 37(3):3528–3543, 2021.
580 [Zhou et al., 2023] Tian Zhou, Peisong Niu, Xue Wang,
581 Liang Sun, and Rong Jin. One fits all: Power general
582 time series analysis by pretrained lm. arXiv preprint
583 arXiv:2302.11939, 2023.