0% found this document useful (0 votes)
388 views9 pages

GPT4Battery A LLM Driven Framework For Adaptive State of Health Estimation of Raw Li Ion Batteries

1) The document proposes a framework that uses a large language model (LLM) to estimate the state of health (SOH) of lithium-ion batteries across different types of batteries without requiring lifelong training data through degradation experiments. 2) The framework employs a test-time training technique to continually adapt the model as unlabeled data from a battery arrives over time to account for distribution shifts as the battery ages. 3) Validation results found the framework achieved state-of-the-art zero-shot accuracy on four lithium-ion battery datasets, with performance comparable to latest domain adaptation methods on two of the datasets.

Uploaded by

stanfeng9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
388 views9 pages

GPT4Battery A LLM Driven Framework For Adaptive State of Health Estimation of Raw Li Ion Batteries

1) The document proposes a framework that uses a large language model (LLM) to estimate the state of health (SOH) of lithium-ion batteries across different types of batteries without requiring lifelong training data through degradation experiments. 2) The framework employs a test-time training technique to continually adapt the model as unlabeled data from a battery arrives over time to account for distribution shifts as the battery ages. 3) Validation results found the framework achieved state-of-the-art zero-shot accuracy on four lithium-ion battery datasets, with performance comparable to latest domain adaptation methods on two of the datasets.

Uploaded by

stanfeng9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

GPT4Battery: An LLM-driven Framework for Adaptive State of Health

Estimation of Raw Li-ion Batteries

Paper ID: 5751

Abstract management and guide the design and manufacturing of new- 42


generation batteries [Severson et al., 2019]. As a multidisci- 43
1 State of health (SOH) is a crucial indicator for plinary issue, battery state estimation presents a challenge for 44
2 assessing the degradation level of batteries that researchers in both battery and machine learning fields. Exist- 45
3 cannot be measured directly but requires estima- ing data-driven SOH estimation studies require lifelong data 46
4 tion. Accurate SOH estimation enhances detection, with precise SOH labels to establish the mapping relationship 47
5 control, and feedback for Li-ion batteries, allow- for every LIB [Roman et al., 2021]. Obtaining the training 48
6 ing for safe and efficient energy management and data necessitates degradation experiments that often require 49
7 guiding the development of new-generation bat- many months to years. However, with the rapid develop- 50
8 teries. Despite the significant progress in data- ment of new-generation batteries, waiting this long to develop 51
9 driven SOH estimation, the time and resource- separate models for each LIB greatly hinders the deploy- 52
10 consuming degradation experiments for generating ment of battery management systems (BMS) [Ng et al., 2020; 53
11 lifelong training data pose a challenge in establish- Roman et al., 2021]. 54
12 ing one large model capable of handling diverse
13 types of Li-ion batteries, e.g., cross-chemistry, To lighten the burden of data collection, several transfer 55

14 cross-manufacturer, and cross-capacity. Hence, learning-based methods for SOH estimation have emerged re- 56

15 this paper utilizes the strong generalization capa- cently [Shu et al., 2021; Tian et al., 2021]. For instance, [Tan 57

16 bility of large language model (LLM) to proposes and Zhao, 2020] developed an LSTM-FC network with a fine- 58

17 a novel framework for adaptable SOH estimation tuning strategy to predict SOH using only the first 25 % of 59

18 across diverse batteries. To match the real sce- the target LIB dataset. Besides, [Lu et al., 2023] integrated a 60

19 nario where unlabeled data sequentially arrives in swarm of DNNs with domain adaptation to enable SOH esti- 61

20 use with distribution shifts, the proposed model is mation without target LIB labels. While these approaches al- 62

21 modified by a test-time training technique to en- leviate the data collection difficulties to some extent, they still 63

22 sure estimation accuracy even at the battery’s end cannot follow the battery upgrading pace and pose a barrier to 64

23 of life. The validation results demonstrate that the developing battery technologies. Another issue is that exist- 65

24 proposed framework achieves state-of-the-art accu- ing transfer learning methods are effective on well-collected 66

25 racy on four widely recognized datasets collected datasets for fine-tuning or feature alignment, but they fail to 67

26 from 62 batteries. Furthermore, we analyze the the- match the real-world scenario where a raw battery continues 68

27 oretical challenges of cross-battery estimation and to age, bringing unlabeled data incrementally over months. 69

28 provide a quantitative explanation of the effective- Hence, developing a large model that can handle diverse types 70

29 ness of our method. of LIBs (e.g., cross-chemistry, cross-manufacturer, and cross- 71


capacity) while having a good match on raw, new-generation 72
batteries in real life is crucial. 73
30 1 Introduction This paper firstly proposes an LLM-driven framework for 74
31 Rechargeable Li-ion batteries (LIBs) are crucial in many SOH estimation that does not require to be trained by target 75
32 modern-day applications ranging from portable electronics data in advance. Since creating massive datasets and train- 76
33 and medical devices to renewable energy integration in power ing a large LIB model is rarely sustainable, we explore the 77
34 grids and electric vehicles [Genikomsakis et al., 2021]. The cross-modal migratability potential of the language model’s 78
35 entire LIB chain could reach a value of more than $400 billion powerful generalization capability. We convert [Radford et 79
36 and a market size of 4.7 TWh by 2030 [Jakob Fleischmann, al., 2019] GPT2’s modality from language to battery data by 80
37 2023]. retraining the input and output layers and plug-in ’battery- 81
38 State of health (SOH) is a critical state evaluating the specific adaptors’ to create our backbone. Secondly, a test- 82
39 degradation level of batteries, which cannot be measured di- time training strategy is employed to fully leverage the incre- 83
40 rectly but requires estimation [Lu et al., 2023]. Obtaining mentally acquired unlabeled data, adapting to real-world sce- 84
41 an accurate SOH is vital to ensure safe and efficient battery narios where a raw battery undergoes temporal aging through 85
86 charging and discharging. This strategy effectively mitigates GPT2 [Radford et al., 2019], verifying that scaling up lan- 140
87 cumulative errors resulting from temporal distribution shifts guage models significantly improves zero-shot performance 141
88 of LIBs in use over months. The proposed framework demon- on various downstream language tasks. By few-shot instruc- 142
89 strates comparable accuracy among extensive baseline meth- tion tuning, GPT3 [Brown et al., 2020] even reached compet- 143
90 ods under regular settings and achieves state-of-the-art accu- itiveness with prior state-of-the-art fine-tuning approaches on 144
91 racy under zero-shot settings. This work also presents a pre- NLP tasks that even required on-the-fly reasoning or domain 145
92 liminary analysis of the theoretical challenges associated with adaptation. 146
93 cross-battery estimation and elucidates our framework’s un- Cross-modality capability. Pre-trained knowledge of lan- 147
94 derlying principles. guage models is generally used for downstream tasks with 148
95 In summary, our main contributions are as follows: a different modality. Indeed, [Kao and Lee, 2021] verified 149

96 • We fully utilize the strong generalization capability of the cross-modality power of Bert in transferring from nat- 150

97 LLM to establish a framework for cross-battery SOH es- ural language to ammonia acid, DNA, and music. Addi- 151

98 timation, lighting the burden of months-to-year degrada- tionally, [Lu et al., 2022] investigated the capability of a 152

99 tion experiments for data collection. transformer pre-trained on natural language to generalize to 153
other modalities with minimal fine-tuning. Their so-called 154
100 • We employ a test-time training strategy to match the Frozen Pretrained Transformer (FPT) shows superiority over 155
101 real-world scenario where a raw battery continues to the randomly initialized same architecture in improving per- 156
102 age, bringing unlabeled data incrementally but often in- formance and compute efficiency on cross-modality down- 157
103 volves temporal distribution shifts. This strategy ensures stream tasks spanning numerical computation, vision, and 158
104 estimation accuracy even at the battery’s end of life. protein fold prediction. Furthermore, [Zhou et al., 2023] also 159
105 • The validation results demonstrate that the proposed leveraged a modified GPT2 on general time series analysis 160
106 framework achieves state-of-the-art zero-shot accuracy and achieved results comparable to SOTA. [Pang et al., 2023] 161
107 on four widely recognized LIB datasets, and two of reveals that LLMs trained solely on textual data are surpris- 162
108 them are even comparable to the latest domain adapta- ingly strong encoders for purely visual tasks. 163
109 tion methods. Inter-modality migration of generalization capability. 164
Relatively, exploring whether the powerful generalization ca- 165

110 2 Related Work pability of LLM can be migrated to another modality is an 166
ongoing research direction. Note that [Zhou et al., 2023] dis- 167
111 2.1 Data-driven battery SOH estimation covered the superior few-shot /zero-shot capability of LLM 168

112 Data-driven approaches for battery SOH estimation display on general time series analysis experimentally despite a little 169

113 greater benefits in accuracy and online computation efficiency lack of explainability. 170

114 than traditional mechanism based models such as equivalent


2.3 Test-Time Training 171
115 circuit models (ECMs) and physics-based models (PBMs)
116 [Ng et al., 2020]. In [Roman et al., 2021], the author pro- Test-Time Training (TTT) is a general approach for improv- 172

117 posed a machine learning pipeline for battery state of health ing the performance of predictive models when there are dis- 173

118 estimation involving four algorithms: Bayesian Ridge Re- tribution shifts between training and testing data [Sun et al., 174

119 gression (BPR), Gaussian process regression (GPR), Random 2020]. A typical TTT model has a supervised main task head 175

120 Forest (RF), and Deep ensemble of neural networks (dNNe). for labeled training data and a self-supervised head for un- 176

121 While accurate health state estimation methods have pro- labeled testing data. When testing data arrives in an online 177

122 gressed significantly, the time- and resource-consuming stream, the online version of TTT also updates incremen- 178

123 degradation experiments needed to generate lifelong time tally via self-supervision. This method operates well when 179

124 training data make target battery agnostic approaches attrac- the self-supervised task propagates gradients that correlate 180

125 tive [Ye and Yu, 2021; Han et al., 2022]. For instance, [Tan with the main task, demonstrating an improved generaliza- 181

126 and Zhao, 2020] developed an LSTM-FC network to predict tion ability on many visual benchmarks for distribution shifts 182

127 SOH by fine-tuning only the first 25 % of the target dataset. [Gandelsman et al., 2022]. 183

128 Similarly, [Wang et al., 2023] retrained LSTM using only two
129 target battery cells during the transfer learning process. Be- 3 Methodology 184

130 sides, [Lu et al., 2023] integrated a swarm of deep neural net- In this section, we introduce our proposed framework, 185
131 works with domain adaptation to enable SOH estimation in GPT4Battery, which leverages LLM’s generalization capabil- 186
132 the absence of target battery labels, which inspired our work. ity for cross-battery state of health (SOH) estimation, along 187
with a more practical setting—test-time training strategy as 188
133 2.2 Capability of LLM shown in Figure 1. 189
134 In-modality generalization capability. Extensive research
135 work has verified the in-modality generalization capability of 3.1 Problem Formulation 190

136 large language models. Bert [Devlin et al., 2018] used trans- Using the battery’s raw voltage-time charging curve is a 191
137 former encoders and employed a masked language modeling popular and easy-to-implement approach to accurately es- 192
138 task to recover the random masked tokens within a text. Fur- timate multiple states over battery life [Tian et al., 2022; 193
139 thermore, OpenAI proposed GPT [Radford et al., 2018] and Roman et al., 2021]. Determining SOH can be considered a 194
Regression Head SSL Head for
Reconstruction

GPT4Battery

Battery-specific
Adaptor

Modeling from PLM. Test-time Learning.

Figure 1: Proposed GPT4Battery architecture. We highlight the leverages of LLM’s generalization capability for cross-battery SOH estima-
tion and the scenario-fit setting of test-time learning strategy.

Electrode Material
Dataset Nominal capacity Voltage range Samples Collector
(Cathode)
CALCE LCO 1.1(Ah) 2.7-4.2(V) 2807 University of Maryland
SANYO NMC 1.85(Ah) 3.0-4.1(V) 415 RWTH Aachen University
KOKAM LCO/NCO 0.74(Ah) 2.7-4.2(V) 503 University of Oxford
PANASONIC NCA 3.03(Ah) 2.5-4.29(V) 2770 Beijing Institude of Technology
GOTION LFP 27(Ah) 2.0-3.65(V) 4262 Beijing Institude of Technology

Table 1: Main specifications of selected LIB datasets.

195 univariant supervised regression problem in this case. In ad-


196 dition to traditional SOH estimation, which requires training Z Vmin +i∆V
197 data over the battery’s full life cycle, the proposed framework qis (V ) = |I(t)|dt, i ∈ {0, 1, · · · , K} (2)
198 is designed to perform well under a zero-shot scenario. In Vmin
199 this scenario, the pre-trained model is tested on a raw battery The cycle’s charging feature qi (V ) is obtained by gridding 215
200 with different chemistry and manufacture without training in the voltage sampling window with step ∆V from Vmin to 216
201 advance. Since developing a battery degradation dataset takes Vmin + K∆V . Then, we expand it to a 1-d vector and nor- 217
202 644–8473 hours of degradation experiments [Lu et al., 2023], malize it with the initial capacity Q to make it adaptive to 218
203 models with such adaptive capability are of great value, as de- different LIBs. 219
204 scribed above.
205 Problem 1: Regular SOH Estimation q s (V ) = [q0 (V ), q1 (V ), . . . qK (V )]/Q (3)
206 We use partial charging data to determine the battery state of Next, the sampled data sequence is mapped to the SOH: 220

207 health (SOH) [Tian et al., 2021]. Battery SOH is generally


208 defined as: SOHs = fDN N (q s (V )) (4)
Problem 2: Zero-shot Adaptivity to Raw Battery 221
Qs
SOHs = (1) In this work, we propose a new problem setting: to pre- 222
Q dict SOH values for unseen batteries without prior train- 223
209 where Q is the initial maximum battery capacity, and Qs is ing. More specifically, given a set of L-labeled samples 224
210 the maximum battery capacity at the sth cycle. L = {(x1 , y1 ), (x2 , y2 ), ..., (xL , yL )} for pre-training, where 225
211 In a constant-current charging process, the voltage V (t) x denotes the partial charging curve q(V ) and y is the target 226
212 and current I(t) are stored by BMS at every time step, and SOH label from the experimental lifelong battery dataset. 227
213 a voltage sampling window is applied to capture the partial When a raw battery comes in the real scenario, the 228
214 charging window at the sth cycle: fine-tuning dataset comprises only U unlabeled data U = 229
230 {x1 , x2 , ..., xU } that arrive sequentially since the degradation Algorithm 1 GPT4Battery Pipeline
231 process during usage typically spans from months to years. Input: A set of L labeled samples L =
232 Meanwhile, the label is inaccessible due to incomplete dis- {(x1 , y1 ), (x2 , y2 ), ..., (xL , yL )} from an experimen-
233 charging under real-world usage, making it challenging to tal dataset and an unlabeled target dataset U with
234 modify by supervised fine-tuning. Nevertheless, obtaining {(x1 , 1), x2 , x3 , ..., xu } coming in sequentially.
235 the partial charging curves of the first cycle (i.e., in a fresh Parameter: Frozen pre-trained GPT2 parameters f , main
236 state) is easily achievable (e.g., through LIB formation or task head parameters h, and self-supervised head g.
237 factory testing), and their state of health can be considered Output: The label for every x in U coming out sequen-
238 100%. tially.
1: Stage 1: Joint pre-training
239 3.2 Pre-trained language model for cross-battery
2: for xi , yi in L do
240 SOH estimation 3: f0 ← minf lm (xi , yi ; f , h) + ls (xi ; f , g)
241 Since it is rarely sustainable to create creating massive 4: h0 ← minh lm (xi , yi ; f , h)
242 datasets and train a large LIB model, we explore the migrata- 5: g0 ← ming ls (xi ; f , g)
243 bility potential of language model’s generalization capabil- 6: end for
244 ity. A few modifications are made to suit the pre-trained lan- 7: Stage 2: Joint fine-tune with 1st cycle label
245 guage model (PLM) for battery data. We aim to emphasize 8: Assign a stopping loss;
246 the inherent internal computation within the language model 9: while lm (x1 , 1; f, h) > stopping loss do
247 by freezing its main components. As a result, the follow- 10: f1 ← minf lm (x1 , 1; f0 , h0 ) + ls (x1 ; f0 , g0 )
248 ing three adjustments are made to a pre-trained GPT-2 model 11: h1 ← minh lm (x1 , 1; f0 , h0 )
249 [Radford et al., 2019]. 12: g1 ← ming ls (x1 ; f0 , g0 )
250 Frozen Pre-trained Language Model. As self-attention 13: end while
251 layers and FFN (Feedforward Neural Networks) contain the 14: Stage 3: Self-supervised Test-time training
252 most learned knowledge from pre-trained language models 15: for xi in U do
253 [Lu et al., 2022], freezing these components would be fair 16: fx ← minf ls (xi ; f1 , g1 )
254 to deploy the migration of PLM’s generalization capability 17: gx ← ming ls (xi ; f1 , g1 )
255 from language to downstream tasks. The positional embed- 18: Predict: fx ◦ h1
256 dings and layer normalization layer are trainable as standard 19: end for
257 practice.
258 Trainable input layers. Re-initializing a new input layer to and NLP. This work elaborates on our knowledge of bat- 267
259 query the transformer is important since the model operates tery adapters, incorporating each LIB cell’s vectorized text 268
260 in a new modality . We use linear probing to minimize the specifications for battery-specific fine-tuning. For the l-th 269
l
261 amount of computation outside the transformer. l ∈ [1, L] adapter layer, the input HA ∈ R(m+n)×d is 270
l
formed by vertically concatenating the hidden features H̃P ∈ 271
m×d
R from the l-th PLM layer and the vectorized text spec- 272
l
ification of LIB cell HA ∈ Rn×d from knowledge. m and 273
n denote the length of the PLM input sequence and knowl- 274
edge piece, respectively, and d is the hidden size. A learnable 275
gate function is used to obtain crucial query information by 276
filtering the hidden features of PLM. Specifically, 277
l l
HA = [H̃P ⊙ σ(G); H A ] (5)
l
Now given the input HA , the adaptor layer projects it down 278
to the r dimension with a linear down layer. Then a self- 279
attention layer is employed to fuse the battery knowledge and 280
the query information from the PLM. After that, another lin- 281
ear projection layer is applied to project it up to the origi- 282
nal dimension d, and finally, this output is input to the PLM 283
through a residual connection. For each different LIB, a sepa- 284
rate battery-specific adaptor is employed to enhance the fine- 285
Figure 2: Knowledgeable ‘battery specific adaptor’ with plug-in cell
tuning performance. 286
text specification.
3.3 Adaption to Raw Battery with Test-time 287

262 Knowledgeable battery-specific adaptor. To enhance the Training 288

263 pre-trained model’s performance on downstream tasks with In a real-world scenario, a raw battery undergoes continu- 289
264 minimal effort, PEFT methods, such as LoRA [Hu et ous aging with incrementally acquired unlabeled data, sam- 290
265 al., 2021], VPT [Jia et al., 2022], and Prefix Tuning pled over months. This renders existing transfer learning- 291
266 [Li and Liang, 2021], have been widely employed in CV based methods unadaptable. In this section, we introduce the 292
Figure 3: Visual results of adaptation to four different kinds of LIBs under zero-shot setting using GPT4Battery, pre-trained on the GOTION
Dataset.

293 theory of test-time training, which aligns more closely with


294 practical settings. 1X
n
f0 , h0 , g0 = min lm (xi , yi ; f , h) + ls (xi ; f , g) (6)
295 Two Head Structure f ,h,g n i=1
296 We consider a Y-shaped structure with two heads, similar to where lm denotes the main SOH regression loss defined 316
297 [Gandelsman et al., 2022]: a feature extractor f simultane- by MSE and ls is the reconstruction loss of charging curve x. 317
298 ously followed by a self-supervised head g and a main task 318
299 head h. For the main task (SOH estimation) h and self- st
Fine-tuning using the ‘labeled’ 1 cycle . The partial 319
300 supervised head (charging curve reconstruction) g, we use charging curves of the first cycle (i.e., in fresh status) are 320
301 linear down projection and linear up projection respectively easily obtained by LIB formation or factory test, and their 321
302 for clear comparisons. Here the self-supervised task is de- SOH labels can be treated as 1. We take full advantage of 322
303 signed by recovering the input charging curve from the hid- this sample to cover the distribution gap caused by the bat- 323
304 den output features of the modified PLM. It should be noted tery mechanism. An early stopping technique is applied to 324
305 that f is exactly the modified PLM, as presented in the last prevent overfitting on this label as presented in Algorithm 1. 325
306 section.

307 Pipeline f1 , h1 , g1 = min lm (x1 , 1; f0 , h0 ) + ls (x1 ; f0 , g0 ) (7)


f ,h,g
308 Joint Pre-Training. An experimental labeled dataset L Test-Time Training. At test time, we start from the PLM 326
309 containing {(x1 , y1 ), . . . , (xn , yn )} is used to train the main pre-trained encoder f1 , as well as two task heads g1 , h1 fine- 327
310 task network f ◦ h . Meanwhile, the reconstruction task f ◦ g tuned in the last step. The main task head h1 is frozen, and 328
311 is trained, conducting a joint training which adopts a multi- the following loss is optimized as each test sample x arrives: 329
312 task learning strategy [Caruana, 1997]. Losses for both tasks
313 are added together, and gradients are taken to collect all pa- fx , gx = min ls (xi ; f1 , g1 ) (8)
314 rameters. The joint training problem is therefore formulated f ,g
315 as follows: after TTT, we make a prediction on each xi as fx ◦ h1 . 330
Datasets CALCE SANYO KOKAM PANASONIC Average
Methods/Metrics MAE(%) Time MAE(%) Time MAE(%) Time MAE(%) Time MAE(%) Time
GPR 23.58 - 21.00 - 31.83 - 30.70 - 26.78 -
RD 8.74 - 13.33 - 9.02 - 10.52 - 10.40 -
SVR 4.27 - 5.62 - 6.44 - 5.53 - 5.47 -
CNN 10.31 - 17.90 - 14.64 - 25.46 - 17.08 -
Benchmark1 4.13 4.86 2.78 4.25 4.00 4.14 4.57 5.5 3.87 4.69
Benchmark2 3.35 11.35 0.88 25.3 6.21 13.18 1.44 11.16 2.97 15.25
GPT4Battery 1.43 43.73 0.87 68.72 5.56 255.24 0.81 61.17 2.17 107.22
swarm 1.12 - 1.21 - 1.76 - 2.09 - 1.55 -

Table 2: Comparison of methods under zero-shot setting. We calculate the MAE (as %) for each dataset and average four datasets as well
as inference time (ms) for three benchmarks. A lower MAE score indicates better performance. Red: best, Black: second best. Note that
swarm is tested under domain adaptation setting where target data is accessible.

331 4 Experiments Table 2 reports the comparative results of the mean abso- 368
lute error (MAE) on baselines and inference time on three 369
332 This section evaluates the proposed GPT4Battery on five pub-
benchmarks. Without the target training data, the exist- 370
333 licly recognized datasets collected from 65 battery cells to
ing methods fail to provide reliable estimation with their 371
334 demonstrate our framework’s effectiveness on this challeng-
MAEs over 10% except for SVR. In contrast, the proposed 372
335 ing problem.
GPT4Battery framework achieves accurate SOH estimation 373

336 4.1 Experiment Settings of an average MAE of 2.17% on four datasets. On SANYO 374
and PANASONIC, our method even outperforms the latest 375
337 Datasets. The experiments employ the following datasets domain adaptation method where the target LIB features are 376
338 manufactured by CALCE [He et al., 2011], SANYO [Li et accessible by MAE of 0.34% and 0.28%, respectively. 377
339 al., 2021], KOKAM [Birkl, 2017], PANASONIC, and GO- The expense of a little inference time (within 1 seconds per 378
340 TION HIGH-TECH [Lu et al., 2023]. These datasets cover cycle) is certainly acceptable considering one charging cycle 379
341 widely-used cathode active materials, a capacity ranging from could take hours. It should be noted that the least desirable 380
342 0.74Ah to 27Ah, and five different manufacturers, emphasiz- performance is observed on KOKAM, mainly attributed to 381
343 ing our method’s potential to work directly on new-generation high linearity of this specific dataset. 382
344 batteries under a zero-shot setting. This paper adopts [Lu et
345 al., 2023] for data pre-processing. The differences between Evaluation under regular setting 383

346 these five LIBs are compared in Table 1 .


Datasets #1 #2 #3 #4 #5
347 Baselines. The GPT4Battery is compared against three cat-
348 egories of methods: Methods/Metrics MAE(%)

349 • Four popular data-driven SOH estimation methods in- GPR 6.42 5.84 8.06 7.63 6.39
350 clude Gaussian process regression (GPR), support vec- RD 0.48 0.13 0.11 0.52 0.29
SVR 4.23 5.87 6.36 5.50 4.66
351 tor regression (SVR), Random Forest (RD) [Roman et CNN 0.72 1.90 2.65 0.72 0.28
352 al., 2021], and CNN [Tian et al., 2022]. Benchmark2 2.22 0.61 1.13 1.40 0.25
353 • One latest domain adaptation based transfer learning
354 method for estimation without the label of target LIBs Table 3: Comparison of methods under regular setting. We calculate
355 [Lu et al., 2023]. the MAE (as %) for each dataset. Black: best.
356 • We created Benchmark1 by substituting PLM with a reg-
Under regular settings, we disabled TTT and compared this 384
357 ular CNN architecture and Benchmark2 by disabling the
LLM-driven model (Benchmark2) to the existing methods. 385
358 test-time training technique.
LIBs within one dataset are divided into training and testing 386

359 4.2 Results sets to fit this traditional machine learning setting. Unsurpris- 387
ingly, our model shows no superior performance over base- 388
360 Evaluation under zero-shot setting line methods, as a model with strong generalization capability 389
361 Under a zero-shot setting, we employed GOTION [Lu et al., may not converge as effectively on one specific dataset. 390
362 2023] as the pre-training dataset for joint training because it
363 contains more samples relatively. After obtaining a founda- 4.3 Analysis 391

364 tion model, test-time training is employed to adapt this model The challenge for zero-shot estimation originates from the 392
365 to four different types of batteries. Fig. 3 illustrates the visual spatial domain shifts across batteries with different chemistry, 393
366 accuracy performance in the absence of any data of the four capacities, or manufacturers. Meanwhile, since the degrada- 394
367 LIBs except for the 1st cycle with SOH label considered as 1. tion process often spans over months to years, variations in 395
Figure 4: Quantitative comparison of GPT4Battery (left) and Benchmark2 (right, with TTT disabled) on CALCE under zero-shot setting.
Despite both methods demonstrating accurate estimation early in the life cycles, GPT4Battery with TTT effectively reduces errors accumu-
lated due to temporal distribution shifts, ensuring accuracy in the mid-to-late life cycles.

396 the mechanisms within the battery (such as side reactions and the spatial distribution shifts between diverse LIBs and tem- 434
397 stability of solid electrolyte interface) [Broussely et al., 2005] poral distribution shifts during the months-to-year aging pro- 435
398 are more likely to induce temporal distribution shifts, making cess, emphasizing the necessity of LLM-driven and test-time 436
399 a fixed model fall in the mid-to-late life cycle. The excel- learning. To our knowledge, we are the first to explore the 437
400 lent performance of our framework can be attributed to the iter-modal migratability potential of large language model’s 438
401 powerful generalization capability migrated from LLM and generalization capability and deploy it successfully. Practi- 439
402 test-time learning strategy, both effectively working in deal- cally, our proposed GPT4Battery achieved state-of-art results 440
403 ing with this out-of-distribution problem. in zero-shot SOH estimation across various LIBs. 441
404 Here, we provide a qualitative explanation of a crucial
405 strategy in our proposed framework, i.e., test-time training.
406 We present two estimation and error analysis maps to visu-
407 alize the contributions of TTT to SOH estimate. Compared
408 with Benchmark2, where TTT is disabled, GPT4Battery
409 achieves significantly lower Mean Absolute Error (MAE)
410 throughout the life cycle. This stands in contrast to the ac-
411 cumulating estimation error observed in Benchmark2, con-
412 firming the assumption of temporal distribution shifts.
413 Theoretically, an intuitive explanation for TTT is that the
414 self-supervised task happens to propagate gradients that cor-
415 relate with those of the main task [Sun et al., 2020]. In our
416 framework, we believe that the charging curve reconstruction
417 task finds a better bias-variance trade-off under the tempo-
418 rally accumulated distribution shifts. The fixed model is bi-
419 ased because it is completely based on biased training data,
420 and the migrated generalization ability from language mod-
421 els is not powerful enough to handle this non-trivial change.
422 The other extreme is completely discarding the pre-trained
423 knowledge and training a new model from scratch on each
424 test input. However, this is undesirable because of the high
425 variance of each input, and the reconstruction task does not
426 always contribute to the main regression task according to
427 our trial and error.

428 5 Conclusion
429 We proposed an LLM-driven framework equipped with time-
430 time learning for cross-battery state of health (SOH) estima-
431 tion, addressing the real-world scenario where raw battery
432 data arrives sequentially without labels. Theoretically, we an-
433 alyzed that the challenges for cross-battery tasks come from
442 References [Kao and Lee, 2021] Wei-Tsung Kao and Hung-yi Lee. Is 495
bert a cross-disciplinary knowledge learner? a surpris- 496
443 [Birkl, 2017] Christoph Birkl. Oxford battery degradation
ing finding of pre-trained models’ transferability. arXiv 497
444 dataset 1. 2017. preprint arXiv:2103.07162, 2021. 498

445 [Broussely et al., 2005] Michel Broussely, Ph Biensan, [Li and Liang, 2021] Xiang Lisa Li and Percy Liang. Prefix- 499
446 F Bonhomme, Ph Blanchard, S Herreyre, K Nechev, and tuning: Optimizing continuous prompts for generation. In 500
447 RJ Staniewicz. Main aging mechanisms in li ion batteries. Proceedings of the 59th Annual Meeting of the Association 501
448 Journal of power sources, 146(1-2):90–96, 2005. for Computational Linguistics and the 11th International 502

449 [Brown et al., 2020] Tom Brown, Benjamin Mann, Nick Ry- Joint Conference on Natural Language Processing (Vol- 503

450 der, Melanie Subbiah, Jared D Kaplan, Prafulla Dhari- ume 1: Long Papers), pages 4582–4597, 2021. 504

451 wal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, [Li et al., 2021] Weihan Li, Neil Sengupta, Philipp Dechent, 505
452 Amanda Askell, et al. Language models are few-shot David Howey, Anuradha Annaswamy, and Dirk Uwe 506
453 learners. Advances in neural information processing sys- Sauer. One-shot battery degradation trajectory predic- 507
454 tems, 33:1877–1901, 2020. tion with deep learning. Journal of Power Sources, 508

455 [Caruana, 1997] Rich Caruana. Multitask learning. Machine 506:230024, 2021. 509

456 learning, 28:41–75, 1997. [Lu et al., 2022] Kevin Lu, Aditya Grover, Pieter Abbeel, 510

[Devlin et al., 2018] Jacob Devlin, Ming-Wei Chang, Ken- and Igor Mordatch. Frozen pretrained transformers as uni- 511
457
versal computation engines. In Proceedings of the AAAI 512
458 ton Lee, and Kristina Toutanova. Bert: Pre-training of
Conference on Artificial Intelligence, volume 36, pages 513
459 deep bidirectional transformers for language understand-
7628–7636, 2022. 514
460 ing. arXiv preprint arXiv:1810.04805, 2018.
[Lu et al., 2023] Jiahuan Lu, Rui Xiong, Jinpeng Tian, 515
461 [Gandelsman et al., 2022] Yossi Gandelsman, Yu Sun, Xin-
Chenxu Wang, and Fengchun Sun. Deep learning to es- 516
462 lei Chen, and Alexei Efros. Test-time training with masked timate lithium-ion battery state of health without addi- 517
463 autoencoders. Advances in Neural Information Processing tional degradation experiments. Nature Communications, 518
464 Systems, 35:29374–29385, 2022. 14(1):2760, 2023. 519

465 [Genikomsakis et al., 2021] Konstantinos N. Genikomsakis, [Ng et al., 2020] Man-Fai Ng, Jin Zhao, Qingyu Yan, 520
466 Nikolaos-Fivos Galatoulas, and Christos S. Ioakimidis. Gareth J Conduit, and Zhi Wei Seh. Predicting the state 521
467 Towards the development of a hotel-based e-bike rental of charge and health of batteries using data-driven ma- 522
468 service: Results from a stated preference survey and chine learning. Nature Machine Intelligence, 2(3):161– 523
469 techno-economic analysis. Energy, 215:119052, 2021. 170, 2020. 524

470 [Han et al., 2022] Te Han, Zhe Wang, and Huixing Meng. [Pang et al., 2023] Ziqi Pang, Ziyang Xie, Yunze Man, and 525
471 End-to-end capacity estimation of lithium-ion batteries Yu-Xiong Wang. Frozen transformers in language mod- 526
472 with an enhanced long short-term memory network con- els are effective visual encoder layers. arXiv preprint 527
473 sidering domain adaptation. Journal of Power Sources, arXiv:2310.12973, 2023. 528
474 520:230823, 2022. [Radford et al., 2018] Alec Radford, Karthik Narasimhan, 529
475 [He et al., 2011] Wei He, Nicholas Williard, Michael Os- Tim Salimans, Ilya Sutskever, et al. Improving language 530
476 terman, and Michael Pecht. Prognostics of lithium- understanding by generative pre-training. 2018. 531
477 ion batteries based on dempster–shafer theory and the [Radford et al., 2019] Alec Radford, Jeffrey Wu, Rewon 532
478 bayesian monte carlo method. Journal of Power Sources, Child, David Luan, Dario Amodei, Ilya Sutskever, et al. 533
479 196(23):10314–10321, 2011. Language models are unsupervised multitask learners. 534

480 [Hu et al., 2021] Edward J Hu, Phillip Wallis, Zeyuan Allen- OpenAI blog, 1(8):9, 2019. 535

481 Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, [Roman et al., 2021] Darius Roman, Saurabh Saxena, 536
482 et al. Lora: Low-rank adaptation of large language models. Valentin Robu, Michael Pecht, and David Flynn. Machine 537
483 In International Conference on Learning Representations, learning pipeline for battery state-of-health estimation. 538
484 2021. Nature Machine Intelligence, 3(5):447–456, 2021. 539

485 [Jakob Fleischmann, 2023] Martin Linder Mikael Hanicke [Severson et al., 2019] Kristen A Severson, Peter M Attia, 540
486 Evan Horetsky Dina Ibrahim Sören Jautelat ukas Torscht Norman Jin, Nicholas Perkins, Benben Jiang, Zi Yang, 541
487 Alexandre van de Rijt Jakob Fleischmann, Patrick Schau- Michael H Chen, Muratahan Aykol, Patrick K Herring, 542
488 fuss. Battery 2030: Resilient, sustainable, and circular. Dimitrios Fraggedakis, et al. Data-driven prediction of 543
489 2023. January 16, 2023. battery cycle life before capacity degradation. Nature En- 544

490 [Jia et al., 2022] Menglin Jia, Luming Tang, Bor-Chun ergy, 4(5):383–391, 2019. 545

491 Chen, Claire Cardie, Serge Belongie, Bharath Hariharan, [Shu et al., 2021] Xing Shu, Jiangwei Shen, Guang Li, 546
492 and Ser-Nam Lim. Visual prompt tuning. In European Yuanjian Zhang, Zheng Chen, and Yonggang Liu. A 547
493 Conference on Computer Vision, pages 709–727. Springer, flexible state-of-health prediction scheme for lithium-ion 548
494 2022. battery packs with long short-term memory network and 549
550 transfer learning. IEEE Transactions on Transportation
551 Electrification, 7(4):2238–2248, 2021.
552 [Sun et al., 2020] Yu Sun, Xiaolong Wang, Zhuang Liu,
553 John Miller, Alexei Efros, and Moritz Hardt. Test-time
554 training with self-supervision for generalization under dis-
555 tribution shifts. In International conference on machine
556 learning, pages 9229–9248. PMLR, 2020.
557 [Tan and Zhao, 2020] Yandan Tan and Guangcai Zhao.
558 Transfer learning with long short-term memory net-
559 work for state-of-health prediction of lithium-ion bat-
560 teries. IEEE Transactions on Industrial Electronics,
561 67(10):8723–8731, 2020.
562 [Tian et al., 2021] Jinpeng Tian, Rui Xiong, Weixiang Shen,
563 Jiahuan Lu, and Xiao-Guang Yang. Deep neural network
564 battery charging curve prediction using 30 points collected
565 in 10 min. Joule, 5(6):1521–1534, 2021.
566 [Tian et al., 2022] Jinpeng Tian, Rui Xiong, Weixiang Shen,
567 Jiahuan Lu, and Fengchun Sun. Flexible battery state of
568 health and state of charge estimation using partial charg-
569 ing data and deep learning. Energy Storage Materials,
570 51:372–381, 2022.
571 [Wang et al., 2023] Yixiu Wang, Jiangong Zhu, Liang Cao,
572 Bhushan Gopaluni, and Yankai Cao. Long short-term
573 memory network with transfer learning for lithium-ion
574 battery capacity fade and cycle life prediction. Applied
575 Energy, 350:121660, 2023.
576 [Ye and Yu, 2021] Zhuang Ye and Jianbo Yu. State-of-health
577 estimation for lithium-ion batteries using domain adversar-
578 ial transfer learning. IEEE Transactions on Power Elec-
579 tronics, 37(3):3528–3543, 2021.
580 [Zhou et al., 2023] Tian Zhou, Peisong Niu, Xue Wang,
581 Liang Sun, and Rong Jin. One fits all: Power general
582 time series analysis by pretrained lm. arXiv preprint
583 arXiv:2302.11939, 2023.

You might also like