SlideShare a Scribd company logo
Statistical Papers (2025) 66:33 https://doi.org/10.1007/s00362-
024-01656-9
REGULAR ARTICLE
Testing for changes in the error distribution in functional linear models
Natalie Neumeyer1 · Leonie Selk1
Received: 3 April 2024 / Revised: 6 November 2024
© The Author(s) 2025
Abstract
We consider linear models with scalar responses and covariates from a separable Hilbert space. The aim is to detect change
points in the error distribution, based on sequential residual empirical distribution functions. Expansions for those estimated
functions are more challenging in models with infinite-dimensional covariates than in regression models with scalar or vector-
valued covariates due to a slower rate of convergence of the parameter estimators. Yet the suggested change point test is
asymptotically distribution-free and consistent for one-change point alternatives. In the latter case we also show consistency of a
change point estimator.
Keywords Change-points · Functional data analysis · Regularized function estimators · Regression · Residual processes
Mathematics Subject Classification Primary 62R10; Secondary 62G10 · 62G30
1 Introduction
We consider a functional linear model Y = α + (X,β)
2
+ ε with scalar response Y and covariates X from a separable Hilbert space,
e.g. L ([0, 1]). Structural changes in the distribution can appear, even when the parameters α and β do not change. For this
reason we focus on detecting changes in the error distribution. If the errors were observable one could use the classical test (and
change point estimators) based on the difference of the sequential empirical distribution functions of the first Lnt ] and the last n
— Lnt ] error terms from a sample of n observations, see Csörgö et al. (1997), Picard (1985), Carlstein (1988), Dümbgen (1991),
Hariz et al. (2005) and Hariz et al. (2007). In a regression model those tests have to be based on estimated residuals εˆ = Y — αˆ
— (X, βˆ). Similar tests have been considered by Bai (1994) in
1 Fachbereich Mathematik, Universität Hamburg, Hamburg, Germany
1 3
33 Page 2 of 17 N. Neumeyer
, L. Selk
the context of ARMA-models, by Koul (1996) in the context of nonlinear time series, by Ling (1998) for nonstationary
autoregressive models, and by Neumeyer and Van Keilegom (2009) and Selk and Neumeyer (2013) for nonparametric independent
and time series regression models. Typically the asymptotic distribution is derived using asymptotic expansions of residual-based
empirical distribution functions. For models with functional covariates those expansions can be problematic because inner products
(X, βˆ — β) appear and those can have a slow rate of convergence [see Cardot et al. (2007), Shang and Cheng (2015), Yeon et al.
(2023)]. However, we show that under very simple non-restrictive assumptions those terms cancel for the suggested change point
test statistic and thus the asymptotic distribution is the same as based on true (unobserved) errors.
Change point testing and estimation for functional data, and for the parameter in
functional linear models have been considered in the literature, but not for the error distribution. Tests for changes in the functional
mean and in the parameter function of autoregressive models are considered in chapters 6 and 14 in Horváth and Kokoszka (2012).
Berkes et al. (2009) propose a CUSUM testing procedure to detect a change in the mean of functional observations. They apply
projections on principal components of the data to estimate the mean. Aue et al. (2009) extend this result and introduce an estimator
for the change point in this model and derive its limit distribution. Aston and Kirch (2012) consider the same type of model with
epidemic changes and dependent data. Aue et al. (2018) consider how to detect and date structural breaks in the mean of functional
observations without the application of dimension reduction techniques (as functional principal component analysis). Aue et al. (
2014) propose a monitoring procedure to detect structural changes in functional linear models with functional response, allowing
for dependence in the data, including functional autoregressive processes. They test for a change in the regression operator, which
is the analogue to our β, based on functional principal component analysis. A linear regression model with scalar response is
considered in Horváth et al. (2024) who propose a tests for the detection of multiple change points in the regression parameter. The
regressors in their model can be functional and can include lagged values of the response.
The paper is organized as follows. In Sect. 2 we define the test statistic and present
model assumptions to obtain the asymptotically distribution-free hypothesis test. In Sect. 3 we discuss the assumptions on the
parameter estimators and some examples. In Sect. 4 consistency of the test as well as of a change point estimator is considered in
the context of one change point. Finite sample properties are shown in Sect. 5. Section 6 concludes the paper, in particular with
an outlook on goodness-of-fit testing. The proofs are given in the appendix.
2 Model, test statistic and main result under the null
Let H be a separable Hilbert space with inner product (·, ·), corresponding norm ·
 and Borel-sigma field. Let (Xi , Yi ), i = 1,...,
n, be an independent sample of (H × R)-valued random variables defined on the same probability space with probability
1 3
Testing for changes in the error distribution… Page 3 of 17 33
measure P. The data are modeled as functional linear model
Yi = α + (Xi ,β)+ εi , i = 1,..., n,
with scalar response Yi and H-valued covariate Xi , and with parameters α e R, βin H . The covariates X1 ,..., Xn are assumed to
be iid with E Xi < ∞, and the errors ε1 ,..., εn are independent, centered, and independent of the covariates. Our aim is to
test for change-points in the error distribution. In this section we consider the test statistic under the null hypothesis, where the errors
are identically distributed.
Let αˆ and βˆ denote estimators for the parameters α e R and β e H. We build residuals εˆi = Yi — αˆ — (Xi , βˆ), i =
1 . . . , n. The test statistic
Tn = sup sup |Gˆ n (t, z)|
t e[0,1] zeR
based on the process
ˆ n
G (t, z) = Lnt] (n — Lnt])
n3/2
ˆ
F ˜
Lnt ] Lnt ]
(z) — F (z) ,
compares for each k = 1,..., n — 1 the empirical distribution functions
ˆk
F (z) = 1 Σ
k
i =1
˜k
I {εˆi ≤ z}, F (z) =
1
k n Σ
n — k i =k+1
i
I {εˆ ≤ z}
of the first k and last n — k residuals, respectively. Note that one can write
ˆ n
G (t, z) = Lnt]
n1/2
F
ˆ ˆ
Lnt ] n
(z) — F (z) .
For the asymptotic distribution of the test statistic under the null hypothesis we assume the following conditions. Let P
denote the distribution of (X1 , ε1).
(a.1) |αˆ — α|= oP(1), βˆ — β = oP(1)
(a.2) Let ε1 ,..., εn be independent and identically distributed with cdf F that is
Hölder-continuous of order γ e (0, 1] with Hölder-constant c.
(a.3) P βˆ — β e B → 1 as n → ∞ for a class B ⊂ H such that the function class
F = {(x, e) '→ I {e ≤ v + (x, b)} | v e R, b e B}
is P-Donsker.
Remark 2.1 The assumptions are very mild and in particular less restrictive than typical assumptions for asymptotic distribution of
residual-based empirical processes, even for finite-dimensional covariates. In assumption (a.1) only consistency is needed, no
rates of convergence. Typically in the literature about residual-based procedures a bounded error density is assumed, see e.g.
Akritas and Van Keilegom (2001). Then 1 3
33 Page 4 of 17 N. Neumeyer
, L. Selk
(a.2) is fulfilled for γ = 1, but (a.2) is less restrictive in the cases γ e (0, 1). Suitable conditions for the general assumption (a.3)
are discussed in Sect. 3. One possibility for H = L2([0, 1]) is to assume smoothness of β which is a typical assumption. If
1
2
γ e ( , 1 ] in assumption (a.2), and β is in a Sobolev-space with third derivatives,
(a.2) holds for the estimator βˆ from Yuan and Cai (2010). This estimator can also be
applied for smaller γ in (a.2) if higher smoothness of β is assumed.
Define the process Gn as Gˆ n , but based on the true errors instead of residuals, i.e.
n
G (t, z) = L nt]
n 1/2 L ]
F nt (z) — Fn(z)
with
Lnt ]
F (z) = 1
Lnt ]
Lnt ]
Σ
i =1
i
I {ε ≤ z}. (2.1)
Further let G be a completely tucked Brownian sheet, i.e. a centered Gaussian process on [0, 1]2 with covariance structure
Cov(G(s, u), G(t, v)) = (s ∧ t — st)(u ∧ v — uv).
Theorem 2.2 Under the assumptions (a.1)–(a.3),
sup sup |Gˆ n (t, z) — Gn (t, z)|= oP(1), (2.2)
t e[0,1] zeR
and thus the process (Gˆ n (t, z))t e[0,1],zeR converges weakly to (G(t, F(z)))t e[0,1],zeR.
The proof of (2.2) in the theorem is given in the appendix. The weak convergence of Gn is a classical result, see Bickel and Wichura
(1971), Shorack and Wellner (1986). With the continuous mapping theorem one obtains the asymptotic distribution of the test
statistic Tn under the null hypothesis of no change-point, which is the distribution of T = supt,ue[0,1] |G(t, u)| because F is
continuous. The test statistic is asymptotically distribution-free with the same limit distribution as for corresponding changepoint
tests based on iid observations (not residuals). Let α¯ e (0, 1) and q be the (1 — α¯ )- quantile of T . Then the test that rejects the
null hypothesis if Tn > q has asymptotic level α¯ . Consistency is considered in Sect. 4.
Remark 2.3 The choice of Tn as a Kolmogorov–Smirnov type test statistic is not mandatory. In principle, any continuous
functional of the process Gˆ n can be con-
sidered. The most common ones, besides Tn , are of Cramér-von–Mises type, e. g.
Tn,2
= supt
. ˆ 2
|G (t, z)| dF 1
. .
e[0,1] R 0 R
ˆ
n n,3 n 2
(z) or T = |G (t, z)| dF(z)dt . The
asymptotic distribution of these test statistics under the null hypothesis also follows from Theorem 2.2 and with
ˆ n
|G (t, z)| dF (z) →
R R
2 2
∫ ∫ ∫ 1
0
2
|G(t, F(z))| dF(z) = |G(t, x)| dx,
1 3
Testing for changes in the error distribution… Page 5 of 17 33
and thus these test statistics are asymptotically distribution-free as well. However,
Tn,2 and Tn,3 contain the unknown quantity F and must therefore be modified in
order to be applied. This can be done by replacing the integral with the sample mean:
T˜n,2 = supt e[ , ] n
1 n
0 1 i =1
ˆ
|G n i 2 ˜
(t, εˆ )| and T n,3 = 1
0 n
Σ . Σ 1 n i =1 ˆ
|G n i 2
(t, εˆ )| dt .
3 Discussion of assumptions and examples
To show validity of the Donsker-class assumption (a.3) there are sufficient conditions on covering numbers or bracketing numbers.
We discuss some specific conditions on the class B, examples for Hilbert spaces H, and estimators for the parameter function β
that fulfill the conditions.
1. VC-class condition
Assumption (a.3) can be derived from a VC-function class condition formulated as
ˆ
follows. Assume that P β — β e B → 1 as n → ∞ for a class B ⊂ H such that the
class of maps
{H → R, x '→ (x, b)+ v | b e B,v e R}
is a VC-subgraph class. By definition then
(3.1)
{{(x, e) e H × R | e ≤ (x, b)+ v}| b e B,v e R}
is a VC-class of sets. The class F from (a.3) is the class of the corresponding indicator functions and (a.3) is fulfilled by
Theorems 8.19 and 9.2 in Kosorok (2008).
Example 3.1 We consider the Hilbert space H = L2([0, 1]) with inner product
. .
1 1
0 0
2 1/2
(g, h)= g(t)h(t) dt and norm g = ( g (t) dt) . For the parameter function
β we assume sparsity as in Lee and Park (2012). Let (φ j ) j eN be a basis of H and
assume β = Σ
j eJ j j
β φ for some finite, but unknown index set J . Lee and Park
ˆ
(2012) consider the estimator β = Σ k
j =1 βˆj φ j with
(βˆ1,..., βˆk ) = arg
b
min
,
, 1
1,...,bk eR n
n
Σ
i =1
k
Σ
j =1
2
k
Σ
Yi — Yn — b j (Xi — Xn ,φj ) + wˆ j |b j | , , j =1
, 2
where k is a chosen dimension-cut-off, wˆ j are suitable weights based on initial esti-
n
mators, and Y =
n i =1
i n
Y , X = 1
n
Σ Σ
1 n n
Xi . Further, αˆ = Yn — (βˆ, Xn ). Under
i =1
2
suitable assumptions, in particular E X < ∞, and k is larger than the largest index
in J , Lee and Park (2012) show in their Theorem 2 that P(βˆj = 0 for j e/ J) → 1 for
1 3
33 Page 6 of 17 N. Neumeyer
, L. Selk
n → ∞. Thus we can set
B =
⎧
⎨ Σ
⎩
j eJ
j j j
b φ b e R ∀ j e J
⎫
⎬
⎭
⎨
⎩
Σ
j eJ
j j
H → R, x '→ b (x,φ )+ j
v b e R ∀ j e J,v e R
and obtain P(βˆ — β e B) → 1 for n → ∞. Further, the class of maps in (3.1), i.e.
⎧ ⎫
⎬
⎭
is a finite dimensional vector space and thus a VC-class, see Lemma 2.6.15 in van der Vaart and Wellner (1996). Then as
discussed above validity of (a.3) follows. Fur- thermore, from Theorem 2 in Lee and Park (2012) it also follows that our
assumption (a.1) is fulfilled, and thus under assumption (a.2) the assertion of Theorem 2.2 holds.
3.2 Bracketing number condition
In this subsection we assume that H is a separable Hilbert space of real-valued func- tions (or vectors with real components) and
the inner product is increasing in the sense that from h ≤ g (pointwise for functions; componentwise for vectors) it follows that
(h, x )≤ (g, x ) for all x e H with x ≥ 0. Then assumption (a.3) can be replaced by the condition in the next lemma.
ˆ
Lemma 3.2 Assume (a.1), (a.2) and P β — β e B → 1 as n → ∞ for a function
class B ⊂ H such that the bracketing number fulfills log N[ ] (B,‹, ·
 ) ≤ K /‹1/k for some k > 1/γ . Here γ is the Hölder-order
from assumption (a.2). Then assumption (a.3) holds.
The proof is given in the appendix.
Example 3.3 We consider the Hilbert space H = L2([0, 1]) with inner product
. .
1 1
0 0
2 1/2
(g, h) = g(t)h(t) dt and norm g = ( g (t) dt) . We assume β e
m
2
W ([0, 1 ]) for some m > 2 and the Sobolev-space
m
2
W ([0, 1 ]) = b : [ ( j)
0, 1]→ R | b is absolutely continuous for j = 0,..., m — 1,
b(m)
and < ∞
}
,
where b(0) = b, and b( j) denotes the j -th derivative of b, j ≥ 1. We consider the regularized estimators in Yuan and Cai (2010
), i.e.
ˆ
αˆ , β = arg min m
1
aeR,beW2 ([0,1]) n
n
Σ
i =1
n
Yi — a + (Xi , b) + λ b (m)
2 ¨ ¨ ¨ ¨ 2
1 3
Testing for changes in the error distribution… Page 7 of 17 33
for a suitable positive sequence λn converging to zero. Convergence rates of βˆ and its derivatives can be found in Corollaries 10
and 11 in Yuan and Cai (2010). Under
( j) ( j)
¨ ˆ ¨ P
suitable assumptions one obtains β — β = o (1 ) for j = 0, 1, 2, and thus
P(βˆ — β e B) → 1 for the function class
2
2
B = b e W [0,
b(2)
1] :b +
 ≤ 1
}
.
By Corollary 4.3.38 in Giné and Nickl (2021) and Lemma 9.21 in Kosorok (2008) the class B fulfills the bracketing number
condition in Lemma 3.2 for k = 2. Thus the 1
2
assumptions (a.1)–(a.3) are fulfilled if F is Hölder-continuous of order γ e ( , 1 ]. Less
1
2
restrictive assumptions on F , i.e. γ ≤ , require for this concept higher smoothness
of β.
4 Fixed one-change point alternative: consistency of the test and change point estimator
In this section we consider fixed alternatives with one change point at index kn
∗ = Lnϑ∗] with ϑ∗ e (0, 1). We write the
functional linear model as in Sect. 2 under the following assumption.
(a.2)’ Assume ε1 ,..., εkn
∗ are iid with cdf F1, and εkn
∗+1,..., εn are iid with cdf
F2 /= F1. Let F1 and F2 be Hölder-continuous of order γ1, γ2 e (0, 1] with
Hölder-constant c1, c2, respectively.
Let further P1 denote the distribution of (X1 , ε1) (before the change) and P2 denote the distribution of (Xn , εn ) (after the
change). For the empirical distribution functions Fˆk and F˜k as in Sect. 2 we obtain the following asymptotic result.
Lemma 4.1 Under assumptions (a.1) and (a.2)’ and if (a.3) is valid for P = P1 and P = P2, it holds that
sup |Fˆkn
∗ (z) — F1(z)|= oP(1) and sup |F˜kn
∗ (z) — F2(z)|= oP(1).
zeR zeR
The proof is given in the appendix. Now note that
≥ n
∗ ∗
n
Tn k (n — k )
n1/2 n2 zeR
sup F
k ∗
n
ˆ ˜ k ∗
n
(z) — F (z) ,
and by Lemma 4.1 the right hand side converges in probability to the positive constant
ϑ∗(1 — ϑ∗) sup |F1(z) — F2(z)| .
zeR
From this it follows that tests that reject the null hypothesis of no change-point if
Tn > q for some q > 0 (see Sect. 2) are consistent. 1 3
n
ˆ ˆ
ϑ = min t : sup |G ˆ
(t, z)|= sup sup |G
n n r
33 Page 8 of 17 N. Neumeyer
, L. Selk
The estimator for the change point ϑ∗ is based on the process Gˆ n and is defined as
}
(t , z)| .
zeR t re[0,1] zeR
Lemma 4.2 Under assumptions (a.1), (a.2)’ and if (a.3) holds for P = P1 and P = P2, the change point estimator is consistent, i. e.
|ϑˆn — ϑ∗
|= oP(1).
The proof is given in the appendix.
5 Finite sample properties
We consider the Hilbert space H = L2([0, 1]). For i = 1 ,..., n the functional
i
X (t) = 1
2
observations Xi (t), t e [0, 1], are generated according to
5
Σ
l=1
i ,l i ,l i ,l i ,l i ,l i ,l
B sin t(5 — B )2π — M — E [B sin (5 — B )2π — M ] ,
where Bi,l ∼ U [0, 5] and Mi,l ∼ U [0, 2π ] for l = 1,..., 5, i = 1 ,..., n. U stands for the (continuous) uniform distribution.
The functional linear model is built as
∫
i i 3, 3 i
Y = X (t)γ 1 (t)dt + ε ,
where the coefficient function γa,b(t) = ba/ Г(a)ta—1e—bt I {t > 0} is the density of the Gamma distribution. Furthermore, we
assume that each Xi is observed on a dense, equidistant grid of 300 evaluation points.
The parameter estimators are the regularized estimators described in Example 3.3 with m = 3 and a data-driven tuning
parameter λn chosen by generalized cross- validation as described in Yuan and Cai (2010).
We model three similar types of change points, such that
1 L
n
2
ε , . . . , ε ∼ N(0, 1), ε n
2
] L ]+ ˜ ˜ ˜
1 n 1,δ 2,δ 3,δ
, . . . , ε ∼ F (respectively F , F ),
where F˜1,
δ , F˜2,
δ , F˜3,
δ have in common that the mean remains zero and the variance remains one. In particular
• F˜1,
δ is the distribution function of a random variable that is N(—2δ, 1) distributed with probability 0.5 and N(2δ, 1)
distributed with probability 0.5.
• F˜2,
δ is the distribution function of a random variable that is N(0,(1 — δ)2) dis- tributed with probability 0.5 and N(0,
2 — (1 — δ)2) distributed with probability 0.5.
1 3
Testing for changes in the error distribution… Page 9 of 17 33
0
2
0
4
0
6
0
8
0
1
0
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
delta
Rejection
in
percent
n=200
n=100
0
2
0
4
0
6
0
8
0
1
0
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
delta
Rejection
in
percent
n=200
n=100
0
2
0
4
0
6
0
8
0
1
0
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
delta
Rejection
in
percent
n=200
n=100
Fig. 1 Rejection probabilities with a change in the error distribution from N(0, 1) to F˜1,
δ (left), to F˜2,δ (middle) and to F˜3,
δ (right). The dotted line marks
the 5% level
• F˜3,
δ is the “skew-normal”-distribution
SN —
,
,
, ,
2π (10δ)2 + (10δ)4 π 1 + (10δ)
2
, ,
10δ .
π 2 + 2π 2 — 2π · (10δ)2 + π 2 — 2π · (10δ)4 π + (π — 2) · (10δ)2
,
,
A random variable Z is distributed SN(λ1 , λ2 , λ3) if Z = λ1 + λ2 · Z0 and Z0
has the density 2φ(x)Ф(λ3 x), where φ is the density and Ф is the distribution
function of the standard normal distribution [see Azzalini and Capitanio (1999)].
1 2
,
π
2 λ
3
The expected value of such a random variable Z is calculated as λ +λ ·, 1+λ3
2
2 π
and the variance as λ 1 — · 2 2 λ3
2
1+λ3
2 . This results in the parameters for the
distribution after the change point, such that the the expected value of the errors remains 0 and the variance remains 1.
So δ = 0 represents the null hypothesis of no change point, and the difference between the distribution before and after the change
point grows with δ in all three cases.
In Fig. 1 the rejection probabilities for 500 repetitions, level 5% [critical value tabled in Picard (1985)] and sample sizes n e
{100, 200} are shown. In all three cases it can be seen that the level is approximated well and the power increases for increasing
parameter δ as well as for increasing sample size n. In the case of a change in skewness, the increase with δ is not as pronounced
as in the other two cases. This is because the distributions for different values of δ become more similar as δ increases. The same
types of changes (from N(0, 1) to F˜1,
δ and to F˜2,
δ ) were also simulated in Selk and Neumeyer (2013) for a real-valued
nonparametric autoregression model with lag 1. The results are comparable with an even higher power in the paper at hand.
In addition, we model a more distinct change, that is
ε , . . . , ε n
2
∼ N(0, 0. 2
1 L ] L ]+1 n
2 2
5 ), ε n , . . . , ε ∼ N(0,(0.5 + δ) ).
As expected for a change in the variance the power grows faster with increasing δ than in the other three cases, especially for
small δ. The results are shown in Fig. 2. This kind of change point was also simulated in Neumeyer and Van Keilegom (2009)
1 3
33 Page 10 of 17 N. Neumeyer
, L. Selk
Fig. 2 Rejection probabilities with a change in
the error distribution from N(0, 0.52) to
N(0, (0.5 + δ)2). The dotted line marks the
5% level
0
2
0
4
0
6
0
8
0
1
0
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
delta
Rejection
in
percent
n=200
n=100
for a nonparametric regression model with one-dimensional regressor. The results are similar.
Next to the Kolmogorov–Smirnov type test with test statistic Tn , we also applied Cramér-von-Mises type tests with test
statistics Tn,2 and Tn,3. The results are very similar and are not presented here for the sake of brevity.
6 Concluding remarks
To detect structural changes in functional linear models, we considered the classical test by Bickel and Wichura (1971) for a
change in the distribution, but based on estimated errors. We gave simple assumptions under which the asymptotic distribution of
the test statistic under the null is the same as for iid data. The test as well as the corresponding change point estimators are
consistent in one-change point models. The same test can be considered in more complex regression models with functional
covariates, e.g. a quadratic model as in Boente and Parada (2023) or nonparametric models, see Ferraty and Vieu (2006). We only
consider independent data, but testing for change points in the innovation distribution in times series models that include functional
covariates is a very interesting topic. However, the proofs for asymptotic distributions will be more complicated. In future work
we are planning to consider a time series model Yt = m(Xt ) + εt , where Yt and εt are real-valued and Xt contains a functional
part, but can also contain past values Yt —1 , . .. , Yt — p . We presume that the proofs as in Selk and Neumeyer (2013) (for
nonparametric autoregression time series with independent errors) and of Sect. 4.2 in Neumeyer and Omelka (2025) (for linear
models with finite- dimensional covariates and beta-mixing errors) can be combined with the proofs in the paper at hand to
consider change-point tests for the innovation distribution under the assumption that (Xt , εt ), t e Z, is a strictly stationary beta-
mixing time series.
Testing for changes in the error distribution… Page 11 of 17 33
In the proof of Theorem 2.2 we derive an expansion for the sequential residual-based empirical distribution function,
FˆLn
t ] (z) = FLnt ] (z) + Rn(z) + oP(n—1/2
)
uniformly in t e [0, 1], z e R, where FLnt ] is defined in (2.1), and the term
Rn(z) = EX [F(z + αˆ — α + (X, βˆ — β))]— F(z)
appears from estimating the parameters [see (A.1) in the appendix]. Here EX denotes the expectation with respect to X , which
has the same distribution as Xi , but is indepen- dent of αˆ , βˆ. For change-point testing the remainder term Rn cancels when
considering the test statistic Tn . For other testing procedures, e. g. goodness-of-fit tests for the error distribution, this typically
nonnegligible term is of relevance, see Koul (2002) and Neumeyer et al. (2006) for linear models and Akritas and Van Keilegom (
2001) for nonparametric regression. Under more restrictive assumptions one can further expand the remainder term as follows.
Assume that F is twice differentiable with density F r = f and bounded f r and further |αˆ — α|+ βˆ — β = oP(n—1/4), and
E X 2 < ∞. Then by Taylor’s expansion one obtains
FˆLn
t ] (z) = FLnt ] (z) + f (z) αˆ — α + (E [X ], βˆ — β) + oP(n—1/2
).
In models with intercept α, where the estimator for α is chosen as αˆ = Yn — (Xn , βˆ)
the remainder term is
Rn(z) = f (z) εn + (Xn — E [X ], βˆ — β) + oP(n—1/2
).
n
(Here, X = n Σ
—1 n i =1 Xi and analogous for Y n and εn .) By Cauchy–Schwarz-
1/2
inequality and the central limit theorem for n (Xn — E [X ]) one obtains that the
dominating part of the remainder term is f (z)εn . This is the same as in homoscedastic finite-dimensional linear models with intercept
and nonparametric regression models. Note that α and β are identifiable if the kernel of the covariance operator of the covariate
X is {0}. But often functional linear models without intercept are considered in the literature. So in our model assume α = αˆ = 0.
Then the remainder term is
Rn(z) = f (z)(E [X ], βˆ — β)+ oP(n—1/2
),
and (x, βˆ — β) for fixed x e H typically has a slower rate than n—1/2, see Cardot et al. (2007), Shang and Cheng (2015), Yeon et al.
(2023). If one assumes E [X ]= 0, then this problematic term cancels [similar as e.g. for centered ARMA-processes, see Bai (1994
)], but otherwise f (z)(E [X ], βˆ — β) will dominate the asymptotic distribution of the process (FˆLn
t ] (z) — F(z))t e[0,1],zeR. For
our change-point test this dominating term vanishes. The same holds when estimating the conditional copula of the response in
multidimensional functional linear models, given the covariate, see Theorem 5 in Neumeyer and Omelka (2025). But e.g. for
1 3
33 Page 12 of 17 N. Neumeyer
, L. Selk
would be of relevance. We consider goodness-of-fit tests for the error distribution in the different cases explained above in
future work. With the derived expansion for residual empirical distribution functions one can also develop other tests for the error
distribution as e.g. for symmetry, or equality of error distributions in different models, see e.g. Pierce and Kopecky (1979),
Neumeyer et al. (2005), Pardo Fernandez (2007), among many others, in the cases of regression models with finite-dimensional
covariates.
A Proofs
For ease of notation let (X, Y, ε) be some generic random variable with the same distribution as (X1 , Y1, ε1) under the null, but
independent from the sample (Xi , Yi ), i = 1,..., n. Let P denote the distribution of (X, ε). Further let EX denote the
expectation with respect to X , which in the context below is the conditional expectation given (Xi , Yi ), i = 1,..., n.
The proofs of Theorem 2.2 and Lemma 3.2 are similar as a part of the proof of Theorem 5 in Neumeyer and Omelka (2025),
but under less restrictive assumptions.
A.1 Proof of Theorem 2.2
From the Donsker-property in assumption (a.3) and Corollary 9.31 in Kosorok (2008) it follows that
{(x, e) '→ I {e ≤ z + a + (x, b)} — I {e ≤ z}| z e R, a e R, b e B}
is also P-Donsker. From Theorem 2.12.1 in van der Vaart and Wellner (1996) it follows that also the centered sequential process
1 Lnt ]
Σ
n i
H (t, z, a, b) = ,
n
I {ε ≤ z + a + ( i
X , b)}
i =1
—I {εi ≤ z}— E [F(z + a + (X, b)] + F(z) ,
indexed in t e [0, 1], z e R, a e R, b e B, converges weakly to a centered Gaussian process. Thus the process Hn is
asymptotically stochastic equicontinuous with respect to the semi-metric
ρ((t1, z1, a1, b1), (t2, z2, a2, b2)) = |t1 — t2|+ Var(I {ε ≤ z1 + a1 + (X, b1)}
—I {ε ≤ z2 + a2 + (X, b2)}),
see van der Vaart and Wellner (1996), problem 2, p. 93, and Sect 2.12. In particular we need
Testing for changes in the error distribution… Page 13 of 17 33
≤ |E [F(z + a + (X, b)) — F(z)]|
≤ cE
!
|a + (X, b)|γ
"
≤ c |a|+ b E X γ
by Cauchy–Schwarz and Jensen’s inequality. Now setting a = αˆ —α and b = βˆ—β we
obtain convergence to zero in probability by assumption (a.1). Thus from asymptotic stochastic equicontinuity of the process Hn ,
and Hn(t, z, 0, 0) = 0 we obtain that
sup
|Hn(t, z, αˆ — α, βˆ — β)|= oP(1),
t e[0,1],zeR
which means that
Lnt] ˆ
,
n
F Lnt ]
Lnt]
(z) = ,
n
F Lnt ]
Lnt] X
(z) + ,
n
E [F(z + αˆ — α + ( ˆ
X, β — β) )]— F(z)
+oP(1) (A.1)
uniformly in t e [0, 1], z e R, where FLnt ] was defined in (2.1) and is based on the true errors. In particular for t = 1 we have
Fˆn (z) = Fn(z) + EX [F(z + αˆ — α + (X, βˆ — β))]— F(z) + oP(n—1/2)
uniformly in z e R. From those expansions we obtain
ˆ n
Lnt] ˆ
G (t, z) = ,
n
F Lnt ]
Lnt] ˆn
(z) — ,
n
F (z)
Lnt]
= ,
n
F Lnt ]
Lnt] n P
(z) — ,
n
F (z) + o (1 )
= Gn (t, z) + oP(1)
uniformly in t e [0, 1], z e R.
A.2 Proof of Lemma 3.2
(A.2)
п
Let ‹ > 0 and let
L U
i i
! "
b , b , i = 1,..., N(‹) = O exp(‹—2/(kγ))
2/γ
be brackets for B of ·-length ‹ [see assumption (a.3)]. Now for b e [ L U
i i
b , b ]
the indicator function I {e ≤ v + (x, b)} is contained in the bracket
!
i
I {e ≤ v + (x I {x ≥ 0}, b )+ ( L U
i
x I {x < 0}, b )}, I {e ≤ v + ( U
i
x I {x ≥ 0}, b )
L
i
+(xI {x < 0}, b )}
"
1 3
33 Page 14 of 17 N. Neumeyer
, L. Selk
for each v e R. Further the above bracket has L2 (P)-length
!
E I {ε ≤ v + (X I U
i
{X ≥ 0}, b )+ ( L
i
X I {X < 0}, b )}
—I {ε ≤ v + (X I L
i
{X ≥ 0}, b )+ ( U
i
X I {X < 0}, b )}
" 2 1/2
!
≤ E F v + ( U
i
X, b ) — F v + ( L
i
X, b )
" 1/2
!
i i
≤ E c|(X, b — b )|
U L γ " 1/2
1/2
bi — b i
γ /2 U L γ /2
≤ c (E X )
= O(‹),
by assumption (a.2), Cauchy–Schwarz and Jensen’s inequality. Similar to the proof of Lemma 1 in Akritas and Van Keilegom (
2001) one obtains an upper bound O(‹—2 exp(‹—2/k )) for the L2(P)-bracketing number of the class F. Thus F is P- Donsker
by the bracketing integral condition in Theorem 19.5 of van der Vaart (1998).
п
A.3 Proof of Lemma 4.1
To show the assertion for Fˆkn
∗ we use the arguments as in the proof of Theorem 2.2 for the process Hn , but based on the iid
sample (X1 , Y1), . . . , (Xkn
∗ , Ykn
∗ ) before the change. Then as in the proof of Theorem 2.2 asymptotic stochastic equicontinuity of
the process Hkn
∗ holds and thus
sup |Hkn
∗ (t, z, αˆ — α, βˆ — β)|= oP(1).
t e[0,1],zeR
Here, αˆ and βˆ depend on the whole sample and assumption (a.1) is used. Thus as in Eq. (A.1) we obtain
ˆ
F k ∗
n k ∗
n 1
X 1
(z) = F (z) + E [F (z + αˆ — α + ( 1 ˆ
X , β — β) 1 P
1
)]— F (z) + o , n
uniformly in z e R. By the classical Glivenko–Cantelli result Fkn
∗ converges uniformly almost surely to F1. By assumptions (a.1),
(a.2)’ and E X1 < ∞ the remainder term is oP(1) and the assertion for Fˆkn
∗ follows. The assertion for F˜kn
∗ can be shown
analogously. п
1 3
Testing for changes in the error distribution… Page 15 of 17 33
A.4 Proof of Lemma 4.2
First note that
zeR
ˆ ˆ
n n
t e[0,1] t e[0,1]
ϑ e arg max sup |G (t, z)| = arg max sup
#
zeR
ˆ
G (t, z)
n1/2
n
$
.
Further it holds
Gˆ n (t, z)
n1/2
=
n2
Lnt] (n — Lnt]) 1
Lnt ]
Lnt ]
Σ
i =1
1
I {εˆi ≤ z}—
n — Lnt ]
n
Σ
i =Lnt ]+1
I {εˆi ≤ z}
= Lnt](n
n2
#
— Lnt]) 1
Lnt ]
∗
Lnt ] L
∧ nϑ ]
Σ
i =1
1
I {εˆi ≤ z}+ I {t > ϑ∗}
Lnt ]
Lnt ]
Σ
i =Lnϑ∗]+1
I {εˆi ≤ z}
1
—
n — Lnt ]
n
Σ
i =Lnt ] L
∨ nϑ∗]+1
1
I {εˆi ≤ z}— I {t < ϑ∗
}
n — Lnt ]
∗
Lnϑ ]
Σ
i =Lnt ]+1
I {εˆi ≤ z}
$
= Lnt] (n — L
n2
#
nt]) L ∗
nt ]∧ Lnϑ ]
Lnt ] 1 ∗
F (z) + I {t > ϑ }
∗
Lnt ]— Lnϑ ]
Lnt ] 2
F (z)
—
∗
n — Lnt ]∨ Lnϑ ]
n — Lnt ] 2 ∗
F (z) — I {t < ϑ }
∗
Lnϑ ]— Lnt ]
n — Lnt ]
$
F1(z) + oP(1),
since we have
sup sup
t e[0,ϑ∗] zeR
Lnt] 1 Lnt ]
Σ
n Lnt ] i =1
i 1
I {εˆ ≤ z}— F (z)
≤ sup sup
t e[0,ϑ ] zeR ∗
Lnt] 1
∗ Lnϑ ] Lnt ]
Lnt ]
Σ
i =1
I {εˆi ≤ z}—
1
∗
Lnϑ ]
∗
Lnϑ ]
Σ
i =1
i
I {εˆ ≤ z}
c ˛' I
= 1
Lnϑ∗]
˜
1/2 Lnϑ ]
G ∗ (t,z)
(A.3)
+ Lnt] 1
∗ ∗
Lnϑ ] Lnϑ ]
∗
Lnϑ ]
Σ
i =1
i 1
I {εˆ ≤ z}— F (z) (A.4)
= oP(1).
Here we have used Lemma 4.1 for the term (A.4). Further G˜ Lnϑ∗] is defined as Gˆ n
based on the iid-sample (X1 , Y1), . . . , (Xkn
∗ , Ykn
∗ ), but where the residuals are built
with αˆ , βˆ based on the whole sample. With the same argument as in the proof of Theorem 2.2 it holds that
G˜ Lnϑ∗](t, z) = GLnϑ∗](t, z) + oP(1) 1 3
33 Page 16 of 17 N. Neumeyer
, L. Selk
Analogously one can show that sup
∗ sup
t e[ϑ ,1] zeR
n—Lnt] 1
n n—Lnt ]
Σ n
i =Lnt ]+ 1 I {εˆi ≤
2 P
z}— F (z) = o (1 ).
Thus, it holds uniformly in t e [0, 1]
ˆ n
G (t , z)
n1/2
∗
= I {t > ϑ } ∗
Lnϑ ](n — Lnt ])
n2
1 2
(F (z) — F (z))
∗
+ I {t ≤ ϑ }
∗
Lnt ](n — Lnϑ ] )
n2
1 2 P
(F (z) — F (z)) + o (1 )
= I {t > ϑ∗}ϑ∗(1 — t) + I {t ≤ ϑ∗}t(1 — ϑ∗) (F1(z) — F2(z)) + oP(1).
The assertion then follows by Theorem 2.12 in Kosorok (2008) as ϑ∗ is well-separated maximum of t '→ I {t > ϑ∗}ϑ∗(1 — t) + I {t
≤ ϑ∗}t(1 — ϑ∗). п
Acknowledgements The authors are grateful to the Editors and Guest Editors for the organization of the Special Issue “Goodness-of-Fit, Change-Point, and
Related Problems”, and to the referees, the Associate Editor and the Guest Editor Simos Meintanis for their constructive comments and interesting ideas to expand
the topic.
Funding Open Access funding enabled and organized by Projekt DEAL.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and
reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons
licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence,
unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/.
References
Akritas MG, Van Keilegom I (2001) Non-parametric estimation of the residual distribution. Scand J Stat 28(3):549–567
Aston JAD, Kirch C (2012) Detecting and estimating changes in dependent functional data. J Multivar Anal
109:204–220
Aue A, Gabrys R, Horváth L, Kokoszka P (2009) Estimation of a change-point in the mean function of functional data. J Multivar Anal 100(10):2254–2269
Aue A, Hörmann S, Horváth L, Hušková M (2014) Dependent functional linear models with applications
to monitoring structural change. Stat Sin 24:1043–1073
Aue A, Rice G, Sönmez O (2018) Detecting and dating structural breaks in functional data without dimension reduction. J R Stat Soc Ser B Stat Methodol
80(3):509–529
Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew-normal distribution. J R
Stat Soc Ser B Stat Methodol 61:579–602
1 3
Testing for changes in the error distribution… Page 17 of 17 33
Cardot H, Mas A, Sarda P (2007) Clt in functional linear regression models. Probab Theory Relat Fields 138:325–361
Carlstein E (1988) Nonparametric change-point estimation. Ann Stat 16(1):188–197
Csörgö M, Horváth L, Szyszkowicz B (1997) Integral tests for suprema of kiefer processes with application.
Stat Decis 15:365–377
Dümbgen L (1991) The asymptotic behavior of some nonparametric change-point estimators. Ann Stat 19(3):1471–1495
Ferraty F, Vieu P (2006) Nonparametric functional data analysis, Springer series in statistics. Springer, New
York
Giné E, Nickl R (2021) Mathematical foundations of infinite-dimensional statistical models. Cambridge series in statistical and probabilistic mathematics.
Cambridge University Press
Hariz SB, Wylie JJ, Zhang Q (2005) Nonparametric change-point estimation for dependent sequences. CR
Math 341(10):627–630
Hariz SB, Wylie JJ, Zhang Q (2007) Optimal rate of convergence for nonparametric change-point estimators for nonstationary sequences. Ann Stat
35(4):1802–1826
Horváth L, Kokoszka P (2012) Inference for functional data with applications, Springer series in statistics.
Springer, New York
Horvath et al (2024) Variable selection based testing for parameter changes in regression with autoregressive dependence. J Bus Econ Stat 42(4):1331–1343
Kosorok MR (2008) Introduction to empirical processes and semiparametric inference. Springer, New York
Koul HL (1996) Asymptotics of some estimators and sequential residual empiricals in nonlinear time series.
Ann Stat 24:380–404
Koul HL (2002) Weighted empirical processes in dynamic nonlinear models, Lecture notes in statistics.
Springer, New York
Lee ER, Park BU (2012) Sparse estimation in functional linear regression. J Multivar Anal 105(1):1–17 Ling S (1998) Weak convergence of the sequential
empirical processes of residuals in nonstationary
autoregressive models. Ann Stat 26:741–754
Neumeyer N, Van Keilegom I (2009) Change-point tests for the error distribution in non-parametric regression. Scand J Stat 36(3):518–541
Neumeyer N, Dette H, Nagel E-R (2005) A note on testing symmetry of the error distribution in linear
regression models. J Nonparametr Stat 17(6):697–715
Neumeyer N, Dette H, Nagel E-R (2006) Bootstrap tests for the error distribution in linear and nonparametric regression models. Austr New Zealand J Stat
48(2):129–156
Neumeyer N, Omelka M (2025) Generalized Hadamard differentiability of the copula mapping and its
applications. Bernoulli (to appear)
Pardo Fernandez JC (2007) Comparison of error distributions in nonparametric regression. Stat Probab Lett 77:350–356
Picard D (1985) Testing and estimating change-points in time series. Adv Appl Probab 17:841–867
Pierce DA, Kopecky KJ (1979) Testing goodness of fit for the distribution of errors in regression models.
Biometrika 66(1):1–5
Selk L, Neumeyer N (2013) Testing for a change of the innovation distribution in nonparametric autoregression: the sequential empirical process
approach. Scand J Stat 40(4):770–788
Shang Z, Cheng G (2015) Nonparametric inference in generalized functional linear models. Ann Stat
43(4):1742–1773
Shorack GR, Wellner JA (1986) Empirical processes with applications to statistics. Wiley
van der Vaart A, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
van der Vaart A (1998) Asymptotic statistic. Cambridge series in statistical and probabilistic mathematics.
1 3

More Related Content

PPTX
Omnibus diagnostic procedures for vector multiplicative errors models.pptx
wefages963
 
PDF
mathstat.pdf
JuanRobinson3
 
PDF
RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEATED MEASUREMENT DATA
orajjournal
 
DOCX
Time series analysis use E-views programer
Al-Qadisiya University
 
PPTX
autocorrelation.pptx
PriyadharshanBobby
 
PDF
On estimating the integrated co volatility using
kkislas
 
PDF
Introduction to queueing systems with telecommunication applications
Springer
 
Omnibus diagnostic procedures for vector multiplicative errors models.pptx
wefages963
 
mathstat.pdf
JuanRobinson3
 
RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEATED MEASUREMENT DATA
orajjournal
 
Time series analysis use E-views programer
Al-Qadisiya University
 
autocorrelation.pptx
PriyadharshanBobby
 
On estimating the integrated co volatility using
kkislas
 
Introduction to queueing systems with telecommunication applications
Springer
 

Similar to s00362-024-01656-9_about Vectors in the Physics.pptx (20)

PDF
Introduction to queueing systems with telecommunication applications
Springer
 
PDF
Nonparametric Tests For Complete Data Vilijandas Bagdonavicius
ukxznytqm1995
 
PDF
Thesis_NickyGrant_2013
Nicky Grant
 
PDF
Chapitre04_Solutions.pdf
Jean-Philippe Turcotte
 
PPT
FE3.ppt
asde13
 
PPT
Intro to ecm models and cointegration.ppt
OliverMcNamara
 
PPT
Ch5_slides.ppt
AhrorErkinov
 
PPT
Ch5_slides Qwertr12234543234433444344.ppt
sadafshahbaz7777
 
PPT
Ch5_slides.ppt
ssuser512132
 
PPT
Ch5 slides
fentaw leykun
 
DOCX
financial econometric
Jerom Emmanual
 
PDF
t-z-chi-square tests of sig.pdf
AmoghLavania1
 
PDF
Comparison Theorems for SDEs
Ilya Gikhman
 
PDF
Ali, Redescending M-estimator
Muhammad Ali
 
PPTX
Econometric Time Series Analysis for students
tharindutd1
 
PDF
Statistical Sciences And Data Analysis 2nd Edition Kameo Matusita Editor Mada...
duboselijha
 
PPTX
ANOVA /Analysis of Variance/_student.pptx
muazmuzayen9
 
PDF
Stochastic Processes - part 3
HAmindavarLectures
 
PPTX
Static Models of Continuous Variables
Economic Research Forum
 
Introduction to queueing systems with telecommunication applications
Springer
 
Nonparametric Tests For Complete Data Vilijandas Bagdonavicius
ukxznytqm1995
 
Thesis_NickyGrant_2013
Nicky Grant
 
Chapitre04_Solutions.pdf
Jean-Philippe Turcotte
 
FE3.ppt
asde13
 
Intro to ecm models and cointegration.ppt
OliverMcNamara
 
Ch5_slides.ppt
AhrorErkinov
 
Ch5_slides Qwertr12234543234433444344.ppt
sadafshahbaz7777
 
Ch5_slides.ppt
ssuser512132
 
Ch5 slides
fentaw leykun
 
financial econometric
Jerom Emmanual
 
t-z-chi-square tests of sig.pdf
AmoghLavania1
 
Comparison Theorems for SDEs
Ilya Gikhman
 
Ali, Redescending M-estimator
Muhammad Ali
 
Econometric Time Series Analysis for students
tharindutd1
 
Statistical Sciences And Data Analysis 2nd Edition Kameo Matusita Editor Mada...
duboselijha
 
ANOVA /Analysis of Variance/_student.pptx
muazmuzayen9
 
Stochastic Processes - part 3
HAmindavarLectures
 
Static Models of Continuous Variables
Economic Research Forum
 
Ad

Recently uploaded (20)

PPTX
Hepatopulmonary syndrome power point presentation
raknasivar1997
 
PPTX
Introduction to biochemistry.ppt-pdf_shotrs!
Vishnukanchi darade
 
PDF
Even Lighter Than Lightweiht: Augmenting Type Inference with Primitive Heuris...
ESUG
 
PDF
Vera C. Rubin Observatory of interstellar Comet 3I ATLAS - July 21, 2025.pdf
SOCIEDAD JULIO GARAVITO
 
PPT
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
PDF
Approximating manifold orbits by means of Machine Learning Techniques
Esther Barrabés Vera
 
PPTX
Pharmacognosy: ppt :pdf :pharmacognosy :
Vishnukanchi darade
 
PPTX
Sleep_pysilogy_types_REM_NREM_duration_Sleep center
muralinath2
 
PPTX
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
PPTX
Embark on a journey of cell division and it's stages
sakyierhianmontero
 
PPTX
Quality control test for plastic & metal.pptx
shrutipandit17
 
PPTX
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
PDF
Package-Aware Approach for Repository-Level Code Completion in Pharo
ESUG
 
PPTX
RED ROT DISEASE OF SUGARCANE.pptx
BikramjitDeuri
 
PDF
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
ESUG
 
PDF
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
PDF
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
PPTX
General Characters and Classification of Su class Apterygota.pptx
Dr Showkat Ahmad Wani
 
PPTX
first COT (MATH).pptxCSAsCNKHPHCouAGSCAUO:GC/ZKVHxsacba
DitaSIdnay
 
PPTX
Modifications in RuBisCO system to enhance photosynthesis .pptx
raghumolbiotech
 
Hepatopulmonary syndrome power point presentation
raknasivar1997
 
Introduction to biochemistry.ppt-pdf_shotrs!
Vishnukanchi darade
 
Even Lighter Than Lightweiht: Augmenting Type Inference with Primitive Heuris...
ESUG
 
Vera C. Rubin Observatory of interstellar Comet 3I ATLAS - July 21, 2025.pdf
SOCIEDAD JULIO GARAVITO
 
1a. Basic Principles of Medical Microbiology Part 2 [Autosaved].ppt
separatedwalk
 
Approximating manifold orbits by means of Machine Learning Techniques
Esther Barrabés Vera
 
Pharmacognosy: ppt :pdf :pharmacognosy :
Vishnukanchi darade
 
Sleep_pysilogy_types_REM_NREM_duration_Sleep center
muralinath2
 
Feeding stratagey for climate change dairy animals.
Dr.Zulfy haq
 
Embark on a journey of cell division and it's stages
sakyierhianmontero
 
Quality control test for plastic & metal.pptx
shrutipandit17
 
Nanofertilizer: Its potential benefits and associated challenges.pptx
BikramjitDeuri
 
Package-Aware Approach for Repository-Level Code Completion in Pharo
ESUG
 
RED ROT DISEASE OF SUGARCANE.pptx
BikramjitDeuri
 
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
ESUG
 
study of microbiologically influenced corrosion of 2205 duplex stainless stee...
ahmadfreak180
 
Multiwavelength Study of a Hyperluminous X-Ray Source near NGC6099: A Strong ...
Sérgio Sacani
 
General Characters and Classification of Su class Apterygota.pptx
Dr Showkat Ahmad Wani
 
first COT (MATH).pptxCSAsCNKHPHCouAGSCAUO:GC/ZKVHxsacba
DitaSIdnay
 
Modifications in RuBisCO system to enhance photosynthesis .pptx
raghumolbiotech
 
Ad

s00362-024-01656-9_about Vectors in the Physics.pptx

  • 1. Statistical Papers (2025) 66:33 https://doi.org/10.1007/s00362- 024-01656-9 REGULAR ARTICLE Testing for changes in the error distribution in functional linear models Natalie Neumeyer1 · Leonie Selk1 Received: 3 April 2024 / Revised: 6 November 2024 © The Author(s) 2025 Abstract We consider linear models with scalar responses and covariates from a separable Hilbert space. The aim is to detect change points in the error distribution, based on sequential residual empirical distribution functions. Expansions for those estimated functions are more challenging in models with infinite-dimensional covariates than in regression models with scalar or vector- valued covariates due to a slower rate of convergence of the parameter estimators. Yet the suggested change point test is asymptotically distribution-free and consistent for one-change point alternatives. In the latter case we also show consistency of a change point estimator. Keywords Change-points · Functional data analysis · Regularized function estimators · Regression · Residual processes Mathematics Subject Classification Primary 62R10; Secondary 62G10 · 62G30 1 Introduction We consider a functional linear model Y = α + (X,β) 2 + ε with scalar response Y and covariates X from a separable Hilbert space, e.g. L ([0, 1]). Structural changes in the distribution can appear, even when the parameters α and β do not change. For this reason we focus on detecting changes in the error distribution. If the errors were observable one could use the classical test (and change point estimators) based on the difference of the sequential empirical distribution functions of the first Lnt ] and the last n — Lnt ] error terms from a sample of n observations, see Csörgö et al. (1997), Picard (1985), Carlstein (1988), Dümbgen (1991), Hariz et al. (2005) and Hariz et al. (2007). In a regression model those tests have to be based on estimated residuals εˆ = Y — αˆ — (X, βˆ). Similar tests have been considered by Bai (1994) in 1 Fachbereich Mathematik, Universität Hamburg, Hamburg, Germany 1 3
  • 2. 33 Page 2 of 17 N. Neumeyer , L. Selk the context of ARMA-models, by Koul (1996) in the context of nonlinear time series, by Ling (1998) for nonstationary autoregressive models, and by Neumeyer and Van Keilegom (2009) and Selk and Neumeyer (2013) for nonparametric independent and time series regression models. Typically the asymptotic distribution is derived using asymptotic expansions of residual-based empirical distribution functions. For models with functional covariates those expansions can be problematic because inner products (X, βˆ — β) appear and those can have a slow rate of convergence [see Cardot et al. (2007), Shang and Cheng (2015), Yeon et al. (2023)]. However, we show that under very simple non-restrictive assumptions those terms cancel for the suggested change point test statistic and thus the asymptotic distribution is the same as based on true (unobserved) errors. Change point testing and estimation for functional data, and for the parameter in functional linear models have been considered in the literature, but not for the error distribution. Tests for changes in the functional mean and in the parameter function of autoregressive models are considered in chapters 6 and 14 in Horváth and Kokoszka (2012). Berkes et al. (2009) propose a CUSUM testing procedure to detect a change in the mean of functional observations. They apply projections on principal components of the data to estimate the mean. Aue et al. (2009) extend this result and introduce an estimator for the change point in this model and derive its limit distribution. Aston and Kirch (2012) consider the same type of model with epidemic changes and dependent data. Aue et al. (2018) consider how to detect and date structural breaks in the mean of functional observations without the application of dimension reduction techniques (as functional principal component analysis). Aue et al. ( 2014) propose a monitoring procedure to detect structural changes in functional linear models with functional response, allowing for dependence in the data, including functional autoregressive processes. They test for a change in the regression operator, which is the analogue to our β, based on functional principal component analysis. A linear regression model with scalar response is considered in Horváth et al. (2024) who propose a tests for the detection of multiple change points in the regression parameter. The regressors in their model can be functional and can include lagged values of the response. The paper is organized as follows. In Sect. 2 we define the test statistic and present model assumptions to obtain the asymptotically distribution-free hypothesis test. In Sect. 3 we discuss the assumptions on the parameter estimators and some examples. In Sect. 4 consistency of the test as well as of a change point estimator is considered in the context of one change point. Finite sample properties are shown in Sect. 5. Section 6 concludes the paper, in particular with an outlook on goodness-of-fit testing. The proofs are given in the appendix. 2 Model, test statistic and main result under the null Let H be a separable Hilbert space with inner product (·, ·), corresponding norm ·  and Borel-sigma field. Let (Xi , Yi ), i = 1,..., n, be an independent sample of (H × R)-valued random variables defined on the same probability space with probability 1 3
  • 3. Testing for changes in the error distribution… Page 3 of 17 33 measure P. The data are modeled as functional linear model Yi = α + (Xi ,β)+ εi , i = 1,..., n, with scalar response Yi and H-valued covariate Xi , and with parameters α e R, βin H . The covariates X1 ,..., Xn are assumed to be iid with E Xi < ∞, and the errors ε1 ,..., εn are independent, centered, and independent of the covariates. Our aim is to test for change-points in the error distribution. In this section we consider the test statistic under the null hypothesis, where the errors are identically distributed. Let αˆ and βˆ denote estimators for the parameters α e R and β e H. We build residuals εˆi = Yi — αˆ — (Xi , βˆ), i = 1 . . . , n. The test statistic Tn = sup sup |Gˆ n (t, z)| t e[0,1] zeR based on the process ˆ n G (t, z) = Lnt] (n — Lnt]) n3/2 ˆ F ˜ Lnt ] Lnt ] (z) — F (z) , compares for each k = 1,..., n — 1 the empirical distribution functions ˆk F (z) = 1 Σ k i =1 ˜k I {εˆi ≤ z}, F (z) = 1 k n Σ n — k i =k+1 i I {εˆ ≤ z} of the first k and last n — k residuals, respectively. Note that one can write ˆ n G (t, z) = Lnt] n1/2 F ˆ ˆ Lnt ] n (z) — F (z) . For the asymptotic distribution of the test statistic under the null hypothesis we assume the following conditions. Let P denote the distribution of (X1 , ε1). (a.1) |αˆ — α|= oP(1), βˆ — β = oP(1) (a.2) Let ε1 ,..., εn be independent and identically distributed with cdf F that is Hölder-continuous of order γ e (0, 1] with Hölder-constant c. (a.3) P βˆ — β e B → 1 as n → ∞ for a class B ⊂ H such that the function class F = {(x, e) '→ I {e ≤ v + (x, b)} | v e R, b e B} is P-Donsker. Remark 2.1 The assumptions are very mild and in particular less restrictive than typical assumptions for asymptotic distribution of residual-based empirical processes, even for finite-dimensional covariates. In assumption (a.1) only consistency is needed, no rates of convergence. Typically in the literature about residual-based procedures a bounded error density is assumed, see e.g. Akritas and Van Keilegom (2001). Then 1 3
  • 4. 33 Page 4 of 17 N. Neumeyer , L. Selk (a.2) is fulfilled for γ = 1, but (a.2) is less restrictive in the cases γ e (0, 1). Suitable conditions for the general assumption (a.3) are discussed in Sect. 3. One possibility for H = L2([0, 1]) is to assume smoothness of β which is a typical assumption. If 1 2 γ e ( , 1 ] in assumption (a.2), and β is in a Sobolev-space with third derivatives, (a.2) holds for the estimator βˆ from Yuan and Cai (2010). This estimator can also be applied for smaller γ in (a.2) if higher smoothness of β is assumed. Define the process Gn as Gˆ n , but based on the true errors instead of residuals, i.e. n G (t, z) = L nt] n 1/2 L ] F nt (z) — Fn(z) with Lnt ] F (z) = 1 Lnt ] Lnt ] Σ i =1 i I {ε ≤ z}. (2.1) Further let G be a completely tucked Brownian sheet, i.e. a centered Gaussian process on [0, 1]2 with covariance structure Cov(G(s, u), G(t, v)) = (s ∧ t — st)(u ∧ v — uv). Theorem 2.2 Under the assumptions (a.1)–(a.3), sup sup |Gˆ n (t, z) — Gn (t, z)|= oP(1), (2.2) t e[0,1] zeR and thus the process (Gˆ n (t, z))t e[0,1],zeR converges weakly to (G(t, F(z)))t e[0,1],zeR. The proof of (2.2) in the theorem is given in the appendix. The weak convergence of Gn is a classical result, see Bickel and Wichura (1971), Shorack and Wellner (1986). With the continuous mapping theorem one obtains the asymptotic distribution of the test statistic Tn under the null hypothesis of no change-point, which is the distribution of T = supt,ue[0,1] |G(t, u)| because F is continuous. The test statistic is asymptotically distribution-free with the same limit distribution as for corresponding changepoint tests based on iid observations (not residuals). Let α¯ e (0, 1) and q be the (1 — α¯ )- quantile of T . Then the test that rejects the null hypothesis if Tn > q has asymptotic level α¯ . Consistency is considered in Sect. 4. Remark 2.3 The choice of Tn as a Kolmogorov–Smirnov type test statistic is not mandatory. In principle, any continuous functional of the process Gˆ n can be con- sidered. The most common ones, besides Tn , are of Cramér-von–Mises type, e. g. Tn,2 = supt . ˆ 2 |G (t, z)| dF 1 . . e[0,1] R 0 R ˆ n n,3 n 2 (z) or T = |G (t, z)| dF(z)dt . The asymptotic distribution of these test statistics under the null hypothesis also follows from Theorem 2.2 and with ˆ n |G (t, z)| dF (z) → R R 2 2 ∫ ∫ ∫ 1 0 2 |G(t, F(z))| dF(z) = |G(t, x)| dx, 1 3
  • 5. Testing for changes in the error distribution… Page 5 of 17 33 and thus these test statistics are asymptotically distribution-free as well. However, Tn,2 and Tn,3 contain the unknown quantity F and must therefore be modified in order to be applied. This can be done by replacing the integral with the sample mean: T˜n,2 = supt e[ , ] n 1 n 0 1 i =1 ˆ |G n i 2 ˜ (t, εˆ )| and T n,3 = 1 0 n Σ . Σ 1 n i =1 ˆ |G n i 2 (t, εˆ )| dt . 3 Discussion of assumptions and examples To show validity of the Donsker-class assumption (a.3) there are sufficient conditions on covering numbers or bracketing numbers. We discuss some specific conditions on the class B, examples for Hilbert spaces H, and estimators for the parameter function β that fulfill the conditions. 1. VC-class condition Assumption (a.3) can be derived from a VC-function class condition formulated as ˆ follows. Assume that P β — β e B → 1 as n → ∞ for a class B ⊂ H such that the class of maps {H → R, x '→ (x, b)+ v | b e B,v e R} is a VC-subgraph class. By definition then (3.1) {{(x, e) e H × R | e ≤ (x, b)+ v}| b e B,v e R} is a VC-class of sets. The class F from (a.3) is the class of the corresponding indicator functions and (a.3) is fulfilled by Theorems 8.19 and 9.2 in Kosorok (2008). Example 3.1 We consider the Hilbert space H = L2([0, 1]) with inner product . . 1 1 0 0 2 1/2 (g, h)= g(t)h(t) dt and norm g = ( g (t) dt) . For the parameter function β we assume sparsity as in Lee and Park (2012). Let (φ j ) j eN be a basis of H and assume β = Σ j eJ j j β φ for some finite, but unknown index set J . Lee and Park ˆ (2012) consider the estimator β = Σ k j =1 βˆj φ j with (βˆ1,..., βˆk ) = arg b min , , 1 1,...,bk eR n n Σ i =1 k Σ j =1 2 k Σ Yi — Yn — b j (Xi — Xn ,φj ) + wˆ j |b j | , , j =1 , 2 where k is a chosen dimension-cut-off, wˆ j are suitable weights based on initial esti- n mators, and Y = n i =1 i n Y , X = 1 n Σ Σ 1 n n Xi . Further, αˆ = Yn — (βˆ, Xn ). Under i =1 2 suitable assumptions, in particular E X < ∞, and k is larger than the largest index in J , Lee and Park (2012) show in their Theorem 2 that P(βˆj = 0 for j e/ J) → 1 for 1 3
  • 6. 33 Page 6 of 17 N. Neumeyer , L. Selk n → ∞. Thus we can set B = ⎧ ⎨ Σ ⎩ j eJ j j j b φ b e R ∀ j e J ⎫ ⎬ ⎭ ⎨ ⎩ Σ j eJ j j H → R, x '→ b (x,φ )+ j v b e R ∀ j e J,v e R and obtain P(βˆ — β e B) → 1 for n → ∞. Further, the class of maps in (3.1), i.e. ⎧ ⎫ ⎬ ⎭ is a finite dimensional vector space and thus a VC-class, see Lemma 2.6.15 in van der Vaart and Wellner (1996). Then as discussed above validity of (a.3) follows. Fur- thermore, from Theorem 2 in Lee and Park (2012) it also follows that our assumption (a.1) is fulfilled, and thus under assumption (a.2) the assertion of Theorem 2.2 holds. 3.2 Bracketing number condition In this subsection we assume that H is a separable Hilbert space of real-valued func- tions (or vectors with real components) and the inner product is increasing in the sense that from h ≤ g (pointwise for functions; componentwise for vectors) it follows that (h, x )≤ (g, x ) for all x e H with x ≥ 0. Then assumption (a.3) can be replaced by the condition in the next lemma. ˆ Lemma 3.2 Assume (a.1), (a.2) and P β — β e B → 1 as n → ∞ for a function class B ⊂ H such that the bracketing number fulfills log N[ ] (B,‹, ·  ) ≤ K /‹1/k for some k > 1/γ . Here γ is the Hölder-order from assumption (a.2). Then assumption (a.3) holds. The proof is given in the appendix. Example 3.3 We consider the Hilbert space H = L2([0, 1]) with inner product . . 1 1 0 0 2 1/2 (g, h) = g(t)h(t) dt and norm g = ( g (t) dt) . We assume β e m 2 W ([0, 1 ]) for some m > 2 and the Sobolev-space m 2 W ([0, 1 ]) = b : [ ( j) 0, 1]→ R | b is absolutely continuous for j = 0,..., m — 1, b(m) and < ∞ } , where b(0) = b, and b( j) denotes the j -th derivative of b, j ≥ 1. We consider the regularized estimators in Yuan and Cai (2010 ), i.e. ˆ αˆ , β = arg min m 1 aeR,beW2 ([0,1]) n n Σ i =1 n Yi — a + (Xi , b) + λ b (m) 2 ¨ ¨ ¨ ¨ 2 1 3
  • 7. Testing for changes in the error distribution… Page 7 of 17 33 for a suitable positive sequence λn converging to zero. Convergence rates of βˆ and its derivatives can be found in Corollaries 10 and 11 in Yuan and Cai (2010). Under ( j) ( j) ¨ ˆ ¨ P suitable assumptions one obtains β — β = o (1 ) for j = 0, 1, 2, and thus P(βˆ — β e B) → 1 for the function class 2 2 B = b e W [0, b(2) 1] :b +  ≤ 1 } . By Corollary 4.3.38 in Giné and Nickl (2021) and Lemma 9.21 in Kosorok (2008) the class B fulfills the bracketing number condition in Lemma 3.2 for k = 2. Thus the 1 2 assumptions (a.1)–(a.3) are fulfilled if F is Hölder-continuous of order γ e ( , 1 ]. Less 1 2 restrictive assumptions on F , i.e. γ ≤ , require for this concept higher smoothness of β. 4 Fixed one-change point alternative: consistency of the test and change point estimator In this section we consider fixed alternatives with one change point at index kn ∗ = Lnϑ∗] with ϑ∗ e (0, 1). We write the functional linear model as in Sect. 2 under the following assumption. (a.2)’ Assume ε1 ,..., εkn ∗ are iid with cdf F1, and εkn ∗+1,..., εn are iid with cdf F2 /= F1. Let F1 and F2 be Hölder-continuous of order γ1, γ2 e (0, 1] with Hölder-constant c1, c2, respectively. Let further P1 denote the distribution of (X1 , ε1) (before the change) and P2 denote the distribution of (Xn , εn ) (after the change). For the empirical distribution functions Fˆk and F˜k as in Sect. 2 we obtain the following asymptotic result. Lemma 4.1 Under assumptions (a.1) and (a.2)’ and if (a.3) is valid for P = P1 and P = P2, it holds that sup |Fˆkn ∗ (z) — F1(z)|= oP(1) and sup |F˜kn ∗ (z) — F2(z)|= oP(1). zeR zeR The proof is given in the appendix. Now note that ≥ n ∗ ∗ n Tn k (n — k ) n1/2 n2 zeR sup F k ∗ n ˆ ˜ k ∗ n (z) — F (z) , and by Lemma 4.1 the right hand side converges in probability to the positive constant ϑ∗(1 — ϑ∗) sup |F1(z) — F2(z)| . zeR From this it follows that tests that reject the null hypothesis of no change-point if Tn > q for some q > 0 (see Sect. 2) are consistent. 1 3
  • 8. n ˆ ˆ ϑ = min t : sup |G ˆ (t, z)|= sup sup |G n n r 33 Page 8 of 17 N. Neumeyer , L. Selk The estimator for the change point ϑ∗ is based on the process Gˆ n and is defined as } (t , z)| . zeR t re[0,1] zeR Lemma 4.2 Under assumptions (a.1), (a.2)’ and if (a.3) holds for P = P1 and P = P2, the change point estimator is consistent, i. e. |ϑˆn — ϑ∗ |= oP(1). The proof is given in the appendix. 5 Finite sample properties We consider the Hilbert space H = L2([0, 1]). For i = 1 ,..., n the functional i X (t) = 1 2 observations Xi (t), t e [0, 1], are generated according to 5 Σ l=1 i ,l i ,l i ,l i ,l i ,l i ,l B sin t(5 — B )2π — M — E [B sin (5 — B )2π — M ] , where Bi,l ∼ U [0, 5] and Mi,l ∼ U [0, 2π ] for l = 1,..., 5, i = 1 ,..., n. U stands for the (continuous) uniform distribution. The functional linear model is built as ∫ i i 3, 3 i Y = X (t)γ 1 (t)dt + ε , where the coefficient function γa,b(t) = ba/ Г(a)ta—1e—bt I {t > 0} is the density of the Gamma distribution. Furthermore, we assume that each Xi is observed on a dense, equidistant grid of 300 evaluation points. The parameter estimators are the regularized estimators described in Example 3.3 with m = 3 and a data-driven tuning parameter λn chosen by generalized cross- validation as described in Yuan and Cai (2010). We model three similar types of change points, such that 1 L n 2 ε , . . . , ε ∼ N(0, 1), ε n 2 ] L ]+ ˜ ˜ ˜ 1 n 1,δ 2,δ 3,δ , . . . , ε ∼ F (respectively F , F ), where F˜1, δ , F˜2, δ , F˜3, δ have in common that the mean remains zero and the variance remains one. In particular • F˜1, δ is the distribution function of a random variable that is N(—2δ, 1) distributed with probability 0.5 and N(2δ, 1) distributed with probability 0.5. • F˜2, δ is the distribution function of a random variable that is N(0,(1 — δ)2) dis- tributed with probability 0.5 and N(0, 2 — (1 — δ)2) distributed with probability 0.5. 1 3
  • 9. Testing for changes in the error distribution… Page 9 of 17 33 0 2 0 4 0 6 0 8 0 1 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 delta Rejection in percent n=200 n=100 0 2 0 4 0 6 0 8 0 1 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 delta Rejection in percent n=200 n=100 0 2 0 4 0 6 0 8 0 1 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 delta Rejection in percent n=200 n=100 Fig. 1 Rejection probabilities with a change in the error distribution from N(0, 1) to F˜1, δ (left), to F˜2,δ (middle) and to F˜3, δ (right). The dotted line marks the 5% level • F˜3, δ is the “skew-normal”-distribution SN — , , , , 2π (10δ)2 + (10δ)4 π 1 + (10δ) 2 , , 10δ . π 2 + 2π 2 — 2π · (10δ)2 + π 2 — 2π · (10δ)4 π + (π — 2) · (10δ)2 , , A random variable Z is distributed SN(λ1 , λ2 , λ3) if Z = λ1 + λ2 · Z0 and Z0 has the density 2φ(x)Ф(λ3 x), where φ is the density and Ф is the distribution function of the standard normal distribution [see Azzalini and Capitanio (1999)]. 1 2 , π 2 λ 3 The expected value of such a random variable Z is calculated as λ +λ ·, 1+λ3 2 2 π and the variance as λ 1 — · 2 2 λ3 2 1+λ3 2 . This results in the parameters for the distribution after the change point, such that the the expected value of the errors remains 0 and the variance remains 1. So δ = 0 represents the null hypothesis of no change point, and the difference between the distribution before and after the change point grows with δ in all three cases. In Fig. 1 the rejection probabilities for 500 repetitions, level 5% [critical value tabled in Picard (1985)] and sample sizes n e {100, 200} are shown. In all three cases it can be seen that the level is approximated well and the power increases for increasing parameter δ as well as for increasing sample size n. In the case of a change in skewness, the increase with δ is not as pronounced as in the other two cases. This is because the distributions for different values of δ become more similar as δ increases. The same types of changes (from N(0, 1) to F˜1, δ and to F˜2, δ ) were also simulated in Selk and Neumeyer (2013) for a real-valued nonparametric autoregression model with lag 1. The results are comparable with an even higher power in the paper at hand. In addition, we model a more distinct change, that is ε , . . . , ε n 2 ∼ N(0, 0. 2 1 L ] L ]+1 n 2 2 5 ), ε n , . . . , ε ∼ N(0,(0.5 + δ) ). As expected for a change in the variance the power grows faster with increasing δ than in the other three cases, especially for small δ. The results are shown in Fig. 2. This kind of change point was also simulated in Neumeyer and Van Keilegom (2009) 1 3
  • 10. 33 Page 10 of 17 N. Neumeyer , L. Selk Fig. 2 Rejection probabilities with a change in the error distribution from N(0, 0.52) to N(0, (0.5 + δ)2). The dotted line marks the 5% level 0 2 0 4 0 6 0 8 0 1 0 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 delta Rejection in percent n=200 n=100 for a nonparametric regression model with one-dimensional regressor. The results are similar. Next to the Kolmogorov–Smirnov type test with test statistic Tn , we also applied Cramér-von-Mises type tests with test statistics Tn,2 and Tn,3. The results are very similar and are not presented here for the sake of brevity. 6 Concluding remarks To detect structural changes in functional linear models, we considered the classical test by Bickel and Wichura (1971) for a change in the distribution, but based on estimated errors. We gave simple assumptions under which the asymptotic distribution of the test statistic under the null is the same as for iid data. The test as well as the corresponding change point estimators are consistent in one-change point models. The same test can be considered in more complex regression models with functional covariates, e.g. a quadratic model as in Boente and Parada (2023) or nonparametric models, see Ferraty and Vieu (2006). We only consider independent data, but testing for change points in the innovation distribution in times series models that include functional covariates is a very interesting topic. However, the proofs for asymptotic distributions will be more complicated. In future work we are planning to consider a time series model Yt = m(Xt ) + εt , where Yt and εt are real-valued and Xt contains a functional part, but can also contain past values Yt —1 , . .. , Yt — p . We presume that the proofs as in Selk and Neumeyer (2013) (for nonparametric autoregression time series with independent errors) and of Sect. 4.2 in Neumeyer and Omelka (2025) (for linear models with finite- dimensional covariates and beta-mixing errors) can be combined with the proofs in the paper at hand to consider change-point tests for the innovation distribution under the assumption that (Xt , εt ), t e Z, is a strictly stationary beta- mixing time series.
  • 11. Testing for changes in the error distribution… Page 11 of 17 33 In the proof of Theorem 2.2 we derive an expansion for the sequential residual-based empirical distribution function, FˆLn t ] (z) = FLnt ] (z) + Rn(z) + oP(n—1/2 ) uniformly in t e [0, 1], z e R, where FLnt ] is defined in (2.1), and the term Rn(z) = EX [F(z + αˆ — α + (X, βˆ — β))]— F(z) appears from estimating the parameters [see (A.1) in the appendix]. Here EX denotes the expectation with respect to X , which has the same distribution as Xi , but is indepen- dent of αˆ , βˆ. For change-point testing the remainder term Rn cancels when considering the test statistic Tn . For other testing procedures, e. g. goodness-of-fit tests for the error distribution, this typically nonnegligible term is of relevance, see Koul (2002) and Neumeyer et al. (2006) for linear models and Akritas and Van Keilegom ( 2001) for nonparametric regression. Under more restrictive assumptions one can further expand the remainder term as follows. Assume that F is twice differentiable with density F r = f and bounded f r and further |αˆ — α|+ βˆ — β = oP(n—1/4), and E X 2 < ∞. Then by Taylor’s expansion one obtains FˆLn t ] (z) = FLnt ] (z) + f (z) αˆ — α + (E [X ], βˆ — β) + oP(n—1/2 ). In models with intercept α, where the estimator for α is chosen as αˆ = Yn — (Xn , βˆ) the remainder term is Rn(z) = f (z) εn + (Xn — E [X ], βˆ — β) + oP(n—1/2 ). n (Here, X = n Σ —1 n i =1 Xi and analogous for Y n and εn .) By Cauchy–Schwarz- 1/2 inequality and the central limit theorem for n (Xn — E [X ]) one obtains that the dominating part of the remainder term is f (z)εn . This is the same as in homoscedastic finite-dimensional linear models with intercept and nonparametric regression models. Note that α and β are identifiable if the kernel of the covariance operator of the covariate X is {0}. But often functional linear models without intercept are considered in the literature. So in our model assume α = αˆ = 0. Then the remainder term is Rn(z) = f (z)(E [X ], βˆ — β)+ oP(n—1/2 ), and (x, βˆ — β) for fixed x e H typically has a slower rate than n—1/2, see Cardot et al. (2007), Shang and Cheng (2015), Yeon et al. (2023). If one assumes E [X ]= 0, then this problematic term cancels [similar as e.g. for centered ARMA-processes, see Bai (1994 )], but otherwise f (z)(E [X ], βˆ — β) will dominate the asymptotic distribution of the process (FˆLn t ] (z) — F(z))t e[0,1],zeR. For our change-point test this dominating term vanishes. The same holds when estimating the conditional copula of the response in multidimensional functional linear models, given the covariate, see Theorem 5 in Neumeyer and Omelka (2025). But e.g. for 1 3
  • 12. 33 Page 12 of 17 N. Neumeyer , L. Selk would be of relevance. We consider goodness-of-fit tests for the error distribution in the different cases explained above in future work. With the derived expansion for residual empirical distribution functions one can also develop other tests for the error distribution as e.g. for symmetry, or equality of error distributions in different models, see e.g. Pierce and Kopecky (1979), Neumeyer et al. (2005), Pardo Fernandez (2007), among many others, in the cases of regression models with finite-dimensional covariates. A Proofs For ease of notation let (X, Y, ε) be some generic random variable with the same distribution as (X1 , Y1, ε1) under the null, but independent from the sample (Xi , Yi ), i = 1,..., n. Let P denote the distribution of (X, ε). Further let EX denote the expectation with respect to X , which in the context below is the conditional expectation given (Xi , Yi ), i = 1,..., n. The proofs of Theorem 2.2 and Lemma 3.2 are similar as a part of the proof of Theorem 5 in Neumeyer and Omelka (2025), but under less restrictive assumptions. A.1 Proof of Theorem 2.2 From the Donsker-property in assumption (a.3) and Corollary 9.31 in Kosorok (2008) it follows that {(x, e) '→ I {e ≤ z + a + (x, b)} — I {e ≤ z}| z e R, a e R, b e B} is also P-Donsker. From Theorem 2.12.1 in van der Vaart and Wellner (1996) it follows that also the centered sequential process 1 Lnt ] Σ n i H (t, z, a, b) = , n I {ε ≤ z + a + ( i X , b)} i =1 —I {εi ≤ z}— E [F(z + a + (X, b)] + F(z) , indexed in t e [0, 1], z e R, a e R, b e B, converges weakly to a centered Gaussian process. Thus the process Hn is asymptotically stochastic equicontinuous with respect to the semi-metric ρ((t1, z1, a1, b1), (t2, z2, a2, b2)) = |t1 — t2|+ Var(I {ε ≤ z1 + a1 + (X, b1)} —I {ε ≤ z2 + a2 + (X, b2)}), see van der Vaart and Wellner (1996), problem 2, p. 93, and Sect 2.12. In particular we need
  • 13. Testing for changes in the error distribution… Page 13 of 17 33 ≤ |E [F(z + a + (X, b)) — F(z)]| ≤ cE ! |a + (X, b)|γ " ≤ c |a|+ b E X γ by Cauchy–Schwarz and Jensen’s inequality. Now setting a = αˆ —α and b = βˆ—β we obtain convergence to zero in probability by assumption (a.1). Thus from asymptotic stochastic equicontinuity of the process Hn , and Hn(t, z, 0, 0) = 0 we obtain that sup |Hn(t, z, αˆ — α, βˆ — β)|= oP(1), t e[0,1],zeR which means that Lnt] ˆ , n F Lnt ] Lnt] (z) = , n F Lnt ] Lnt] X (z) + , n E [F(z + αˆ — α + ( ˆ X, β — β) )]— F(z) +oP(1) (A.1) uniformly in t e [0, 1], z e R, where FLnt ] was defined in (2.1) and is based on the true errors. In particular for t = 1 we have Fˆn (z) = Fn(z) + EX [F(z + αˆ — α + (X, βˆ — β))]— F(z) + oP(n—1/2) uniformly in z e R. From those expansions we obtain ˆ n Lnt] ˆ G (t, z) = , n F Lnt ] Lnt] ˆn (z) — , n F (z) Lnt] = , n F Lnt ] Lnt] n P (z) — , n F (z) + o (1 ) = Gn (t, z) + oP(1) uniformly in t e [0, 1], z e R. A.2 Proof of Lemma 3.2 (A.2) п Let ‹ > 0 and let L U i i ! " b , b , i = 1,..., N(‹) = O exp(‹—2/(kγ)) 2/γ be brackets for B of ·-length ‹ [see assumption (a.3)]. Now for b e [ L U i i b , b ] the indicator function I {e ≤ v + (x, b)} is contained in the bracket ! i I {e ≤ v + (x I {x ≥ 0}, b )+ ( L U i x I {x < 0}, b )}, I {e ≤ v + ( U i x I {x ≥ 0}, b ) L i +(xI {x < 0}, b )} " 1 3
  • 14. 33 Page 14 of 17 N. Neumeyer , L. Selk for each v e R. Further the above bracket has L2 (P)-length ! E I {ε ≤ v + (X I U i {X ≥ 0}, b )+ ( L i X I {X < 0}, b )} —I {ε ≤ v + (X I L i {X ≥ 0}, b )+ ( U i X I {X < 0}, b )} " 2 1/2 ! ≤ E F v + ( U i X, b ) — F v + ( L i X, b ) " 1/2 ! i i ≤ E c|(X, b — b )| U L γ " 1/2 1/2 bi — b i γ /2 U L γ /2 ≤ c (E X ) = O(‹), by assumption (a.2), Cauchy–Schwarz and Jensen’s inequality. Similar to the proof of Lemma 1 in Akritas and Van Keilegom ( 2001) one obtains an upper bound O(‹—2 exp(‹—2/k )) for the L2(P)-bracketing number of the class F. Thus F is P- Donsker by the bracketing integral condition in Theorem 19.5 of van der Vaart (1998). п A.3 Proof of Lemma 4.1 To show the assertion for Fˆkn ∗ we use the arguments as in the proof of Theorem 2.2 for the process Hn , but based on the iid sample (X1 , Y1), . . . , (Xkn ∗ , Ykn ∗ ) before the change. Then as in the proof of Theorem 2.2 asymptotic stochastic equicontinuity of the process Hkn ∗ holds and thus sup |Hkn ∗ (t, z, αˆ — α, βˆ — β)|= oP(1). t e[0,1],zeR Here, αˆ and βˆ depend on the whole sample and assumption (a.1) is used. Thus as in Eq. (A.1) we obtain ˆ F k ∗ n k ∗ n 1 X 1 (z) = F (z) + E [F (z + αˆ — α + ( 1 ˆ X , β — β) 1 P 1 )]— F (z) + o , n uniformly in z e R. By the classical Glivenko–Cantelli result Fkn ∗ converges uniformly almost surely to F1. By assumptions (a.1), (a.2)’ and E X1 < ∞ the remainder term is oP(1) and the assertion for Fˆkn ∗ follows. The assertion for F˜kn ∗ can be shown analogously. п 1 3
  • 15. Testing for changes in the error distribution… Page 15 of 17 33 A.4 Proof of Lemma 4.2 First note that zeR ˆ ˆ n n t e[0,1] t e[0,1] ϑ e arg max sup |G (t, z)| = arg max sup # zeR ˆ G (t, z) n1/2 n $ . Further it holds Gˆ n (t, z) n1/2 = n2 Lnt] (n — Lnt]) 1 Lnt ] Lnt ] Σ i =1 1 I {εˆi ≤ z}— n — Lnt ] n Σ i =Lnt ]+1 I {εˆi ≤ z} = Lnt](n n2 # — Lnt]) 1 Lnt ] ∗ Lnt ] L ∧ nϑ ] Σ i =1 1 I {εˆi ≤ z}+ I {t > ϑ∗} Lnt ] Lnt ] Σ i =Lnϑ∗]+1 I {εˆi ≤ z} 1 — n — Lnt ] n Σ i =Lnt ] L ∨ nϑ∗]+1 1 I {εˆi ≤ z}— I {t < ϑ∗ } n — Lnt ] ∗ Lnϑ ] Σ i =Lnt ]+1 I {εˆi ≤ z} $ = Lnt] (n — L n2 # nt]) L ∗ nt ]∧ Lnϑ ] Lnt ] 1 ∗ F (z) + I {t > ϑ } ∗ Lnt ]— Lnϑ ] Lnt ] 2 F (z) — ∗ n — Lnt ]∨ Lnϑ ] n — Lnt ] 2 ∗ F (z) — I {t < ϑ } ∗ Lnϑ ]— Lnt ] n — Lnt ] $ F1(z) + oP(1), since we have sup sup t e[0,ϑ∗] zeR Lnt] 1 Lnt ] Σ n Lnt ] i =1 i 1 I {εˆ ≤ z}— F (z) ≤ sup sup t e[0,ϑ ] zeR ∗ Lnt] 1 ∗ Lnϑ ] Lnt ] Lnt ] Σ i =1 I {εˆi ≤ z}— 1 ∗ Lnϑ ] ∗ Lnϑ ] Σ i =1 i I {εˆ ≤ z} c ˛' I = 1 Lnϑ∗] ˜ 1/2 Lnϑ ] G ∗ (t,z) (A.3) + Lnt] 1 ∗ ∗ Lnϑ ] Lnϑ ] ∗ Lnϑ ] Σ i =1 i 1 I {εˆ ≤ z}— F (z) (A.4) = oP(1). Here we have used Lemma 4.1 for the term (A.4). Further G˜ Lnϑ∗] is defined as Gˆ n based on the iid-sample (X1 , Y1), . . . , (Xkn ∗ , Ykn ∗ ), but where the residuals are built with αˆ , βˆ based on the whole sample. With the same argument as in the proof of Theorem 2.2 it holds that G˜ Lnϑ∗](t, z) = GLnϑ∗](t, z) + oP(1) 1 3
  • 16. 33 Page 16 of 17 N. Neumeyer , L. Selk Analogously one can show that sup ∗ sup t e[ϑ ,1] zeR n—Lnt] 1 n n—Lnt ] Σ n i =Lnt ]+ 1 I {εˆi ≤ 2 P z}— F (z) = o (1 ). Thus, it holds uniformly in t e [0, 1] ˆ n G (t , z) n1/2 ∗ = I {t > ϑ } ∗ Lnϑ ](n — Lnt ]) n2 1 2 (F (z) — F (z)) ∗ + I {t ≤ ϑ } ∗ Lnt ](n — Lnϑ ] ) n2 1 2 P (F (z) — F (z)) + o (1 ) = I {t > ϑ∗}ϑ∗(1 — t) + I {t ≤ ϑ∗}t(1 — ϑ∗) (F1(z) — F2(z)) + oP(1). The assertion then follows by Theorem 2.12 in Kosorok (2008) as ϑ∗ is well-separated maximum of t '→ I {t > ϑ∗}ϑ∗(1 — t) + I {t ≤ ϑ∗}t(1 — ϑ∗). п Acknowledgements The authors are grateful to the Editors and Guest Editors for the organization of the Special Issue “Goodness-of-Fit, Change-Point, and Related Problems”, and to the referees, the Associate Editor and the Guest Editor Simos Meintanis for their constructive comments and interesting ideas to expand the topic. Funding Open Access funding enabled and organized by Projekt DEAL. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. References Akritas MG, Van Keilegom I (2001) Non-parametric estimation of the residual distribution. Scand J Stat 28(3):549–567 Aston JAD, Kirch C (2012) Detecting and estimating changes in dependent functional data. J Multivar Anal 109:204–220 Aue A, Gabrys R, Horváth L, Kokoszka P (2009) Estimation of a change-point in the mean function of functional data. J Multivar Anal 100(10):2254–2269 Aue A, Hörmann S, Horváth L, Hušková M (2014) Dependent functional linear models with applications to monitoring structural change. Stat Sin 24:1043–1073 Aue A, Rice G, Sönmez O (2018) Detecting and dating structural breaks in functional data without dimension reduction. J R Stat Soc Ser B Stat Methodol 80(3):509–529 Azzalini A, Capitanio A (1999) Statistical applications of the multivariate skew-normal distribution. J R Stat Soc Ser B Stat Methodol 61:579–602 1 3
  • 17. Testing for changes in the error distribution… Page 17 of 17 33 Cardot H, Mas A, Sarda P (2007) Clt in functional linear regression models. Probab Theory Relat Fields 138:325–361 Carlstein E (1988) Nonparametric change-point estimation. Ann Stat 16(1):188–197 Csörgö M, Horváth L, Szyszkowicz B (1997) Integral tests for suprema of kiefer processes with application. Stat Decis 15:365–377 Dümbgen L (1991) The asymptotic behavior of some nonparametric change-point estimators. Ann Stat 19(3):1471–1495 Ferraty F, Vieu P (2006) Nonparametric functional data analysis, Springer series in statistics. Springer, New York Giné E, Nickl R (2021) Mathematical foundations of infinite-dimensional statistical models. Cambridge series in statistical and probabilistic mathematics. Cambridge University Press Hariz SB, Wylie JJ, Zhang Q (2005) Nonparametric change-point estimation for dependent sequences. CR Math 341(10):627–630 Hariz SB, Wylie JJ, Zhang Q (2007) Optimal rate of convergence for nonparametric change-point estimators for nonstationary sequences. Ann Stat 35(4):1802–1826 Horváth L, Kokoszka P (2012) Inference for functional data with applications, Springer series in statistics. Springer, New York Horvath et al (2024) Variable selection based testing for parameter changes in regression with autoregressive dependence. J Bus Econ Stat 42(4):1331–1343 Kosorok MR (2008) Introduction to empirical processes and semiparametric inference. Springer, New York Koul HL (1996) Asymptotics of some estimators and sequential residual empiricals in nonlinear time series. Ann Stat 24:380–404 Koul HL (2002) Weighted empirical processes in dynamic nonlinear models, Lecture notes in statistics. Springer, New York Lee ER, Park BU (2012) Sparse estimation in functional linear regression. J Multivar Anal 105(1):1–17 Ling S (1998) Weak convergence of the sequential empirical processes of residuals in nonstationary autoregressive models. Ann Stat 26:741–754 Neumeyer N, Van Keilegom I (2009) Change-point tests for the error distribution in non-parametric regression. Scand J Stat 36(3):518–541 Neumeyer N, Dette H, Nagel E-R (2005) A note on testing symmetry of the error distribution in linear regression models. J Nonparametr Stat 17(6):697–715 Neumeyer N, Dette H, Nagel E-R (2006) Bootstrap tests for the error distribution in linear and nonparametric regression models. Austr New Zealand J Stat 48(2):129–156 Neumeyer N, Omelka M (2025) Generalized Hadamard differentiability of the copula mapping and its applications. Bernoulli (to appear) Pardo Fernandez JC (2007) Comparison of error distributions in nonparametric regression. Stat Probab Lett 77:350–356 Picard D (1985) Testing and estimating change-points in time series. Adv Appl Probab 17:841–867 Pierce DA, Kopecky KJ (1979) Testing goodness of fit for the distribution of errors in regression models. Biometrika 66(1):1–5 Selk L, Neumeyer N (2013) Testing for a change of the innovation distribution in nonparametric autoregression: the sequential empirical process approach. Scand J Stat 40(4):770–788 Shang Z, Cheng G (2015) Nonparametric inference in generalized functional linear models. Ann Stat 43(4):1742–1773 Shorack GR, Wellner JA (1986) Empirical processes with applications to statistics. Wiley van der Vaart A, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York van der Vaart A (1998) Asymptotic statistic. Cambridge series in statistical and probabilistic mathematics. 1 3