0% found this document useful (0 votes)

12 views15 pages

Towards The Resistance of Neural Network Watermarking To Fine-Tuning

This paper presents a novel watermarking method for deep neural networks (DNNs) that is resistant to fine-tuning by embedding ownership information into specific frequency components of convolutional filters. The authors demonstrate that these frequency components remain stable during fine-tuning and are equivariant to weight scaling and permutations, addressing a significant challenge in neural network watermarking. Preliminary experiments validate the effectiveness of the proposed method, marking a step towards theoretically guaranteed robust watermarks in neural networks.

Uploaded by

scripher

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views15 pages

Towards The Resistance of Neural Network Watermarking To Fine-Tuning

Uploaded by

scripher

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

Towards the Resistance of Neural Network Watermarking to Fine-tuning

Ling Tang1 Yuefeng Chen2 Hui Xue2 Quanshi Zhang1 *

1
Shanghai Jiao Tong University
2
Alibaba Group
arXiv:2505.01007v1 [cs.LG] 2 May 2025

Abstract be resistant to the fine-tuning of the DNN. When network

parameters are changed during the fine-tuning process, the
This paper proves a new watermarking method to em- watermark implicitly embedded in the parameters may also
bed the ownership information into a deep neural network be overwritten.
(DNN), which is robust to fine-tuning. Specifically, we Although many studies [Uchida et al., 2017] [Adi et al.,
prove that when the input feature of a convolutional layer 2018][Liu et al., 2021] [Tan et al., 2023] [Zeng et al., 2023]
only contains low-frequency components, specific frequency have realized this problem and have tested the resistance of
components of the convolutional filter will not be changed their watermarks to fine-tuning, there is no theory designed
by gradient descent during the fine-tuning process, where towards the resistance of watermarking w.r.t. fine-tuning, to
we propose a revised Fourier transform to extract frequency the best of our knowledge.
components from the convolutional filter. Additionally, we The core challenge towards the resistance to fine-tuning
also prove that these frequency components are equivariant is to explore an invariant term in the neural network to fine-
to weight scaling and weight permutations. In this way, we tuning, e.g., certain network parameters or some properties
design a watermark module to encode the watermark infor- of network parameters that are least affected during the fine-
mation to specific frequency components in a convolutional tuning process. Although Zeng et al. [2023], Zhang et al.
filter. Preliminary experiments demonstrate the effectiveness [2024], and Fernandez et al. [2024] have explored invariant
of our method. terms w.r.t. weight scaling and weight permutations for
watermarking, the theoretically guaranteed invariant term to
fine-tuning remains unsolved.
1. Introduction In this study, we aim to discover and prove such an in-
Watermarking techniques have long been used to protect variant term to fine-tuning. Specifically, as Figure 2 shows,
the copyright of digital content, including images, videos, Tang et al. [2023] have found that the forward propagation
and audio [Nematollahi et al., 2017]. Recently, these tech- through a convolutional layer W ⊗ X + b · 1M ×N can be
niques have been extended to protect the intellectual prop- reformulated as a specific vector multiplication between fre-
(uv) (uv)
erty of neural networks. Watermarking a neural network quency components FW ·FX +δuv M N b in the frequency
(uv)
is usually conducted to implicitly embed the ownership in- domain, where FX denotes the frequency component of
formation into the neural network. In this way, if a neural the input feature X at frequency (u, v), which is extracted by
(uv)
network is stolen and further optimized, the ownership in- conducting a discrete Fourier transform, FW denotes the
1
formation embedded in the network can be used to verify frequency component of the convolutional filter W, and b
its true origin. Previous studies usually embedded the own- is the bias term.
ership information in different ways. For example, Wang Based on this, we prove that if the input feature X only
et al. [2020] directly embedded the watermark into the net- contains the low-frequency components, then specific fre-
(uv)
work parameters. Lukas et al. [2021] used the classification quency components of a convolutional filter FW are stable
results on a particular type of adversarial examples as the w.r.t. network fine-tuning. Additionally, these specific fre-
backdoor watermark. Kirchenbauer et al. [2023] added a quency components1 also exhibit equivariance to weight
soft watermark to the generation result. (uv)
1 The frequency component FW of the convolutional filter is defined
However, one of the core challenges of neural network
in Equation (6), which is extracted by applying a revised discrete Fourier
watermarking is that most watermarking techniques cannot transform on the convolutional filter W. According to Theorem 3.1, the
(uv)
* Quanshi Zhang is the corresponding author. He is with the Department frequency component FW at frequency (u, v) represents the influence
of Computer Science and Engineering, the John Hopcroft Center, at the of the convolutional filter W on the corresponding frequency component
(uv)
Shanghai Jiao Tong University, China. FX extracted from the input feature X.

1
Before
fine-tuning ( ) … Stable as the watermark
DNN

Change:
After
fine-tuning ( ) …
DNN

(uv)
Figure 1. The framework of the proposed watermark. We prove that the specific frequency components1 FW , which are obtained by
conducting a revised discrete Fourier transform T (·) on the convolutional filter W, keep stable in the training process. Thus, these specific
(uv)
frequency components FW are used as the robust watermark to fine-tuning. For clarity, we move low frequencies to the center of the
spectrum map, and move high frequencies to corners of the spectrum map. Unless otherwise stated, in this paper, we visualize the frequency
spectrum map in this manner.

scaling and weight permutations. of metrics. Zhang et al. [2024] measured the CKA similarity
Therefore, we propose to use the frequency components [Kornblith et al., 2019] between the features of different
(uv)
FW as the robust watermark. Besides, the overwriting at- layers in a DNN as the robust watermark towards weight
tack is another important issue for watermarking. To defend scaling and weight permutations.
the watermark from the overwriting attack, we introduce Compared to the robustness to weight scaling and weight
an additional loss to train the model, which ensures that permutations, the robustness to fine-tuning presents a more
the overwriting of the watermark will significantly hurt the significant challenge. Up to now, there is no theoretically
model’s performance. guaranteed robust watermark to fine-tuning, to the best of
The contribution of this study can be summarized as fol- our knowledge. Thus, many watermark techniques were
lows. (1) We discover and theoretically prove that specific implemented in an engineering manner to defend the fine-
frequency components of a convolutional filter remain in- tuning attack. Liu et al. [2021] selected network parameters,
variant during training and are equivariant to weight scaling which did not change a lot during fine-tuning, to encode the
and weight permutations. (2) Based on the theory, we pro- watermark information. Tan et al. [2023] used the classifica-
pose to encode the watermark information to these frequency tion accuracy on a particular type of adversarial examples,
components, so as to ensure that the watermark is robust to which is termed a trigger set, as the watermark. To enhance
fine-tuning, weight scaling, and weight permutations. (3) robustness, they optimized the trigger set to ensure that the
Preliminary experiments have demonstrated the effective- watermarked network could maintain high accuracy on the
ness of the proposed method. trigger set, even under the fine-tuning attack. Zeng et al.
[2023] found that the direction of the vector formed by all
2. Related Work parameters was relatively stable during fine-tuning, so as to
The robustness of watermarks has always been a key is- use it to encode watermark information.
sue in the field of neural network watermarking. In this However, Aiken et al. [2021], Shafieinejad et al. [2021],
paper, we limit our discussion to the watermark embedded and Xu et al. [2024] showed that, despite various engineering
in network parameters for the protection of the DNN’s own- defense methods, most watermarks could still be effectively
ership information. However, fine-tuning, weight scaling, removed from the neural network under certain fine-tuning
weight permutations, pruning, and distillation may all hurt settings. Therefore, a theoretically certificated robust water-
or remove the watermark from the DNN. mark is of considerable value in both theory and practice.
Weight scaling and weight permutations are typical attack- To this end, Bansal et al. [2022] and Ren et al. [2023] pro-
ing methods, which change the watermark by rearranging the posed to use the classification accuracy on a trigger set as
network’s parameters. Therefore, Zeng et al. [2023] found the watermark and proved that the classification accuracy
that the multiplication of specific weight matrices were in- was lower bounded when the attacker did not change the
variant to weight scaling and weight permutations, thereby network’s parameters by more than a distance in terms of
embedding the watermark information in such multiplication lp -norm (p > 1). These methods proved a safe range of

2
parameter changes during fine-tuning, but they did not boost
the robustness of the watermark or propose an intrinsically (a) = ! +
robust watermark.
In contrast, we have proved that the convolutional fil-
ter’s specific frequency components1 keep stable during fine-
tuning. Thus, we embed the watermark information into
these frequency components as a theoretically certificated
robust watermark.

3. Method
3.1. Preliminaries: reformulating the convolution (b) +
= ⨂
in the frequency domain
Figure 2. Forward propagation in the frequency domain (a) and
In this subsection, we reformulate the forward propaga-
forward propagation in the spatial domain (b). The convolution
tion through a convolutional filter in the frequency domain. operation with a convolutional filter on the input feature X is
i.e., when we apply a discrete Fourier transform (DFT) to essentially equivalent to a vector multiplication on the frequency
the input feature, and a revised discrete Fourier transform to components of the input.
the convolutional filter, we can get the frequency component
vectors at different frequencies for the input feature and the
convolutional filter, respectively. As a preliminary, Tang et al.
(c)
[2023] have proven that the forward propagation through a where Xmn ∈ R denotes the element of X (c) ∈ RM ×N at
convolutional filter can be reformulated as the vector multi- position (m, n), and Ymn ∈ R denotes the element of Y .
plication between the frequency component vectors of the For clarity, we can organize all frequency elements be-
input feature and the convolutional filter at corresponding longing to the same c-th channel to construct a frequency
frequencies. spectrum matrix G(c) ∈ CM ×N . Alternatively, we can also
Specifically, let us focus on a convolutional filter with re-organize these frequency elements at the same frequency
C channels and a kernel size of K × K . The convolutional (uv)
(u, v) to form a frequency component vector FX ∈ CC .
filter is parameterized by weights W ∈ RC×K×K and the
bias term b ∈ R. Accordingly, we apply this filter to an input " (c)
#
feature X ∈ RC×M ×N , and obtain an output feature map G00 · · · M ×N
′ ′ ∀c, G(c) = .. .. ∈ C ,
Y ∈ RM ×N . .
. (3)
h i⊤
Y = W ⊗ X + b · 1M ′ ×N ′ , (1) ∀u, v,
(uv)
FX = G(1) (2) (C)
∈ CC .
uv , Guv , . . . , Guv

where ⊗ denotes the convolution operation. 1M ′ ×N ′ is an

M ′ × N ′ matrix, in which elements are all ones. In this way, we have
Frequency components of the input feature and the
output feature. In this way, Tang et al. [2023] have proven " (00)#
Theorem 3.1, showing that the above forward propagation in (1) (2) (C)
FX
··· C×M ×N
FX = [G , G , . . . , G ]= .. ∈ C
.. .
Equation (1) can be reformulated as the vector multiplication .
.
in the frequency domain as shown in Figure 2. However, (4)
before that, let us first introduce the notation of frequency Similarly, FY(uv) = Huv ∈ C represents the frequency com-
components. ponent of the output feature Y at the frequency (u, v).
Given the input feature X ∈ RC×M ×N and the output (uv)
For all frequency components FX and FY(uv) , frequency
feature Y ∈ RM ×N , we conduct the two-dimensional DFT
(u, v) close to (0, 0), (0, N − 1), (M − 1, 0), or (M − 1, N − 1)
on each c-th channel X (c) ∈ RM ×N of X and the matrix Y
represents the low frequency, while frequency (u, v) close to
to obtain the frequency element G(c) uv ∈ C and Huv ∈ C at
(M/2, N/2) is considered as the high frequency.
frequency (u, v) as follows. C denotes the set of complex
numbers.
XM −1 XN −1 Theorem 3.1. (Forward propagation in frequency domain)
um
(c) −i( M + vn
G(c)
uv = Xmn e N )2π , Based on the above notation, Tang et al. [2023] have proven
m=0 n=0
XM −1 XN −1 (2) that the forward propagation of the convolution operation in
um
+ vn
Huv = Ymn e−i( M N )2π , Equation (1) can be reformulated as a vector multiplication
m=0 n=0

3
(uv) ∂Loss
in the frequency domain as follows. and FW ′ = Tuv (W − η ∂W ) according to Equation (6) de-

note the frequency components which are extracted from the

Y = W ⊗ X + b · 1M ×N (Spatial domain) filter W before the step of gradient descent optimization and
⇐⇒ that extracted from W′ after the optimization, respectively.
(uv) (uv) (uv)
FY = FW · FX + δuv M N b(Frequency domain), Theorem 3.2. (The change of frequency components dur-
(5) ing training, proven in Appendix A.1) The change of each
(uv)
where · denotes the scalar product of two vectors; δuv is frequency component FW before and after a single-step
defined as δuv = 1 if and only if u = v = 0, and δuv = 0 gradient decent optimization is reformulated as follows.
otherwise. In particular, the convolution operation ⊗ is
conducted with circular padding [Jain, 1989] and a stride (uv) ∂Loss
∆FW = Tuv (W − η ) − Tuv (W)
size of 1, which avoids changing the size of the output feature ∂W
(uv) (uv)
(i.e., ensuring M ′ = M and N ′ = N ). = FW′ − FW (7)
M −1 N −1
Frequency components of the convolutional filter. In X X ∂Loss ′ ′
(u v )
(uv) = −η Auvu′ v′ · FX ,
Theorem 3.1, FW represents the frequency component1 of (u′ v ′ )
∂F Y
u′ =0 v ′ =0
the convolutional filter W ∈ RC×K×K at frequency (u, v),
K(u−u′ )π K(v−v ′ )π (K−1)(u−u′ )
and can be obtained by conducting the revised discrete sin( ) sin( )
where Auvu′ v′ = M
(u−u′ )π
N
(v−v ′ )π
· ei( M
+
Fourier transform of frequency (u, v)Tuv (·) on W as follows. sin( M
) sin( N
)
(K−1)(v−v ′ ) (u′ v ′ )
)π
(uv)
N ∈ C is ′a′ complex coefficient; F X denotes
FW = Tuv (W), (6) (u v )
the conjugate of FX .
(1) (2) (C)
where Tuv (W) = [Quv , Quv , . . . , Quv ]⊤ , Q(c)uv = Corollary 3.3 shows that if the input feature X only con-
PK−1 PK−1 (c) i( ut + vs )2π (c)
t=0 s=0 Wts e
M N , Wts denotes the element at tains the fundamental frequency component, then the spe-
(c) K×K (uv)
position (t, s) of the c-th channel W ∈ R of W. cific frequency components FW keep unchanged over the
Similar to Equation (4), we can define the revised discrete training process.
Fourier transform of all frequencies T (·) by organizing the
Corollary 3.3. (Invariant frequency components towards
filter’s frequency components to get the frequency tensor as
(00)
 fine-tuning, proven in Appendix A.2) In the training pro-
FW ···
FW = T (W), where T (W) =  C×M ×N
. cess, if the input feature X only contains the fundamental
.. .. ∈ C

(uv)
. . frequency component, i.e., ∀(u, v) ̸= (0, 0), FX = 0, then
(uv)
frequency components FW at the following frequencies in
3.2. Invariant frequency components of the convo- the set S keep invariant.
lutional filter
(uv)
∀(u, v) ∈ S, ∆FW = 0, (8)
In this subsection, we aim to prove that frequency compo-
(uv)
nents of the convolutional filter FW at certain frequencies where S is the set of the specific frequencies, defined as S =
(u, v) are relatively stable during training. Additionally, these {(u, v) | u = iM/K or v = jN/K; i, j ∈ {1, 2, . . . , K − 1}}.
frequency components are also equivariant to other attacks
like weight scaling and weight permutations. In this way, we Corollary 3.3 indicates an ideal case where the input
can embed the watermarks into these components to enhance feature only contains the fundamental frequency component,
(uv)
their resistance to fine-tuning, weight scaling, and weight and u, v can take non-integer values. In this case, ∆FW at
permutations. frequencies in the set S are strictly zero vectors.
Specific frequency components of the filter are in- However, in real applications, the input feature usually
variant towards fine-tuning. Specifically, based on the contains low-frequency components, and frequencies must
forward propagation in the frequency domain formulated be integers when conducting the DFT. Under these con-
(uv)
in Equation (5), we prove that if the input feature X con- ditions, each element in ∆FW is nearly zero at integer
tains only the fundamental frequency components, i.e., frequencies that are close to the frequencies in the set S .
(uv)
∀(u, v) ̸= (0, 0), FX = 0, then frequency components Proposition 3.4. In the training process, if the input fea-
(uv)
FW at specific frequencies will not change over the training ture X only contains the low-frequency components, i.e.,
process. (uv)
/ Srlow , FX = 0, where Srlow = {(u, v)|u ∈ [0, r] ∪
∀(u, v) ∈
To prove the invariance of the frequency components to- [M − r, M ), v ∈ [0, r] ∪ [N − r, N )} and r is a positive integer
wards fine-tuning, we decompose the entire training process (uv)
(r ≤ 2), then frequency components FW at the following
into massive steps of gradient descent optimization. Each ′
frequencies in the set S keep relative stable.
step of gradient descent optimization w.r.t. the loss function
(uv)
can be formulated as W′ = W−η ∂Loss ∂W
. Let FW(uv)
= Tuv (W) ∀(u, v) ∈ S ′ , ∆FW ≈ 0, (9)

4
where S ′ is the set of the specific integer frequencies, de-
fined as S ′ = {(u, v)|u = ⌊iM/K⌉ or v = ⌊jN/K⌉; i, j ∈
{1, 2, . . . , K − 1}}, ⌊x⌉ is used to round the real number x to … Backbone of the network …
the nearest integer.
Towards weight scaling. Weight scaling attack means Watermark module
scaling the weights of a convolutional layer by a constant a, Low-pass Convolutions
and scaling the weights of the next convolutional layer by the filter
inverse proportion 1/a. In this way, the model’s performance
will not be affected, but the watermark embedded in the
[ , … … ]
weights usually will change. Theorem 3.5 shows that the
frequency components are equivariant to the weight scaling … …
attack.
Theorem 3.5. (Equivariance towards weight scaling,
proven in Appendix A.3) If we scale all weights in the con-
…
volutional filter W by a constant a as W∗ = a · W(a > 0),
then the frequency components of W∗ are equal to the scaled Stable as
frequency components of W, as follows. the watermark
(uv) (uv) Figure 3. The architecture of the watermark module. The water-
∀a, u, v, FW∗ = a · FW . (10)
mark module is connected in parallel to the backbone of the neural
Towards weight permutations. Permutation attack on network. We extract the specific frequency components from the
the filters means permuting the filters and corresponding convolutional filters in the watermark module as the network’s wa-
bias terms of a convolutional layer, and then permuting the termark.
channels of every filter of the next convolutional layer in
the same order. As a result, the network’s outputs remain
unaffected, while the watermark embedded in the weights is keep stable during training. Furthermore, these components
usually altered. exhibit equivariance to weight scaling and weight permuta-
We further investigate the equivariance of the frequency tions.
components when permuting the convolutional filters. Given Watermark module. All the above findings and proofs
a convolutional layer with D convolutional filters with D enable us to use the specific frequency components as the
bis terms arranged as W = [W1 , W2 , · · · , WD ] and b = watermark of the neural network. In this way, the watermark
[b1 , b2 , · · · , bD ]. will be highly robust to fine-tuning, weight scaling, and
weight permutations. Specifically, as Figure 3 shows, we
Theorem 3.6. (Equivariance towards weight permu-
construct the following watermark module Φ(X) to contain
tations, proven in Appendix A.4) If we use a per-
the watermark, which consists of a low-pass filter Λ(·) and
mutation π to rearrange the above filters and bias
convolution operations (with D convolutional filters W =
terms as πW = [Wπ(1) , Wπ(2) , · · · , Wπ(D) ] and πb =
[W1 , W2 , · · · , WD ] and D bias terms b = [b1 , b2 , · · · , bD ]).
[bπ(1) , bπ(2) , · · · , bπ(D) ], where [π(1), π(2), · · · , π(D)] is a
random permutation of integers from 1 to D, then the fre-
Φ(X) = [Y1 , Y2 , · · · , YD ] ,
quency components of πW are equal to the permuted fre- (12)
quency components of W, as follows. s.t. Yd = Wd ⊗ Λ(X) + bd · 1M ×N ,
h i h i
(uv) (uv) (uv) (uv) where the low-pass filtering operation Λ(·) preserves fre-
∀π, u, v, FWπ(1) , · · · , FWπ(D) = π FW1 · · · FWD ,
(11) quency components in X at low frequencies in Srlow =
where FW (uv)
= T (W ) ∈ C C
denote the frequency compo- {(u, v)|u ∈ [0, r] ∪ [M − r, M ), v ∈ [0, r] ∪ [N − r, N )} (r ≤ 2)
d
uv d
nents extracted from the d-th filter Wd at frequency (u, v). and removes all other frequency components, i.e., setting
(uv)
/ Srlow , FX = 0. 1M ×N is an M × N matrix, in
∀(u, v) ∈
which elements are all ones.
3.3. Using the invariant frequency components as
Invariant frequency components as the watermark.
the neural network’s watermark (uv)
In this way, when we extract frequency components FW d
In the last subsection, we prove that if the input feature from every d-th convolutional filter Wd in the watermark
only contains the low-frequency components, the filter’s module Φ(X) based on Equation (6), we can consider the
(uv)
frequency components FW at specific frequencies (u, v) frequency components at the following frequencies in the set

5
S ′ , as the watermark.
h
(uv) (uv) (uv)
i Before fine-tuning
FW1 , FW2 , · · · , FWD s.t. (u, v) ∈ S ′ , (13)
Change:
where S ′ = {(u, v)|u = ⌊iM/K⌉ or v = ⌊jN/K⌉; i, j ∈
{1, 2, . . . , K − 1}}. According to Proposition 3.4, the water- (a)
mark will keep stable during training. After fine-tuning
Implementation details. We notice that in the watermark
module, the low-pass filter Λ(·) may hurt the flexibility of
feature representations. Therefore, as Figure 3 shows, the
watermark module is connected in parallel to the backbone
architecture of the neural network. In this way, this design Watermark: … … …
does not significantly change the network’s architecture or
seriously hurt its performance. In this paper, unless stated IDFT IDFT IDFT
otherwise, we set the integer r = 1, the kernel size K = 3.
Visualization of the watermark. Figure 4(a) shows the
specific frequencies in the set S ′ used as the watermark. Fig- Feature
(b)
ure 4(b) shows the feature maps when we apply the inverse maps
discrete Fourier transform (IDFT) to some unit frequency
components which are used as the watermark. Since our
transform of the convolutional filter in Equation (6) is irre- Figure 4. Visualization of the watermark. (a) shows the specific
versible, we use the feature maps here only to illustrate the frequencies in the set S ′ used as the watermark. (b) shows the
feature maps when we apply the inverse discrete Fourier transform
characteristics of the frequency components1 in the spatial
(IDFT) to some unit frequency components used as the watermark.
domain. The frequency components are extracted from a single channel
3.4. Detecting the watermark of a 3 × 3 convolutional filter in the watermark module and the
input feature map has a width and height of 9 × 9, so the set
In this subsection, we introduce how to detect the S ′ = {(u, v)|u = 3i or v = 3j; i, j ∈ {1, 2}}. For clarity, we
watermark. Given a source watermarked DNN with move low frequencies to the center of the spectrum map, and move
a watermark module containing D convolutional fil- high frequencies to corners of the spectrum map.
ters [W1 , W2 , · · · , WD ] and a suspicious DNN with a
watermark module containing D convolutional filters
[W1′ , W′ , · · · , WD′
], we aim to detect whether the suspi- that are used as the watermark. τ is a threshold, and we
cious DNN is obtained from the source DNN by fine-tuning, set τ = 0.995 in this paper unless otherwise stated. If the
weight scaling or weight permutations. watermark detection rate DR is high, we can determine that
Considering the permutation attack, the detection towards the suspicious network originates from the source network.
the frequency components should consider the matching be-
3.5. Learning towards the overwriting attack
tween the frequency components of different convolutional
filters of the two DNNs, i.e., we can definitely find a permuta- Another challenge is the resistance of the watermark to
tion [π(1), π(2), · · · , π(D)] to assign each d-th convolutional the overwriting attack. Given the watermark module’s ar-
′
filter Wd in the source DNN with the π(d)-th filter Wπ(d) chitecture, if the attacker has obtained the authority to edit
in the suspicious DNN. Specifically, we use the following its parameters, then he can overwrite the weights W in the
watermark detection rate DR between two DNNs to identify watermark module with entirely new values, so as to change
the matching quality. the watermark.
P uv uv
Let us consider a DNN for the classification of n cate-
(u,v)∈S ′ ,d I(cos(FWd , FWπ(d)
′ ) ≥ τ) gories. To defend the overwriting attack, the basic idea is
DR = × 100%
Z to construct the (n + 1)-th category as a pseudo category
(14) besides the existing n categories. If the neural network is
where I denotes an indicator function that equals 1 if not attacked, it is supposed to classify input samples nor-
(uv) uv (uv) uv
cos(FWd , FW ′ ) ≥ τ and 0 otherwise. cos(FWd , FW ′ ) mally. Otherwise, if the network is under an overwriting
π(d) π(d)

denotes the cosine similarity 2 of the frequency components. z1 , ∥z1 ∥ denotes the magnitude of z1 , and Re(·) represents the real
Z denotes the total number of the frequency components part of a complex number. The cosine similarity, which ranges from
2
[−1, 1], measures the directional similarity between two complex vec-
The cosine similarity cos(z1 , z2 ) between two complex vectors tors. When cos(z1 , z2 ) = 1, z1 and z2 have the same direction, while
Re(z1 ·z2 )
z1 and z2 is defined as ∥z1 ∥∥z2 ∥
, where z1 denotes the conjugate of when cos(z1 , z2 ) = −1, z1 and z2 have opposite directions.

6
attack, then it is supposed to classify all samples into the
pseudo category. In this way, overwriting the watermark will
significantly hurt the classification performance of the DNN.
Therefore, we train the network by adding an additional loss
Lattack , which pushes the attacked network to classify all sam-
ples into the pseudo category, to the standard cross-entropy
loss LCE for multi-category classification.

L(W, b, θ|x) = LCE (W, b, θ|x) + Lattack (W, b, θ|x) (a) AlexNet on CIFAR-10 (b) AlexNet on Caltech-101
Xn
=− p(y = k|x)log q(y = k|x; W, b, θ)
k=1
− λ · log q(y = n + 1|x; W + ϵ, b, θ),
(15)
where x denotes an input sample, and y denotes its corre-
sponding label; θ denotes the network’s parameters. q(y = (c) ResNet18 on CIFAR-10 (d) ResNet18 on Caltech-101
k|x; W, b, θ) denotes the classification probability predicted
by the neural network. p(y = k|x) is the ground truth proba- Figure 5. Heatmaps showing the average norm of the change of the
(uv)
bility. The scalar weight λ balances the influence of LCE and frequency components Ed [∥∆FWd ∥] before and after fine-tuning
Lattack . at different frequencies (u, v) over all convolutional filters. For
In the above loss function, we add a random noise3 ϵ to clarity, we move low frequencies to the center of the spectrum map,
the parameters W in the watermark module to mimic the and move high frequencies to corners of the spectrum map
state of the neural network with overwritten parameters. To
enhance the module’s sensitivity to such attacks, we do not
completely overwrite the parameters but add random noise. significantly drops under the overwriting attack. The re-
Ablation studies. We conducted an ablation experiment sults indicate that the newly introduced loss term effectively
to evaluate the effectiveness of the newly added loss term defends the overwriting attack.
Lattack , i.e., examining whether the performance of the neural
network was significantly hurt under the overwriting attack 3.6. Verifying the robustness of the watermark
when the network was trained with the loss function L in Verifying the robustness towards fine-tuning. We
Equation (15). We compared the classification accuracy of conducted the experiments to verify the invariance of the
the network without the attack and the classification accuracy proposed watermark towards fine-tuning. Let us fine-tune
under the attack to analyze the performance decline of the a trained DNN with watermark module containing filters
network towards the overwriting attack. [W1 , W2 , · · · , WD ], and obtain a fine-tuned DNN with fil-
We ran experiments of AlexNet [Krizhevsky et al., 2012] ters [W1′ , W2′ , · · · , WD′
]. We computed the average the norm
and ResNet18 [He et al., 2016] on Caltech-101, Caltech- of the change of the frequency components Ed [∥∆FW (uv)
∥] to
d
256 [Fei-Fei et al., 2006], CIFAR-10 and CIFAR-100 measure the invariance of the proposed watermark, where
[Krizhevsky et al., 2009] for image classification tasks. For (uv) (uv) (uv)
∆FWd = FW′ −FWd denoted the change of the frequency
AlexNet, the watermark module containing 256 convolu- d
components extracted from the d-th convolutional filter.
tional filters was connected to the third convolutional layer.
We trained AlexNet and ResNet18 on CIFAR-100, and
For ResNet18, the watermark module containing 256 con-
then fine-tuned them on CIFAR-10 and Caltech-101. All
volutional filters was connected to the second convolutional
other experiment settings remained the same as described in
layer of the second residual block. The scalar weight λ was
Section 3.5. Figure 5 shows the frequency components used
set to 5 × 10−4 . The noise ϵ added to the parameters in the
as the watermark keep stable during fine-tuning. Table 2
watermark module was obtained by conducting the IDFT on
shows that adding a watermark does not decline the fine-
a unit frequency component at a random frequency, and the
tuning performance of the network. The results indicate that
l2 -norm of the noise ϵ was set to 0.5 times the l2 -norm of the
the watermark is robust to the fine-tuning attack.
weights.
Verifying the robustness towards weight scaling. We
Table 1 shows the experiment results. We observe
conducted experiments to verify the robustness of the pro-
that if the network is trained with the loss function L =
posed watermark towards weight scaling. Given a water-
LCE + Lattack in Equation (15), the classification accuracy
marked DNN, we scaled the parameters in the watermark
3 The magnitude and other specific settings of the noise ϵ will be intro- module by a constant a(a > 0), and then detected the water-
duced later. mark using the method introduced in Section 3.4. We used

7
Table 1. Experiment results of the effectiveness of the newly added loss term Lattack . Baseline denotes the test accuracy of a neural network
normally trained without the watermark. With LCE + Lattack denotes the test accuracy of a watermarked network trained with the loss
function LCE + Lattack , and With LCE denotes the test accuracy of a watermarked network trained without the added loss term Lattack .
The accuracy outside the bracket represents the accuracy of the network without the overwriting attack, and the accuracy inside the bracket
represents the accuracy of the network under the overwriting attack.

Baseline (%) With LCE + Lattack (%) With LCE (%)

Dataset
AlexNet ResNet-18 AlexNet ResNet-18 AlexNet ResNet-18
CIFAR-10 91.03 94.83 90.28 (43.55) 92.17 (72.26) 91.12 (91.12) 94.89 (94.89)
CIFAR-100 68.10 76.29 66.34 (36.93) 75.49 (41.18) 67.52 (67.52) 76.53 (76.53)
Caltech-101 66.46 70.10 62.15 (32.53) 67.14 (41.33) 67.84 (67.84) 69.87 (69.87)
Caltech-256 40.50 54.61 37.97 (15.37) 50.33 (18.13) 39.22 (39.22) 53.86 (53.86)

Table 2. Experiment results of verifying the robustness towards Verifying the robustness towards weight permutations.
fine-tuning. Baseline denotes the accuracy of a neural network We conducted experiments to verify the robustness of the
normally trained without the watermark. Ours denotes the accuracy proposed watermark towards weight permutations. Given
of a watermarked network. The rate inside the bracket denotes the
a watermarked DNN, we permuted the filters in the wa-
watermark detection rate of the fine-tuned DNN. Accuracy outside
the bracket denotes test accuracy on the dataset.
termark module with a random permutation π , and then
detected the watermark using the method introduced in Sec-
Baseline (%) Ours (%) tion 3.4. We used the watermark detection rate DR to show
Source Target
AlexNet ResNet-18 AlexNet ResNet-18 the robustness of the watermark towards weight scaling. We
CIFAR-10 88.90 93.03 89.65 (100) 94.12 (100) trained AlexNet on CIFAR-10, CIFAR-100, Caltech-101
CIFAR-100
Caltech-101 70.02 76.80 72.11 (100) 79.98 (100) and Caltech-256. All other experiment settings remained
the same as described in Section 3.5. Table 3 shows the
experiment results. All the watermark detection rates are
Table 3. Experiment results of verifying the robustness towards 100%, showing that our method is highly robust to the weight
weight scaling. The rate outside the bracket denotes the watermark permutation attack.
detection rate without the weight scaling attack, and the rate inside
the bracket denotes the watermark detection rate under the weight
scaling attack.
4. Conclusion
a CIFAR-10 (%) CIFAR-100 (%) Caltech-101 (%) Caltech-256 (%)
10 100 (100) 100 (100) 100 (100) 100 (100) In this paper, we discover and theoretically prove that
100 100 (100) 100 (100) 100 (100) 100 (100)
specific frequency components of a convolutional filter keep
stable during training and have equivariance towards weight
Table 4. Experiment results of verifying the robustness towards
scaling and weight permutations. Based on the theory, we
weight permutations. The rate outside the bracket denotes the propose to use these frequency components as the network’s
watermark detection rate without the weight permutation attack, watermark to embed the ownership information. Thus, our
and the rate inside the bracket denotes the watermark detection rate proposed watermark technique is theoretically guaranteed to
under the weight permutation attack. be robust to fine-tuning, weight scaling, and weight permuta-
tions. Additionally, to defend against the overwriting attack,
π CIFAR-10 (%) CIFAR-100 (%) Caltech-101 (%) Caltech-256 (%) we add an additional loss term during training to make sure
π1 100 (100) 100 (100) 100 (100) 100 (100) that the network’s performance will drop significantly un-
π2 100 (100) 100 (100) 100 (100) 100 (100)
der the overwriting attack. Preliminary experiments have
demonstrated the effectiveness of the proposed method.

the watermark detection rate DR to show the robustness of

the watermark towards weight scaling. We trained AlexNet 5. Impact Statements
on CIFAR-10, CIFAR-100, Caltech-101 and Caltech-256.
All other experiment settings remained the same as described This paper presents work whose goal is to advance the
in Section 3.5. Table 3 shows the experiment results. All field of Machine Learning. There are many potential societal
the watermark detection rates are 100%, showing that our consequences of our work, none which we feel must be
method is highly robust to the weight scaling attack. specifically highlighted here.

8
References Hanwen Liu, Zhenyu Weng, and Yuesheng Zhu. Watermark-
ing deep neural networks with greedy residuals. In ICML,
Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, pages 6978–6988, 2021. 1, 2
and Joseph Keshet. Turning your weakness into a strength:
Watermarking deep neural networks by backdooring. In Nils Lukas, Yuxuan Zhang, and Florian Kerschbaum. Deep
27th USENIX security symposium (USENIX Security 18), neural network fingerprinting by conferrable adversarial
pages 1615–1631, 2018. 1 examples. In International Conference on Learning Rep-
resentations, 2021. 1
William Aiken, Hyoungshick Kim, Simon Woo, and Jung-
woo Ryoo. Neural network laundering: Removing black- Mohammad Ali Nematollahi, Chalee Vorakulpipat, and
box backdoor watermarks from deep neural networks. Hamurabi Gamboa Rosales. Digital watermarking.
Computers & Security, 106:102277, 2021. 2 Springer, 2017. 1
Arpit Bansal, Ping-yeh Chiang, Michael J Curry, Rajiv Jain, Jiaxiang Ren, Yang Zhou, Jiayin Jin, Lingjuan Lyu, and
Curtis Wigington, Varun Manjunatha, John P Dickerson, Da Yan. Dimension-independent certified neural network
and Tom Goldstein. Certified neural network watermarks watermarks via mollifier smoothing. In International
with randomized smoothing. In International Conference Conference on Machine Learning, pages 28976–29008.
on Machine Learning, pages 1450–1465. PMLR, 2022. 2 PMLR, 2023. 2
Li Fei-Fei, Robert Fergus, and Pietro Perona. One-shot learn-
Masoumeh Shafieinejad, Nils Lukas, Jiaqi Wang, Xinda Li,
ing of object categories. IEEE transactions on pattern
and Florian Kerschbaum. On the robustness of backdoor-
analysis and machine intelligence, 28(4):594–611, 2006.
based watermarking in deep neural networks. In Proceed-
7
ings of the 2021 ACM workshop on information hiding
Pierre Fernandez, Guillaume Couairon, Teddy Furon, and and multimedia security, pages 177–188, 2021. 2
Matthijs Douze. Functional invariants to watermark large
Jingxuan Tan, Nan Zhong, Zhenxing Qian, Xinpeng Zhang,
transformers. In ICASSP 2024-2024 IEEE International
and Sheng Li. Deep neural network watermarking against
Conference on Acoustics, Speech and Signal Processing
model extraction attack. In Proceedings of the 31st ACM
(ICASSP), pages 4815–4819. IEEE, 2024. 1
International Conference on Multimedia, pages 1588–
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 1597, 2023. 1, 2
Deep residual learning for image recognition. In Pro-
ceedings of the IEEE conference on computer vision and Ling Tang, Wen Shen, Zhanpeng Zhou, Yuefeng Chen,
pattern recognition, pages 770–778, 2016. 7 and Quanshi Zhang. Defects of convolutional decoder
networks in frequency representation. In International
Anil K Jain. Fundamentals of digital image processing. Conference on Machine Learning, pages 33758–33791.
Prentice-Hall, Inc., 1989. 4 PMLR, 2023. 1, 3
John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Yusuke Uchida, Yuki Nagai, Shigeyuki Sakazawa, and
Katz, Ian Miers, and Tom Goldstein. A watermark for Shin’ichi Satoh. Embedding watermarks into deep neural
large language models. In International Conference on networks. In Proceedings of the 2017 ACM on interna-
Machine Learning, pages 17061–17084. PMLR, 2023. 1 tional conference on multimedia retrieval, pages 269–277,
Simon Kornblith, Mohammad Norouzi, Honglak Lee, and 2017. 1
Geoffrey Hinton. Similarity of neural network represen- Jiangfeng Wang, Hanzhou Wu, Xinpeng Zhang, and Yuwei
tations revisited. In International conference on machine Yao. Watermarking in deep neural networks via error
learning, pages 3519–3529. PMLR, 2019. 2 back-propagation. Electronic Imaging, 32:1–9, 2020. 1
Ken Kreutz-Delgado. The complex gradient operator and
Jiashu Xu, Fei Wang, Mingyu Derek Ma, Pang Wei Koh,
the cr-calculus. arXiv preprint arXiv:0906.4835, 2009.
Chaowei Xiao, and Muhao Chen. Instructional fin-
12, 13
gerprinting of large language models. arXiv preprint
Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple arXiv:2401.12255, 2024. 2
layers of features from tiny images. 2009. 7
Boyi Zeng, Lizheng Wang, Yuncong Hu, Yi Xu, Chenghu
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Zhou, Xinbing Wang, Yu Yu, and Zhouhan Lin. Huref:
Imagenet classification with deep convolutional neural Human-readable fingerprint for large language models. In
networks. Advances in neural information processing The Thirty-eighth Annual Conference on Neural Informa-
systems, 25, 2012. 7 tion Processing Systems, 2023. 1, 2

9
Jie Zhang, Dongrui Liu, Chen Qian, Linfeng Zhang, Yong
Liu, Yu Qiao, and Jing Shao. Reef: Representation encod-
ing fingerprints for large language models. arXiv preprint
arXiv:2410.14273, 2024. 1, 2

10
A. Proofs of our theoretical findings
We first introduce an important equation, which is widely used in the following proofs.
Lemma A.1. Given N complex numbers, einθ , n = 0, 1, . . . , N − 1, the sum of these N complex numbers is given as follows.
N −1
X sin( N2θ ) (N −1)θ
∀θ ∈ R, einθ = ei 2 (16)
n=0
sin( θ2 )
Specifically, when N θ = 2kπ, k ∈ Z, −N < k < N , we have
N −1
X sin( N2θ ) (N −1)θ
∀θ ∈ R, einθ = ei 2 = N δθ ; s.t. N θ = 2kπ, k ∈ Z, −N < k < N,
n=0
sin( θ2 )
( (17)
1, θ = 0
where δθ =
0, otherwise
We prove Lemma A.1 as follows.
PN −1
Proof. First, let us use the letter S ∈ C to denote the term of n=0 einθ .
N
X −1
S= einθ
n=0
iθ
Therefore, e S is formulated as follows.
N
X
eiθ S = einθ ∈ C
n=1
eiθ S−S
Then, S can be computed as S = eiθ −1
. Therefore, we have
eiθ S − S
S=
eiθ − 1
PN inθ
PN −1
n=1 e − n=0 einθ
=
eiθ − 1
iN θ
e −1
=
eiθ − 1
Nθ Nθ
ei 2 − e−i 2 i (N −1)θ
= θ θ e 2
ei 2 − e−i 2
Nθ Nθ
(ei 2 − e−i 2 )/2i i (N −1)θ
= θ θ e 2
(ei 2 − e−i 2 )/2i
sin( N2θ ) (N −1)θ
= ei 2
sin( θ2 )
PN −1 sin( N2θ ) i (N −1)θ
Therefore, we prove that n=0 einθ = sin( θ2 )
e 2 .
(
PN −1 inθ N, θ=0
Then, we prove the special case that when N θ = 2kπ, k ∈ Z, −N < k < N , n=0 e = N δθ = , as
0, otherwise
follows.
When θ = 0, we have
N −1
X sin( N2θ ) (N −1)θ
lim einθ = lim ei 2
θ→0
n=0
θ→0 sin( θ2 )
sin( N2θ )
= lim
θ→0 sin( θ2 )
=N

11
When θ ̸= 0, and N θ = 2kπ, k ∈ Z, −N < k < N , we have
N −1
X sin( N2θ ) (N −1)θ
einθ = ei 2

n=0
sin( θ2 )
sin(kπ) i (N −1)kπ
= e N
sin( kπ
N )
=0

In the following proofs, the following two equations are widely used, which are derived based on Lemma A.1.
M −1 N −1 M −1 N −1
um
+ vn u2π v2π
X X X X
e−i( M N )2π = eim(− M ) ein(− N )

m=0 n=0 m=0 n=0

= (M δ− u2π
M
)(N δ− v2π
N
) //According to Equation (17)
(
M N, u = v = 0
=
0, otherwise
To simplify the representation, let δuv be the simplification of δ− u2π
M
δ− v2π
N
in the following proofs. Therefore, we have

−1 N −1
M
(
X X
−i( um vn M N, u = v = 0
e M + N )2π = M N δuv = (18)
m=0 n=0
0, otherwise

Similarly, we derive the second equation as follows.

M −1 N −1 M −1 N −1
X X (u−u′ )m (v−v ′ )n X (u−u′ )2π X (v−v ′ )2π
ei( M
+ N
)2π
= eim( M
)
ein( N
)

m=0 n=0 m=0 n=0

= M N δ (u−u′ )2π δ (v−v′ )2π //According to Equation (17)

M N (19)
= M N δu−u′ δv−v′
(
M N, u′ = u; v ′ = v
=
0, otherwise

A.1. Proof of Theorem 3.2

In this section, we prove Theorem 3.2 in the main paper, as follows.
(c) (c)
Proof. According to the DFT and the inverse DFT, we can obtain the mathematical relationship between Guv and Xmn , and
(c) (c)
the mathematical relationship between Quv and Wts , as follows.
 
M −1 N −1 K−1
X K−1
(c) −i( um + vn
 X X  X (c) i( ut + vs )2π
G(c)
uv = Xmn e )2π (c)
Quv = Wts e M N

 M N



 

 
m=0 n=0 t=0 s=0
(20)
M −1 N −1 M −1 N −1

(c) 1 X X i( um + vn

(c) 1 X X (c) −i( ut vs
G(c) )2π
Quv e M + N )2π
 
Xmn = uv e Wts =

 M N


 MN u=0 v=0
 M N u=0 v=0

Based on Equation (20) and the derivation rule for complex numbers [Kreutz-Delgado, 2009], we can obtain the mathemati-
cal relationship between ∂Loss
(c) and
∂Loss
(c) , and the mathematical relationship between
∂Loss
(c) and
∂Loss
(c) , as follows. Note that
∂Guv ∂X mn ∂Quv ∂W ts
when we use gradient descent to optimize a real-valued loss function Loss with complex variables, people usually treat the
real and imaginary values, a ∈ C and b ∈ C, of a complex variable (z = a + bi) as two separate real-valued variables, and
separately update these two real-valued variables. In this way, the exact optimization step of z computed based on such a
(c) (l)[ker=d]
technology is equivalent to ∂Loss
∂z . Since Xmn and Wcts are real numbers, ∂Loss
(c) =
∂Loss
(c) and
∂Loss
(c) =
∂Loss
(c) .
∂X mn ∂Xmn ∂W ts ∂Wts

12
 
M −1 N −1 K−1 K−1
 ∂Loss 1 X X ∂Loss −i( um + vn
 ∂Loss 1 X X ∂Loss i( ut vs
= e )2π
= e M + N )2π

 M N


 ∂G(c) M N m=0 n=0 ∂X (c)  ∂Q(c) M N t=0 s=0 ∂W (c)

 

uv mn uv ts
(21)
M −1 N −1 M −1 N −1
 ∂Loss
 X X ∂Loss i( um vn  ∂Loss
 X X ∂Loss −i( ut + vs )2π
= e M + N )2π = e M N

 

(c) (c) (c) (c)

 

∂X mn u=0 v=0 ∂Guv ∂W ts u=0 v=0 ∂Quv

Let us conduct the convolution operation on the feature map X = [X (1) , X (2) , · · · , X (C) ] ∈ RC×M ×N , and obtain the
output feature map Y ∈ RM ×N as follows.

C K−1
X K−1 (c) (c)
X X
Ymn = b + Wts Xm+t,n+s (22)
c=1 t=0 s=0

Based on Equation (20) and Equation (21), and the derivation rule for complex numbers [Kreutz-Delgado, 2009], the exact
(c)
optimization step of Quv in real implementations can be computed as follows.

∂Loss
(c)
∂Quv
K−1 K−1
1 X X ∂Loss i( ut vs
= (c)
e M + N )2π //Equation (21)
M N t=0 s=0 ∂W
ts
−1 N −1
K−1 K−1 M
!
1 X X X X ∂Loss (c) ut vs
= · X m+t,n+s ei( M + N )2π //Equation (22)
M N t=0 s=0 m=0 n=0
∂Y mn
//Equation (20)
−1 N −1 M −1 N −1
K−1 K−1 M
!
1 X X X X ∂Loss 1 X X (c) −i( u′ (m+t) +
v ′ (n+s)
)2π ut vs
= · Gu′ v′ e M N ei( M + N )2π
M N t=0 s=0 m=0 n=0
∂Y mn M N u′ =0 v′ =0
−1 N −1 M −1 N −1
K−1
!
1 X K−1
X M
X X (c) u t v s ′ ′
1 X X ∂Loss −i( uM ′m ′
+ vNn )2π ut vs
= Gu′ v′ e−i( M + N )2π · e ei( M + N )2π
MN M N m=0 n=0 ∂Y mn
t=0 s=0 u′ =0 v ′ =0
−1 N −1
K−1
!
1 X K−1
X M
X X (c) ∂Loss −i( uM′ t + vN′ s )2π ut vs
= Gu′ v′ e ei( M + N )2π //Equation (21)
MN ∂H u′ v′
t=0 s=0 u′ =0 v ′ =0
K−1 K−1 M −1 N −1 ′ )t
1 X X X X (c) ∂Loss i( (u−u (v−v ′ )s
+ )2π
= Gu′ v′ e M N
M N t=0 s=0 ′ ′ ∂H u′ v′
u =0 v =0
M −1 N −1 K−1 K−1 ′ )t
X X (c) ∂Loss 1 X X i( (u−u (v−v ′ )s
+ )2π
= Gu′ v′ · e M N

u′ =0 v ′ =0
∂H u′ v′ M N t=0 s=0
K−1
X K−1
X (u−u′ )t (v−v ′ )s
// Let Au′ v′ uv = ei( M
+ N
)2π

t=0 s=0
M −1 N −1
1 X X (c) ∂Loss
= Gu′ v′
MN ∂H u′ v′
u′ =0 v ′ =0

where Au′ v′ uv can be rewritten as follows.

K−1
X K−1
X (u−u′ )t (v−v ′ )s
Au′ v′ uv = ei( M
+ N
)2π

t=0 s=0
K−1 K−1
X (u−u′ )2π X (v−v ′ )2π
= ei M
t
ei N
s

t=0 s=0
′ ′
sin( K(u−u )π
) sin( K(v−v )π
) (K−1)(u−u′ ) (K−1)(v−v ′ )
= M
′
(u−u )π
N
(v−v ′ )π
· ei( M
+ N
)π
//According to Equation (16)
sin( M ) sin( N
)

13
∂Loss ∂Loss
Based on the derived (c) ∈ C , we can further compute gradients (l,uv) ⊤ ∈ CD×C as follows.
∂Quv ∂(T )
M −1 N −1
∂Loss 1 X X ∂Loss (u′ v ′ )
(uv)
= Auvu′ v′ (u′ v ′ )
· FX (23)
∂FW MN ′ ′ ∂F
u =0 v =0 Y

(c)
Let us use the gradient descent algorithm to update the convlutional weight Wts |n of the n-th epoch, the updated frequency
(c)
spectrum Wts |n+1 can be computed as follows.

(c) (c) ∂Loss

∀t, s, Wts |n+1 = Wts |n − η · (c)
∂W ts
where η is the learning rate. Then, the updated frequency spectrum T (l,uv) |n+1 computed based on Equation (21) is given
as follows.

∆Q(c) (c) (c)

X K−1
K−1 X ∂Loss ut vs
= −η (c)
ei( M + N )2π //Equation (20)
t=0 s=0 ∂W ts
∂Loss
= −ηM N (c)
//Equation (21)
∂Quv
(c) (c)
Therefore, we prove that any step on Wts equals to M N step on Quv . In this way, the change of frequency components
(uv)
FW can be computed as follows.
M −1 N −1
(uv)
X X ∂Loss (u′ v ′ )
∆FW = −η Auvu′ v′ (u′ v ′ )
· FX (24)
u′ =0 v ′ =0 ∂F Y

A.2. Proof of Corollary 3.3

In this section, we prove Corollary 3.3 in Section 3 of the main paper, as follows.
Proof. According to Theorem 3.2, the change of the frequency components at frequencies (u, v) ∈ S can be further derived as
follows.
M −1 N −1
(uv)
X X ∂Loss (u′ v ′ )
∆FW = −η Auvu′ v′ (u′ v ′ )
· FX
u′ =0 v ′ =0 ∂F Y
∂Loss (00) (uv)
= −ηAuv00 (00)
· FX //∀(u, v) ̸= (0, 0), FX =0
∂F Y
∂Loss (00) sin( Kuπ
M
) sin( Kvπ
N
) i( (K−1)u + (K−1)v )π
= −η · FX · ·e M N

∂F Y
(00) sin( M ) sin( vπ
uπ
N
)
∂Loss (00) (K−1)iπ (K−1)jπ sin(iπ) sin(jπ)
= −η (00)
· FX · ei( K
+ K
)π
·
∂F Y sin(iπ/K) sin(jπ/K)
//S = {(u, v) | u = iM/K or v = jN/K; i, j ∈ {1, 2, . . . , K − 1}}
=0 // sin(iπ) = 0

14
(uv)
Therefore, we have proved the frequency components FW at the frequencies in the set S keep invariant.

A.3. Proof of Theorem 3.5

In this section, we prove Theorem 3.5 in Section 3 of the main paper, as follows.
(uv)
Proof. The frequency components FW∗ of the scaled filter W∗ = a · W are computed as follows.

FW∗ = [Q∗(1) ∗(2) ∗(C) ⊤

(uv)
uv , Quv , . . . , Quv ]
K−1
X K−1 K−1
X K−1 K−1
X K−1
X ∗(1) ut vs X ∗(2) ut vs X ∗(C) i( ut vs
=[ Wts ei( M + N )2π , Wts ei( M + N )2π , · · · , Wts e M + N )2π ]
t=0 s=0 t=0 s=0 t=0 s=0
K−1
X K−1 K−1
X K−1 K−1
X K−1
i( ut + vs i( ut + vs ut vs
X X X
a · Wts ei( M + N )2π ]⊤
(1) )2π (2) )2π (C)
=[ a · Wts e M N , a · Wts e M N ,··· ,
t=0 s=0 t=0 s=0 t=0 s=0
K−1
X K−1 K−1
X K−1 K−1
X K−1
i( ut + vs i( ut + vs ut vs
X X X
Wts ei( M + N )2π ]⊤
(1) )2π (2) )2π (C)
=a·[ Wts e M N , Wts e M N ,··· ,
t=0 s=0 t=0 s=0 t=0 s=0
(C) ⊤
= a · [Q(1) (2)
uv , Quv , . . . , Quv ]
(uv)
= a · FW

(uv)
Thus, we have proved that the frequency components FW∗ of the scaled filter are equal to the scaled frequency components
(uv)
a · FW of the original filter.

A.4. Proof of Theorem 3.6

In this section, we prove Theorem 3.6 in Section 3 of the main paper, as follows.
(uv) (uv)
Proof. The frequency components [FWπ(1) , · · · , FWπ(D) ] of the permuted filters [Wπ(1) , Wπ(2) , · · · , Wπ(D) ] are computed
as follows.
h i
(uv) (uv)
FWπ(1) , · · · , FWπ(D) = Tuv (Wπ(1) ), Tuv (Wπ(2) ), · · · , Tuv (Wπ(D) ) //Equation (6)
= π [Tuv (W1 ), Tuv (W2 ), · · · , Tuv (WD )] (25)
h i
(uv) (uv)
= π FW 1 , · · · , FW D
(uv) (uv)
The frequency components [FWπ(1) , · · · , FWπ(D) ] of the permuted filters [Wπ(1) , Wπ(2) , · · · , Wπ(D) ] are equal to the
(uv) (uv)
permuated frequency components π[FW1 , · · · , FWD ].

Embedding Watermarks Into Deep Neural Networks
No ratings yet
Embedding Watermarks Into Deep Neural Networks
10 pages
IJISAE 1 Thasleen+fathima 2 1371
No ratings yet
IJISAE 1 Thasleen+fathima 2 1371
10 pages
Irjet Antagonistic Watermarking To Attac
No ratings yet
Irjet Antagonistic Watermarking To Attac
4 pages
Protecting The Intellectual Properties of Digital Watermark Using Deep Neural Network
No ratings yet
Protecting The Intellectual Properties of Digital Watermark Using Deep Neural Network
5 pages
Robust Zero-Watermarking of Images Using Neural Network: Santosh Chapaneri Radhika Chapaneri
No ratings yet
Robust Zero-Watermarking of Images Using Neural Network: Santosh Chapaneri Radhika Chapaneri
6 pages
MN906 NNWatermarking
No ratings yet
MN906 NNWatermarking
66 pages
Watermark Removal Paper
No ratings yet
Watermark Removal Paper
6 pages
Understanding Adversarial Robustness From Feature Maps of Convolutional Layers
No ratings yet
Understanding Adversarial Robustness From Feature Maps of Convolutional Layers
14 pages
Certified Neural Network Watermarks With Randomized Smoothing
No ratings yet
Certified Neural Network Watermarks With Randomized Smoothing
16 pages
Rakin Bit-Flip Attack Crushing Neural Network With Progressive Bit Search ICCV 2019 Paper
No ratings yet
Rakin Bit-Flip Attack Crushing Neural Network With Progressive Bit Search ICCV 2019 Paper
10 pages
Minimization of Effects of Rotation and Cropping Attacks On Watermarked Images Using Neural Network
No ratings yet
Minimization of Effects of Rotation and Cropping Attacks On Watermarked Images Using Neural Network
26 pages
Watermarking Deep Neural Network by Backdooring
No ratings yet
Watermarking Deep Neural Network by Backdooring
18 pages
Robust Image Watermarking Based On Generative Adversarial Network
No ratings yet
Robust Image Watermarking Based On Generative Adversarial Network
10 pages
Watermarking Deep Neural Networks
No ratings yet
Watermarking Deep Neural Networks
14 pages
Digital Image Watermarking Using Deep Learning - A Survey
No ratings yet
Digital Image Watermarking Using Deep Learning - A Survey
12 pages
One Pixel Attack
No ratings yet
One Pixel Attack
15 pages
Deep Model Intellectual Property Protection Via Deep Watermarking
No ratings yet
Deep Model Intellectual Property Protection Via Deep Watermarking
16 pages
36276-Article Text-138455-1-10-20240711
No ratings yet
36276-Article Text-138455-1-10-20240711
11 pages
Compusoft, 2 (4), 97-102
No ratings yet
Compusoft, 2 (4), 97-102
6 pages
1 s2.0 S1568494624013309 Main
No ratings yet
1 s2.0 S1568494624013309 Main
15 pages
TDW3431 Tutorial 6 - Solution PDF
No ratings yet
TDW3431 Tutorial 6 - Solution PDF
4 pages
Attack Success Rate
No ratings yet
Attack Success Rate
8 pages
9.using Deep Learning For Image Watermarking Attack
No ratings yet
9.using Deep Learning For Image Watermarking Attack
12 pages
Digital Image Watermarking Using DFT Algorithm
No ratings yet
Digital Image Watermarking Using DFT Algorithm
9 pages
Robust Image Watermarking Based On Generative Adve
No ratings yet
Robust Image Watermarking Based On Generative Adve
27 pages
Deepisign: Invisible Fragile Watermark To Protect The Integrity and Authenticity of CNN
No ratings yet
Deepisign: Invisible Fragile Watermark To Protect The Integrity and Authenticity of CNN
8 pages
Machine Learning Based
No ratings yet
Machine Learning Based
7 pages
Defense Mechanism Against Adversarial Attacks Using Density-Based Representation of Images
No ratings yet
Defense Mechanism Against Adversarial Attacks Using Density-Based Representation of Images
6 pages
Safe Delta: Consistently Preserving Safety When Fine-Tuning Llms On Diverse Datasets
No ratings yet
Safe Delta: Consistently Preserving Safety When Fine-Tuning Llms On Diverse Datasets
23 pages
30091-Article Text-34145-1-2-20240324
No ratings yet
30091-Article Text-34145-1-2-20240324
9 pages
On The Evaluation of Performance of Digital Watermarking in The
No ratings yet
On The Evaluation of Performance of Digital Watermarking in The
4 pages
Deep Learning For Revolutionary Image Copyright Protection With Zero-Watermarking
No ratings yet
Deep Learning For Revolutionary Image Copyright Protection With Zero-Watermarking
20 pages
06811181
No ratings yet
06811181
12 pages
Al Khassaweneh2019
No ratings yet
Al Khassaweneh2019
4 pages
ROWBACK Robust Watermarking For Neural
No ratings yet
ROWBACK Robust Watermarking For Neural
8 pages
02 Hiding
No ratings yet
02 Hiding
65 pages
9375 FreqMark Invisible Image
No ratings yet
9375 FreqMark Invisible Image
25 pages
29203-Article Text-33257-1-2-20240324
No ratings yet
29203-Article Text-33257-1-2-20240324
10 pages
Adversarially Robust Deep Learning
No ratings yet
Adversarially Robust Deep Learning
28 pages
6976-Article Text-10205-1-10-20200525
No ratings yet
6976-Article Text-10205-1-10-20200525
8 pages
Robust Watermarking Techniques
No ratings yet
Robust Watermarking Techniques
54 pages
Efficient Defense Against Model Stealing Attacks On Convolutional Neural Networks
No ratings yet
Efficient Defense Against Model Stealing Attacks On Convolutional Neural Networks
8 pages
Table of Content
No ratings yet
Table of Content
3 pages
Water
No ratings yet
Water
26 pages
Paper Presentation
No ratings yet
Paper Presentation
21 pages
Reversible Watermarking Thesis
100% (3)
Reversible Watermarking Thesis
7 pages
Hacking Neural Networks: A Short Introduction
No ratings yet
Hacking Neural Networks: A Short Introduction
50 pages
Carlini and Wagner 2017 Towards - Evaluating - The - Robustness - of - Neural - Networks
No ratings yet
Carlini and Wagner 2017 Towards - Evaluating - The - Robustness - of - Neural - Networks
19 pages
20 CVPR SCNet
No ratings yet
20 CVPR SCNet
10 pages
Ref Paper
No ratings yet
Ref Paper
8 pages
Mahto Paper Seminar Presentation en
No ratings yet
Mahto Paper Seminar Presentation en
10 pages
Sec19 Hong
No ratings yet
Sec19 Hong
19 pages
Defense Against Adversarial Attacks On Deep Convolutional Neural Networks Through Nonlocal Denoising
No ratings yet
Defense Against Adversarial Attacks On Deep Convolutional Neural Networks Through Nonlocal Denoising
8 pages
Transfer Learning with MRCNN
No ratings yet
Transfer Learning with MRCNN
12 pages
DNN Fault Injection Threats
No ratings yet
DNN Fault Injection Threats
23 pages
Digital Image Water Marking Using Deep Learning
No ratings yet
Digital Image Water Marking Using Deep Learning
16 pages
Digital Watermarking For Big Data
No ratings yet
Digital Watermarking For Big Data
17 pages
P1601
No ratings yet
P1601
2 pages
Complex If Ication
No ratings yet
Complex If Ication
14 pages
00 Title
No ratings yet
00 Title
8 pages
A General Interpolation Theorem Of Marcinkiewicz Type: Tόhoku Math. Journ. 33 (1981), 383-393
No ratings yet
A General Interpolation Theorem Of Marcinkiewicz Type: Tόhoku Math. Journ. 33 (1981), 383-393
12 pages
1411 5500v1ArXiv
No ratings yet
1411 5500v1ArXiv
22 pages
Counter-Examples To Gamma Conjecture I: X N N N N
No ratings yet
Counter-Examples To Gamma Conjecture I: X N N N N
39 pages
GTM003.Topological - Vector.spaces. Helmut.H..Schaefer
No ratings yet
GTM003.Topological - Vector.spaces. Helmut.H..Schaefer
307 pages
Motor Acceleration Analysis
No ratings yet
Motor Acceleration Analysis
4 pages
SP 11 - Control of Records 2
No ratings yet
SP 11 - Control of Records 2
2 pages
Microprocessors Lecture11,12predicted
No ratings yet
Microprocessors Lecture11,12predicted
23 pages
LAB 01 Server 2016 Oracle 2024
No ratings yet
LAB 01 Server 2016 Oracle 2024
19 pages
SRM Placements 2024: Packages & Recruiters
No ratings yet
SRM Placements 2024: Packages & Recruiters
9 pages
Risk Assessment For Online Dissertation Project
No ratings yet
Risk Assessment For Online Dissertation Project
2 pages
Blitz-Logs 20231009213819
No ratings yet
Blitz-Logs 20231009213819
42 pages
Samba Server Installation On Debian 8 (Jessie) : 1 Preliminary Note
No ratings yet
Samba Server Installation On Debian 8 (Jessie) : 1 Preliminary Note
7 pages
Proposal Online Kebele Information Management System Project
No ratings yet
Proposal Online Kebele Information Management System Project
6 pages
ASUS VivoBook Q301LA-BSI5T17
No ratings yet
ASUS VivoBook Q301LA-BSI5T17
4 pages
t14 Gen2 p14s Gen2 HMM en
No ratings yet
t14 Gen2 p14s Gen2 HMM en
133 pages
Exception Handling & Inheritance
No ratings yet
Exception Handling & Inheritance
29 pages
Engineering Roles and Achievements
No ratings yet
Engineering Roles and Achievements
4 pages
4873 501 610 30 - Rotated
No ratings yet
4873 501 610 30 - Rotated
1 page
MapReduce Design Patterns Building Effective Algorithms and Analytics For Hadoop and Other Systems 1st Edition Donald Miner Download
100% (1)
MapReduce Design Patterns Building Effective Algorithms and Analytics For Hadoop and Other Systems 1st Edition Donald Miner Download
54 pages
FDA Document Control Guide
No ratings yet
FDA Document Control Guide
45 pages
Machine Terminals REM 543 REM 545: 1MRS751173-MBG
No ratings yet
Machine Terminals REM 543 REM 545: 1MRS751173-MBG
10 pages
Installation Guide: Magicad 2021 For Revit
No ratings yet
Installation Guide: Magicad 2021 For Revit
19 pages
Metrix DPS Manual Installation Mx2032 Mx2033 Mx2034 Dps Doc 100545
No ratings yet
Metrix DPS Manual Installation Mx2032 Mx2033 Mx2034 Dps Doc 100545
20 pages
‎⁨כתב צופי⁩ PDF
No ratings yet
‎⁨כתב צופי⁩ PDF
1 page
Copy of SEKOLAH3
No ratings yet
Copy of SEKOLAH3
107 pages
Aws Cloudformation: Teamwork
No ratings yet
Aws Cloudformation: Teamwork
14 pages
Microchip Presentation - Evolution of 8-Bit MCUs - Final
No ratings yet
Microchip Presentation - Evolution of 8-Bit MCUs - Final
27 pages
How To Install The LCD Driver: Method1: Online Installation
No ratings yet
How To Install The LCD Driver: Method1: Online Installation
2 pages
Global Data Flow Analysis Guide
No ratings yet
Global Data Flow Analysis Guide
34 pages
Resum (1) (3) Pro
No ratings yet
Resum (1) (3) Pro
16 pages
Welcome To This Course On ChatGPT Intro 1
No ratings yet
Welcome To This Course On ChatGPT Intro 1
2 pages
Dialog+ SW9.xx Service Manual, Chapter 7 Spare Parts, Edition 3-2015 (English)
100% (1)
Dialog+ SW9.xx Service Manual, Chapter 7 Spare Parts, Edition 3-2015 (English)
77 pages
QuickBooks Manual
No ratings yet
QuickBooks Manual
71 pages
IPC1502 Examination Tutorial Letter
No ratings yet
IPC1502 Examination Tutorial Letter
6 pages

Towards The Resistance of Neural Network Watermarking To Fine-Tuning

Uploaded by

Towards The Resistance of Neural Network Watermarking To Fine-Tuning

Uploaded by

Towards the Resistance of Neural Network Watermarking to Fine-tuning

Ling Tang1 Yuefeng Chen2 Hui Xue2 Quanshi Zhang1 *

Abstract be resistant to the fine-tuning of the DNN. When network

where ⊗ denotes the convolution operation. 1M ′ ×N ′ is an

note the frequency components which are extracted from the

Baseline (%) With LCE + Lattack (%) With LCE (%)

the watermark detection rate DR to show the robustness of

m=0 n=0 m=0 n=0

Similarly, we derive the second equation as follows.

m=0 n=0 m=0 n=0

= M N δ (u−u′ )2π δ (v−v′ )2π //According to Equation (17)

A.1. Proof of Theorem 3.2

where Au′ v′ uv can be rewritten as follows.

(c) (c) ∂Loss

∆Q(c) (c) (c)

A.2. Proof of Corollary 3.3

A.3. Proof of Theorem 3.5

FW∗ = [Q∗(1) ∗(2) ∗(C) ⊤

A.4. Proof of Theorem 3.6

You might also like