SlideShare a Scribd company logo
ABOUT RNN
PartPrime Inc.
Ryan Jeong
RNN
시계열 데이터를 취급하는 방법
(신경망 + 시간개념)
LSTM, GRU …
일반 학습 데이터 - 하나의 벡터
시계열 데이터 - 데이터 벡터군
x(n)이 t개 입력 데이터 : x(0), x(1), x(2), x(3), … x(t)
ex) 2016년 ~ 2018년 월별 전국 기온 데이터
하나의 입력 데이터 : x(n)
ex) mnist 이미지
과거의 시계열 데이터를 학습해서,
미지의 새로운 시계열 데이터가 주어졌을 때
미래상태 예측
RNN
시계열데이터의 예
기본적인 신경망 구조
……
……
……
V W
input layer
x(t)
output layer
y(t)
hidden layer
h(t)
순환신경망(RNN) 구조
……
……
……
…… U V
W
input layer
x(t)
output layer
y(t)
hidden layer
h(t)
previous hidden layer
h(t-1)
시간 t인 시점에 주어지는 입력값 x(t)
+
저장되어 있던 t-1 시점의 hidden layer
t 시점의 hidden layer
h(t) = f(Ux(t) + Wh(t-1) + b)
y(t) = g(Vh(t) + c)
hidden layer formula
output layer formula
activation function
activation function
bias
bias
p(t) = Ux(t) + Wh(t-1) + b q(t) = Vh(t) + c
hidden layer value output layer value
error function E = E(U, V, W, b, c)
let,
eh(t) =
∂E
∂p(t)
eo(t) =
∂E
∂q(t)
Error term Error term
eh(t) =
∂E
∂p(t)
∂E
∂U
=
eo(t) =
∂E
∂q(t)
∂E
∂V
=
Error term of Hidden Layer
Error term of Output Layer
∂E
∂W
=
∂E
∂b
=
∂E
∂c
=
∂E
∂p(t)
∂E
∂q(t)
∂E
∂p(t)
∂E
∂p(t)
∂E
∂q(t)
∂p(t)
∂U
T
∂q(t)
∂V
T
∂p(t)
∂W
T
∂p(t)
∂b
∂q(t)
∂c
⊙
⊙
= eh(t)x(t)
T
= eo(t)h(t)
T
= eh(t)h(t-1)
T
= eh(t)
= eo(t)
y(t) = g(Vh(t) + c)output layer formula
In CNN
activation function g(*) - sigmoid function or softmax function or etc…
And output value is probability (number) value.
In RNN
output value is a function like this, g(x) = x
∴ y(t) = Vh(t) + c
E= ∑	y(t) - t(t)1
2
2
t=1
T
Squared error function
……
……
V
……
U
input layer
x(t)
output layer
y(t)
hidden layer
h(t)
……
W
previous hidden layer
h(t-2)
……
U
input layer
x(t-2)
……
W
previous hidden layer
h(t-1)
……
U
input layer
x(t-1)
……
W
previous hidden layer
h(t-3)
BPTT (Backpropagation Through Time)
BPTT (Backpropagation Through Time)
eh(t-1) =
∂E
∂p(t-1)
eh(t-1) =
∂E
∂p(t)
⊙
∂p(t)
∂p(t-1)
= eh(t) ⊙
∂p(t)
∂h(t-1)
∂h(t-1)
∂p(t-1)
= eh(t) ⊙Wf(p(t-1))’
결국 eh(t-1) 를 eh(t) 식으로 나타내는 것.
BPTT (Backpropagation Through Time)
eh(t-z-1) = eh(t-z)⊙Wf(p(t-z-1))’
U(t+1) = U(t) - η∑eh(t-z)x(t-z)
T
z=0
τ
V(t+1) = V(t) - ηeo(t)h(t)
T
W(t+1) = W(t) - η∑eh(t-z)h(t-z-1)
T
z=0
τ
b(t+1) = b(t) - η∑eh(t-z)
z=0
τ
c(t+1) = c(t) - ηe0(t)
IMPLEMENT
1. Prepare DATA
def sin(x, T=100):
return np.sin(2.0 * np.pi * x / T)
def toy_problem(T=100, ampl=0.05):
x = np.arange(0, 2 * T + 1)
noise = ampl * np.random.uniform(low=-1.0, high=1.0, size=len(x))
return sin(x) + noise
노이즈가 첨가된 사인파 데이터를 생성함수
1. Prepare DATA
데이터를 생성
T = 100
f = toy_problem(T)
length_of_sequences = 2 * T # 시계열 전체의 길이
maxlen = 25 # 시계열 데이터 하나의 길이
data = []
target = []
for i in range(0, length_of_sequences - maxlen + 1):
data.append(f[i: i + maxlen])
target.append(f[i + maxlen])
X = np.array(data).reshape(len(data), maxlen, 1)
Y = np.array(target).reshape(len(data), 1)
# 데이터 설정
N_train = int(len(data) * 0.9)
N_validation = len(data) - N_train
X_train, X_validation, Y_train, Y_validation = train_test_split(X, Y, test_size=N_validation)
2. with Tensorflow
모델 설정
n_in = len(X[0][0]) # 1
n_hidden = 20
n_out = len(Y[0]) # 1
x = tf.placeholder(tf.float32, shape=[None, maxlen, n_in])
t = tf.placeholder(tf.float32, shape=[None, n_out])
n_batch = tf.placeholder(tf.int32)
y = inference(x, n_batch, maxlen=maxlen, n_hidden=n_hidden, n_out=n_out)
loss = loss(y, t)
train_step = training(loss)
early_stopping = EarlyStopping(patience=10, verbose=1)
history = {
'val_loss': []
}
2. with Tensorflow
모델 설정 본체
def inference(x, n_batch, maxlen=None, n_hidden=None, n_out=None):
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.01)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.zeros(shape, dtype=tf.float32)
return tf.Variable(initial)
cell = tf.contrib.rnn.BasicRNNCell(n_hidden)
initial_state = cell.zero_state(n_batch, tf.float32)
state = initial_state
outputs = [] # 과거의 은닉층에서 나온 출력을 저장한다
with tf.variable_scope('RNN'):
for t in range(maxlen):
if t > 0:
tf.get_variable_scope().reuse_variables()
(cell_output, state) = cell(x[:, t, :], state)
outputs.append(cell_output)
output = outputs[-1]
V = weight_variable([n_hidden, n_out])
c = bias_variable([n_out])
y = tf.matmul(output, V) + c # 선형활성
return y
def training(loss):
optimizer = tf.train.AdamOptimizer(learning_rate=0.001,
beta1=0.9,
beta2=0.999)
train_step = optimizer.minimize(loss)
return train_step
def loss(y, t):
mse = tf.reduce_mean(tf.square(y - t))
return mse
2. with Tensorflow
모델 학습
epochs = 500
batch_size = 10
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
n_batches = N_train // batch_size
for epoch in range(epochs):
X_, Y_ = shuffle(X_train, Y_train)
for i in range(n_batches):
start = i * batch_size
end = start + batch_size
sess.run(train_step, feed_dict={
x: X_[start:end],
t: Y_[start:end],
n_batch: batch_size
})
# 검증 데이터를 사용해서 평가한다
val_loss = loss.eval(session=sess, feed_dict={
x: X_validation,
t: Y_validation,
n_batch: N_validation
})
history['val_loss'].append(val_loss)
print('epoch:', epoch,
' validation loss:', val_loss)
# Early Stopping 검사
if early_stopping.validate(val_loss):
break
2. with Tensorflow
예측
truncate = maxlen
Z = X[:1] # 본래 데이터의 첫머리의 일부분만 잘라낸다
original = [f[i] for i in range(maxlen)]
predicted = [None for i in range(maxlen)]
for i in range(length_of_sequences - maxlen + 1):
# 마지막 시계열 데이터로 미래를 예측한다
z_ = Z[-1:]
y_ = y.eval(session=sess, feed_dict={
x: Z[-1:],
n_batch: 1
})
# 예측 결과를 사용해서 새로운 시계열 데이터를 생성한다
sequence_ = np.concatenate(
(z_.reshape(maxlen, n_in)[1:], y_), axis=0) 
.reshape(1, maxlen, n_in)
Z = np.append(Z, sequence_, axis=0)
predicted.append(y_.reshape(-1))
2. with Tensorflow
그래프
plt.rc('font', family='serif')
plt.figure()
plt.ylim([-1.5, 1.5])
plt.plot(toy_problem(T, ampl=0), color='blue')
plt.plot(original, color='red')
plt.plot(predicted, color='black')
plt.show()
3. with keras
모델 설정
n_in = len(X[0][0]) # 1
n_hidden = 20
n_out = len(Y[0]) # 1
def weight_variable(shape, name=None):
return np.random.normal(scale=.01, size=shape)
early_stopping = EarlyStopping(monitor='val_loss', patience=10, verbose=1)
model = Sequential()
model.add(SimpleRNN(n_hidden,
kernel_initializer=weight_variable,
input_shape=(maxlen, n_in)))
model.add(Dense(n_out, kernel_initializer=weight_variable))
model.add(Activation('linear'))
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999)
model.compile(loss='mean_squared_error',
optimizer=optimizer)
3. with keras
모델 학습
epochs = 500
batch_size = 10
model.fit(X_train, Y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(X_validation, Y_validation),
callbacks=[early_stopping])
3. with keras
예측
truncate = maxlen
Z = X[:1] # 본래 데이터의 첫머리의 일부분만을 잘라낸다
original = [f[i] for i in range(maxlen)]
predicted = [None for i in range(maxlen)]
for i in range(length_of_sequences - maxlen + 1):
z_ = Z[-1:]
y_ = model.predict(z_)
sequence_ = np.concatenate(
(z_.reshape(maxlen, n_in)[1:], y_),
axis=0).reshape(1, maxlen, n_in)
Z = np.append(Z, sequence_, axis=0)
predicted.append(y_.reshape(-1))
3. with keras
그래프
plt.rc('font', family='serif')
plt.figure()
plt.ylim([-1.5, 1.5])
plt.plot(toy_problem(T, ampl=0), color='blue')
plt.plot(original, color='red')
plt.plot(predicted, color='black')
plt.show()
Thank you

More Related Content

PPTX
Explanation on Tensorflow example -Deep mnist for expert
홍배 김
 
PPTX
TensorFlow
jirimaterna
 
PDF
TensorFlow Tutorial
NamHyuk Ahn
 
PDF
Google TensorFlow Tutorial
台灣資料科學年會
 
PDF
Gentlest Introduction to Tensorflow - Part 3
Khor SoonHin
 
PDF
Gentlest Introduction to Tensorflow
Khor SoonHin
 
PPTX
深層学習後半
ssusere8ae711
 
PPTX
TensorFlow in Practice
indico data
 
Explanation on Tensorflow example -Deep mnist for expert
홍배 김
 
TensorFlow
jirimaterna
 
TensorFlow Tutorial
NamHyuk Ahn
 
Google TensorFlow Tutorial
台灣資料科學年會
 
Gentlest Introduction to Tensorflow - Part 3
Khor SoonHin
 
Gentlest Introduction to Tensorflow
Khor SoonHin
 
深層学習後半
ssusere8ae711
 
TensorFlow in Practice
indico data
 

What's hot (19)

PDF
Gentlest Introduction to Tensorflow - Part 2
Khor SoonHin
 
PPTX
Introduction to TensorFlow 2
Oswald Campesato
 
PDF
EE443 - Communications 1 - Lab 1 - Loren Schwappach.pdf
Loren Schwappach
 
PPTX
Introduction to TensorFlow 2
Oswald Campesato
 
PDF
Introduction to TensorFlow, by Machine Learning at Berkeley
Ted Xiao
 
PPTX
Introduction to Tensorflow
Tzar Umang
 
PDF
Tensor board
Sung Kim
 
PPTX
Tuple in python
vikram mahendra
 
PDF
Additive model and boosting tree
Dong Guo
 
PDF
Matlab 2
asguna
 
PDF
Applied Digital Signal Processing 1st Edition Manolakis Solutions Manual
towojixi
 
PDF
Fourier series 1
Faiza Saher
 
PDF
Pdfcode
SokhnaRokhayaDIOP
 
PDF
Time Series Analysis:Basic Stochastic Signal Recovery
Daniel Cuneo
 
PDF
Dsp Lab Record
Aleena Varghese
 
PDF
Signal Prosessing Lab Mannual
Jitendra Jangid
 
PPTX
TensorFlow for IITians
Ashish Bansal
 
PPT
3. convolution fourier
skysunilyadav
 
PDF
Ch2
Emkei Sario
 
Gentlest Introduction to Tensorflow - Part 2
Khor SoonHin
 
Introduction to TensorFlow 2
Oswald Campesato
 
EE443 - Communications 1 - Lab 1 - Loren Schwappach.pdf
Loren Schwappach
 
Introduction to TensorFlow 2
Oswald Campesato
 
Introduction to TensorFlow, by Machine Learning at Berkeley
Ted Xiao
 
Introduction to Tensorflow
Tzar Umang
 
Tensor board
Sung Kim
 
Tuple in python
vikram mahendra
 
Additive model and boosting tree
Dong Guo
 
Matlab 2
asguna
 
Applied Digital Signal Processing 1st Edition Manolakis Solutions Manual
towojixi
 
Fourier series 1
Faiza Saher
 
Time Series Analysis:Basic Stochastic Signal Recovery
Daniel Cuneo
 
Dsp Lab Record
Aleena Varghese
 
Signal Prosessing Lab Mannual
Jitendra Jangid
 
TensorFlow for IITians
Ashish Bansal
 
3. convolution fourier
skysunilyadav
 
Ad

Similar to About RNN (20)

PDF
Lucio Floretta - TensorFlow and Deep Learning without a PhD - Codemotion Mila...
Codemotion
 
PPTX
Font classification with 5 deep learning models using tensor flow
Devatanu Banerjee
 
PPTX
Working with tf.data (TF 2)
Oswald Campesato
 
PDF
TENSOR DECOMPOSITION WITH PYTHON
André Panisson
 
PDF
Using R Tool for Probability and Statistics
nazlitemu
 
PDF
Ss 2013 midterm
Dharmendra Dixit
 
PDF
Ss 2013 midterm
Dharmendra Dixit
 
DOCX
Need help filling out the missing sections of this code- the sections.docx
lauracallander
 
PPTX
Digital signal processing parti cularly filter design ppt
SUMITDATTA23
 
DOCX
Trabajo de Matemática aplicada de la facultad de ciencias matematicas unidad ...
jackmarlom
 
PPTX
DSP_DiscSignals_LinearS_150417.pptx
HamedNassar5
 
PDF
Convolution problems
PatrickMumba7
 
PDF
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
TsegaTeklewold1
 
PDF
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
TsegaTeklewold1
 
PPTX
Machine Learning - Introduction to Tensorflow
Andrew Ferlitsch
 
DOCX
Introduction to signals and systems Operations on it
KUNDAPRAVEEN
 
PDF
Simple Neural Network Python Code
Andres Mendez-Vazquez
 
PDF
Rcommands-for those who interested in R.
Dr. Volkan OBAN
 
PPTX
Basic elementary signals and sequences .pptx
pvvec20
 
DOCX
CLUSTERGRAM
Dr. Volkan OBAN
 
Lucio Floretta - TensorFlow and Deep Learning without a PhD - Codemotion Mila...
Codemotion
 
Font classification with 5 deep learning models using tensor flow
Devatanu Banerjee
 
Working with tf.data (TF 2)
Oswald Campesato
 
TENSOR DECOMPOSITION WITH PYTHON
André Panisson
 
Using R Tool for Probability and Statistics
nazlitemu
 
Ss 2013 midterm
Dharmendra Dixit
 
Ss 2013 midterm
Dharmendra Dixit
 
Need help filling out the missing sections of this code- the sections.docx
lauracallander
 
Digital signal processing parti cularly filter design ppt
SUMITDATTA23
 
Trabajo de Matemática aplicada de la facultad de ciencias matematicas unidad ...
jackmarlom
 
DSP_DiscSignals_LinearS_150417.pptx
HamedNassar5
 
Convolution problems
PatrickMumba7
 
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
TsegaTeklewold1
 
cab2602ff858c51113591d17321a80fc_MITRES_6_007S11_hw04.pdf
TsegaTeklewold1
 
Machine Learning - Introduction to Tensorflow
Andrew Ferlitsch
 
Introduction to signals and systems Operations on it
KUNDAPRAVEEN
 
Simple Neural Network Python Code
Andres Mendez-Vazquez
 
Rcommands-for those who interested in R.
Dr. Volkan OBAN
 
Basic elementary signals and sequences .pptx
pvvec20
 
CLUSTERGRAM
Dr. Volkan OBAN
 
Ad

More from Young Oh Jeong (18)

PDF
개혁파교의학 12장 솔하 발표본
Young Oh Jeong
 
PDF
개혁파교의학 12장 동혁 발표본
Young Oh Jeong
 
PDF
개혁파교의학 12장 다은 발표본
Young Oh Jeong
 
PDF
개혁파교의학 12장 예건 발표본
Young Oh Jeong
 
PDF
About RNN
Young Oh Jeong
 
PDF
Python machine learning Chapter 07 - PART1
Young Oh Jeong
 
PDF
What is CNN?
Young Oh Jeong
 
PDF
Python machine learning Chapter 06 - PART1
Young Oh Jeong
 
PDF
Python machine learning Chapter 04 - PART2
Young Oh Jeong
 
PDF
Python machine learning Chapter 02
Young Oh Jeong
 
PDF
10 Scrapping Javascript
Young Oh Jeong
 
PDF
07 Cleaning Your Dirty Data
Young Oh Jeong
 
PDF
푸른아카데미, PART→PARTPRIME
Young Oh Jeong
 
PDF
03 Crawling with Beautiful Soup (네이버 카페 크롤링 하기)
Young Oh Jeong
 
PDF
Day by day iPhone Programming
Young Oh Jeong
 
PDF
네델란드개혁교회역사도식
Young Oh Jeong
 
PDF
Everybody need programming skill. 프로그래밍, 현대인의 교양입니다
Young Oh Jeong
 
PDF
마이크로소프트웨어 2002년 10월호 內, 모바일 관련 서적 리뷰 및 평점
Young Oh Jeong
 
개혁파교의학 12장 솔하 발표본
Young Oh Jeong
 
개혁파교의학 12장 동혁 발표본
Young Oh Jeong
 
개혁파교의학 12장 다은 발표본
Young Oh Jeong
 
개혁파교의학 12장 예건 발표본
Young Oh Jeong
 
About RNN
Young Oh Jeong
 
Python machine learning Chapter 07 - PART1
Young Oh Jeong
 
What is CNN?
Young Oh Jeong
 
Python machine learning Chapter 06 - PART1
Young Oh Jeong
 
Python machine learning Chapter 04 - PART2
Young Oh Jeong
 
Python machine learning Chapter 02
Young Oh Jeong
 
10 Scrapping Javascript
Young Oh Jeong
 
07 Cleaning Your Dirty Data
Young Oh Jeong
 
푸른아카데미, PART→PARTPRIME
Young Oh Jeong
 
03 Crawling with Beautiful Soup (네이버 카페 크롤링 하기)
Young Oh Jeong
 
Day by day iPhone Programming
Young Oh Jeong
 
네델란드개혁교회역사도식
Young Oh Jeong
 
Everybody need programming skill. 프로그래밍, 현대인의 교양입니다
Young Oh Jeong
 
마이크로소프트웨어 2002년 10월호 內, 모바일 관련 서적 리뷰 및 평점
Young Oh Jeong
 

Recently uploaded (20)

PPTX
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
PPTX
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
PPTX
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
PPTX
Simple and concise overview about Quantum computing..pptx
mughal641
 
PPTX
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
PDF
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
PDF
Brief History of Internet - Early Days of Internet
sutharharshit158
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PDF
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
PPTX
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
PDF
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
PPTX
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
PDF
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
PPTX
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
PDF
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
PDF
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
PDF
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 
What-is-the-World-Wide-Web -- Introduction
tonifi9488
 
OA presentation.pptx OA presentation.pptx
pateldhruv002338
 
Agile Chennai 18-19 July 2025 Ideathon | AI Powered Microfinance Literacy Gui...
AgileNetwork
 
Simple and concise overview about Quantum computing..pptx
mughal641
 
IT Runs Better with ThousandEyes AI-driven Assurance
ThousandEyes
 
Economic Impact of Data Centres to the Malaysian Economy
flintglobalapac
 
Brief History of Internet - Early Days of Internet
sutharharshit158
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
AI Unleashed - Shaping the Future -Starting Today - AIOUG Yatra 2025 - For Co...
Sandesh Rao
 
Agile Chennai 18-19 July 2025 | Emerging patterns in Agentic AI by Bharani Su...
AgileNetwork
 
Responsible AI and AI Ethics - By Sylvester Ebhonu
Sylvester Ebhonu
 
cloud computing vai.pptx for the project
vaibhavdobariyal79
 
Peak of Data & AI Encore - Real-Time Insights & Scalable Editing with ArcGIS
Safe Software
 
AI and Robotics for Human Well-being.pptx
JAYMIN SUTHAR
 
How ETL Control Logic Keeps Your Pipelines Safe and Reliable.pdf
Stryv Solutions Pvt. Ltd.
 
The Future of Mobile Is Context-Aware—Are You Ready?
iProgrammer Solutions Private Limited
 
Unlocking the Future- AI Agents Meet Oracle Database 23ai - AIOUG Yatra 2025.pdf
Sandesh Rao
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
OFFOFFBOX™ – A New Era for African Film | Startup Presentation
ambaicciwalkerbrian
 

About RNN

  • 2. RNN 시계열 데이터를 취급하는 방법 (신경망 + 시간개념) LSTM, GRU …
  • 3. 일반 학습 데이터 - 하나의 벡터 시계열 데이터 - 데이터 벡터군 x(n)이 t개 입력 데이터 : x(0), x(1), x(2), x(3), … x(t) ex) 2016년 ~ 2018년 월별 전국 기온 데이터 하나의 입력 데이터 : x(n) ex) mnist 이미지
  • 4. 과거의 시계열 데이터를 학습해서, 미지의 새로운 시계열 데이터가 주어졌을 때 미래상태 예측 RNN
  • 6. 기본적인 신경망 구조 …… …… …… V W input layer x(t) output layer y(t) hidden layer h(t)
  • 7. 순환신경망(RNN) 구조 …… …… …… …… U V W input layer x(t) output layer y(t) hidden layer h(t) previous hidden layer h(t-1) 시간 t인 시점에 주어지는 입력값 x(t) + 저장되어 있던 t-1 시점의 hidden layer t 시점의 hidden layer
  • 8. h(t) = f(Ux(t) + Wh(t-1) + b) y(t) = g(Vh(t) + c) hidden layer formula output layer formula activation function activation function bias bias
  • 9. p(t) = Ux(t) + Wh(t-1) + b q(t) = Vh(t) + c hidden layer value output layer value error function E = E(U, V, W, b, c) let, eh(t) = ∂E ∂p(t) eo(t) = ∂E ∂q(t) Error term Error term
  • 10. eh(t) = ∂E ∂p(t) ∂E ∂U = eo(t) = ∂E ∂q(t) ∂E ∂V = Error term of Hidden Layer Error term of Output Layer ∂E ∂W = ∂E ∂b = ∂E ∂c = ∂E ∂p(t) ∂E ∂q(t) ∂E ∂p(t) ∂E ∂p(t) ∂E ∂q(t) ∂p(t) ∂U T ∂q(t) ∂V T ∂p(t) ∂W T ∂p(t) ∂b ∂q(t) ∂c ⊙ ⊙ = eh(t)x(t) T = eo(t)h(t) T = eh(t)h(t-1) T = eh(t) = eo(t)
  • 11. y(t) = g(Vh(t) + c)output layer formula In CNN activation function g(*) - sigmoid function or softmax function or etc… And output value is probability (number) value. In RNN output value is a function like this, g(x) = x ∴ y(t) = Vh(t) + c E= ∑ y(t) - t(t)1 2 2 t=1 T Squared error function
  • 12. …… …… V …… U input layer x(t) output layer y(t) hidden layer h(t) …… W previous hidden layer h(t-2) …… U input layer x(t-2) …… W previous hidden layer h(t-1) …… U input layer x(t-1) …… W previous hidden layer h(t-3) BPTT (Backpropagation Through Time)
  • 13. BPTT (Backpropagation Through Time) eh(t-1) = ∂E ∂p(t-1) eh(t-1) = ∂E ∂p(t) ⊙ ∂p(t) ∂p(t-1) = eh(t) ⊙ ∂p(t) ∂h(t-1) ∂h(t-1) ∂p(t-1) = eh(t) ⊙Wf(p(t-1))’ 결국 eh(t-1) 를 eh(t) 식으로 나타내는 것.
  • 14. BPTT (Backpropagation Through Time) eh(t-z-1) = eh(t-z)⊙Wf(p(t-z-1))’ U(t+1) = U(t) - η∑eh(t-z)x(t-z) T z=0 τ V(t+1) = V(t) - ηeo(t)h(t) T W(t+1) = W(t) - η∑eh(t-z)h(t-z-1) T z=0 τ b(t+1) = b(t) - η∑eh(t-z) z=0 τ c(t+1) = c(t) - ηe0(t)
  • 16. 1. Prepare DATA def sin(x, T=100): return np.sin(2.0 * np.pi * x / T) def toy_problem(T=100, ampl=0.05): x = np.arange(0, 2 * T + 1) noise = ampl * np.random.uniform(low=-1.0, high=1.0, size=len(x)) return sin(x) + noise 노이즈가 첨가된 사인파 데이터를 생성함수
  • 17. 1. Prepare DATA 데이터를 생성 T = 100 f = toy_problem(T) length_of_sequences = 2 * T # 시계열 전체의 길이 maxlen = 25 # 시계열 데이터 하나의 길이 data = [] target = [] for i in range(0, length_of_sequences - maxlen + 1): data.append(f[i: i + maxlen]) target.append(f[i + maxlen]) X = np.array(data).reshape(len(data), maxlen, 1) Y = np.array(target).reshape(len(data), 1) # 데이터 설정 N_train = int(len(data) * 0.9) N_validation = len(data) - N_train X_train, X_validation, Y_train, Y_validation = train_test_split(X, Y, test_size=N_validation)
  • 18. 2. with Tensorflow 모델 설정 n_in = len(X[0][0]) # 1 n_hidden = 20 n_out = len(Y[0]) # 1 x = tf.placeholder(tf.float32, shape=[None, maxlen, n_in]) t = tf.placeholder(tf.float32, shape=[None, n_out]) n_batch = tf.placeholder(tf.int32) y = inference(x, n_batch, maxlen=maxlen, n_hidden=n_hidden, n_out=n_out) loss = loss(y, t) train_step = training(loss) early_stopping = EarlyStopping(patience=10, verbose=1) history = { 'val_loss': [] }
  • 19. 2. with Tensorflow 모델 설정 본체 def inference(x, n_batch, maxlen=None, n_hidden=None, n_out=None): def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.01) return tf.Variable(initial) def bias_variable(shape): initial = tf.zeros(shape, dtype=tf.float32) return tf.Variable(initial) cell = tf.contrib.rnn.BasicRNNCell(n_hidden) initial_state = cell.zero_state(n_batch, tf.float32) state = initial_state outputs = [] # 과거의 은닉층에서 나온 출력을 저장한다 with tf.variable_scope('RNN'): for t in range(maxlen): if t > 0: tf.get_variable_scope().reuse_variables() (cell_output, state) = cell(x[:, t, :], state) outputs.append(cell_output) output = outputs[-1] V = weight_variable([n_hidden, n_out]) c = bias_variable([n_out]) y = tf.matmul(output, V) + c # 선형활성 return y def training(loss): optimizer = tf.train.AdamOptimizer(learning_rate=0.001, beta1=0.9, beta2=0.999) train_step = optimizer.minimize(loss) return train_step def loss(y, t): mse = tf.reduce_mean(tf.square(y - t)) return mse
  • 20. 2. with Tensorflow 모델 학습 epochs = 500 batch_size = 10 init = tf.global_variables_initializer() sess = tf.Session() sess.run(init) n_batches = N_train // batch_size for epoch in range(epochs): X_, Y_ = shuffle(X_train, Y_train) for i in range(n_batches): start = i * batch_size end = start + batch_size sess.run(train_step, feed_dict={ x: X_[start:end], t: Y_[start:end], n_batch: batch_size }) # 검증 데이터를 사용해서 평가한다 val_loss = loss.eval(session=sess, feed_dict={ x: X_validation, t: Y_validation, n_batch: N_validation }) history['val_loss'].append(val_loss) print('epoch:', epoch, ' validation loss:', val_loss) # Early Stopping 검사 if early_stopping.validate(val_loss): break
  • 21. 2. with Tensorflow 예측 truncate = maxlen Z = X[:1] # 본래 데이터의 첫머리의 일부분만 잘라낸다 original = [f[i] for i in range(maxlen)] predicted = [None for i in range(maxlen)] for i in range(length_of_sequences - maxlen + 1): # 마지막 시계열 데이터로 미래를 예측한다 z_ = Z[-1:] y_ = y.eval(session=sess, feed_dict={ x: Z[-1:], n_batch: 1 }) # 예측 결과를 사용해서 새로운 시계열 데이터를 생성한다 sequence_ = np.concatenate( (z_.reshape(maxlen, n_in)[1:], y_), axis=0) .reshape(1, maxlen, n_in) Z = np.append(Z, sequence_, axis=0) predicted.append(y_.reshape(-1))
  • 22. 2. with Tensorflow 그래프 plt.rc('font', family='serif') plt.figure() plt.ylim([-1.5, 1.5]) plt.plot(toy_problem(T, ampl=0), color='blue') plt.plot(original, color='red') plt.plot(predicted, color='black') plt.show()
  • 23. 3. with keras 모델 설정 n_in = len(X[0][0]) # 1 n_hidden = 20 n_out = len(Y[0]) # 1 def weight_variable(shape, name=None): return np.random.normal(scale=.01, size=shape) early_stopping = EarlyStopping(monitor='val_loss', patience=10, verbose=1) model = Sequential() model.add(SimpleRNN(n_hidden, kernel_initializer=weight_variable, input_shape=(maxlen, n_in))) model.add(Dense(n_out, kernel_initializer=weight_variable)) model.add(Activation('linear')) optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999) model.compile(loss='mean_squared_error', optimizer=optimizer)
  • 24. 3. with keras 모델 학습 epochs = 500 batch_size = 10 model.fit(X_train, Y_train, batch_size=batch_size, epochs=epochs, validation_data=(X_validation, Y_validation), callbacks=[early_stopping])
  • 25. 3. with keras 예측 truncate = maxlen Z = X[:1] # 본래 데이터의 첫머리의 일부분만을 잘라낸다 original = [f[i] for i in range(maxlen)] predicted = [None for i in range(maxlen)] for i in range(length_of_sequences - maxlen + 1): z_ = Z[-1:] y_ = model.predict(z_) sequence_ = np.concatenate( (z_.reshape(maxlen, n_in)[1:], y_), axis=0).reshape(1, maxlen, n_in) Z = np.append(Z, sequence_, axis=0) predicted.append(y_.reshape(-1))
  • 26. 3. with keras 그래프 plt.rc('font', family='serif') plt.figure() plt.ylim([-1.5, 1.5]) plt.plot(toy_problem(T, ampl=0), color='blue') plt.plot(original, color='red') plt.plot(predicted, color='black') plt.show()