About RNN

ABOUT RNN
PartPrime Inc.
Ryan Jeong

RNN
시계열 데이터를 취급하는 방법
(신경망 + 시간개념)
LSTM, GRU …

일반 학습 데이터 - 하나의 벡터
시계열 데이터 - 데이터 벡터군
x(n)이 t개 입력 데이터 : x(0), x(1), x(2), x(3), … x(t)
ex) 2016년 ~ 2018년 월별 전국 기온 데이터
하나의 입력 데이터 : x(n)
ex) mnist 이미지

과거의 시계열 데이터를 학습해서,
미지의 새로운 시계열 데이터가 주어졌을 때
미래상태 예측
RNN

기본적인 신경망 구조
……
……
……
V W
input layer
x(t)
output layer
y(t)
hidden layer
h(t)

순환신경망(RNN) 구조
……
……
……
…… U V
W
input layer
x(t)
output layer
y(t)
hidden layer
h(t)
previous hidden layer
h(t-1)
시간 t인 시점에 주어지는 입력값 x(t)
+
저장되어 있던 t-1 시점의 hidden layer
t 시점의 hidden layer

h(t) = f(Ux(t) + Wh(t-1) + b)
y(t) = g(Vh(t) + c)
hidden layer formula
output layer formula
activation function
activation function
bias
bias

p(t) = Ux(t) + Wh(t-1) + b q(t) = Vh(t) + c
hidden layer value output layer value
error function E = E(U, V, W, b, c)
let,
eh(t) =
∂E
∂p(t)
eo(t) =
∂E
∂q(t)
Error term Error term

eh(t) =
∂E
∂p(t)
∂E
∂U
=
eo(t) =
∂E
∂q(t)
∂E
∂V
=
Error term of Hidden Layer
Error term of Output Layer
∂E
∂W
=
∂E
∂b
=
∂E
∂c
=
∂E
∂p(t)
∂E
∂q(t)
∂E
∂p(t)
∂E
∂p(t)
∂E
∂q(t)
∂p(t)
∂U
T
∂q(t)
∂V
T
∂p(t)
∂W
T
∂p(t)
∂b
∂q(t)
∂c
⊙
⊙
= eh(t)x(t)
T
= eo(t)h(t)
T
= eh(t)h(t-1)
T
= eh(t)
= eo(t)

y(t) = g(Vh(t) + c)output layer formula
In CNN
activation function g(*) - sigmoid function or softmax function or etc…
And output value is probability (number) value.
In RNN
output value is a function like this, g(x) = x
∴ y(t) = Vh(t) + c
E= ∑ y(t) - t(t)1
2
2
t=1
T
Squared error function

……
……
V
……
U
input layer
x(t)
output layer
y(t)
hidden layer
h(t)
……
W
h(t-2)
……
U
input layer
x(t-2)
……
W
h(t-1)
……
U
input layer
x(t-1)
……
W
h(t-3)
BPTT (Backpropagation Through Time)

eh(t-1) =
∂E
∂p(t-1)
eh(t-1) =
∂E
∂p(t)
⊙
∂p(t)
∂p(t-1)
= eh(t) ⊙
∂p(t)
∂h(t-1)
∂h(t-1)
∂p(t-1)
= eh(t) ⊙Wf(p(t-1))’
결국 eh(t-1) 를 eh(t) 식으로 나타내는 것.

eh(t-z-1) = eh(t-z)⊙Wf(p(t-z-1))’
U(t+1) = U(t) - η∑eh(t-z)x(t-z)
T
z=0
τ
V(t+1) = V(t) - ηeo(t)h(t)
T
W(t+1) = W(t) - η∑eh(t-z)h(t-z-1)
T
z=0
τ
b(t+1) = b(t) - η∑eh(t-z)
z=0
τ
c(t+1) = c(t) - ηe0(t)

1. Prepare DATA
def sin(x, T=100):
return np.sin(2.0 * np.pi * x / T)
def toy_problem(T=100, ampl=0.05):
x = np.arange(0, 2 * T + 1)
noise = ampl * np.random.uniform(low=-1.0, high=1.0, size=len(x))
return sin(x) + noise
노이즈가 첨가된 사인파 데이터를 생성함수

1. Prepare DATA
데이터를 생성
T = 100
f = toy_problem(T)
length_of_sequences = 2 * T # 시계열 전체의 길이
maxlen = 25 # 시계열 데이터 하나의 길이
data = []
target = []
for i in range(0, length_of_sequences - maxlen + 1):
data.append(f[i: i + maxlen])
target.append(f[i + maxlen])
X = np.array(data).reshape(len(data), maxlen, 1)
Y = np.array(target).reshape(len(data), 1)
# 데이터 설정
N_train = int(len(data) * 0.9)
N_validation = len(data) - N_train
X_train, X_validation, Y_train, Y_validation = train_test_split(X, Y, test_size=N_validation)

2. with Tensorflow
모델 설정
n_in = len(X[0][0]) # 1
n_hidden = 20
n_out = len(Y[0]) # 1
x = tf.placeholder(tf.float32, shape=[None, maxlen, n_in])
t = tf.placeholder(tf.float32, shape=[None, n_out])
n_batch = tf.placeholder(tf.int32)
y = inference(x, n_batch, maxlen=maxlen, n_hidden=n_hidden, n_out=n_out)
loss = loss(y, t)
train_step = training(loss)
early_stopping = EarlyStopping(patience=10, verbose=1)
history = {
'val_loss': []
}

2. with Tensorflow
모델 설정 본체
def inference(x, n_batch, maxlen=None, n_hidden=None, n_out=None):
def weight_variable(shape):
initial = tf.truncated_normal(shape, stddev=0.01)
return tf.Variable(initial)
def bias_variable(shape):
initial = tf.zeros(shape, dtype=tf.float32)
return tf.Variable(initial)
cell = tf.contrib.rnn.BasicRNNCell(n_hidden)
initial_state = cell.zero_state(n_batch, tf.float32)
state = initial_state
outputs = [] # 과거의 은닉층에서 나온 출력을 저장한다
with tf.variable_scope('RNN'):
for t in range(maxlen):
if t > 0:
tf.get_variable_scope().reuse_variables()
(cell_output, state) = cell(x[:, t, :], state)
outputs.append(cell_output)
output = outputs[-1]
V = weight_variable([n_hidden, n_out])
c = bias_variable([n_out])
y = tf.matmul(output, V) + c # 선형활성
return y
def training(loss):
optimizer = tf.train.AdamOptimizer(learning_rate=0.001,
beta1=0.9,
beta2=0.999)
train_step = optimizer.minimize(loss)
return train_step
def loss(y, t):
mse = tf.reduce_mean(tf.square(y - t))
return mse

2. with Tensorflow
모델 학습
epochs = 500
batch_size = 10
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
n_batches = N_train // batch_size
for epoch in range(epochs):
X_, Y_ = shuffle(X_train, Y_train)
for i in range(n_batches):
start = i * batch_size
end = start + batch_size
sess.run(train_step, feed_dict={
x: X_[start:end],
t: Y_[start:end],
n_batch: batch_size
})
# 검증 데이터를 사용해서 평가한다
val_loss = loss.eval(session=sess, feed_dict={
x: X_validation,
t: Y_validation,
n_batch: N_validation
})
history['val_loss'].append(val_loss)
print('epoch:', epoch,
' validation loss:', val_loss)
# Early Stopping 검사
if early_stopping.validate(val_loss):
break

2. with Tensorflow
예측
truncate = maxlen
Z = X[:1] # 본래 데이터의 첫머리의 일부분만 잘라낸다
original = [f[i] for i in range(maxlen)]
predicted = [None for i in range(maxlen)]
for i in range(length_of_sequences - maxlen + 1):
# 마지막 시계열 데이터로 미래를 예측한다
z_ = Z[-1:]
y_ = y.eval(session=sess, feed_dict={
x: Z[-1:],
n_batch: 1
})
# 예측 결과를 사용해서 새로운 시계열 데이터를 생성한다
sequence_ = np.concatenate(
(z_.reshape(maxlen, n_in)[1:], y_), axis=0)
.reshape(1, maxlen, n_in)
Z = np.append(Z, sequence_, axis=0)
predicted.append(y_.reshape(-1))

2. with Tensorflow
그래프
plt.rc('font', family='serif')
plt.figure()
plt.ylim([-1.5, 1.5])
plt.plot(toy_problem(T, ampl=0), color='blue')
plt.plot(original, color='red')
plt.plot(predicted, color='black')
plt.show()

3. with keras
모델 설정
n_in = len(X[0][0]) # 1
n_hidden = 20
n_out = len(Y[0]) # 1
def weight_variable(shape, name=None):
return np.random.normal(scale=.01, size=shape)
early_stopping = EarlyStopping(monitor='val_loss', patience=10, verbose=1)
model = Sequential()
model.add(SimpleRNN(n_hidden,
kernel_initializer=weight_variable,
input_shape=(maxlen, n_in)))
model.add(Dense(n_out, kernel_initializer=weight_variable))
model.add(Activation('linear'))
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999)
model.compile(loss='mean_squared_error',
optimizer=optimizer)

3. with keras
모델 학습
epochs = 500
batch_size = 10
model.fit(X_train, Y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(X_validation, Y_validation),
callbacks=[early_stopping])

3. with keras
예측
truncate = maxlen
Z = X[:1] # 본래 데이터의 첫머리의 일부분만을 잘라낸다
original = [f[i] for i in range(maxlen)]
predicted = [None for i in range(maxlen)]
for i in range(length_of_sequences - maxlen + 1):
z_ = Z[-1:]
y_ = model.predict(z_)
sequence_ = np.concatenate(
(z_.reshape(maxlen, n_in)[1:], y_),
axis=0).reshape(1, maxlen, n_in)
Z = np.append(Z, sequence_, axis=0)
predicted.append(y_.reshape(-1))

3. with keras
그래프
plt.rc('font', family='serif')
plt.figure()
plt.ylim([-1.5, 1.5])
plt.plot(toy_problem(T, ampl=0), color='blue')
plt.plot(original, color='red')
plt.plot(predicted, color='black')
plt.show()

About RNN

More Related Content

What's hot (19)

Similar to About RNN (20)

More from Young Oh Jeong (18)

Recently uploaded (20)

About RNN