我们通过上一讲的CNN,可以知道,CNN是依据于尺寸不变性,平移不变性,旋转不变性。空间共享,通过不同位置的参数共享那么接下来,我们来学习时间共享,通过不同时刻的参数共享。
目录
循环核循环横向 按时间展开循环计算层TF描述循环计算层循环计算过程Embedding编码RNN实现股票预测LSTMGRU网络
循环核
循环核:参数时间共享,循环层提取时间信息。记忆体内存储的状态信息ht,在每个时刻都被更新
循环横向 按时间展开
循环计算层
TF描述循环计算层
return_sequences = True return_sequences = False x_train维度【送入样本数,循环核时间展开步数,每个时间步输入特征个数】循环核展开步数,也就是输入的一个数据,有几个字符,就是几。比如输入是a,那么展开步数是1,输入是AAT,那么展开步数是3另外,每个时间步输入特征个数 也就是每个字符用几个特征数值表示,比如 A(ATCG) 在one-hot编码中,用n来表示,也就是每个时间步输入特征个数为n。n表示序列长度。
循环计算过程
Embedding编码
embedding编码作为可训练的参数,设定词汇表大小,每个字符的维度即可
RNN实现股票预测
import numpy
as np
import tensorflow
as tf
from tensorflow
.keras
.layers
import Dropout
, Dense
, SimpleRNN
import matplotlib
.pyplot
as plt
import os
import pandas
as pd
from sklearn
.preprocessing
import MinMaxScaler
from sklearn
.metrics
import mean_squared_error
, mean_absolute_error
import math
maotai
= pd
.read_csv
('./SH600519.csv')
training_set
= maotai
.iloc
[0:2426 - 300, 2:3].values
test_set
= maotai
.iloc
[2426 - 300:, 2:3].values
sc
= MinMaxScaler
(feature_range
=(0, 1))
training_set_scaled
= sc
.fit_transform
(training_set
)
test_set
= sc
.transform
(test_set
)
x_train
= []
y_train
= []
x_test
= []
y_test
= []
for i
in range(60, len(training_set_scaled
)):
x_train
.append
(training_set_scaled
[i
- 60:i
, 0])
y_train
.append
(training_set_scaled
[i
, 0])
np
.random
.seed
(7)
np
.random
.shuffle
(x_train
)
np
.random
.seed
(7)
np
.random
.shuffle
(y_train
)
tf
.random
.set_seed
(7)
x_train
, y_train
= np
.array
(x_train
), np
.array
(y_train
)
x_train
= np
.reshape
(x_train
, (x_train
.shape
[0], 60, 1))
for i
in range(60, len(test_set
)):
x_test
.append
(test_set
[i
- 60:i
, 0])
y_test
.append
(test_set
[i
, 0])
x_test
, y_test
= np
.array
(x_test
), np
.array
(y_test
)
x_test
= np
.reshape
(x_test
, (x_test
.shape
[0], 60, 1))
model
= tf
.keras
.Sequential
([
SimpleRNN
(80, return_sequences
=True),
Dropout
(0.2),
SimpleRNN
(100),
Dropout
(0.2),
Dense
(1)
])
model
.compile(optimizer
=tf
.keras
.optimizers
.Adam
(0.001),
loss
='mean_squared_error')
checkpoint_save_path
= "./checkpoint/rnn_stock.ckpt"
if os
.path
.exists
(checkpoint_save_path
+ '.index'):
print('-------------load the model-----------------')
model
.load_weights
(checkpoint_save_path
)
cp_callback
= tf
.keras
.callbacks
.ModelCheckpoint
(filepath
=checkpoint_save_path
,
save_weights_only
=True,
save_best_only
=True,
monitor
='val_loss')
history
= model
.fit
(x_train
, y_train
, batch_size
=64, epochs
=50, validation_data
=(x_test
, y_test
), validation_freq
=1,
callbacks
=[cp_callback
])
model
.summary
()
file = open('./weights.txt', 'w')
for v
in model
.trainable_variables
:
file.write
(str(v
.name
) + '\n')
file.write
(str(v
.shape
) + '\n')
file.write
(str(v
.numpy
()) + '\n')
file.close
()
loss
= history
.history
['loss']
val_loss
= history
.history
['val_loss']
plt
.plot
(loss
, label
='Training Loss')
plt
.plot
(val_loss
, label
='Validation Loss')
plt
.title
('Training and Validation Loss')
plt
.legend
()
plt
.show
()
predicted_stock_price
= model
.predict
(x_test
)
predicted_stock_price
= sc
.inverse_transform
(predicted_stock_price
)
real_stock_price
= sc
.inverse_transform
(test_set
[60:])
plt
.plot
(real_stock_price
, color
='red', label
='MaoTai Stock Price')
plt
.plot
(predicted_stock_price
, color
='blue', label
='Predicted MaoTai Stock Price')
plt
.title
('MaoTai Stock Price Prediction')
plt
.xlabel
('Time')
plt
.ylabel
('MaoTai Stock Price')
plt
.legend
()
plt
.show
()
mse
= mean_squared_error
(predicted_stock_price
, real_stock_price
)
rmse
= math
.sqrt
(mean_squared_error
(predicted_stock_price
, real_stock_price
))
mae
= mean_absolute_error
(predicted_stock_price
, real_stock_price
)
print('均方误差: %.6f' % mse
)
print('均方根误差: %.6f' % rmse
)
print('平均绝对误差: %.6f' % mae
)
LSTM
LSTM就是听老师讲课的过程,现在脑子记忆的内容是ppt第一页到第45页记忆的长期记忆内容Ct,长期记忆Ct包括两部分,一部分是ppt1-44页的内容,也就是上一时刻的Ct-1,因为不可能全部记住,所以需要乘上遗忘门,这个乘积项表示留存在脑中的对过去的记忆。现在学习的是新知识,是即将存入脑中的现在的记忆。现在的记忆包括两部分,一部分是现在讲的第45页ppt,是当前时刻输入的Xt,还有一部分是第44页ppt的短期记忆留存,脑袋把当前时刻的输入Xt和上一时刻的短期记忆ht-1,归纳形成即将存入脑中的现在的记忆Ct波浪号。现在的记忆乘以输入门与过去的记忆异同存储Wie长期记忆。当我重新讲给别人时,不可能一字不落的讲出来,讲的是留存在脑中的长期记忆经过输出门筛选后的内容,这就是记忆体的输出ht。
当有多层网络时,第二层循环网络的输入就是第一层循环网络的输出ht,输入第二层网络的是第一层网络提取的精华。
TF表示
GRU网络
TF描述