Select Page

In previous blog post we proposed how custom Deep learning neural network can be built from scratch using Python. In this post, we will present some verification and testing results.

Let us consider a very simple example of learning the shape of a sinusoidal function by means of a very simple network having one hidden ReLU layer and a linear output layer.

Let us first construct an appropriate training set. Since this is a very simple example, we will not even consider a separate test for validation. The goal is, after all, to demonstrate that the previously deriver and implemented gradient learning algorithm based on back-propagation works!
u_array = np.arange(0, np.pi, 0.01)
y_array = np.sin(u)

Next, let us build a network and train it for 200 epochs using 1-sample batches.

for _ in range(200):
for u, y in zip(u_array, y_array):
net.train_on([np.asarray(u)], [np.asarray(y)])

Let us check network performance after the training …

y_ = []
for u in u_array:
y_.append(net.evaluate(np.asarray(u)))
y_ = np.asarray(y_).flatten()

plt.plot(u_array, y_)
plt.plot(u_array, y_array)
plt.title(f'MAE = {np.mean(np.abs(y_array - y_))}', fontsize=16);


### A More Interesting Example

This example illustrates basic procedure necessary for performing one-step-ahead prediction of S&P500 index based on prices of its constituents using custom feed-forward neural network implemented in “pure” Python.

S&P500 is a financial index, essentially a weighted moving average, computed based on market capitalization of 500 companies having stock listed on NYSE and NASDAQ. The problem considered here is to predict S&P500 value at some future time instant, based on the known values of its constituents in the current time instant.

Assuming that stock data is sampled at some fixed interval of time, the problem can be formally stated as identifying unknown mapping ​$$f$$​ such that

$SP500_1 = f(c_1, c_2, \ldots, c_{500})$

where  ​$$SP500_1$$​ is the value of SP500 index ​$$1$$​ step ahead into the future, ​$$c_1$$​ to  ​$$c_500$$​ are values of individual constituents (market capitalization of relevant companies) in the current time instant.

The predictor will be implemented by means of a feed-forward architecture implemented in the first part of this notebook.

import pandas as pd

First, let us load data. The data will be downsapled significantly, in order to increase efficiency of the training process.

data = pd.read_csv('./data/data_stocks.csv')
data.drop(['DATE'], axis=1, inplace=True)
data = data.iloc[:10000:10, :]

Available data will be split into test and training sets. The training set will consist of every tenth sample, while the rest will be used for testing. Therefore, training will be performed on 10% of the data, while the rest will be used for testing.

decimation = 10
data_train = data.iloc[::decimation, :]
data_test = data.iloc[test_mask, :]

Before training, the data will be scaled. Scaling is performed using the training set only.

# Compute mean and standard deviation of the TRAINING data
m = data_train.min(axis=0)
M = data_train.max(axis=0)

# Scale both the training and the test data
data_train_scaled = (data_train - m) / (M-m)
data_test_scaled = (data_test - m) / (M-m)

# Build training and test regressor and target
X_train = data_train_scaled.iloc[:, 1:].fillna(0)
Y_train = data_train_scaled.iloc[:, 0].fillna(0)
X_test = data_test_scaled.iloc[:, 1:].fillna(0)
Y_test = data_test_scaled.iloc[:, 0].fillna(0)

The following line simply converts test and train data from the original Pandas DataFrame format into plain NumPy ndarray.

X_train = X_train.values
Y_train = Y_train.values
X_test = X_test.values
Y_test = Y_test.values

Once the data is prepared, let us build an appropriate neural network. We will use a simple network architecture with 3 layers, apart from the input one. The two hidden layers are non-linear, with 128 and 64 ReLU units, respectively. The output layer is linear, with only one unit.

net = Network([Layer(500, 128, ReLU_Activation),
Layer(128, 64, ReLU_Activation),
Layer(64, 1, Lin_Activation)])

The network will be trained using one-sample batches, over 200 epochs. This means that the entire data-set will be passed 200 times, and within each pass network weights will be updated once for every input sample.

for i in range(200):
if i % 10 == 0:
print('.', end='')
for u, y in zip(X_train, Y_train):
u = u.reshape(500, 1)
y = np.asarray([y]).reshape(1, 1)
net.train_on([u], [y])

Let us first evaluate the network performance on the training set. In general, this is not a good indication of how well the network will be handling data in the future. However, this is a good indication on whether or not the network managed to learn the input data sufficiently well.

Too big training data may indicate that the network structure is too simple, or that there are problems related to data pre-processing.

y_ = []
for u in X_train:
y_.append(net.evaluate(u.reshape(500, 1)))
y_ = np.asarray(y_).flatten()

plt.plot(y_, label='estimation')
plt.plot(Y_train, label='actual data')
plt.legend(fontsize=16);
plt.title('Comparison of the actual and estimated data on the training set');

Finally, let us evaluate network performance on the test set. This is much better indicator of how well the network is expected to be working in production.

y_ = []
for u in X_test:
y_.append(net.evaluate(u.reshape(500, 1)))
y_ = np.asarray(y_).flatten()

plt.plot(y_, label='estimation')
plt.plot(Y_test, label='actual data')
plt.legend(fontsize=16);
plt.title('Comparison of the actual and estimated data on the test set');

$${}$$