Machine learning algorithms can be used for market prediction with Zorro's
advise functions.
Due to the low signal-to-noise ratio and to ever-changing market conditions,
analyzing price series is an ambitious task for machine
learning. But since the price curves are not completely random, even simple
machine learning methods, such as in the **DeepLearn** script,
can predict the next price movement with a better than 50% success rate. If
the success rate is high enough to overcome transactions costs is another
question.

Compared with other machine learning algorithms, such as Random Forests or Support Vector Machines, deep learning systems have often a high success rate with a small programming effort. A linear neural network with 8 inputs driven by indicators and 4 outputs for buying and selling has a structure like this:

Deep learning uses linear or special neural network structures (convolution layers, LSTM) with a large number of neurons and hidden layers. Some parameters common for most neural networks:

**Hidden layers**of a linear network are usually defined with a vector, f.i.**c(50,100,50)**defines 3 hidden layers, the first with 50, second with 100, and third with 50 neurons.- The
**Activation**function converts the sum of neuron input values to the neuron output. Most often used are a**Rectifier**(RELU = rectified linear unit) that has a linear slope from 0 to 1,**Sigmoid**that saturates to 0 or 1,**Tanh**that saturates to -1 or +1, or**SoftMax**that approximates the highest input. - An
**Eoch**is a training iteration over the entire data set. Training will stop once the number of epochs is reached. More epochs mean better prediction, but longer training. - The
**Learning rate**controls the step size for the gradient descent in training; a lower rate means finer steps and possibly more precise prediction, but longer training time. The**Larning rate scale**is a multiplication factor for changing the learning rate after each iteration. **Momentum**adds a fraction of the previous step to the current one. It prevents the gradient descent from getting stuck at a tiny local minimum or saddle point.- The
**Batch size**is a number of random samples – a**mini batch**– taken out of the data set for a single training run. Splitting the data into mini batches speeds up training since the weight gradient is then calculated from fewer samples. The higher the batch size, the better is the training, but the more time it will take. **Dropout**is a number of randomly selected neurons that are disabled during a mini batch. This way the net learns only with a part of its neurons, which can effectively reduce overfitting.

Here's a short description of installation and usage of four popular R based deep learning packages, each with an example of a (not really deep) linear neural net with one hidden layer.

library('deepnet') neural.train = function(model,XY) { XY <- as.matrix(XY) X <- XY[,-ncol(XY)] Y <- XY[,ncol(XY)] Y <- ifelse(Y > 0,1,0) Models[[model]] <<- sae.dnn.train(X,Y, hidden = c(30), learningrate = 0.5, momentum = 0.5, learningrate_scale = 1.0, output = "sigm", sae_output = "linear", numepochs = 100, batchsize = 100) } neural.predict = function(model,X) { if(is.vector(X)) X <- t(X) return(nn.predict(Models[[model]],X)) } neural.save = function(name) { save(Models,file=name) } neural.init = function() { set.seed(365) Models <<- vector("list") }

library('h2o') neural.train = function(model,XY) { XY <- as.h2o(XY) Models[[model]] <<- h2o.deeplearning( -ncol(XY),ncol(XY),XY, hidden = c(30), seed = 365) } neural.predict = function(model,X) { if(is.vector(X)) X <- as.h2o(as.data.frame(t(X))) else X <- as.h2o(X) Y <- h2o.predict(Models[[model]],X) return(as.vector(Y)) } neural.save = function(name) { save(Models,file=name) } neural.init = function() { h2o.init() Models <<- vector("list") }

Keras is available as a R library, but installing it with Tensorflow requires also a Python environment such as Anaconda. The step by step installation for the CPU version:

- Download and install R 64 bit for Windows from cran.r-project.org/bin/windows/..
- Edit
**Zorro.ini**and enter the path to the R terminal on your PC, f.i. "C:\Program Files\R\R-3.6.1\bin\x64\Rterm.exe". - Download and install
**RTools**from cran.r-project.org/bin/windows/Rtools/. - Download and install
**Anaconda**for Windows, Python 3.X version, from www.anaconda.com.. - Start the
**R x64**console. Enter the command**install.packages("devtools")**and hit <Return>**.** - Now you can finally install Keras. The commands:
**devtools::install_github("rstudio/keras") -****library('keras') -****install_keras()**.

Depending on the Keras version, the way to install it might change from time to time, so check on their website to be sure. Now you're all set to use Keras in your strategy:

library('keras') neural.train = function(model,XY) { X <- data.matrix(XY[,-ncol(XY)]) Y <- XY[,ncol(XY)] Y <- ifelse(Y > 0,1,0) Model <- keras_model_sequential() Model %>% layer_dense(units=30,activation='relu',input_shape = c(ncol(X))) %>% layer_dropout(rate = 0.2) %>% layer_dense(units = 1, activation = 'sigmoid') Model %>% compile( loss = 'binary_crossentropy', optimizer = optimizer_rmsprop(), metrics = c('accuracy')) Model %>% fit(X, Y, epochs = 20, batch_size = 20, validation_split = 0, shuffle = FALSE) Models[[model]] <<- Model } neural.predict = function(model,X) { if(is.vector(X)) X <- t(X) X <- as.matrix(X) Y <- Models[[model]] %>% predict_proba(X) return(ifelse(Y > 0.5,1,0)) } neural.save = function(name) { for(i in c(1:length(Models))) Models[[i]] <<- serialize_model(Models[[i]]) save(Models,file=name) } neural.load <- function(name) { load(name,.GlobalEnv) for(i in c(1:length(Models))) Models[[i]] <<- unserialize_model(Models[[i]]) } neural.init = function() { set.seed(365) Models <<- vector("list") }

cran <- getOption("repos") cran["dmlc"] <- "https://s3-us-west-2.amazonaws.com/apache-mxnet/R/CRAN/" options(repos = cran) install.packages('mxnet')The MxNet R script:

library('mxnet') neural.train = function(model,XY) { X <- data.matrix(XY[,-ncol(XY)]) Y <- XY[,ncol(XY)] Y <- ifelse(Y > 0,1,0) Models[[model]] <<- mx.mlp(X,Y, hidden_node = c(30), out_node = 2, activation = "sigmoid", out_activation = "softmax", num.round = 20, array.batch.size = 20, learning.rate = 0.05, momentum = 0.9, eval.metric = mx.metric.accuracy) } neural.predict = function(model,X) { if(is.vector(X)) X <- t(X) X <- data.matrix(X) Y <- predict(Models[[model]],X) return(ifelse(Y[1,] > Y[2,],0,1)) } neural.save = function(name) { save(Models,file=name) } neural.init = function() { mx.set.seed(365) Models <<- vector("list") }

- If the deep network package is not or not properly installed, the
training process will fail with the error message
**NEURAL_INIT: failed**. - Most deep learning packages are optimized for training and predicting a large set of data samples. Trading algorithms however normally train with many samples, but predict from only a single sample.This has no large effect on live trading, but can make the backtest very slow. This was especially observed with the H2O package, but to some extent also with other packages. If you can choose, use GPU support only for training, and the CPU for testing and trading.