Method |
PERCEPTRON - train and predict with a simple neural net (20 signals max).PATTERN - train and predict with a signal pattern analyzer (20 signals max). +FAST -
fast and large pattern finding (for PATTERN). +FUZZY - fuzzy pattern finding (for PATTERN). +2 .. +6 - number of pattern groups (for PATTERN). +BALANCED - enforce the same number of positive and negative target values by replication (for SIGNALS, NEURAL, DTREE, PERCEPTRON).0 - use the method and signals of the last advise call.
Only when trade returns are used for the training target, not when Objective
is used. If a 0 or omitted, use the method and signals of the last advise call. |

Objective |
The training target value. If at 0 or omitted, use the
next trade return for the target. In that case the target is the profit or loss including trading costs of the immediately following trade with the same asset, algo, and matching trade direction. A positive value advises to enter a trade when this signal combination occurs, a negative value advises against a trade. The Objective parameter is only used in the training run and has no meaning in test or trade mode. Use the PEEK flag for accessing future prices in training. When using a direct training target, make sure that it
is not 0 at the first advise() calls, or else the next trade
return will be used. |

Signal0, ... Signal19 |
Up to 20 parameters that are used as features to the machine learning algorithm for training or prediction. Use signals that carry information about the current market situation, for instance candle patterns, price differences, indicators, filters, or statistics functions. All signal values should be in the same range, for instance 0..1, -1..+1, or -100..+100 (see remarks). Signals largely outside that range will generate a warning message. If the signals are omitted, the signals from the last advise call are used. |

Signals |
Alternative input method, a var array of arbitrary length containing the features to the machine learning algorithm. |

NumSignals |
Length of the Signals array. |

In Test], [Trade] mode: Prediction value returned from the trained algorithm, for triggering trades when exceeding a threshold. The internal methods normally return a value in the

The signals should be normalized roughly to the

The decision tree functions are stored in C source code in the **\Data\*.c** file. The functions are automatically included in the strategy script and used by the **advise** function in test and trade mode. They can also be exported for using them in strategy scripts or expert advisors of other platforms.The example below is a typical Zorro-generated decision tree:

int EURUSD_S(var* sig)

{

if(sig[1] <= 12.938) {

if(sig[3] <= 0.953) return -70;

else {

if(sig[2] <= 43) return 25;

else {

if(sig[3] <= 0.962) return -67;

else return 15;

}

}

}

else {

if(sig[3] <= 0.732) return -71;

else {

if(sig[1] > 30.61) return 27;

else {

if(sig[2] > 46) return 80;

else return -62;

}

}

}

}

The **advise()** call used 5 signals, of which the first and the last one - **sig[0]** and **sig[4]** - had no predictive power, and thus were pruned and do not appear in the tree. Unpredictive signals are displayed in the message window.

Example of a script for generating a decision tree:

void run() { BarPeriod = 60; LookBack = 150; TradesPerBar = 2; if(Train) Hedge = 2; set(RULES|TESTNOW);// generate price seriesvars H = series(priceHigh()), L = series(priceLow()), C = series(priceClose());// generate some signals from H,L,C in the -100..100 rangevar Signal1 = (LowPass(H,1000)-LowPass(L,1000))/PIP; var Signal2 = 100*FisherN(C,100);// train and trade the signalsStop = 4*ATR(100); TakeProfit = 4*ATR(100); if(adviseLong(DTREE,0,Signal1,Signal2) > 0) enterLong(); if(adviseShort(DTREE,0,Signal1,Signal2) > 0) enterShort(); }

The perceptron algorithm works best when a weighted sum of the signals has predictive power. It does not work well when the prediction requires a nonlinear signal combination, i.e. when trade successes and failures are not separated by a straight plane in the signal space. A classical example of a function that a perceptron can not emulate is a logical XOR. Often a perceptron can be used where a decision tree fails, and vice versa.

The perceptron learning algorithm generates prediction functions in C source code in the **\Data\*.c** file. The functions are automatically included in the strategy script and used by the **advise** function in test and trade mode. They can also be exported for using them in strategy scripts or expert advisors of other platforms. The output is binary: either **>0** for a positive or **<0** for a negative prediction. A generated perceptron function with 3 signals can look like this:

int EURUSD_S(var* sig)Signals that do not contain useful market information get a weight of

{

if(-27.99*sig[0] + 1.24*sig[1] - 3.54*sig[2] > -21.50)

return 100;

else

return -100;

}

The signals can be divided into groups with the **PATTERN+2** .. **PATTERN+6** methods. They divide the signals into two to six pattern groups and only compare signals within the same group. This is useful when, for instance, only the first two candles and the last two candles of a 3-candle pattern should be compared with each other, but not the first candle with the third candle. **PATTERN+2** requires an even number of signals, of which the first half belongs to the first and and the second half to the second group. **PATTERN+3** likewise requires a number of signals that is divisible by 3, and so on. Pattern groups can share signals - for instance, the open, high, low, and close of the middle candle can appear in the first as well as in the second group - as long as the total number of signals does not exceed 20.

Aside from grouping, Zorro makes no assumptions of the signals and their relations. Therefore the pattern analyzer can be also used for other signals than candle prices. All signals within a pattern group should have the same unit for being comparable, but different groups can have different units. For candle patterns, usually the high, low, and close of the last 3 bars is used for the signals - the open is not needed as it's normally identical with the close of the previous candle. More signals, such as the moving average of the price, can be added for improving the prediction (but in most cases won't).

The pattern analyzer generates pattern finding functions in C source code in the **\Data\*.c** file. The functions are automatically included in the strategy script and used by the **advise** function in test and trade mode. They can also be exported for using them in strategy scripts or expert advisors of other platforms. They find all patterns that occurred 4 or more times in the training data set and had a positive profit expectancy. They return the pattern's information ratio - the ratio of profit mean to standard deviation - multiplied with 100. The better the information ratio, the more predictive is the pattern. A typical pattern finding function with 12 signals looks like this:

int EURUSD_S(float* sig) { if(sig[1]<sig[2] && eqF(sig[2]-sig[4]) && sig[4]<sig[0] && sig[0]<sig[5] && sig[5]<sig[3] && sig[10]<sig[11] && sig[11]<sig[7] && sig[7]<sig[8] && sig[8]<sig[9] && sig[9]<sig[6]) return 19; if(sig[4]<sig[1] && sig[1]<sig[2] && sig[2]<sig[5] && sig[5]<sig[3] && sig[3]<sig[0] && sig[7]<sig[8] && eqF(sig[8]-sig[10]) && sig[10]<sig[6] && sig[6]<sig[11] && sig[11]<sig[9]) return 170; if(sig[1]<sig[4] && eqF(sig[4]-sig[5]) && sig[5]<sig[2] && sig[2]<sig[3] && sig[3]<sig[0] && sig[10]<sig[7] && eqF(sig[7]-sig[8]) && sig[8]<sig[6] && sig[6]<sig[11] && sig[11]<sig[9]) return 74; if(sig[1]<sig[4] && sig[4]<sig[5] && sig[5]<sig[2] && sig[2]<sig[0] && sig[0]<sig[3] && sig[7]<sig[8] && eqF(sig[8]-sig[10]) && sig[10]<sig[11] && sig[11]<sig[9] && sig[9]<sig[6]) return 143; if(sig[1]<sig[2] && eqF(sig[2]-sig[4]) && sig[4]<sig[5] && sig[5]<sig[3] && sig[3]<sig[0] && sig[10]<sig[7] && sig[7]<sig[8] && sig[8]<sig[6] && sig[6]<sig[11] && sig[11]<sig[9]) return 168; .... return 0; }

The **eqF** function in the code above checks if two signals are almost equal, i.e. differ less than **FuzzyRange**.

There are two additional special methods for the pattern analyzer. The **FUZZY** method generates a pattern finding function that also finds patterns that can slightly deviate from the profitable patterns in the training data set. It gives patterns a higher score when they 'match better'. The deviation can be set up with **FuzzyRange**. A typical fuzzy pattern finding function looks like this:

int EURUSD_S(float* sig) { double result = 0.; result += belowF(sig[1],sig[4]) * belowF(sig[4],sig[2]) * belowF(sig[2],sig[5]) * belowF(sig[5],sig[3]) * belowF(sig[3],sig[0]) * belowF(sig[10],sig[11]) * belowF(sig[11],sig[7]) * belowF(sig[7],sig[8]) * belowF(sig[8],sig[9]) * belowF(sig[9],sig[6]) * 19; result += belowF(sig[4],sig[5]) * belowF(sig[5],sig[1]) * belowF(sig[1],sig[2]) * belowF(sig[2],sig[3]) * belowF(sig[3],sig[0]) * belowF(sig[10],sig[7]) * belowF(sig[7],sig[11]) * belowF(sig[11],sig[8]) * belowF(sig[8],sig[9]) * belowF(sig[9],sig[6]) * 66; result += belowF(sig[4],sig[1]) * belowF(sig[1],sig[2]) * belowF(sig[2],sig[0]) * belowF(sig[0],sig[5]) * belowF(sig[5],sig[3]) * belowF(sig[10],sig[11]) * belowF(sig[11],sig[7]) * belowF(sig[7],sig[8]) * belowF(sig[8],sig[6]) * belowF(sig[6],sig[9]) * 30; result += belowF(sig[1],sig[4]) * belowF(sig[4],sig[2]) * belowF(sig[2],sig[5]) * belowF(sig[5],sig[3]) * belowF(sig[3],sig[0]) * belowF(sig[7],sig[10]) * belowF(sig[10],sig[11]) * belowF(sig[11],sig[8]) * belowF(sig[8],sig[6]) * belowF(sig[6],sig[9]) * 70; result += belowF(sig[4],sig[5]) * belowF(sig[5],sig[1]) * belowF(sig[1],sig[2]) * belowF(sig[2],sig[3]) * belowF(sig[3],sig[0]) * belowF(sig[7],sig[10]) * belowF(sig[10],sig[8]) * belowF(sig[8],sig[11]) * belowF(sig[11],sig[9]) * belowF(sig[9],sig[6]) * 108; ... return result; }

The **belowF** function is described on the Fuzzy Logic page.

The **FAST** method does not generate C code; instead it generates a list of patterns that are classified with alphanumeric names. For finding a pattern, it is classified and its name compared with the pattern list. This is about 4 times faster than the pattern finding function in C code, and can also handle bigger and more complex patterns. It can make a remarkable difference in backtest time or when additional parameters have to be trained. A pattern name list looks like this (the numbers behind the name are the information ratios):

/* Pattern list for EURUSD_S

HIECBFGAD 61

BEFHCAIGD 152

EHBCIFGAD 73

BEFHCIGDA 69

BHFECAIGD 95

BHIFECGAD 86

HBEIFCADG 67

HEICBFGDA 108 ...*/

The **FAST** method can not be used in combination with **FUZZY** or with **FuzzyRange. **But the **FAST** as well as the **FUZZY** method can be combined with pattern groups (f.i. **PATTERN+FAST+2**).

The find rate of the pattern analyzer can be adjusted with two variables:

An example of a pattern trading script can be found in Workshop 7.

mode |
Parameters | Description |

NEURAL_INIT |
--- |
Initialize the machine learning library (f.i. by calling Rstart) before running the simulation. Return 0 if the initialization failed, otherwise 1. The script is aborted if the system can not be initialized. |

NEURAL_EXIT |
--- |
Close the machine learning library (not required for R). |

NEURAL_LEARN |
model, numSignals, Data |
Use a single sample of signals contained in the Data double array for training the model identified by the model index. The last element, Data[numSignals], is the Objective parameter or the trade result. The function is triggered by any advise call in [Train] mode, either immediately (Objective != 0) or when the trade is closed (Objective == 0). |

NEURAL_TRAIN |
model, numSignals, Data |
Batch training. Alternative to |

NEURAL_PREDICT |
model, numSignals, Data |
Return the value predicted by the model with the given model index, using the signals contained in the Data double array. The function is called by an advise call in [Test] and [Trade] mode. The model parameter is the number of the model. |

NEURAL_SAVE |
Data |
Save all trained models in the file with the name given by the string Data. The function is called at the end of every WFO cycle in [Train] mode. |

NEURAL_LOAD |
Data |
Load all trained models from the file with the name given by the string Data. The function is called at the begin of every WFO cycle in [Test] mode, and at the begin of a [Trade] session. It is also called every time when the model file is updated by re-training. |

The **model** index is the number of the trained model - for instance a set of decision rules, or a set of weights of a neural network - starting with **0**. When several models are trained for long and short predictions and for different assets or algos, the index selects one of the models. In R, models can be stored in a list of lists and accessed through their index (f.i. **Models[[model+1]]**). Any aditional parameters generated in the training process - for instance, a set of normalization factors or selection masks for the signals - can be saved together with the models.

The **numSignals** parameter is the number of signals passed to the **advise** function. It is normally identical to the number of trained features.

The **Data** parameter provides data to the function. The data can be of different type. For **NEURAL_LEARN**/**NEURAL_PREDICT** it's a pointer to a **double** array of length **numSignals+1**, containing the signal values plus the prediction objective or trade result at the end. Note that a plain data array has no "dim names" or other R gimmicks - if they are needed in the R training or predicting function, add them there. For **NEURAL_TRAIN** the **Data** parameter is a text string containing all samples in CSV format. The string can be stored in a temporary CSV file and then read by the machine learning algorithm for training the model. For **NEURAL_SAVE**/**NEURAL_LOAD** the **Data** parameter is the suggested file name for saving or loading the trained models separately for any WFO cycle in the **Data** folder. Use the **slash(string)** function for converting backslashes to slashes when required for R file paths.

This is the default **neural** function in the **r.h** file for using a R machine learning algorithm. It expects 4 R functions named **neural.train**, **neural.predict**, **neural.save**, and **neural.init** in a R script in the **Strategy** folder. The R script has the same name as the strategy script, but extension **.r** instead of **.c**. If required for special purposes, the default **neural** function can be replaced by a user-supplied** **function.

var neural(int mode, int model, int numSignals, void* Data)

{

if(!wait(0)) return 0;// open an R script with the same name as the stratefy script

if(mode == NEURAL_INIT) {

if(!Rstart(strf("%s.r",Script),2)) return 0;

Rx("neural.init()");

return 1;

}// export batch training samples and call the R training function

if(mode == NEURAL_TRAIN) {

string name = strf("Data\\signals%i.csv",Core);

file_write(name,Data,0);

Rx(strf("XY <- read.csv('%s%s',header = F)",slash(ZorroFolder),slash(name)));

Rset("AlgoVar",AlgoVar,8);

if(!Rx(strf("neural.train(%i,XY)",model+1),2))

return 0;

return 1;

}// predict the target with the R predict function

if(mode == NEURAL_PREDICT) {

Rset("AlgoVar",AlgoVar,8);

Rset("X",(double*)Data,numSignals);

Rx(strf("Y <- neural.predict(%i,X)",model+1));

return Rd("Y[1]");

}// save all trained models

if(mode == NEURAL_SAVE) {

print(TO_ANY,"\nStore %s",strrchr(Data,'\\')+1);

return Rx(strf("neural.save('%s')",slash(Data)),2);

}// load all trained models

if(mode == NEURAL_LOAD) {

printf("\nLoad %s",strrchr(Data,'\\')+1);

return Rx(strf("load('%s')",slash(Data)),2);

}

return 1;

}

The **neural** function is compatible with all Zorro trading, test, and training methods, including walk forward analysis, multi-core parallel training, and multiple assets and algorithms. An example of using **advise(NEURAL,...)** for a short-term deep learning system can be found on Financial Hacker.

- The RULES flag must be set for generating rules or training a machine learning algorithm. See the remarks under Training for special cases when
**RULES**and**PARAMETERS**are generated at the same time. - When trading is disabled, f.i. during the
LookBack period, at weekend, due to a
**SKIP**flag or during the inactive bars of a time frame, the**advise**functions won't train or predict and return**0**. - All
**advise**calls for a given asset/algo combination must use the same**Method**, the same signals, and the same training target type (either**Objective**or trade result). However, different methods and signals can be used for different assets and algorithm identifiers, so call algo() for combining multiple**advise**methods in the same script. Ensemble or hybrid methods can also be implemented this way, using different algo identifiers. - The generated rules by
**DTREE**,**PERCEPTRON**,**PATTERN**are stored in plain C code in***.c**files in the**Data**folder. This allows to export the rules to other platforms, and to look into the prediction process and to check if the rules makes sense. Machine learning models are stored in***.ml**files in the**Data**folder. If several models are trained, they are numbered in the order of advise calls and WFO cycles. F.i. for two advise calls, two assets, and 7 WFO cycles, 28 models are generated, 4 for any cycle. - For training trade results, call the
**advise**function with**Objective = 0**just before entering a trade; Zorro then uses the next trade's result for learning the rules. When training long and short trades at the same time, set Hedge to make sure that they can be opened simultaneously in training mode; otherwise any trade would close an opposite trade just entered before. Define an exit condition for the trade, such as a timed exit after a number of bars, or a stop/trailing mechanism for predicting a positive or negative excursion. Do not train rules with hedging explicitly disabled or with special modes such as NFA or Virtual Hedging. - For more than 20 signals, pass a
**Signals**array. Only the signals for the**PATTERN**and**PERCEPTRON**methods are always limited to 20. - Most machine learning algorithms require that training targets are balanced. That means the trades used for training should result in about 50% wins and 50% losses, and negative and positive
**Objective**values should be equally distributed. If in doubt, add**+BALANCED**to the method; this will simply copy samples until balance is reached. - Functions like scale or Normalize can be used to convert the signals to the same range. Some R machine learning algorithms require that signals and targets are in the
**0..1**range; in that case negative values or values greater than 1 lead to wrong results. If a signal value is outside the +/-1000 range, a warning is issued. - When using a future value for prediction, such as the price change of the next bars (f.i.
**Objective = priceClose(-5) - priceClose(0)**), make sure to set the PEEK flag in train mode. Alternatively, use the price change of a past bar to the current bar, and pass past signals - f.i. from a series - to the advise function in train mode, f.i.:

**Objective = priceClose(0) - priceClose(5)**;

**int Offset = ifelse(Train,5,0);**

var Prediction = adviseLong(Method,Objective,Signal0[Offset],Signal1[Offset],...); - Machine learning functions have a tendency to greatly overfit the strategy. For this reason, strategies that use the
**advise**function should always be tested with unseen data, f.i. by using Out-Of-Sample testing or Walk Forward Optimization. Use the OPENEND flag for preventing that trades prematurely close at the end of a WFO training period and thus reduce the training quality. When WFO training future signals (with**PEEK**flag), make sure that no data from the subsequent test cycle is used, as this would introduce peeking bias. F.i. when peeking 3 bars in the future, do not call advise anymore after 3 bars before the end of the training cycle. The end bar number of a cycle can be retrieved with the**EndBar**variable. - When WFO training a machine learning algorithm with
multiple cores,
make sure that the used R libraries are compatible with multiple processes.
Loading a library in parallel several times was reported not to work with
some packages. The R
**neural.init**function is called at the begin of any process, which can lead to different results in single core and multi core when for instance a random seed is set in the**neural.init**function. - For training several objectives with the
**NEURAL**method, pass the further objectives as**Signal**parameters to the**advise**function. For debugging, use the R**sink**function for printing R output to a file. - The estimated prediction accuracy resp. the number of found patterns are printed in the message window. The signals should be selected in a way that the prediction accuracy is above 50% or that as many as possible profitable patterns are found.