Previous: Portfolio Trading

Workshop 7: Machine Learning

Zorro's advise function can be used for applying machine learning functions to candle patterns, and using the most profitable patterns for a trade signal. Here's a simple example (Workshop7.c):

function run()
{
  StartDate = 2005;   // use > 10 years data
  BarPeriod = 1440;   // 1 day
  BarZone = WET;      // West European midnight
  Weekend = 1;        // separate Friday and Sunday bars
  LookBack = 3;       // only 3 bars needed
  NumWFOCycles = 10;   // mandatory for machine learning functions
 
  set(RULES+TESTNOW); // generate rules, test after training
  if(Train) Hedge = 2; // allow long + short
  LifeTime = 5;       // = one week
 
  if(adviseLong(PATTERN+2,0,
    priceHigh(2),priceLow(2),priceClose(2),
    priceHigh(1),priceLow(1),priceClose(1),
    priceHigh(1),priceLow(1),priceClose(1),
    priceHigh(0),priceLow(0),priceClose(0)) > 40)
    reverseLong(1);
  if(adviseShort() > 40)
    reverseShort(1);
}

Many lines in this code should be already familiar, but there are also some new concepts. The adviseLong function looks like a sort of indicator that takes some candle data and generates a signal for a long trade:

if (adviseLong(PATTERN+2,0,
  priceHigh(2),priceLow(2),priceClose(2),
  priceHigh(1),priceLow(1),priceClose(1),
  priceHigh(1),priceLow(1),priceClose(1),
  priceHigh(0),priceLow(0),priceClose(0) ) > 40)
  reverseLong(1);

The function is called with the PATTERN classification method and the High, Low, and Close prices of the last 3 candles. Aside from PATTERN, other and more complex machine learning methods can be used, such as a deep learning neural net as in this article. If adviseLong returns a value above 40, a long trade is entered (reverseLong(1) is a 'special version' of enterLong(), but limits the number of open trades to 1).

The advise functions behave differently in training and in test/trade mode. In training mode, they always return 100. So a trade is always entered. The function stores a 'snapshot' of its signal parameters - in this case, 12 signals from the High, Low, and Close prices of the last 3 candles - in an internal list. It then waits for the result of the trade, and stores the profit or loss of the trade together with the signal snapshot. Thus, after the training run Zorro got a large internal list containing all signal snapshots and their corresponding trade profits or losses.

The signals are then classified into patterns. PATTERN+2 tells Zorro's pattern analyzer to split the signals into 2 equal groups. Each got 6 of the 12 signals. The first group contains the prices of the first two candles of the 3-candle sequence:

priceHigh(2),priceLow(2),priceClose(2),
priceHigh(1),priceLow(1),priceClose(1)

And the second group contains the prices of the last two candles:

priceHigh(1),priceLow(1),priceClose(1),
priceHigh(0),priceLow(0),priceClose(0)

Note that the middle candle, with offset 1, appears in both groups. The Open price is not used in the signals because currencies are traded 24 hours a day, so the Close of a daily bar is normally identical to the Open of the next bar. Using the Open price would emphasize outliers and weekend patterns, which is not desired.

Within every signal group, Zorro now compares every signal with every other signal. This generates a huge set of greater, smaller, or equal results. This set of comparison results is a pattern. It does not matter for the PATTERN method if priceHigh(2) is far smaller or only a little smaller than priceHigh(1) - the resulting pattern is the same. The patterns of the two groups are now glued together to form a single pattern. It contains all information about all price comparisons within the first and the second and within the second and the third candle, but the pattern does not contain any information about how the first candle compares with the third. Bert had told Bob that it's best for price action trading to compare only adjacent candles - therefore the two independent pattern groups. If we had looked for 4-candle-patterns, we'd used three groups.

Aside from grouping, Zorro makes no assumptions of the signals and their relations. Therefore, in stead of candle patterns any other set of signals or indicators could also be used for the advise function. After the pattern was classified, Zorro checks how often it appears in training data set, and sums up all its profits or losses. If a pattern appears often and with a profit, it is considered a profitable pattern. Zorro removes all unprofitable or insignificant patterns from the list - patterns that don't have a positive profit sum or appear less than 4 times. From the remaining patterns, a pattern finding function is generated and stored in the workshop7_EURUSD.c script in the Data folder. Such a machine generated pattern finding function can look similar to this:

int EURUSD_L(float* sig)
{
  if(sig[1]<sig[2] && eqF(sig[2]-sig[4]) && sig[4]<sig[0] && sig[0]<sig[5] && sig[5]<sig[3]
    && sig[10]<sig[11] && sig[11]<sig[7] && sig[7]<sig[8] && sig[8]<sig[9] && sig[9]<sig[6])
      return 19;
  if(sig[4]<sig[1] && sig[1]<sig[2] && sig[2]<sig[5] && sig[5]<sig[3] && sig[3]<sig[0] && sig[7]<sig[8]
    && eqF(sig[8]-sig[10]) && sig[10]<sig[6] && sig[6]<sig[11] && sig[11]<sig[9])
      return 70;
  if(sig[1]<sig[4] && eqF(sig[4]-sig[5]) && sig[5]<sig[2] && sig[2]<sig[3] && sig[3]<sig[0]
    && sig[10]<sig[7] && eqF(sig[7]-sig[8]) && sig[8]<sig[6] && sig[6]<sig[11] && sig[11]<sig[9])
      return 74;
  if(sig[1]<sig[4] && sig[4]<sig[5] && sig[5]<sig[2] && sig[2]<sig[0] && sig[0]<sig[3] && sig[7]<sig[8]
    && eqF(sig[8]-sig[10]) && sig[10]<sig[11] && sig[11]<sig[9] && sig[9]<sig[6])
      return 43;
  if(sig[1]<sig[2] && eqF(sig[2]-sig[4]) && sig[4]<sig[5] && sig[5]<sig[3] && sig[3]<sig[0]
    && sig[10]<sig[7] && sig[7]<sig[8] && sig[8]<sig[6] && sig[6]<sig[11] && sig[11]<sig[9])
      return 68;
  ....
  return 0;
}

Machine generated code is automatically called by adviseLong/Short in test or trade mode, or when training other parameters. The functions in the code get their names from their asset, their algo, and whether they are used for long or short trades. This way many different functions can be stored in the same .c file. The function here has the name EURUSD_L. The list of signals is passed to the function as a float array named sig. float is a variable type similar to var, but has lower precision and thus consumes less memory. sig[0] is the first signal passed to the advise function - in this case priceHigh(2). sig[1] is the second signal (priceLow(2)) and so on. The signals are accessed inside the function just like the elements of a series.

We can see that the function contains many if() conditions with many comparisons of signals with other signals. Any if() condition represents a pattern. The comparisons are linked with && (that's the same as the and operator), so the if() condition is true only when all its comparisons are true. In this case a certain value, like 19 in the first if() condition in the above example, is returned by the function. If none of the if() conditions is true, 0 is returned, meaning that no pattern is found. The returned value is the pattern's score - its information ratio multiplied with 100. The higher the score, the more predictive is the pattern. The returned value is compared with 40, meaning that a long trade is entered for any pattern match with a score above 40. Short trading just works the same way:

if(adviseShort() > 40)
  reverseShort(1);

The adviseShort call has no parameters. In that case the function uses the same method and signals as the last advise call, which was the preceding adviseLong. This way lengthy signal lists don't have to be written twice.

Price action set-up

There are still some prerequisites for pattern analysis that haven't been discussed yet:

StartDate = 2004;
BarPeriod = 1440;
NumWFOCycles = 10;

NumWFOCycles or some other out-of-sample test method is mandatory for this type of strategy. All machine learning systems tend to overfitting, so any in-sample result from price patterns, decision trees, or preceptrons would be far too optimistic and thus meaningless. The number 10 is a compromise: higher numbers produce more WFO cycles, ergo less bars for any cycle to train, so less patterns are found and the results become more random. Lower numbers produce more bars per cycle and more patterns are found, but they are from a longer time period - above one year - within which the market can have substantially changed. So the results can become more random, too.

The number of bars from the same time period could theoretically be increased with oversampling. Unfortunately, oversampling is useless for daily bars because the High, Low, and Close prices depend on a certain bar start and end time. Resampled bars would produce very different patterns. So we must use more than 10 years for the simulation period to get enough data for training. (If price data from a certain year is not included in the Zorro program, it can be downloaded either automatically from the broker server, or with the historic price package from the Zorro download page).

BarZone = WET;

Normally, daily bars begin and end at UTC midnight. But for price patterns the time zone of the bars is critical. A good time for low EUR/USD volatility is midnight in Western Europe. BarZone determines the time zone of a daily bar; WET is the Western European Time, the time zone of London, considering daylight saving time.

Weekend = 1;

The Weekend variable determines how the simulator deals with weekend bars. Normally, no bar is allowed to start or end within a weekend. This means that for daily bars, the bar starting Friday 00:00 midnight would end Monday 00:00 midnight. This is not desired here because this bar would then contain prices from Friday as well as from Sunday evening, and spoil the candle pattern. Thus, Weekend = 1 enforces the Friday bar to end Saturday 00:00 midnight, although due to the weekend no trades can be entered on that bar. The week then consists of 6 instead of 5 bars.

LookBack = 3;

Because the strategy needs only the last 3 candles for trade decisions, we can set the lookback period from its default 80 down to 3 bars. This gives us three months more for training and testing.

set(RULES+TESTNOW);

The RULES flag is required for generating price patterns with the advise function. TESTNOW runs a test automatically after training - this saves a button click when experimenting with different pattern finding methods.

The next code part behaves different in training and in test or trade mode:

if(Train) Hedge = 2;

Train is true in [Train] mode. In this mode we want to determine the profitability of a trade that follows a certain pattern. Hedge is set to 2, which allows long and short positions at the same time. This is required for training the patterns, otherwise the short trade after adviseShort would immediately close the long positions that was just opened after adviseLong, and thus assign a wrong profit/loss value to its candle pattern. Hedge is not set in test and trade mode where it makes sense that positions are closed when opposite patterns appear.

LifeTime = 5;

LifeTime sets the duration of a trade to 5 bars, equivalent to about one week. If a trade is not closed by an opposite pattern, it is closed after a week. The trade results after one week are also used for training the candle patterns and generating the trade rules.

The result

Click [Train]. Depending on the PC speed, Zorro will need a few seconds for running through the 10 WFO cycles and finding about 50 profitable long or short patterns in every cycle. Click [Result] for the equity curve:
 

The machine learning algorithm with daily candle patterns seems to give us a relatively steadily rising equity curve and symmetric results in long and short trading. But can the same result be achieved with live trading? Or was it just a lucky test? For finding out, you have to do a Reality Check. Run the test many times (use NumTotalCycles) with a randomized price curve (Detrend = SHUFFLE), plot a histogram of the results, and compare it with the result from the real price curve. This is covered in more detail in the Black Book. For experimenting with other, more advanced machine learning methods - for instance, Deep Learning - look into the article series on Financial Hacker.

We're now at the end of the coding course. For writing your own systems, it can save you a lot of time when you flip through this manual and make yourself familiar with Zorro's math and statistics functions. Often-used code snippets for your own scripts and strategies can be found on the Tips & tricks page. If you worked with a different trade platform before, read the conversion page about how to convert your old scripts or EAs to lite-C.

What have we learned in this workshop?


Further reading: ► advise, RULES