When it comes to trading and investing, most people are familiar with indices and ETFs. They’re a basket of securities that are represented in a single price or value. The difference between an ETF and an Index is that you can trade the ETF, but there are plenty of ETFs out there that were created with the goal of replicating an index. For example, you can’t directly trade the S&P500 index, but you can buy and sell shares of SPY which is an ETF that tracks the S&P500.
The concept introduced in this post is creating your own ‘index’ and trading it by selecting a portfolio of securities and optimizing the weights in a manner that minimizes the volatility. In this case, we’ll be covering a baset of currency rates intended to represent trading the Euro, and we’ll be performing this operation with SciPy’s minimize() function. The particular strategy we’re looking it will be simple price momentum with a rolling optimization window, meaning that we’ll be regularly updating the period between prices used to calculate the index’s ‘momentum’. This is being done on a weekly timeframe, but there’s no reason this can’t be done at a higher resolution.
We’ll start by importing all of our needed libraries:
# Import Packages
import numpy as np
import pandas as pd
from scipy.optimize import minimize, LinearConstraint, Bounds
from pandas import read_csv
from datetime import datetime
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore") # I was getting some warnings about future changes that may affect some code
The next three cells are just helper functions that allow us to clean up our backtesting code. All of this data is taken from TradingView, so the processDatetimes() function converts the exported unix time format into a date format we can understand.
def processDatetimes(csv_file):
# read in the csv file
data = pd.read_csv(csv_file)
# convert Unix time format from TradingView to datetime objects
dateTimes = []
for unixTime in data['time']:
dateTimes.append(datetime.strptime(datetime.fromtimestamp(unixTime).strftime('%Y-%m-%d %H:%M:%S'),'%Y-%m-%d %H:%M:%S'))
# assigns the datetimes as the index for the dataframe and drop it from your Pandas columns
data.index = dateTimes
data = data.drop(['time'],axis=1)
data = data.dropna()
return data
The CreatelistofIndices() function allows us to perform the rolling window optimization. Since we’re holding all of the price data in a Pandas dataframe, the backtest will be iterating through a list object, where each item in the list is bunch of dates that serves as a dataframe index. You can think of it like this:
returnedList = [timePeriod1, timePeriod2, timePeriod3, …, timePeriodN]
where timePeriod1 is an index of dates:
timePeriod1 = index(1/1/2012, 1/8/2012, 1/15/2012, … , 12/29/2015)
timePeriod2 = index(1/1/2013, 1/8/2013, 1/15/2013, … , 12/29/2016)
You’ll notice that the time periods overlap. timePeriod1 spans 2012-2015, and timePeriod2 spans 2013-2016. That’s because we’ll be splitting up each time period into a train and test set. Using timePeriod1 as an example, 2012-2014 would be the training set that we optimize our momentum length over, and 2015 would be the test set that we actually record backtest data from. The idea is that we’ll be trading on unseen data in real life, so we shouldn’t be inflating our expectations by assuming that results we got from data we optimized over will continue to perform the same on unseen data. Even then, there’s still an element of curve-fitting to consider, since we can see how our test data has performed and alter parameters that would yield good test data.
def CreatelistofIndices(data, trainingPeriod, testPeriod):
# assign the index location that you want to start at
# having your starting row being later is useful if you want to have enough room
# for calcuations like if you wanted to do a 200-period moving average.
first_row = 200
last_row = first_row + trainingPeriod + testPeriod
# creates a list data structure, where each entry in the list is a list of indices
listofIndices = []
while last_row < data.shape[0]+1:
listofIndices.append(data.iloc[first_row:last_row].index)
first_row += testPeriod # include this row if you want to do a rolling analysis. Remove this line of code if you want to do an anchored analysis
last_row += testPeriod
return listofIndices
Here is where we optimize our weights to minimize volatility. You’ll see that we set our constraints such that the bounds for the possible weights are between 0 and 1, and the sum of the weights need to add up to 1. We have our optimization starting point as just an even split between all the currencies.
def optimizeWeights(data):
# this function represents the objective function we're trying to minimize
def indexVolatility(weights, data):
index = weights[0]*data['EURUSD'] + weights[1]*data['EURCHF'] + weights[2]*data['EURCAD'] + weights[3]*data['EURAUD'] + weights[4]*data['EURGBP'] + weights[5]*data['EURJPY']
return index.std()
# here we introduce the constraint of 0 <= weight <= 1 for each weight
bounds = Bounds(np.zeros(6),np.ones(6))
# this constraint states that all the weights must sum to 1
con = ({'type': 'eq', 'fun': lambda weights: 1 - sum(weights)})
weights = np.array([0.2,0.2,0.2,0.2,0.2,0.2])
results = minimize(indexVolatility, args = (data), x0=weights, constraints=con, bounds=bounds)
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = results['x']
return weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY
Alright, we have everything we need now to run the backtest. A couple things to note: all the currencies have a leverage of 50:1 which may not be the case for every broker, and this backtest assumes that we’ll be trading with 10% of our available equity, which to some may be considered too much. If you’ve got the patience to go through the backtest code, you’ll notice that we’ll be optimizing each parameter according to its resulting Sharpe Ratio, meaning that the parameters during the train period that yield the highest sharpe ratio will get used in the test period.
def backtest(volatility_optimized):
# process the csv file that contains all the data we care about
data = processDatetimes('OANDA_EURUSD, 1W_594d3.csv')
# calculate the arithmetic returns of all the currencies
for currency in data.columns:
data[currency] = (data[currency].shift(-1)-data[currency]) / data[currency]
# select our train size and test size for each window (assuming 52 weeks in year)
trainPeriod = 52*3
testPeriod = 26
# generate all the windows that we'll be training and testing on
indicestoTest = CreatelistofIndices(data, trainPeriod, testPeriod)
# here we'll append optimization data so we can select the best performing length
backtestProfits = pd.Series()
backtestReturns = pd.Series()
backtestSignals = pd.Series()
# this list will capture the best lengths from each parameter sweep
optimalLengths = []
for indices in indicestoTest:
lengths = []
trainProfits = []
if volatility_optimized:
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = optimizeWeights(data.loc[indices].iloc[:-testPeriod])
else: weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = (1/6) * np.ones(6)
data['return'] = weight_EURUSD*data['EURUSD'] + weight_EURCHF*data['EURCHF'] + weight_EURCAD*data['EURCAD'] + weight_EURAUD*data['EURAUD'] + weight_EURGBP*data['EURGBP'] + weight_EURJPY*data['EURJPY']
for length in range(5,50,5):
trainData = data
trainData['signal'] = np.where((trainData['return']+1).cumprod().shift(1) - (trainData['return']+1).cumprod().shift(length) > 0, 1, -1)
trainData = trainData.loc[indices].iloc[:-testPeriod]
profits = (trainData['signal']*trainData['return']+1)
cumulativeProfit = profits.cumprod()[-2]
volatility = profits.std()*np.sqrt(profits.shape[0])
sharpe = cumulativeProfit/volatility
lengths.append(length)
trainProfits.append(sharpe)
results = pd.DataFrame()
results['lengths'] = lengths
results['train profits'] = trainProfits
optimalLength = results[results['train profits']==results['train profits'].max()]['lengths'].iloc[0]
### Test Section
testData = data
testData['signal'] = np.where((testData['return']+1).cumprod().shift(1) - (testData['return']+1).cumprod().shift(optimalLength) > 0, 1, -1)
if results['train profits'].max() < 1:
testData['signal'] = testData['signal']*0
testData = testData.loc[indices].iloc[-testPeriod:]
testProfits = testData['signal']*testData['return']
backtestProfits = backtestProfits.append(testProfits)
backtestReturns = backtestReturns.append(testData['return'])
backtestSignals = backtestSignals.append(testData['signal'])
optimalLengths.append(optimalLength)
plt.rcParams["figure.figsize"] = (15,5)
backtestProfitsCumulative = (backtestProfits*50*0.1+1).cumprod()
return backtestProfitsCumulative, backtestProfits
Now that we have our backtest function created, we can see how the momentum strategy is affected by optimizing for volatility. Here we have both the resulting charts and their corresponding statistics.
equalProfitsCumulative, equalProfits = backtest(volatility_optimized=False)
optimizedProfitsCumulative, optimizedProfits = backtest(volatility_optimized=True)
plt.plot(optimizedProfitsCumulative,label='Optimized for Low Volatility')
plt.plot(equalProfitsCumulative,label='Equally Distributed Weighting')
plt.legend()
<matplotlib.legend.Legend at 0x1cf7ce4a820>
def backtestStatistics(profitsCumulative, profits):
drawdowns = []
past_percents = []
returns_percent = profitsCumulative.dropna()
for cum_percent in returns_percent:
past_percents.append(cum_percent)
draw = min(0,(cum_percent - max(past_percents))/max(past_percents))
drawdowns.append(draw)
max_drawdown = abs(min(drawdowns))*100
annual_return = (returns_percent.iloc[-1]**(52/returns_percent.shape[0]) - 1)*100
win_rate = profits[profits > 0].shape[0] / profits.shape[0]*100
average_wins = (profits[profits > 0]).mean()*100
average_loss = (profits[profits < 0]).mean()*100
volatility = (profits).std()*100
sharpe = (annual_return-4) / (volatility*np.sqrt(52))
print(' max drawdown: ' + str(round(max_drawdown,2))+'%')
print(' annual return: ' + str(round(annual_return,2))+'%')
print(' win rate: ' + str(round(win_rate,2))+'%')
print('average winning day: ' + str(round(average_wins,2))+'%')
print(' average losing day: ' + str(round(average_loss,2))+'%')
print(' sharpe ratio: ' + str(round(sharpe,2)))
print("Equally Distributed Weighting:")
backtestStatistics(equalProfitsCumulative, equalProfits)
print("\nOptimized for Low Volatility:")
backtestStatistics(optimizedProfitsCumulative, optimizedProfits)
Equally Distributed Weighting: max drawdown: 61.42% annual return: 2.42% win rate: 50.96% average winning day: 0.59% average losing day: -0.57% sharpe ratio: -0.28 Optimized for Low Volatility: max drawdown: 55.16% annual return: 13.01% win rate: 53.53% average winning day: 0.58% average losing day: -0.52% sharpe ratio: 1.31
We can see that the weightings optimized for low volatility produced superior returns compared to when the weights were equally distributed. It might be at this point that you hope this is the ticket to limitless profits using a very simple setup like this. Unfortunately, there are some issues with this strategy. If you take this code and start adjusting the parameters, you’ll notice that the results change drastically. For example, let’s take a look at what happens when we increase the resolution of our sweep from skipping every 5 numbers to testing every single length.
def backtest(volatility_optimized):
# process the csv file that contains all the data we care about
data = processDatetimes('OANDA_EURUSD, 1W_594d3.csv')
# calculate the arithmetic returns of all the currencies
for currency in data.columns:
data[currency] = (data[currency].shift(-1)-data[currency]) / data[currency]
# select our train size and test size for each window (assuming 52 weeks in year)
trainPeriod = 52*3
testPeriod = 26
# generate all the windows that we'll be training and testing on
indicestoTest = CreatelistofIndices(data, trainPeriod, testPeriod)
# here we'll append optimization data so we can select the best performing length
backtestProfits = pd.Series()
backtestReturns = pd.Series()
backtestSignals = pd.Series()
# this list will capture the best lengths from each parameter sweep
optimalLengths = []
for indices in indicestoTest:
lengths = []
trainProfits = []
if volatility_optimized:
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = optimizeWeights(data.loc[indices].iloc[:-testPeriod])
else: weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = (1/6) * np.ones(6)
data['return'] = weight_EURUSD*data['EURUSD'] + weight_EURCHF*data['EURCHF'] + weight_EURCAD*data['EURCAD'] + weight_EURAUD*data['EURAUD'] + weight_EURGBP*data['EURGBP'] + weight_EURJPY*data['EURJPY']
for length in range(5,50,1): ##### CHANGE WAS MADE HERE #####
trainData = data
trainData['signal'] = np.where((trainData['return']+1).cumprod().shift(1) - (trainData['return']+1).cumprod().shift(length) > 0, 1, -1)
trainData = trainData.loc[indices].iloc[:-testPeriod]
profits = (trainData['signal']*trainData['return']+1)
cumulativeProfit = profits.cumprod()[-2]
volatility = profits.std()*np.sqrt(profits.shape[0])
sharpe = cumulativeProfit/volatility
lengths.append(length)
trainProfits.append(sharpe)
results = pd.DataFrame()
results['lengths'] = lengths
results['train profits'] = trainProfits
optimalLength = results[results['train profits']==results['train profits'].max()]['lengths'].iloc[0]
### Test Section
testData = data
testData['signal'] = np.where((testData['return']+1).cumprod().shift(1) - (testData['return']+1).cumprod().shift(optimalLength) > 0, 1, -1)
if results['train profits'].max() < 1:
testData['signal'] = testData['signal']*0
testData = testData.loc[indices].iloc[-testPeriod:]
testProfits = testData['signal']*testData['return']
backtestProfits = backtestProfits.append(testProfits)
backtestReturns = backtestReturns.append(testData['return'])
backtestSignals = backtestSignals.append(testData['signal'])
optimalLengths.append(optimalLength)
plt.rcParams["figure.figsize"] = (15,5)
backtestProfitsCumulative = (backtestProfits*50*0.1+1).cumprod()
return backtestProfitsCumulative, backtestProfits
backtestProfitsCumulative, backtestProfits = backtest(volatility_optimized=True)
plt.plot(backtestProfitsCumulative)
[<matplotlib.lines.Line2D at 0x1cf7f00aeb0>]
Without outputting the statistics, we can already see that the results are very different with just a simple parameter tweaked. This should raise a red flag and indicate that either this is not a tradeable concept, or that more study is needed.
For future revisions, the following should be considered:
- See if you get more use optimizing for low volatility running each currency separately and then optimizing the strategy of portfolios vs what you’re doing now where you’re creating an index
- Check to see if there’s statistical significance in the profitability of momentum on currencies with low volatility instead of going straight into backtesting
Version History
Version | Date | Author | Description |
1.0 | 4/10/2023 | Quanty Python | Initial Release |
2.0 | |||
3.0 |
Version 1
1.0
When it comes to trading and investing, most people are familiar with indices and ETFs. They’re a basket of securities that are represented in a single price or value. The difference between an ETF and an Index is that you can trade the ETF, but there are plenty of ETFs out there that were created with the goal of replicating an index. For example, you can’t directly trade the S&P500 index, but you can buy and sell shares of SPY which is an ETF that tracks the S&P500.
The concept introduced in this post is creating your own ‘index’ and trading it by selecting a portfolio of securities and optimizing the weights in a manner that minimizes the volatility. In this case, we’ll be covering a baset of currency rates intended to represent trading the Euro, and we’ll be performing this operation with SciPy’s minimize() function. The particular strategy we’re looking it will be simple price momentum with a rolling optimization window, meaning that we’ll be regularly updating the period between prices used to calculate the index’s ‘momentum’. This is being done on a weekly timeframe, but there’s no reason this can’t be done at a higher resolution.
We’ll start by importing all of our needed libraries:
# Import Packages
import numpy as np
import pandas as pd
from scipy.optimize import minimize, LinearConstraint, Bounds
from pandas import read_csv
from datetime import datetime
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore") # I was getting some warnings about future changes that may affect some code
The next three cells are just helper functions that allow us to clean up our backtesting code. All of this data is taken from TradingView, so the processDatetimes() function converts the exported unix time format into a date format we can understand.
def processDatetimes(csv_file):
# read in the csv file
data = pd.read_csv(csv_file)
# convert Unix time format from TradingView to datetime objects
dateTimes = []
for unixTime in data['time']:
dateTimes.append(datetime.strptime(datetime.fromtimestamp(unixTime).strftime('%Y-%m-%d %H:%M:%S'),'%Y-%m-%d %H:%M:%S'))
# assigns the datetimes as the index for the dataframe and drop it from your Pandas columns
data.index = dateTimes
data = data.drop(['time'],axis=1)
data = data.dropna()
return data
The CreatelistofIndices() function allows us to perform the rolling window optimization. Since we’re holding all of the price data in a Pandas dataframe, the backtest will be iterating through a list object, where each item in the list is bunch of dates that serves as a dataframe index. You can think of it like this:
returnedList = [timePeriod1, timePeriod2, timePeriod3, …, timePeriodN]
where timePeriod1 is an index of dates:
timePeriod1 = index(1/1/2012, 1/8/2012, 1/15/2012, … , 12/29/2015)
timePeriod2 = index(1/1/2013, 1/8/2013, 1/15/2013, … , 12/29/2016)
You’ll notice that the time periods overlap. timePeriod1 spans 2012-2015, and timePeriod2 spans 2013-2016. That’s because we’ll be splitting up each time period into a train and test set. Using timePeriod1 as an example, 2012-2014 would be the training set that we optimize our momentum length over, and 2015 would be the test set that we actually record backtest data from. The idea is that we’ll be trading on unseen data in real life, so we shouldn’t be inflating our expectations by assuming that results we got from data we optimized over will continue to perform the same on unseen data. Even then, there’s still an element of curve-fitting to consider, since we can see how our test data has performed and alter parameters that would yield good test data.
def CreatelistofIndices(data, trainingPeriod, testPeriod):
# assign the index location that you want to start at
# having your starting row being later is useful if you want to have enough room
# for calcuations like if you wanted to do a 200-period moving average.
first_row = 200
last_row = first_row + trainingPeriod + testPeriod
# creates a list data structure, where each entry in the list is a list of indices
listofIndices = []
while last_row < data.shape[0]+1:
listofIndices.append(data.iloc[first_row:last_row].index)
first_row += testPeriod # include this row if you want to do a rolling analysis. Remove this line of code if you want to do an anchored analysis
last_row += testPeriod
return listofIndices
Here is where we optimize our weights to minimize volatility. You’ll see that we set our constraints such that the bounds for the possible weights are between 0 and 1, and the sum of the weights need to add up to 1. We have our optimization starting point as just an even split between all the currencies.
def optimizeWeights(data):
# this function represents the objective function we're trying to minimize
def indexVolatility(weights, data):
index = weights[0]*data['EURUSD'] + weights[1]*data['EURCHF'] + weights[2]*data['EURCAD'] + weights[3]*data['EURAUD'] + weights[4]*data['EURGBP'] + weights[5]*data['EURJPY']
return index.std()
# here we introduce the constraint of 0 <= weight <= 1 for each weight
bounds = Bounds(np.zeros(6),np.ones(6))
# this constraint states that all the weights must sum to 1
con = ({'type': 'eq', 'fun': lambda weights: 1 - sum(weights)})
weights = np.array([0.2,0.2,0.2,0.2,0.2,0.2])
results = minimize(indexVolatility, args = (data), x0=weights, constraints=con, bounds=bounds)
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = results['x']
return weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY
Alright, we have everything we need now to run the backtest. A couple things to note: all the currencies have a leverage of 50:1 which may not be the case for every broker, and this backtest assumes that we’ll be trading with 10% of our available equity, which to some may be considered too much. If you’ve got the patience to go through the backtest code, you’ll notice that we’ll be optimizing each parameter according to its resulting Sharpe Ratio, meaning that the parameters during the train period that yield the highest sharpe ratio will get used in the test period.
def backtest(volatility_optimized):
# process the csv file that contains all the data we care about
data = processDatetimes('OANDA_EURUSD, 1W_594d3.csv')
# calculate the arithmetic returns of all the currencies
for currency in data.columns:
data[currency] = (data[currency].shift(-1)-data[currency]) / data[currency]
# select our train size and test size for each window (assuming 52 weeks in year)
trainPeriod = 52*3
testPeriod = 26
# generate all the windows that we'll be training and testing on
indicestoTest = CreatelistofIndices(data, trainPeriod, testPeriod)
# here we'll append optimization data so we can select the best performing length
backtestProfits = pd.Series()
backtestReturns = pd.Series()
backtestSignals = pd.Series()
# this list will capture the best lengths from each parameter sweep
optimalLengths = []
for indices in indicestoTest:
lengths = []
trainProfits = []
if volatility_optimized:
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = optimizeWeights(data.loc[indices].iloc[:-testPeriod])
else: weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = (1/6) * np.ones(6)
data['return'] = weight_EURUSD*data['EURUSD'] + weight_EURCHF*data['EURCHF'] + weight_EURCAD*data['EURCAD'] + weight_EURAUD*data['EURAUD'] + weight_EURGBP*data['EURGBP'] + weight_EURJPY*data['EURJPY']
for length in range(5,50,5):
trainData = data
trainData['signal'] = np.where((trainData['return']+1).cumprod().shift(1) - (trainData['return']+1).cumprod().shift(length) > 0, 1, -1)
trainData = trainData.loc[indices].iloc[:-testPeriod]
profits = (trainData['signal']*trainData['return']+1)
cumulativeProfit = profits.cumprod()[-2]
volatility = profits.std()*np.sqrt(profits.shape[0])
sharpe = cumulativeProfit/volatility
lengths.append(length)
trainProfits.append(sharpe)
results = pd.DataFrame()
results['lengths'] = lengths
results['train profits'] = trainProfits
optimalLength = results[results['train profits']==results['train profits'].max()]['lengths'].iloc[0]
### Test Section
testData = data
testData['signal'] = np.where((testData['return']+1).cumprod().shift(1) - (testData['return']+1).cumprod().shift(optimalLength) > 0, 1, -1)
if results['train profits'].max() < 1:
testData['signal'] = testData['signal']*0
testData = testData.loc[indices].iloc[-testPeriod:]
testProfits = testData['signal']*testData['return']
backtestProfits = backtestProfits.append(testProfits)
backtestReturns = backtestReturns.append(testData['return'])
backtestSignals = backtestSignals.append(testData['signal'])
optimalLengths.append(optimalLength)
plt.rcParams["figure.figsize"] = (15,5)
backtestProfitsCumulative = (backtestProfits*50*0.1+1).cumprod()
return backtestProfitsCumulative, backtestProfits
Now that we have our backtest function created, we can see how the momentum strategy is affected by optimizing for volatility. Here we have both the resulting charts and their corresponding statistics.
equalProfitsCumulative, equalProfits = backtest(volatility_optimized=False)
optimizedProfitsCumulative, optimizedProfits = backtest(volatility_optimized=True)
plt.plot(optimizedProfitsCumulative,label='Optimized for Low Volatility')
plt.plot(equalProfitsCumulative,label='Equally Distributed Weighting')
plt.legend()
<matplotlib.legend.Legend at 0x1cf7ce4a820>
def backtestStatistics(profitsCumulative, profits):
drawdowns = []
past_percents = []
returns_percent = profitsCumulative.dropna()
for cum_percent in returns_percent:
past_percents.append(cum_percent)
draw = min(0,(cum_percent - max(past_percents))/max(past_percents))
drawdowns.append(draw)
max_drawdown = abs(min(drawdowns))*100
annual_return = (returns_percent.iloc[-1]**(52/returns_percent.shape[0]) - 1)*100
win_rate = profits[profits > 0].shape[0] / profits.shape[0]*100
average_wins = (profits[profits > 0]).mean()*100
average_loss = (profits[profits < 0]).mean()*100
volatility = (profits).std()*100
sharpe = (annual_return-4) / (volatility*np.sqrt(52))
print(' max drawdown: ' + str(round(max_drawdown,2))+'%')
print(' annual return: ' + str(round(annual_return,2))+'%')
print(' win rate: ' + str(round(win_rate,2))+'%')
print('average winning day: ' + str(round(average_wins,2))+'%')
print(' average losing day: ' + str(round(average_loss,2))+'%')
print(' sharpe ratio: ' + str(round(sharpe,2)))
print("Equally Distributed Weighting:")
backtestStatistics(equalProfitsCumulative, equalProfits)
print("\nOptimized for Low Volatility:")
backtestStatistics(optimizedProfitsCumulative, optimizedProfits)
Equally Distributed Weighting: max drawdown: 61.42% annual return: 2.42% win rate: 50.96% average winning day: 0.59% average losing day: -0.57% sharpe ratio: -0.28 Optimized for Low Volatility: max drawdown: 55.16% annual return: 13.01% win rate: 53.53% average winning day: 0.58% average losing day: -0.52% sharpe ratio: 1.31
We can see that the weightings optimized for low volatility produced superior returns compared to when the weights were equally distributed. It might be at this point that you hope this is the ticket to limitless profits using a very simple setup like this. Unfortunately, there are some issues with this strategy. If you take this code and start adjusting the parameters, you’ll notice that the results change drastically. For example, let’s take a look at what happens when we increase the resolution of our sweep from skipping every 5 numbers to testing every single length.
def backtest(volatility_optimized):
# process the csv file that contains all the data we care about
data = processDatetimes('OANDA_EURUSD, 1W_594d3.csv')
# calculate the arithmetic returns of all the currencies
for currency in data.columns:
data[currency] = (data[currency].shift(-1)-data[currency]) / data[currency]
# select our train size and test size for each window (assuming 52 weeks in year)
trainPeriod = 52*3
testPeriod = 26
# generate all the windows that we'll be training and testing on
indicestoTest = CreatelistofIndices(data, trainPeriod, testPeriod)
# here we'll append optimization data so we can select the best performing length
backtestProfits = pd.Series()
backtestReturns = pd.Series()
backtestSignals = pd.Series()
# this list will capture the best lengths from each parameter sweep
optimalLengths = []
for indices in indicestoTest:
lengths = []
trainProfits = []
if volatility_optimized:
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = optimizeWeights(data.loc[indices].iloc[:-testPeriod])
else: weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = (1/6) * np.ones(6)
data['return'] = weight_EURUSD*data['EURUSD'] + weight_EURCHF*data['EURCHF'] + weight_EURCAD*data['EURCAD'] + weight_EURAUD*data['EURAUD'] + weight_EURGBP*data['EURGBP'] + weight_EURJPY*data['EURJPY']
for length in range(5,50,1): ##### CHANGE WAS MADE HERE #####
trainData = data
trainData['signal'] = np.where((trainData['return']+1).cumprod().shift(1) - (trainData['return']+1).cumprod().shift(length) > 0, 1, -1)
trainData = trainData.loc[indices].iloc[:-testPeriod]
profits = (trainData['signal']*trainData['return']+1)
cumulativeProfit = profits.cumprod()[-2]
volatility = profits.std()*np.sqrt(profits.shape[0])
sharpe = cumulativeProfit/volatility
lengths.append(length)
trainProfits.append(sharpe)
results = pd.DataFrame()
results['lengths'] = lengths
results['train profits'] = trainProfits
optimalLength = results[results['train profits']==results['train profits'].max()]['lengths'].iloc[0]
### Test Section
testData = data
testData['signal'] = np.where((testData['return']+1).cumprod().shift(1) - (testData['return']+1).cumprod().shift(optimalLength) > 0, 1, -1)
if results['train profits'].max() < 1:
testData['signal'] = testData['signal']*0
testData = testData.loc[indices].iloc[-testPeriod:]
testProfits = testData['signal']*testData['return']
backtestProfits = backtestProfits.append(testProfits)
backtestReturns = backtestReturns.append(testData['return'])
backtestSignals = backtestSignals.append(testData['signal'])
optimalLengths.append(optimalLength)
plt.rcParams["figure.figsize"] = (15,5)
backtestProfitsCumulative = (backtestProfits*50*0.1+1).cumprod()
return backtestProfitsCumulative, backtestProfits
backtestProfitsCumulative, backtestProfits = backtest(volatility_optimized=True)
plt.plot(backtestProfitsCumulative)
[<matplotlib.lines.Line2D at 0x1cf7f00aeb0>]
Without outputting the statistics, we can already see that the results are very different with just a simple parameter tweaked. This should raise a red flag and indicate that either this is not a tradeable concept, or that more study is needed.
For future revisions, the following should be considered:
- See if you get more use optimizing for low volatility running each currency separately and then optimizing the strategy of portfolios vs what you’re doing now where you’re creating an index
- Check to see if there’s statistical significance in the profitability of momentum on currencies with low volatility instead of going straight into backtesting
1.1 Example
When it comes to trading and investing, most people are familiar with indices and ETFs. They’re a basket of securities that are represented in a single price or value. The difference between an ETF and an Index is that you can trade the ETF, but there are plenty of ETFs out there that were created with the goal of replicating an index. For example, you can’t directly trade the S&P500 index, but you can buy and sell shares of SPY which is an ETF that tracks the S&P500.
The concept introduced in this post is creating your own ‘index’ and trading it by selecting a portfolio of securities and optimizing the weights in a manner that minimizes the volatility. In this case, we’ll be covering a baset of currency rates intended to represent trading the Euro, and we’ll be performing this operation with SciPy’s minimize() function. The particular strategy we’re looking it will be simple price momentum with a rolling optimization window, meaning that we’ll be regularly updating the period between prices used to calculate the index’s ‘momentum’. This is being done on a weekly timeframe, but there’s no reason this can’t be done at a higher resolution.
We’ll start by importing all of our needed libraries:
# Import Packages
import numpy as np
import pandas as pd
from scipy.optimize import minimize, LinearConstraint, Bounds
from pandas import read_csv
from datetime import datetime
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore") # I was getting some warnings about future changes that may affect some code
The next three cells are just helper functions that allow us to clean up our backtesting code. All of this data is taken from TradingView, so the processDatetimes() function converts the exported unix time format into a date format we can understand.
def processDatetimes(csv_file):
# read in the csv file
data = pd.read_csv(csv_file)
# convert Unix time format from TradingView to datetime objects
dateTimes = []
for unixTime in data['time']:
dateTimes.append(datetime.strptime(datetime.fromtimestamp(unixTime).strftime('%Y-%m-%d %H:%M:%S'),'%Y-%m-%d %H:%M:%S'))
# assigns the datetimes as the index for the dataframe and drop it from your Pandas columns
data.index = dateTimes
data = data.drop(['time'],axis=1)
data = data.dropna()
return data
The CreatelistofIndices() function allows us to perform the rolling window optimization. Since we’re holding all of the price data in a Pandas dataframe, the backtest will be iterating through a list object, where each item in the list is bunch of dates that serves as a dataframe index. You can think of it like this:
returnedList = [timePeriod1, timePeriod2, timePeriod3, …, timePeriodN]
where timePeriod1 is an index of dates:
timePeriod1 = index(1/1/2012, 1/8/2012, 1/15/2012, … , 12/29/2015)
timePeriod2 = index(1/1/2013, 1/8/2013, 1/15/2013, … , 12/29/2016)
You’ll notice that the time periods overlap. timePeriod1 spans 2012-2015, and timePeriod2 spans 2013-2016. That’s because we’ll be splitting up each time period into a train and test set. Using timePeriod1 as an example, 2012-2014 would be the training set that we optimize our momentum length over, and 2015 would be the test set that we actually record backtest data from. The idea is that we’ll be trading on unseen data in real life, so we shouldn’t be inflating our expectations by assuming that results we got from data we optimized over will continue to perform the same on unseen data. Even then, there’s still an element of curve-fitting to consider, since we can see how our test data has performed and alter parameters that would yield good test data.
def CreatelistofIndices(data, trainingPeriod, testPeriod):
# assign the index location that you want to start at
# having your starting row being later is useful if you want to have enough room
# for calcuations like if you wanted to do a 200-period moving average.
first_row = 200
last_row = first_row + trainingPeriod + testPeriod
# creates a list data structure, where each entry in the list is a list of indices
listofIndices = []
while last_row < data.shape[0]+1:
listofIndices.append(data.iloc[first_row:last_row].index)
first_row += testPeriod # include this row if you want to do a rolling analysis. Remove this line of code if you want to do an anchored analysis
last_row += testPeriod
return listofIndices
Here is where we optimize our weights to minimize volatility. You’ll see that we set our constraints such that the bounds for the possible weights are between 0 and 1, and the sum of the weights need to add up to 1. We have our optimization starting point as just an even split between all the currencies.
def optimizeWeights(data):
# this function represents the objective function we're trying to minimize
def indexVolatility(weights, data):
index = weights[0]*data['EURUSD'] + weights[1]*data['EURCHF'] + weights[2]*data['EURCAD'] + weights[3]*data['EURAUD'] + weights[4]*data['EURGBP'] + weights[5]*data['EURJPY']
return index.std()
# here we introduce the constraint of 0 <= weight <= 1 for each weight
bounds = Bounds(np.zeros(6),np.ones(6))
# this constraint states that all the weights must sum to 1
con = ({'type': 'eq', 'fun': lambda weights: 1 - sum(weights)})
weights = np.array([0.2,0.2,0.2,0.2,0.2,0.2])
results = minimize(indexVolatility, args = (data), x0=weights, constraints=con, bounds=bounds)
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = results['x']
return weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY
Alright, we have everything we need now to run the backtest. A couple things to note: all the currencies have a leverage of 50:1 which may not be the case for every broker, and this backtest assumes that we’ll be trading with 10% of our available equity, which to some may be considered too much. If you’ve got the patience to go through the backtest code, you’ll notice that we’ll be optimizing each parameter according to its resulting Sharpe Ratio, meaning that the parameters during the train period that yield the highest sharpe ratio will get used in the test period.
def backtest(volatility_optimized):
# process the csv file that contains all the data we care about
data = processDatetimes('OANDA_EURUSD, 1W_594d3.csv')
# calculate the arithmetic returns of all the currencies
for currency in data.columns:
data[currency] = (data[currency].shift(-1)-data[currency]) / data[currency]
# select our train size and test size for each window (assuming 52 weeks in year)
trainPeriod = 52*3
testPeriod = 26
# generate all the windows that we'll be training and testing on
indicestoTest = CreatelistofIndices(data, trainPeriod, testPeriod)
# here we'll append optimization data so we can select the best performing length
backtestProfits = pd.Series()
backtestReturns = pd.Series()
backtestSignals = pd.Series()
# this list will capture the best lengths from each parameter sweep
optimalLengths = []
for indices in indicestoTest:
lengths = []
trainProfits = []
if volatility_optimized:
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = optimizeWeights(data.loc[indices].iloc[:-testPeriod])
else: weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = (1/6) * np.ones(6)
data['return'] = weight_EURUSD*data['EURUSD'] + weight_EURCHF*data['EURCHF'] + weight_EURCAD*data['EURCAD'] + weight_EURAUD*data['EURAUD'] + weight_EURGBP*data['EURGBP'] + weight_EURJPY*data['EURJPY']
for length in range(5,50,5):
trainData = data
trainData['signal'] = np.where((trainData['return']+1).cumprod().shift(1) - (trainData['return']+1).cumprod().shift(length) > 0, 1, -1)
trainData = trainData.loc[indices].iloc[:-testPeriod]
profits = (trainData['signal']*trainData['return']+1)
cumulativeProfit = profits.cumprod()[-2]
volatility = profits.std()*np.sqrt(profits.shape[0])
sharpe = cumulativeProfit/volatility
lengths.append(length)
trainProfits.append(sharpe)
results = pd.DataFrame()
results['lengths'] = lengths
results['train profits'] = trainProfits
optimalLength = results[results['train profits']==results['train profits'].max()]['lengths'].iloc[0]
### Test Section
testData = data
testData['signal'] = np.where((testData['return']+1).cumprod().shift(1) - (testData['return']+1).cumprod().shift(optimalLength) > 0, 1, -1)
if results['train profits'].max() < 1:
testData['signal'] = testData['signal']*0
testData = testData.loc[indices].iloc[-testPeriod:]
testProfits = testData['signal']*testData['return']
backtestProfits = backtestProfits.append(testProfits)
backtestReturns = backtestReturns.append(testData['return'])
backtestSignals = backtestSignals.append(testData['signal'])
optimalLengths.append(optimalLength)
plt.rcParams["figure.figsize"] = (15,5)
backtestProfitsCumulative = (backtestProfits*50*0.1+1).cumprod()
return backtestProfitsCumulative, backtestProfits
Now that we have our backtest function created, we can see how the momentum strategy is affected by optimizing for volatility. Here we have both the resulting charts and their corresponding statistics.
equalProfitsCumulative, equalProfits = backtest(volatility_optimized=False)
optimizedProfitsCumulative, optimizedProfits = backtest(volatility_optimized=True)
plt.plot(optimizedProfitsCumulative,label='Optimized for Low Volatility')
plt.plot(equalProfitsCumulative,label='Equally Distributed Weighting')
plt.legend()
<matplotlib.legend.Legend at 0x1cf7ce4a820>
def backtestStatistics(profitsCumulative, profits):
drawdowns = []
past_percents = []
returns_percent = profitsCumulative.dropna()
for cum_percent in returns_percent:
past_percents.append(cum_percent)
draw = min(0,(cum_percent - max(past_percents))/max(past_percents))
drawdowns.append(draw)
max_drawdown = abs(min(drawdowns))*100
annual_return = (returns_percent.iloc[-1]**(52/returns_percent.shape[0]) - 1)*100
win_rate = profits[profits > 0].shape[0] / profits.shape[0]*100
average_wins = (profits[profits > 0]).mean()*100
average_loss = (profits[profits < 0]).mean()*100
volatility = (profits).std()*100
sharpe = (annual_return-4) / (volatility*np.sqrt(52))
print(' max drawdown: ' + str(round(max_drawdown,2))+'%')
print(' annual return: ' + str(round(annual_return,2))+'%')
print(' win rate: ' + str(round(win_rate,2))+'%')
print('average winning day: ' + str(round(average_wins,2))+'%')
print(' average losing day: ' + str(round(average_loss,2))+'%')
print(' sharpe ratio: ' + str(round(sharpe,2)))
print("Equally Distributed Weighting:")
backtestStatistics(equalProfitsCumulative, equalProfits)
print("\nOptimized for Low Volatility:")
backtestStatistics(optimizedProfitsCumulative, optimizedProfits)
Equally Distributed Weighting: max drawdown: 61.42% annual return: 2.42% win rate: 50.96% average winning day: 0.59% average losing day: -0.57% sharpe ratio: -0.28 Optimized for Low Volatility: max drawdown: 55.16% annual return: 13.01% win rate: 53.53% average winning day: 0.58% average losing day: -0.52% sharpe ratio: 1.31
We can see that the weightings optimized for low volatility produced superior returns compared to when the weights were equally distributed. It might be at this point that you hope this is the ticket to limitless profits using a very simple setup like this. Unfortunately, there are some issues with this strategy. If you take this code and start adjusting the parameters, you’ll notice that the results change drastically. For example, let’s take a look at what happens when we increase the resolution of our sweep from skipping every 5 numbers to testing every single length.
def backtest(volatility_optimized):
# process the csv file that contains all the data we care about
data = processDatetimes('OANDA_EURUSD, 1W_594d3.csv')
# calculate the arithmetic returns of all the currencies
for currency in data.columns:
data[currency] = (data[currency].shift(-1)-data[currency]) / data[currency]
# select our train size and test size for each window (assuming 52 weeks in year)
trainPeriod = 52*3
testPeriod = 26
# generate all the windows that we'll be training and testing on
indicestoTest = CreatelistofIndices(data, trainPeriod, testPeriod)
# here we'll append optimization data so we can select the best performing length
backtestProfits = pd.Series()
backtestReturns = pd.Series()
backtestSignals = pd.Series()
# this list will capture the best lengths from each parameter sweep
optimalLengths = []
for indices in indicestoTest:
lengths = []
trainProfits = []
if volatility_optimized:
weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = optimizeWeights(data.loc[indices].iloc[:-testPeriod])
else: weight_EURUSD,weight_EURCHF,weight_EURCAD,weight_EURAUD,weight_EURGBP,weight_EURJPY = (1/6) * np.ones(6)
data['return'] = weight_EURUSD*data['EURUSD'] + weight_EURCHF*data['EURCHF'] + weight_EURCAD*data['EURCAD'] + weight_EURAUD*data['EURAUD'] + weight_EURGBP*data['EURGBP'] + weight_EURJPY*data['EURJPY']
for length in range(5,50,1): ##### CHANGE WAS MADE HERE #####
trainData = data
trainData['signal'] = np.where((trainData['return']+1).cumprod().shift(1) - (trainData['return']+1).cumprod().shift(length) > 0, 1, -1)
trainData = trainData.loc[indices].iloc[:-testPeriod]
profits = (trainData['signal']*trainData['return']+1)
cumulativeProfit = profits.cumprod()[-2]
volatility = profits.std()*np.sqrt(profits.shape[0])
sharpe = cumulativeProfit/volatility
lengths.append(length)
trainProfits.append(sharpe)
results = pd.DataFrame()
results['lengths'] = lengths
results['train profits'] = trainProfits
optimalLength = results[results['train profits']==results['train profits'].max()]['lengths'].iloc[0]
### Test Section
testData = data
testData['signal'] = np.where((testData['return']+1).cumprod().shift(1) - (testData['return']+1).cumprod().shift(optimalLength) > 0, 1, -1)
if results['train profits'].max() < 1:
testData['signal'] = testData['signal']*0
testData = testData.loc[indices].iloc[-testPeriod:]
testProfits = testData['signal']*testData['return']
backtestProfits = backtestProfits.append(testProfits)
backtestReturns = backtestReturns.append(testData['return'])
backtestSignals = backtestSignals.append(testData['signal'])
optimalLengths.append(optimalLength)
plt.rcParams["figure.figsize"] = (15,5)
backtestProfitsCumulative = (backtestProfits*50*0.1+1).cumprod()
return backtestProfitsCumulative, backtestProfits
backtestProfitsCumulative, backtestProfits = backtest(volatility_optimized=True)
plt.plot(backtestProfitsCumulative)
[<matplotlib.lines.Line2D at 0x1cf7f00aeb0>]
Without outputting the statistics, we can already see that the results are very different with just a simple parameter tweaked. This should raise a red flag and indicate that either this is not a tradeable concept, or that more study is needed.
For future revisions, the following should be considered:
- See if you get more use optimizing for low volatility running each currency separately and then optimizing the strategy of portfolios vs what you’re doing now where you’re creating an index
- Check to see if there’s statistical significance in the profitability of momentum on currencies with low volatility instead of going straight into backtesting
Disclaimer
The information provided on this blog is for educational and informational purposes only. It should not be considered as financial or investment advice. The content presented here is based on the personal opinions and experiences of the author and should not be interpreted as a recommendation to buy, sell, or trade any financial instrument.
Trading and investing in the financial markets involve significant risks. It is important to conduct thorough research, seek professional advice, and carefully consider your financial situation and risk tolerance before making any investment decisions. Past performance is not indicative of future results, and no guarantee can be made regarding the profitability or success of any trading strategy or investment approach discussed on this blog.
The author and the blog shall not be held responsible for any losses, damages, or liabilities arising from the use of the information presented here. Readers are solely responsible for their own investment decisions and should seek the advice of a qualified financial professional before taking any action.
The blog may contain links to external websites or resources for informational purposes. The author and the blog do not endorse or guarantee the accuracy, completeness, or reliability of any information or content provided on these external sites.
Trading financial instruments involves inherent risks, and it is important to understand the potential consequences of your actions. Always exercise caution and diligence when engaging in any financial transactions or investment activities.