In order to improve the portfolio optimization tool developed, we carefully investigate parameters used in order to address effectiveness and limitations of our model. Some of the aspects examined are briefly presented below. A more complete analysis, and more details on how each figure is constructed, can be found in the technical paper on our resource page.

Impact of pre-simulation data length: In order for stochastic models, to be effective, they need incorporate some training data from before the start of the simulation. As a result, we incorporate a period of known historical data that immediately precedes the simulation. But, how does amount of pre-simulation training data affect a stochastic model? Specifically, does more data being incorporated necessarily lead to better results?

Left: Predicted Coca-Cola stock price movement using one year of historical data. Right: Predicted Coca-Cola stock price movement using five years of historical data 

In the particular case shown above, the model accuracy decreased with a larger pre-simulation data set. One possible reason for this, is that going back too far in time will include irrelevant price trends. For example, in the period from one year to five years before the simulation start, the coke stock performed far worse than in the year immediately preceding the simulation. As a result, all models had a worse under-estimation of the real-world price movement.

Overall though, in periods with less abnormalities, we did find that accuracy generally increased as more pre-simulation training data was used, although as shown above, there are many exceptions to this trend, and care should be taken to look for abnormal factors, such as stock market crashes, when deciding what training data to include.

Impact of simulation length: One other parameter to vary is the simulation length. For example, we can run a three-month simulation which will try to predict price trends across the next three-months, or a two-year simulation that will try to predict trends across the next to years. However, we need to investigate how different simulation lengths affect the accuracy of stock price trend predictions.

Left: Three-month Coca-Cola stock price simulation using five years of historical data. Right: One-year Coca-Cola stock price simulation using five years of historical data

Generally, as simulation length increases, the historical price becomes more likely to be included in a 90% confidence interval. This is likely driven by two factors. First, as a simulation goes on, the variance in the predicted price naturally increases. This is a reflection of the fact that models can’t be too certain in the future. Secondly, price abnormalities are more likely to cancel out or be mitigated with longer simulations. For example, a spike in price in a real-world may lead a model to be really inaccurate, as the model did not account for this spike. However, so long as price spikes are not too frequent, or if they are frequent in opposite directions, the model will revert closer to the real-world price.


Impact of varying stocks: The above too stocks looked just at Coca-Cola, and overall had fairly poor accuracy. Why Coke’s price trend is predicted so inaccurate? Will these models be more or less accurate for other stocks?

Left: Three-month Exxon stock price simulation using one year of historical data. Right: Three-month Coca-Cola stock price simulation using one year of historical data

We can see that Exxon has a much more accurate prediction than Coke did. If we look at Exxon’s preceding few months in comparison to Coke’s we can notice that Exxon continues to perform in a way much more similar to the past than Coke did. In the Coke stock, we can see there is in an initial rapid increase in price that was not accounted for by the model. Generally speaking, stocks that continue to perform similar to past performance will have much more accurate model results. Of course, this cannot be known with foresight, so there is inherently a risk associated with using models.

Stochastic model comparison: We just got to see the CAPM, Fama-French and Carhart models in action for just two stocks in a small variety of simulations. However, overall, which model, between the CAPM, Fama-French and Carhart, is the most accurate? Does one tend to perform better under certain conditions than others?

Top: Efficient frontier for CAPM. Middle: Efficient frontier for Fama French model. Bottom: Efficient frontier for Carhart model.


Generally, the CAPM has the worst accuracy, as highlighted by the figure above, although all three are somewhat reasonable at capturing the expected trend. The Fama-French and Carhart are generally of comparable accuracy. The Carhart model is typically better when a stock continues to perform to a similar degree as it did in the past, in relation to the rest of the stock market. However, this is difficult to know in advance, so either the Fama-French or Carhart model would be suitable for deriving insights.

Impact of rebalancing: Investors are unlikely to keep there portfolio exactly the same for many years and will likely rebalance there portfolio. But how often should an investor rebalance? What role does number times a portfolio is rebalanced play in terms of returns and accuracy?

Top: Efficient frontier with one-time rebalancing. Middle: Efficient frontier with rebalancing three times. Bottom: Efficient frontier with daily rebalancing.

Here we can see that rebalancing too frequently can improve predicted returns. However, these returns are often not realized, with a real-world portfolio typically vastly under-performing a prediction. This is because the model accuracy is not as good if simulation periods are too short. If portfolios are rebalanced at a high frequency, such as daily, there is so much cumulative inaccuracy, that predictions are almost completely ineffective. As a result, investors should not rebalance frequently, unless they have a strong financial background providing additional insights that our platform cannot.