Terms

This blog is for educational and informational purposes only. The contents of this blog are opinions of the author and should not be interpreted as investment advice. The author takes no responsibility for the investment decisions of other individuals or institutions, and is not liable for any losses they may incur. By reading this blog, you are agreeing that you understand and agree to the terms above.

Tuesday, May 31, 2011

The Data-Driven Manifesto

First and foremost, timing is everything. And not just in humor.

The correct timing is the sole requirement for profitability. Whether it be signals generated by a statistical model, or an important event in the news, all information is incorporated very quickly into market prices. Thus, the only thing that matters is timing each trade just right, whether that means capturing imbalances in supply and demand at the microstructure level, parsing news articles with information extraction algorithms, or betting on an trend reversal after seeing a familiar signal. There is no such thing as a temporally-irrelevant investment. Everything depends on time.

If data is old, one cannot use it to successfully invest. No fundamental data or technical data or obvious statistical data is necessary at all, because it is public knowledge that has already been incorporated into the price of the security, commodity, currency or derivative in question. Betting on a known fact is useless, because the market price already reflects that fact, and will not change due to the fact. How could it? The information has already been unveiled. It is not going to unveil itself again--it cannot become common knowledge twice!

However, if data suggests a statistical anomaly in the presumably efficient market, then it may be proper to act on it, assuming that others are not aware of the anomaly. Widely observed statistical phenomena such as the value, size, and momentum factors are not good foundations for trading strategies because the phenomena are already widely observed and therefore more risky; indeed, the more market participants using a strategy, the more potential that strategy has to underperform the overall market index.

The whole reason investment strategies are market-neutral is because their creators did not want to worry about predicting where investors, as a flock, would go next. Unfortunately, any strategy, with sufficient popularity, suffers from the same problem; people invest in the strategy, and then they withdraw capital, creating volatility in returns from the strategy. This is true for everything from statistical arbitrage to the failure of quantitative equity selection models based on value and size. The only reason that the strategies worked well was because few people used them, but rather their returns came from corrections. For instance, when stat arb was still new, success came in the form of lower returns for an outperforming stock. It wasn't that investors necessarily used stat arb and recognized that the value of the stock was too high, but rather that the strategies' successes were phenomenon unrelated to the direct investment decisions of any individual or institution. But once they became popular, the returns of such strategies was directly impacted by a greater portion of the market, that now knew of and traded the strategy directly. At that point, the strategies were inherently subject to the same fickleness that market indices have always been subject to.

Rather intuitively, the more participants using a strategy, the lower returns associated with it, and the lower the Sharpe ratio. The standard deviation of an overused strategy's returns are also much higher, again for intuitive reasons: the number of market participants is positively correlated with the number of multi-strategy investors; the number of strategies increases, the probability that at least one strategy fails increases as well--and eventually one of the many strategies is bound to fail; as the number of participants increases the number of investors being exposed most heavily to the failing strategy increases as well. As the number of investors being exposed most heavily to the failing strategy approaches infinity, the probability of one of these investors being heavily leveraged and consequently receiving a large margin call (as a result of the failing strategy) converges to unity. From that point, the participant with the margin call liquidates an arbitrary number of strategies' portfolios. If this number includes, say, a value/momentum book, then the value/momentum strategy will suffer from the unwinding of the participant's portfolio.

The problem, then, is not market crashes or strategy failure, ipso facto, but rather a spillover effect on other portfolios using the same, over-popular strategy. The solution is obvious: use strategies that other participants have (literally) never even heard of. The mere knowledge of a possible strategy may encourage participants to covertly experiment with it, and perhaps put it into practice without declaring it. Luckily, as more participants use the strategy, the Sharpe ratio of the strategy declines, and vigilant observation will allow the original users of the strategy to leave quietly before it fails in dramatic fashion.

* * * * *

The problem with theory-driven strategies is that they usually reject temporal trading rules.
For instance, CAPM, EMH, APT and MPT in general fail to account for the possibility of different expected returns across time. They cannot adjust to changing market conditions either, and their models often make too many restrictive assumptions.

The fullest implication of "data-driven" strategies is that their associated models are not merely created ahead of time and parametrically synchronized with data, but rather that the data itself determines the model's structure, and not just its parameters.

For instance, the number of hidden layers in an artificial neural network, the number of iterations of a genetic algorithm for portfolio selection, and the number of states in a hidden Markov model are all input by a human programmer. And yet, the decisions made by humans are the ones that constrict the model from becoming fully formed. Thus, these decisions need to be supported by the data

The optimization method for the most accurate model (as measured by the probability of the model producing the training sequence) should be the one that leads to global optimality. Hill-climbing algorithms can only guarantee local optimality and are therefore less desirable than algorithms that search for global maxima. This is intuitive, since the more accurate the model is, the better it represents the truth behind how the market moves and works.

Once the globally optimal model's structure is perfected through historical profitability tests, the model is ready to use, and no more human intervention is necessary. However, as a scientific experiment, it would be interesting to see what kind of unexpected connections, classifications, and procedures the models could come up with. It will be complex, and likely counterintuitive.

"Black box" has become finance-lingo for any algorithmic trading strategy without a simple, logical backing. The models' structures and parameters--or anything too complex or too counterintuitive for a human to understand--are labeled as "black-box" as if that was a bad thing. The fact that the strategy is obscure helps to avoid the crowding effect that leads to the downfall of every hyped investment strategy, from LTCM's fixed income arbitrage disaster to PDT's temporary Stat Arb troubles in August and November 2007. The more black-box it is, the less others are likely to catch on, and the better the strategy's performance will be. Put differently, assuming that it is sound, it won't be subject to failure on account of sheer popularity. From a bigger perspective, all data-driven strategies work well--assuming that they are not too well known, in which case their success is nothing but a house of cards that has been lucky not to suffer from a gust of wind--because they exploit market phenomena that do exist, rather than ones that ought to exist. Indeed, data-driven strategies are valid because they have been validated by the market.

High Frequency Portfolios

An investment management company must keep track of risk across all positions and strategies used in each of its funds. At high frequency, however, this matters less because the positions are held only a short time, and the strategies can be made market neutral or even componentless (as measured by Principal Components Analysis). Instead, the focus is on transaction costs. How much money does the strategy lose from market impact when entering and exiting the trade? How much of those remaining profits are eaten away at by broker fees? These costs will determine how much of a position may be built in the time available for the strategy to work. Thus, the costs and the time series forecast determine how much may be invested, and the investment is entirely time based. Enter here; wait for the time series' next value; exit here if prediction is unfavorable or if a better post-transaction cost profit is predicted elsewhere--and hold position if it is favorable; repeat.

However, the size of orders should also be limited to a certain percentage of portfolio value, the leverage should remain fixed, and a Value-at-Risk system should also be used to determine whether a trade's marginal VaR is too high to make it a worthwhile investment.

One theoretical (gasp!) price impact model is written as an integral, (from the initial price to the final acceptable transaction price) where the integrand is the price multiplied by the liquidity function, which is a function of the price, and can be derived empirically through experimentation and the use of some statistical regression. Of course, we integrate with respect to the price.

This is intuitive because we have an infinitely small amount we can buy without moving the price. The price moves as we buy, so we have the price times the amount we can buy at that price, for every price between the initial and the final one. (And this works in reverse if selling--the top term of the integral will just be lower than the bottom.)

If there is a certain largest absolute price impact that the investor is willing to allow, he can simply set that impact equal to the integral described above, and solve for the upper term. Then, he can see how many shares he will be buying by integrating the liquidity function with respect to the price, and using the earlier boundaries for the new boundaries of integration. Conversely, if an investor wants to see how much of an impact will be made by a trade of a certain size, the later expression can be set equal to a certain number of shares, and the upper bound of the integral can be solved for. In that case, we plug the upper bound into the former expression, and integrate to find the market impact of the trade.

From there, the investor can construct a search heuristic (particle swarm optimization, cuckoo search, etc.) to search for the optimal trade size. From there, we can use a sort of execution algorithm that allows for transactions across time--something that minimizes (if buying) or maximizes (if selling) the VWAP or TWAP.