Mid-Price Movement Prediction in Limit Order Books Using Feature Engineering and Machine Learning
|Tila||Julkaistu - 25 lokakuuta 2019|
|Nimi||Tampere University Dissertations|
The thesis develops novelmethods for the identification of limit order book characteristics which provide traders and market makers an information edge in their trading. A good proxy for traders and market makers is the prediction of mid-price movement, which is the main target of this thesis. The contributions of this thesis are categorized chronologically into three parts. The first part refers to the introduction in the literature of the first publicly available limit order book dataset for high-frequency trading for the task of mid-price movement prediction. This dataset comes together with the development of an experimental protocol that utilizes methods inspired by ridge regression and a single layer feed-forward neural network as classifiers. These classifiers use state-of-the-art limit order book features as inputs for the target task.
The next contribution of this thesis is the use and development of a wide range of technical and quantitative indicators for the task of mid-price movement prediction via an extensive feature selection process. This feature selection process identifies which features improve predictability performance. The results suggest that the newly introduced quantitative feature based on an adaptive logistic regression model for online learning was selected first according to several criteria. These criteria operate according to entropy, linear discriminant analysis, and least mean square error.
The third contribution is the introduction of econometric features as inputs to deep learning models for the task of mid-price movement prediction. An extensive comparison against other state-of-the-art hand-crafted features and fully automated feature extraction processes is provided. Furthermore, a new experimental protocol is developed for the task of mid-price prediction, to overcome the problem of time irregularities, which characterizes high-frequency data. Results suggest that advanced hand-crafted features such as econometric indicators can predict movements of proxies, such as mid-price.