Articles
| Open Access |
https://doi.org/10.55640/
A Hybrid Volatility-Driven Statistical Arbitrage Framework: Integrating Advanced Time-Series Econometrics and Machine Learning for Enhanced Stock Market Trend Prediction
Piotr S. Volkov , Faculty of Computational Finance, Moscow State University, Moscow, Russia Aisha N. Mensah , Department of Computer Science, Kwame Nkrumah University of Science and Technology, Kumasi, GhanaAbstract
Purpose: This study introduces the Hybrid Volatility-Driven Statistical Arbitrage (HVSA) framework, an integrated quantitative strategy designed to enhance stock market trend prediction and exploit mean-reversion opportunities. The methodology synergistically combines advanced time-series econometrics with machine learning to generate robust and non-spurious trading signals, specifically focusing on mid-range volatility assets.
Methods: The HVSA framework employs a multi-stage approach. First, the Gaussian Mixture Model is utilized to cluster a large universe of assets based on their drift-independent realized volatility profiles, isolating those in optimal, tradable volatility regimes. Second, the Granger Causality Test is applied to the resultant clusters to rigorously identify predictive, causal linkages between asset pairs, moving beyond simple co-integration. The extracted causal features, along with advanced Volatility-of-Volatility metrics, are then fed into a Deep Neural Network classifier, which is trained using an adaptive resampling protocol to predict the directional trend of the arbitrage spread. The entire strategy is validated using a stringent forwardtesting protocol, which accounts for realistic market constraints and transaction costs.
Results: Empirical evaluation demonstrates that the HVSA framework achieves superior risk-adjusted returns compared to traditional benchmarks. The strategy’s predictive power, rooted in verified causal volatility dependencies, resulted in an Annualized Sharpe Ratio of 2.31 and a low Maximum Drawdown of 4.2% during the forwardtesting period. The Deep Neural Network, continuously retrained via an adaptive rolling-window scheme, proved highly effective in capturing the non-linear patterns of mean reversion, a task where simpler linear models often fail.
Conclusion: The HVSA framework provides compelling evidence that the integration of statistically-rigorous volatility analysis with advanced, adaptively trained machine learning classification is crucial for developing robust statistical arbitrage strategies. This hybrid methodology successfully enhances the predictability of stock market trends and offers a viable pathway for generating alpha in contemporary financial markets.
Keywords
Statistical Arbitrage, Volatility, Machine Learning, Deep Neural Networks, Time-Series Econometrics, Granger Causality, Forwardtesting
References
Letteri I, Penna GD, Gasperis GD, Dyoub A. DNN-forwardtesting: a new trading strategy validation using statistical timeseries analysis and deep neural networks 2022.
Engle RF, Sokalska ME. Forecasting intraday volatility in the US equity market. Multiplicative component GARCH. J Financ Economet. 2012;10(1):54–83.
BATES, D.S. How crashes develop: intradaily volatility and crash evolution. J Fin. 2019;74(1):193–238.
Letteri I, Penna GD, Gasperis GD, Dyoub A. A stock trading system for a medium volatile asset using multi layer perceptron. CoRR 2022; abs/2201.12286
Letteri I. Stock market forecasting using machine learning models through volatility-driven trading strategies. In: Arami, M., Baudier, P., Chang, V. (eds.) Proceedings of the 6th International Conference on Finance, Economics, Management and IT Business, FEMIB 2024, Angers, April 28-29, 2024; pp. 96–103. SCITEPRESS.
Miyazaki B, Izumi K, Toriumi F, Takahashi R. Change detection of orders in stock markets using a Gaussian mixture model. Int Syst Account Fin Manam. 2014;21(3):169–91.
Vikram Singh, 2025, Policy Optimization for Anti-Money Laundering (AML) Compliance using AI Techniques: A Machine Learning Approach to Enhance Banking Regulatory Compliance, INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH & TECHNOLOGY (IJERT) Volume 14, Issue 04 (April 2025)
Caldeira JF, Moura GV. Selection of a portfolio of pairs based on cointegration: a statistical arbitrage strategy. Braz Rev Fin. 2013;11(1):49–80.
Letteri I, Penna GD, Gasperis GD, Dyoub A. Trading strategy validation using forwardtesting with deep neural networks. In: Arami, M., Baudier, P., Chang, V. (eds.) Proceedings of the 5th International Conference on Finance, Economics, Management and IT Business, FEMIB 2023, Prague, Czech Republic, April 23-24, 2023; pp. 15–25. SCITEPRESS.
Engle RF, Granger CWJ. Co-integration and error correction: representation, estimation, and testing. Econometrica. 1987;55(2):251–76 (Accessed 2023-06-18).
Letteri I. VolTS: A Volatility-based Trading System to forecast Stock Markets Trend using Statistics and Machine Learning 2023.
Yang D, Zhang Q. Drift-independent volatility estimation based on high, low, open, and close prices. J Bus. 2000;73(3):477–92 (Accessed 2022-06-07).
Letteri I, Cecco AD, Dyoub A, Penna GD. Imbalanced dataset optimization with new resampling techniques. In: Arai, K. (ed.) Intelligent Systems and Applications - Proceedings of the 2021 Intelligent Systems Conference, IntelliSys 2021, Amsterdam, The Netherlands, 2-3 September, 2021, Volume 2. Lecture Notes in Networks and Systems, vol. 295. Springer. pp. 199–215.
Letteri I, Cecco AD, Dyoub A, Penna GD. A novel resampling technique for imbalanced dataset optimization. CoRR abs/2012.15231 2020.
Pearl J. Causal inference in statistics: an overview. Stat Surv. 2009;3:96–146.
Spirtes P, Glymour C, Scheines R. Causation, prediction, and search. The MIT Press. 2001.
Niennattrakul V, Ratanamahatana CA. On clustering multimedia time series data using k-means and dynamic time warping. In: 2007 International Conference on Multimedia and Ubiquitous Engineering (MUE’07), 2007; pp. 733–738 .
Almgren R, Chriss N. Optimal execution of portfolio transactions. J Risk. 3(2):5–39. (Accessed 2025-08-21).
Letteri I. A comparative analysis of statistical and machine learning models for outlier detection in Bitcoin limit order books 2025; arxiv:2507.14960
Letteri I, Cecco AD, Penna GD. Dataset optimization strategies for malwaretraffic detection. CoRR abs/2009.11347 2020; arxiv:2009.11347
Letteri I, Cecco AD, Penna GD. New optimization approaches in malware traffic analysis. In: Machine Learning, Optimization, and Data Science - 7th International Conference, LOD 2021, Grasmere, UK, October 4-8, 2021, Revised Selected Papers, Part I. Lecture Notes in Computer Science, vol. 13163, pp. 57–68. Springer.
Letteri I, Penna GD, Gasperis GD. Security in the internet of things: botnet detection in software-defined networks by deep learning techniques. Int J High Perform Comput Netw. 2019;15(3/4):170–82.
Letteri I, Penna GD, Caianiello P. Feature selection strategies for HTTP botnet traffic detection. In: 2019 IEEE European Symposium on Security and Privacy Workshops, EuroS &P Workshops 2019, Stockholm, Sweden, June 17-19, 2019; pp. 202–210. IEEE.
Parate, H., Madala, P., & Waikar, A. (2025). Equity and efficiency in TxDOT infrastructure funding: A per capita and spatial investment analysis. Journal of Information Systems Engineering and Management, 10(55s). https://www.jisem-journal.com/
Dyoub A, Costantini S, Lisi FA, Letteri I. Logic-based machine learning for transparent ethical agents. In: Proceedings of the 35th Italian Conference on Computational Logic - CILC 2020, Rende, Italy, October 13-15, 2020. CEUR Workshop Proceedings, vol. 2710, pp. 169–183. CEUR-WS.org.
Gasperis GD, Costantini S, Rafanelli A, Migliarini P, Letteri I, Dyoub A. Extension of constraint-procedural logic-generated environments for deep q-learning agent training and benchmarking. J Log Comput. 2023;33(8):1712–33.
Dyoub A, Costantini S, Letteri I, Lisi FA. A logic-based multi-agent system for ethical monitoring and evaluation of dialogues. In: Proceedings 37th International Conference on Logic Programming (Technical Communications), ICLP Technical Communications 2021, Porto (virtual Event), 20-27th September 2021. EPTCS, 2021; vol. 345, pp. 182–188.
Dyoub A, Costantini S, Letteri I. Care robots learning rules of ethical behavior under the supervision of an ethical teacher (short paper). In: Joint Proceedings of the 1st International Workshop on HYbrid Models for Coupling Deductive and Inductive ReAsoning (HYDRA 2022) and the 29th RCRA Workshop on Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion (RCRA 2022) Co-located with the 16th International Conference on Logic Programming and Non-monotonic Reasoning (LPNMR 2022), Genova Nervi, Italy, September 5, 2022. CEUR Workshop Proceedings, vol. 3281, pp. 1–8. CEUR-WS.org.
Angelone A, Letteri I, Vittorini P. First evaluation of an adaptive tool supporting formative assessment in data science courses. In: Methodologies and Intelligent Systems for Technology Enhanced Learning, 13th International Conference, MIS4TEL 2023, Guimaraes, Portugal, 12-14 July 2023. Lecture Notes in Networks and Systems, vol. 764, pp. 144–151. Springer.
Article Statistics
Downloads
Copyright License
Copyright (c) 2025 Piotr S. Volkov, Aisha N. Mensah

This work is licensed under a Creative Commons Attribution 4.0 International License.