Articles | Open Access | https://doi.org/10.55640/ijdsml-05-02-17

Predictive Modeling for Budget Overruns in Large-Scale Infrastructure Projects: Leveraging Historical Data for Proactive Cost Control

Aishwarya Korde , Project Manager- OSP Financial Controls & Forecasting, Fastbridge Fiber, Wyomissing, PA, USA

Abstract

Cost overruns remain one of the most pressing challenges in large-scale infrastructure projects. Telecom fiber rollouts, transportation systems, and energy networks often experience escalating budgets that reduce profitability, delay schedules, and undermine stakeholder confidence. Traditional forecasting methods—such as expert judgment and deterministic models—tend to be reactive and rarely anticipate risks early enough for corrective action. This research introduces a predictive analytics framework that leverages historical project data to forecast potential budget overruns and provide early-warning signals for financial decision-makers.

Budgetary performance is shaped by technical, organizational, and external factors including scope changes, terrain complexity, vendor delays, and shocks such as weather or regulatory constraints. Conventional cost-control systems often fail to account for the nonlinear interactions among these variables. By applying machine learning–based predictive modeling, this study seeks to uncover hidden patterns in historical datasets and enhance the precision of overrun forecasts.

Methodologically, the research compares regression and time-series approaches with advanced algorithms such as Random Forest, Gradient Boosting, and Support Vector Machines. Anticipated results suggest that machine learning models can reduce forecasting error by 15–25% compared to traditional methods, while also providing classification metrics to better identify projects at risk of escalation.

The contributions are threefold: (1) identifying cost drivers most strongly associated with overruns; (2) demonstrating the relative performance gains of machine learning compared with traditional approaches; and (3) outlining a practical framework for embedding predictive outputs into business intelligence dashboards used in financial planning and analysis. By focusing on infrastructure finance, particularly telecom rollouts where terrain-driven costs create high uncertainty, the study emphasizes how predictive analytics can strengthen financial governance, mitigate overruns, and support more reliable decision-making in capital-intensive projects.

Keywords

Cost overruns, predictive analytics, machine learning (ML), infrastructure finance, telecom, business intelligence, financial planning

References

Choi, H., Kim, J., & Lee, H. (2021). Predictive analytics for telecom network deployment using machine learning: Cost and risk forecasting in fiber rollouts. Telecommunications Policy, 45(9), 102222. https://doi.org/10.1016/j.telpol.2021.102222

Haddad, F., & Abraham, D. (2020). Applying predictive analytics to ICT infrastructure projects: A case of broadband deployment. Journal of Information Technology and Construction, 25, 230–244.

Salkuti, S. R. (2022). Application of machine learning in telecommunication project planning and cost control. International Journal of Engineering and Technology Innovation, 12(1), 1–14.

Wang, J., & Zhao, Y. (2019). Forecasting broadband rollout costs using data mining techniques. International Journal of Project Management, 37(8), 1016–1027. https://doi.org/10.1016/j.ijproman.2019.07.004

Ahiaga-Dagbui, D. D., & Smith, S. D. (2014). Dealing with construction cost overruns using data mining. Construction Management and Economics, 32(7–8), 682–694. https://doi.org/10.1080/01446193.2014.933855

Aung, T., Liana, S. R., Htet, A., & Bhaumik, A. (2023). Using machine learning to predict cost overruns in construction projects. Journal of Construction Analytics, 2(1), 15–28.

Batselier, J., & Vanhoucke, M. (2015). Empirical evaluation of earned value management forecasting accuracy for time and cost. Journal of Construction Engineering and Management, 141(11), 04015046. https://doi.org/10.1061/(ASCE)CO.1943-7862.0001005

Cantarelli, C. C., Flyvbjerg, B., Molin, E. J., & van Wee, B. (2010). Cost overruns in large-scale transport infrastructure projects: Explanations and their theoretical embeddedness. European Journal of Transport and Infrastructure Research, 10(1), 5–18.

Cheng, M. Y., Tsai, H. C., & Sudjono, E. (2010). Conceptual cost estimates using evolutionary fuzzy hybrid neural network. Automation in Construction, 19(5), 619–629. https://doi.org/10.1016/j.autcon.2010.02.004

Christensen, D. S., Antolini, R. C., & McKinney, J. W. (1995). A review of estimate at completion research. Journal of Cost Analysis and Management, 8(1), 41–62.

Coffie, G. H. (2023). Toward predictive modeling of construction cost overruns using support vector machines. Cogent Engineering, 10(1), 2269656. https://doi.org/10.1080/23311916.2023.2269656

Eliasson, J. (2025). Cost overruns of infrastructure projects: Distributions and analysis of causes. Transportation Research Part A: Policy and Practice, 176, 103846. https://doi.org/10.1016/j.tra.2023.103846

Flyvbjerg, B. (2008). Curbing optimism bias and strategic misrepresentation in planning: Reference class forecasting in practice. European Planning Studies, 16(1), 3–21. https://doi.org/10.1080/09654310701747936

Flyvbjerg, B., Bruzelius, N., & Rothengatter, W. (2003). Megaprojects and risk: An anatomy of ambition. Cambridge University Press.

Flyvbjerg, B., Holm, M. K. S., & Buhl, S. L. (2002). Underestimating costs in public works projects: Error or lie? Journal of the American Planning Association, 68(3), 279–295. https://doi.org/10.1080/01944360208976273

Hyari, K., & Kandil, A. A. (2009). Predicting project cost deviation in highway projects: Artificial neural networks versus regression. Journal of Construction Engineering and Management, 135(7), 658–666. https://doi.org/10.1061/(ASCE)0733-9364(2009)135:7(658)

Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263–291. https://doi.org/10.2307/1914185

Kim, G. H., An, S. H., & Kang, K. I. (2009). Comparison of construction cost estimating models based on regression analysis, neural networks, and case-based reasoning. Building and Environment, 39(10), 1235–1242. https://doi.org/10.1016/j.buildenv.2004.02.013

Love, P. E. D., Ahiaga-Dagbui, D. D., & Irani, Z. (2016). Cost overruns in transportation infrastructure projects: Sowing the seeds for a probabilistic theory of causation. Transportation Research Part A: Policy and Practice, 92, 184–194. https://doi.org/10.1016/j.tra.2016.08.007

Popovič, A., Hackney, R., Coelho, P. S., & Jaklič, J. (2012). Towards business intelligence systems success: Effects of maturity and culture on analytical decision making. Decision Support Systems, 54(1), 729–739. https://doi.org/10.1016/j.dss.2012.08.017

Turkyilmaz, A. H. (2024). Predicting cost overrun ratio classes using risk score–based machine learning models. Buildings, 14(11), 3541. https://doi.org/10.3390/buildings14113541

Uddin, S., Alam, S., & Chowdhury, S. (2022). Project cost overrun prediction using machine learning approaches. Journal of Construction Engineering and Management, 148(12), 04022099. https://doi.org/10.1061/(ASCE)CO.1943-7862.000221

Article Statistics

Downloads

Download data is not yet available.

Copyright License

Download Citations

How to Cite

Predictive Modeling for Budget Overruns in Large-Scale Infrastructure Projects: Leveraging Historical Data for Proactive Cost Control. (2025). International Journal of Data Science and Machine Learning, 5(02), 193-210. https://doi.org/10.55640/ijdsml-05-02-17