Predicting IPO Listing Day Returns and Profitability Using Linear Regression, Random Forest, and XGBoost Models
Main Article Content
Abstract
IPO listing day outcomes remain highly unpredictable, exposing investors to substantial risk even amid strong subscription demand and prevailing market optimism. This study develops a rigorous, data-driven machine learning framework to forecast IPO listing day performance and classify profitability outcomes, thereby reducing the investment community’s reliance on sentiment-based decision-making.
Methods: The study analysed over 800 IPO records spanning 2010–2025 through a multi-phase machine learning pipeline. Following rigorous data preprocessing and feature engineering, key predictors were constructed from institutional and retail subscription metrics (QIB, HNI, RII), issue characteristics, and market momentum indicators. Linear Regression was applied to predict continuous listing day returns, while ensemble methods (Random Forest and XG Boost) were employed for binary profit/loss classification. Model performance was evaluated using R², RMSE, accuracy, ROC-AUC, and cross-validation; feature importance was assessed through regression coefficients and tree-based importance rankings.
Results: Linear Regression explained a meaningful share of listing day return variability (R² ≈ 0.38), while Random Forest and XG Boost delivered strong classification performance—each surpassing 80% accuracy with robust ROC-AUC scores. Market momentum, institutional subscription levels, and issue size consistently emerged as the most influential predictors across both prediction tasks.