Aluno: Edoardo De Besi
Resumo
This academic thesis addresses a critical gap in the existing literature surrounding predictive analytics and used car prices, specifically where research predominantly focuses on estimating point predictions of prices using machine learning without providing a measure of uncertainty associated with these predictions. The objective is to calculate prediction intervals using both conformal quantile regression and frequentist quantile regression on a “Light Gradient Boosting Machine (LightGBM)” model trained with a comprehensive dataset of used car listings collected in May 2021 from the United States marketplace. The paper empirically compares these two methodologies at various nominal coverage probabilities. Notably, the study reveals a significant trade-off that decision-makers must consider - a balance between accuracy and precision. Conformal predictions uniquely offer a guarantee of the nominal coverage level at the expense of wider prediction intervals. Furthermore, the research emphasizes that the decision on which method to use depends on the target nominal coverage probability level. As the nominal coverage probability increases, the study finds that the median width of conformal quantile regression increases more than proportionally compared to frequentist quantile regression. This implies that the coverage guarantee becomes more costly in terms of width as the nominal coverage probability rises, making conformal quantile regression more advantageous at lower nominal coverage probability.
Trabalho final de Mestrado