Aluno: Cheng Chen
Resumo
Applying data science techniques in the insurance industry has become increasingly popular
in recent years, especially in the pricing of non-life insurance. Machine Learning
(ML) models have demonstrated better accuracy compared to traditional Generalized
Linear Model (GLM). However, in terms of interpretability, GLM is inherently easier
to explain than ML models. The beta coefficients of GLM provide straightforward indicators
of how important a factor is and how it influences the predictions. As a result,
finding ways to better interpret ML models has become essential for companies using
solely ML models or ensemble models that combine ML and GLM results. SHapley
Additive exPlanations (SHAP), a model-agnostic method and an application of Shapley
Value in machine learning, can help us better understand the mechanisms behind ML’s
“black box”. We expect SHAP to play a similar role in ML as the beta coefficients do
in GLM. This project primarily focuses on using SHAP to analyze the ML model, rather
than the model itself, aiming to make companies feel more confident when using models
that are less interpretable. In addition to using SHAP, we also applied another method
called Partial Dependence Plot (PDP) to validate the conclusions drawn from SHAP.
Trabalho final de Mestrado