« A new classification method to target prevention actions » by Romain GAUCHON

Romain Gauchon, currently CIFRE PhD student in a partnership between ACTUARIS and the laboratory SAF ISFA, has just supported his Master thesis of actuary focused on one of the central issues of the Chair Prevent’Horizon: « A new classification method to target prevention actions. »
The jury was unanimous in recognizing his talents as an actuarial researcher and was impressed by the innovation he has developed to improve patient / insured targeting techniques.

Work co-supervised by Stéphane LOISEL and Jean-Louis RULLIERE for ISFA and Alexandra BARRAL, Céline BLATTNER and Cécile PARADIS for Actuaris.

A prevention plan cannot be efficient without being targeted. Because medical data are very sensitive, insurance company only possess two types of data : very general one used to compute the actuarial risk (age, sex), and the nature of the repayment done. Thus, targeting a prevention plan is a major issue for an insurance company.
This limited access to the data usually makes the usual descriptive statistics not sufficient enough to target a tertiary prevention plan. Moreover, most of the data science techniques are supervised, which makes them inappropriate here.

This context has incited us to develop a new clustering method. It has to be unsupervised, complementary with the usual descriptive statistics and easy to interpret in order to fit with the practical constraints of prevention.
The method developed in this actuarial thesis has three steps. First, we convert a typical insurance data set to a new matrix. Each line of this matrix describes the consumption of one policyholder (in this thesis, we used the consumption frequency).

Secondly, we reduce the dimension of the new problem, using the NMF (Non-negative Matrix Factorisation) algorithms. These steps help to interpret the final clusters. Moreover, it increased dramatically the quality of the results.

Finally, we cluster the policyholders using a Kohonen’s self-organising map. We usually obtain among fifteen and twenty clusters. Kohonen’s maps offer a natural visualisation of the results. Thus, it is easy to compare results obtained on two policyholders sample. In this way, the method can be seen as a new descriptive statistics tool. Also, by means of the dimension reduction, it is easy to interpret the clusters. Hence, this process can be used to target a tertiary prevention plan, for example on senior fall. The statistical description of the clusters can also lead to a better understanding of a class of risk and help to target a primary prevention plan.

We used this method on two datasets for this actuarial thesis: a standard individual one and a collective one with top of the range guarantees.
We will describe in this actuarial thesis the adopted process, the main results and all the validation tests done in order to analyse the stability and the quality of this method.