ANNs for Regression

##### Objectives

We currently have two objectives in using Machine Learning. The first one is to improve multi-dimensional regression methods applied to the American Monte-Carlo algorithm for XVA exposures. This will also serve as a learning and practising exercise before going to the second objective, the speeding up of calibration processes for pricing models.

We use Python’s numerous free resources to build prototypes testing various methodologies. Once a good methodology is identified, we implement it in C# for full usage in SDev.

To run our Jupyter notebooks, we recommend to work within Anaconda. The required packages are then Numpy, Matplotlib, Scikit-Learn, Keras and pyGAM.

##### Exposure Regression for XVAs

The purpose of this project is to address the issues illustrated in regressing on non-polynomial payoffs. Typical global polynomial regressions sometimes fail to accurately fit non-polynomial payoff which are often occuring in the context of XVA exposures: call/put options and their various combinations such as digital coupons, butterflies, risk reversals, etc…

In 1 dimension a rather efficient solution is to use local regressions such as LOESS, which we have witnessed to be accurately fit such payoffs without significant loss of speed.

In multiple dimensions, the problem becomes harder. Kd-trees coupled to local regression may provide an answer, but these may not scale well to high dimensions as needed for XVAs (for netting set level regression or highly dimensional payoffs).

We are thus looking for regression methods that can accomodate non-polynomial payoffs accurately, scale well with dimension, and are fast at runtime. This is where our interest in Machine Learning, and in particular Neural Networks, comes from.

Current progress

Our notebook, to be downloaded above, generates 1 dimensional samples for various payoffs that may lead to difficulties for global polynomial methods.

We then use packages available in Python to test multiple regression methods: K-neighbours, Support Vector Machines, Decision Trees, Random Forests, and finally Neural Networks (MLPs).

Currently, in 1 dimension, we find several competitive methods. In particular Random Forests with Extra-Trees and MLPs seem to give good results, with Random Forests far easier to operate in terms of choice of hyper-parameters.

Next steps include tuning of hyper-parameters in 1 dimension, comparison of all methods over a wide range of scenarios, before turning to the multi-dimensional case.