Heterogeneous Pearson Correlation Coefficient

Introduction

Pearson Correlation Coefficient is a measure of the linear correlation between two variables, X and Y, defined as,

../../../../../_images/pearson.png

Let

../../../../../_images/standard.png

then,

../../../../../_images/rewrited.png

Implementation Detail

We use an MPC protocol called SPDZ for Heterogeneous Pearson Correlation Coefficient calculation. For more details, one can refer to [README].

Param

How to Use

params
column_indexes

-1 or list of int. If -1 provided, all columns are used for calculation. If a list of int provided, columns with given indexes are used for calculation.

column_names

names of columns use for calculation.

Note

if both params are provided, the union of columns indicated are used for calculation.

examples

There is an example [conf] and [dsl]