Federated Machine Learning

[中文]

FederatedML includes implementation of many common machine learning algorithms on federated learning. All modules are developed in a decoupling modular approach to enhance scalability. Specifically, we provide:

  1. Federated Statistic: PSI, Union, Pearson Correlation, etc.

  2. Federated Feature Engineering: Feature Sampling, Feature Binning, Feature Selection, etc.

  3. Federated Machine Learning Algorithms: LR, GBDT, DNN, TransferLearning, which support Heterogeneous and Homogeneous styles.

  4. Model Evaluation: Binary | Multiclass | Regression Evaluation, Local vs Federated Comparison.

  5. Secure Protocol: Provides multiple security protocols for secure multi-party computing and interaction between participants.

federatedml structure

Alogorithm List

Algorithm

Module Name

Description

Data Input

Data Output

Model Input

Model Output

DataIO

DataIO

This component is typically the first component of a modeling task. It will transform user- uploaded date into Instance object which can be used for the following components.

DTable, values are raw data.

Transformed DTable, values are data instance define in f ederatedml/ feature/ins tance.py

Intersect

Intersection

Compute intersect data set of two parties without leakage of difference set information. Mainly used in hetero scenario task.

DTable

DTable which keys are occurred in both parties.

Federated Sampling

FederatedSample

Federated Sampling data so that its distribution become balance in each party.This module support both federated and standalone version.

DTable

the sampled data, supports both random and stratified sampling.

Feature Scale

FeatureScale

Module for feature scaling and standardization.

DTable, whose values are instances.

Transformed DTable.

Transform factors like min/max, mean/std.

Hetero Feature Binning

HeteroFeatureBinning

With binning input data, calculates each column’s iv and woe and transform data according to the binned information.

DTable with y in guest and without y in host.

Transformed DTable.

iv/woe, split points, event counts, non- event counts etc. of each column.

OneHot Encoder

OneHotEncoder

Transfer a column into one-hot format.

Input DTable.

Transformed DTable with new headers.

Original header and feature values to new header map.

Hetero Feature Selection

HeteroFeatureSelection

Provide 5 types of filters. Each filters can select columns according to user config.

Input DTable.

Transformed DTable with new headers and filtered data instance.

If iv filters used, heter o_binning model is needed.

Whether left or not for each column.

Union

Union

Combine multiple data tables into one.

Input DTable(s).

one DTable with combined values from input DTables.

Hetero-LR

HeteroLR

Build hetero logistic regression module through multiple parties.

Input DTable.

Logistic Regression model.

Local Baseline

LocalBaseline

Wrapper that runs sklearn Logistic Regression model with local data.

Input DTable.

Logistic Regression. model.

Hetero-LinR

HeteroLinR

Build hetero linear regression module through multiple parties.

Input DTable.

Linear Regression model.

Hetero-Poisson

HeteroPoisson

Build hetero poisson regression module through multiple parties.

Input DTable.

Poisson Regression model.

Homo-LR

HomoLR

Build homo logistic regression module through multiple parties.

Input DTable.

Logistic Regression model.

Homo-NN

HomoNN

Build homo neural network module through multiple parties.

Input Dtable.

Neural Network model.

Hetero Secure Boosting

HeteroSecureBoost

Build hetero secure boosting module through multiple parties.

DTable, values are instances.

SecureBoost Model, consists of model-meta and model- param

Evaluation

Evaluation

Hetero Pearson

HeteroPearson

Hetero-NN

HeteroNN

Build hetero neural network module.

Input Dtable.

Model Output: heero neural network model.

Homo Secure Boosting

HomoSecureBoost

Build homo secure boosting module through multiple parties.

DTable, values are instances.

SecureBoost Model, consists of model-meta and model- param