Federated Machine Learning¶

[中文]

FederatedML includes implementation of many common machine learning algorithms on federated learning. All modules are developed in a decoupling modular approach to enhance scalability. Specifically, we provide:

Federated Statistic: PSI, Union, Pearson Correlation, etc.
Federated Information Retrieval: PIR(SIR) Based OT
Federated Feature Engineering: Feature Sampling, Feature Binning, Feature Selection, etc.
Federated Machine Learning Algorithms: LR, GBDT, DNN, TransferLearning, UnsupervisedLearning which support Heterogeneous and Homogeneous styles, Semi-supervisedLearning which support Heterogeneous styles
Model Evaluation: Binary | Multiclass | Regression | Clustering Evaluation, Local vs Federated Comparison.
Secure Protocol: Provides multiple security protocols for secure multi-party computing and interaction between participants.

Algorithm List¶

Algorithm	Module Name	Description	Data Input	Data Output	Model Input	Model Output
DataTransform	DataTransform	This component transforms user-uploaded data into Instance object.	Table, values are raw data.	Transformed Table, values are data instance defined here		DataTransform Model
Intersect	Intersection	Compute intersect data set of multiple parties without leakage of difference set information. Mainly used in hetero scenario task.	Table.	Table with only common instance keys.		Intersect Model
Federated Sampling	FederatedSample	Federated Sampling data so that its distribution become balance in each party.This module supports standalone and federated versions.	Table	Table of sampled data; both random and stratified sampling methods are supported.
Feature Scale	FeatureScale	module for feature scaling and standardization.	Table，values are instances.	Transformed Table.	Transform factors like min/max, mean/std.
Hetero Feature Binning	HeteroFeatureBinning	With binning input data, calculates each column's iv and woe and transform data according to the binned information.	Table, values are instances.	Transformed Table.		iv/woe, split points, event count, non-event count etc. of each column.
Homo Feature Binning	HomoFeatureBinning	Calculate quantile binning through multiple parties	Table	Transformed Table		Split points of each column
OneHot Encoder	OneHotEncoder	Transfer a column into one-hot format.	Table, values are instances.	Transformed Table with new header.		Feature-name mapping between original header and new header.
Hetero Feature Selection	HeteroFeatureSelection	Provide 5 types of filters. Each filters can select columns according to user config	Table	Transformed Table with new header and filtered data instance.	If iv filters used, hetero_binning model is needed.	Whether each column is filtered.
Union	Union	Combine multiple data tables into one.	Tables.	Table with combined values from input Tables.
Hetero-LR	HeteroLR	Build hetero logistic regression model through multiple parties.	Table, values are instances	Table, values are instances.		Logistic Regression Model, consists of model-meta and model-param.
Local Baseline	LocalBaseline	Wrapper that runs sklearn(scikit-learn) Logistic Regression model with local data.	Table, values are instances.	Table, values are instances.
Hetero-LinR	HeteroLinR	Build hetero linear regression model through multiple parties.	Table, values are instances.	Table, values are instances.		Linear Regression Model, consists of model-meta and model-param.
Hetero-Poisson	HeteroPoisson	Build hetero poisson regression model through multiple parties.	Table, values are instances.	Table, values are instances.		Poisson Regression Model, consists of model-meta and model-param.
Homo-LR	HomoLR	Build homo logistic regression model through multiple parties.	Table, values are instances.	Table, values are instances.		Logistic Regression Model, consists of model-meta and model-param.
Homo-NN	HomoNN	Build homo neural network model through multiple parties.	Table, values are instances.	Table, values are instances.		Neural Network Model, consists of model-meta and model-param.
Hetero Secure Boosting	HeteroSecureBoost	Build hetero secure boosting model through multiple parties	Table, values are instances.	Table, values are instances.		SecureBoost Model, consists of model-meta and model-param.
Hetero Fast Secure Boosting	HeteroFastSecureBoost	Build hetero secure boosting model through multiple parties in layered/mix manners.	Table, values are instances.	Table, values are instances.		FastSecureBoost Model, consists of model-meta and model-param.
Evaluation	Evaluation	Output the model evaluation metrics for user.	Table(s), values are instances.
Hetero Pearson	HeteroPearson	Calculate hetero correlation of features from different parties.	Table, values are instances.
Hetero-NN	HeteroNN	Build hetero neural network model.	Table, values are instances.	Table, values are instances.		Hetero Neural Network Model, consists of model-meta and model-param.
Homo Secure Boosting	HomoSecureBoost	Build homo secure boosting model through multiple parties	Table, values are instances.	Table, values are instances.		SecureBoost Model, consists of model-meta and model-param.
Homo OneHot Encoder	HomoOneHotEncoder	Build homo onehot encoder model through multiple parties.	Table, values are instances.	Transformed Table with new header.		Feature-name mapping between original header and new header.
Hetero Data Split	HeteroDataSplit	Split one data table into 3 tables by given ratio or count	Table, values are instances.	3 Tables, values are instance.
Homo Data Split	HomoDataSplit	Split one data table into 3 tables by given ratio or count	Table, values are instances.	3 Tables, values are instance.
Column Expand	ColumnExpand	Add arbitrary number of columns with user-provided values.	Table, values are raw data.	Transformed Table with added column(s) and new header.		Column Expand Model
Secure Information Retrieval	SecureInformationRetrieval	Securely retrieves information from host through oblivious transfer	Table, values are instance	Table, values are instance
Hetero Federated Transfer Learning	FTL	Build Hetero FTL Model Between 2 party	Table, values are instance			Hetero FTL Model
Hetero KMeans	HeteroKMeans	Build Hetero KMeans model through multiple parties	Table, values are instance	Table, values are instance; Arbier outputs 2 Tables		Hetero KMeans Model
PSI	PSI	Compute PSI value of features between two table	Table, values are instance			PSI Results
Data Statistics	DataStatistics	This component will do some statistical work on the data, including statistical mean, maximum and minimum, median, etc.	Table, values are instance	Table		Statistic Result
Scorecard	Scorecard	Scale predict score to credit score by given scaling parameters	Table, values are predict score	Table, values are score results
Sample Weight	SampleWeight	Assign weight to instances according to user-specified parameters	Table, values are instance	Table, values are weighted instance		SampleWeight Model
Feldman Verifiable Sum	FeldmanVerifiableSum	This component will sum multiple privacy values without exposing data	Table, values to sum	Table, values are sum results
Feature Imputation	FeatureImputation	This component imputes missing features using arbitrary methods/values	Table, values are Instances	Table, values with missing features filled		FeatureImputation Model
Label Transform	LabelTransform	Replaces label values of input data instances and predict results	Table, values are Instances or prediction results	Table, values with transformed label values		LabelTransform Model
Hetero SSHE Logistic Regression	HeteroSSHELR	Build hetero logistic regression model without arbiter	Table, values are Instances	Table, values are Instances		SSHE LR Model
Hetero SSHE Linear Regression	HeteroSSHELinR	Build hetero linear regression model without arbiter	Table, values are Instances	Table, values are Instances		SSHE LinR Model
Positive Unlabeled Learning	PositiveUnlabeled	Build positive unlabeled learning model	Table, values are Instances	Table, values are Instances

Secure Protocol¶

Params¶

`param` ¶

Attributes¶

all = ['BoostingParam', 'ObjectiveParam', 'DecisionTreeParam', 'CrossValidationParam', 'DataSplitParam', 'DataIOParam', 'DataTransformParam', 'EncryptParam', 'EncryptedModeCalculatorParam', 'FeatureBinningParam', 'FeatureSelectionParam', 'FTLParam', 'HeteroNNParam', 'HomoNNParam', 'HomoOneHotParam', 'InitParam', 'IntersectParam', 'EncodeParam', 'RSAParam', 'LinearParam', 'LocalBaselineParam', 'LogisticParam', 'OneVsRestParam', 'PearsonParam', 'PoissonParam', 'PositiveUnlabeledParam', 'PredictParam', 'PSIParam', 'SampleParam', 'ScaleParam', 'SecureAddExampleParam', 'StochasticQuasiNewtonParam', 'StatisticsParam', 'StepwiseParam', 'UnionParam', 'ColumnExpandParam', 'KmeansParam', 'ScorecardParam', 'SecureInformationRetrievalParam', 'SampleWeightParam', 'FeldmanVerifiableSumParam', 'EvaluateParam'] `module-attribute` ¶

Classes¶

`PSIParam(max_bin_num=20, need_run=True, dense_missing_val=None, binning_error=consts.DEFAULT_RELATIVE_ERROR)` ¶

Bases: BaseParam

Source code in python/federatedml/param/psi_param.py

def __init__(self, max_bin_num=20, need_run=True, dense_missing_val=None,
             binning_error=consts.DEFAULT_RELATIVE_ERROR):
    super(PSIParam, self).__init__()
    self.max_bin_num = max_bin_num
    self.need_run = need_run
    self.dense_missing_val = dense_missing_val
    self.binning_error = binning_error

Attributes¶

max_bin_num = max_bin_num instance-attribute ¶

need_run = need_run instance-attribute ¶

dense_missing_val = dense_missing_val instance-attribute ¶

binning_error = binning_error instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/psi_param.py

def check(self):
    assert isinstance(self.max_bin_num, int) and self.max_bin_num > 0, 'max bin must be an integer larger than 0'
    assert isinstance(self.need_run, bool)

    if self.dense_missing_val is not None:
        assert isinstance(self.dense_missing_val, str) or isinstance(self.dense_missing_val, int) or \
            isinstance(self.dense_missing_val, float), \
            'missing value type {} not supported'.format(type(self.dense_missing_val))

    self.check_decimal_float(self.binning_error, "psi's param")

`HomoOneHotParam(transform_col_indexes=-1, transform_col_names=None, need_run=True, need_alignment=True)` ¶

Bases: BaseParam

Parameters:

Name	Description	Default
`transform_col_indexes`	Specify which columns need to calculated. -1 represent for all columns.	`-1`
`need_run`	Indicate if this module needed to be run	`True`
`need_alignment`	Indicated whether alignment of features is turned on	`True`

Source code in python/federatedml/param/homo_onehot_encoder_param.py

def __init__(self, transform_col_indexes=-1, transform_col_names=None, need_run=True, need_alignment=True):
    super(HomoOneHotParam, self).__init__()
    self.transform_col_indexes = transform_col_indexes
    self.transform_col_names = transform_col_names
    self.need_run = need_run
    self.need_alignment = need_alignment

Attributes¶

transform_col_indexes = transform_col_indexes instance-attribute ¶

transform_col_names = transform_col_names instance-attribute ¶

need_run = need_run instance-attribute ¶

need_alignment = need_alignment instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/homo_onehot_encoder_param.py

def check(self):
    descr = "One-hot encoder with alignment param's"
    self.check_defined_type(self.transform_col_indexes, descr, ['list', 'int'])
    self.check_boolean(self.need_run, descr)
    self.check_boolean(self.need_alignment, descr)

    self.transform_col_names = [] if self.transform_col_names is None else self.transform_col_names
    return True

`DataIOParam(input_format='dense', delimitor=',', data_type='float64', exclusive_data_type=None, tag_with_value=False, tag_value_delimitor=':', missing_fill=False, default_value=0, missing_fill_method=None, missing_impute=None, outlier_replace=False, outlier_replace_method=None, outlier_impute=None, outlier_replace_value=0, with_label=False, label_name='y', label_type='int', output_format='dense', need_run=True)` ¶

Bases: BaseParam

Define dataio parameters that used in federated ml.

Parameters:

Name	Type	Description	Default
`input_format`		please have a look at this tutorial at "DataIO" section of federatedml/util/README.md. Formally, dense input format data should be set to "dense", svm-light input format data should be set to "sparse", tag or tag:value input format data should be set to "tag".	`'dense'`
`delimitor`	`str`	the delimitor of data input, default: ','	`','`
`data_type`		the data type of data input	`'float64'`
`exclusive_data_type`	`dict`	the key of dict is col_name, the value is data_type, use to specified special data type of some features.	`None`
`tag_with_value`		use if input_format is 'tag', if tag_with_value is True, input column data format should be tag[delimitor]value, otherwise is tag only	`False`
`tag_value_delimitor`		use if input_format is 'tag' and 'tag_with_value' is True, delimitor of tag[delimitor]value column value.	`':'`
`missing_fill`	`bool`	need to fill missing value or not, accepted only True/False, default: False	`False`
`default_value`	`None or object or list`	the value to replace missing value. if None, it will use default value define in federatedml/feature/imputer.py, if single object, will fill missing value with this object, if list, it's length should be the sample of input data' feature dimension, means that if some column happens to have missing values, it will replace it the value by element in the identical position of this list.	`0`
`missing_fill_method`		the method to replace missing value	`None`
`missing_impute`		element of list can be any type, or auto generated if value is None, define which values to be consider as missing	`None`
`outlier_replace`		need to replace outlier value or not, accepted only True/False, default: True	`False`
`outlier_replace_method`		the method to replace missing value	`None`
`outlier_impute`		element of list can be any type, which values should be regard as missing value, default: None	`None`
`outlier_replace_value`	`None or object or list`	the value to replace outlier. if None, it will use default value define in federatedml/feature/imputer.py, if single object, will replace outlier with this object, if list, it's length should be the sample of input data' feature dimension, means that if some column happens to have outliers, it will replace it the value by element in the identical position of this list.	`0`
`with_label`	`bool`	True if input data consist of label, False otherwise. default: 'false'	`False`
`label_name`	`str`	column_name of the column where label locates, only use in dense-inputformat. default: 'y'	`'y'`
`label_type`		use when with_label is True.	`'int'`
`output_format`		output format	`'dense'`

Source code in python/federatedml/param/dataio_param.py

def __init__(self, input_format="dense", delimitor=',', data_type='float64',
             exclusive_data_type=None,
             tag_with_value=False, tag_value_delimitor=":",
             missing_fill=False, default_value=0, missing_fill_method=None,
             missing_impute=None, outlier_replace=False, outlier_replace_method=None,
             outlier_impute=None, outlier_replace_value=0,
             with_label=False, label_name='y',
             label_type='int', output_format='dense', need_run=True):
    self.input_format = input_format
    self.delimitor = delimitor
    self.data_type = data_type
    self.exclusive_data_type = exclusive_data_type
    self.tag_with_value = tag_with_value
    self.tag_value_delimitor = tag_value_delimitor
    self.missing_fill = missing_fill
    self.default_value = default_value
    self.missing_fill_method = missing_fill_method
    self.missing_impute = missing_impute
    self.outlier_replace = outlier_replace
    self.outlier_replace_method = outlier_replace_method
    self.outlier_impute = outlier_impute
    self.outlier_replace_value = outlier_replace_value
    self.with_label = with_label
    self.label_name = label_name
    self.label_type = label_type
    self.output_format = output_format
    self.need_run = need_run

Attributes¶

input_format = input_format instance-attribute ¶

delimitor = delimitor instance-attribute ¶

data_type = data_type instance-attribute ¶

exclusive_data_type = exclusive_data_type instance-attribute ¶

tag_with_value = tag_with_value instance-attribute ¶

tag_value_delimitor = tag_value_delimitor instance-attribute ¶

missing_fill = missing_fill instance-attribute ¶

default_value = default_value instance-attribute ¶

missing_fill_method = missing_fill_method instance-attribute ¶

missing_impute = missing_impute instance-attribute ¶

outlier_replace = outlier_replace instance-attribute ¶

outlier_replace_method = outlier_replace_method instance-attribute ¶

outlier_impute = outlier_impute instance-attribute ¶

outlier_replace_value = outlier_replace_value instance-attribute ¶

with_label = with_label instance-attribute ¶

label_name = label_name instance-attribute ¶

label_type = label_type instance-attribute ¶

output_format = output_format instance-attribute ¶

need_run = need_run instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/dataio_param.py

def check(self):

    descr = "dataio param's"

    self.input_format = self.check_and_change_lower(self.input_format,
                                                    ["dense", "sparse", "tag"],
                                                    descr)

    self.output_format = self.check_and_change_lower(self.output_format,
                                                     ["dense", "sparse"],
                                                     descr)

    self.data_type = self.check_and_change_lower(self.data_type,
                                                 ["int", "int64", "float", "float64", "str", "long"],
                                                 descr)

    if type(self.missing_fill).__name__ != 'bool':
        raise ValueError("dataio param's missing_fill {} not supported".format(self.missing_fill))

    if self.missing_fill_method is not None:
        self.missing_fill_method = self.check_and_change_lower(self.missing_fill_method,
                                                               ['min', 'max', 'mean', 'designated'],
                                                               descr)

    if self.outlier_replace_method is not None:
        self.outlier_replace_method = self.check_and_change_lower(self.outlier_replace_method,
                                                                  ['min', 'max', 'mean', 'designated'],
                                                                  descr)

    if type(self.with_label).__name__ != 'bool':
        raise ValueError("dataio param's with_label {} not supported".format(self.with_label))

    if self.with_label:
        if not isinstance(self.label_name, str):
            raise ValueError("dataio param's label_name {} should be str".format(self.label_name))

        self.label_type = self.check_and_change_lower(self.label_type,
                                                      ["int", "int64", "float", "float64", "str", "long"],
                                                      descr)

    if self.exclusive_data_type is not None and not isinstance(self.exclusive_data_type, dict):
        raise ValueError("exclusive_data_type is should be None or a dict")

    return True

`DataTransformParam(input_format='dense', delimitor=',', data_type='float64', exclusive_data_type=None, tag_with_value=False, tag_value_delimitor=':', missing_fill=False, default_value=0, missing_fill_method=None, missing_impute=None, outlier_replace=False, outlier_replace_method=None, outlier_impute=None, outlier_replace_value=0, with_label=False, label_name='y', label_type='int', output_format='dense', need_run=True, with_match_id=False, match_id_name='', match_id_index=0)` ¶

Bases: BaseParam

Define data transform parameters that used in federated ml.

Parameters:

Name	Type	Description	Default
`input_format`		please have a look at this tutorial at "DataTransform" section of federatedml/util/README.md. Formally, dense input format data should be set to "dense", svm-light input format data should be set to "sparse", tag or tag:value input format data should be set to "tag". Note: in fate's version >= 1.9.0, this params can be used in uploading/binding data's meta	`'dense'`
`delimitor`	`str`	the delimitor of data input, default: ','	`','`
`data_type`	`int`	{'float64','float','int','int64','str','long'} the data type of data input	`'float64'`
`exclusive_data_type`	`dict`	the key of dict is col_name, the value is data_type, use to specified special data type of some features.	`None`
`tag_with_value`		use if input_format is 'tag', if tag_with_value is True, input column data format should be tag[delimitor]value, otherwise is tag only	`False`
`tag_value_delimitor`		use if input_format is 'tag' and 'tag_with_value' is True, delimitor of tag[delimitor]value column value.	`':'`
`missing_fill`	`bool`	need to fill missing value or not, accepted only True/False, default: False	`False`
`default_value`	`None or object or list`	the value to replace missing value. if None, it will use default value define in federatedml/feature/imputer.py, if single object, will fill missing value with this object, if list, it's length should be the sample of input data' feature dimension, means that if some column happens to have missing values, it will replace it the value by element in the identical position of this list.	`0`
`missing_fill_method`		the method to replace missing value, should be one of [None, 'min', 'max', 'mean', 'designated']	`None`
`missing_impute`		element of list can be any type, or auto generated if value is None, define which values to be consider as missing	`None`
`outlier_replace`		need to replace outlier value or not, accepted only True/False, default: True	`False`
`outlier_replace_method`		the method to replace missing value, should be one of [None, 'min', 'max', 'mean', 'designated']	`None`
`outlier_impute`		element of list can be any type, which values should be regard as missing value	`None`
`outlier_replace_value`		the value to replace outlier. if None, it will use default value define in federatedml/feature/imputer.py, if single object, will replace outlier with this object, if list, it's length should be the sample of input data' feature dimension, means that if some column happens to have outliers, it will replace it the value by element in the identical position of this list.	`0`
`with_label`	`bool`	True if input data consist of label, False otherwise. default: 'false' Note: in fate's version >= 1.9.0, this params can be used in uploading/binding data's meta	`False`
`label_name`	`str`	column_name of the column where label locates, only use in dense-inputformat. default: 'y'	`'y'`
`label_type`		use when with_label is True	`'int','int64','float','float64','long','str'`
`output_format`		output format	`'dense'`
`with_match_id`		True if dataset has match_id, default: False Note: in fate's version >= 1.9.0, this params can be used in uploading/binding data's meta	`False`
`match_id_name`		Valid if input_format is "dense", and multiple columns are considered as match_ids, the name of match_id to be used in current job Note: in fate's version >= 1.9.0, this params can be used in uploading/binding data's meta	`''`
`match_id_index`		Valid if input_format is "tag" or "sparse", and multiple columns are considered as match_ids, the index of match_id, default: 0 This param works only when data meta has been set with uploading/binding.	`0`

Source code in python/federatedml/param/data_transform_param.py

def __init__(self, input_format="dense", delimitor=',', data_type='float64',
             exclusive_data_type=None,
             tag_with_value=False, tag_value_delimitor=":",
             missing_fill=False, default_value=0, missing_fill_method=None,
             missing_impute=None, outlier_replace=False, outlier_replace_method=None,
             outlier_impute=None, outlier_replace_value=0,
             with_label=False, label_name='y',
             label_type='int', output_format='dense', need_run=True,
             with_match_id=False, match_id_name='', match_id_index=0):
    self.input_format = input_format
    self.delimitor = delimitor
    self.data_type = data_type
    self.exclusive_data_type = exclusive_data_type
    self.tag_with_value = tag_with_value
    self.tag_value_delimitor = tag_value_delimitor
    self.missing_fill = missing_fill
    self.default_value = default_value
    self.missing_fill_method = missing_fill_method
    self.missing_impute = missing_impute
    self.outlier_replace = outlier_replace
    self.outlier_replace_method = outlier_replace_method
    self.outlier_impute = outlier_impute
    self.outlier_replace_value = outlier_replace_value
    self.with_label = with_label
    self.label_name = label_name
    self.label_type = label_type
    self.output_format = output_format
    self.need_run = need_run
    self.with_match_id = with_match_id
    self.match_id_name = match_id_name
    self.match_id_index = match_id_index

Attributes¶

input_format = input_format instance-attribute ¶

delimitor = delimitor instance-attribute ¶

data_type = data_type instance-attribute ¶

exclusive_data_type = exclusive_data_type instance-attribute ¶

tag_with_value = tag_with_value instance-attribute ¶

tag_value_delimitor = tag_value_delimitor instance-attribute ¶

missing_fill = missing_fill instance-attribute ¶

default_value = default_value instance-attribute ¶

missing_fill_method = missing_fill_method instance-attribute ¶

missing_impute = missing_impute instance-attribute ¶

outlier_replace = outlier_replace instance-attribute ¶

outlier_replace_method = outlier_replace_method instance-attribute ¶

outlier_impute = outlier_impute instance-attribute ¶

outlier_replace_value = outlier_replace_value instance-attribute ¶

with_label = with_label instance-attribute ¶

label_name = label_name instance-attribute ¶

label_type = label_type instance-attribute ¶

output_format = output_format instance-attribute ¶

need_run = need_run instance-attribute ¶

with_match_id = with_match_id instance-attribute ¶

match_id_name = match_id_name instance-attribute ¶

match_id_index = match_id_index instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/data_transform_param.py

def check(self):

    descr = "data_transform param's"

    self.input_format = self.check_and_change_lower(self.input_format,
                                                    ["dense", "sparse", "tag"],
                                                    descr)

    self.output_format = self.check_and_change_lower(self.output_format,
                                                     ["dense", "sparse"],
                                                     descr)

    self.data_type = self.check_and_change_lower(self.data_type,
                                                 ["int", "int64", "float", "float64", "str", "long"],
                                                 descr)

    if type(self.missing_fill).__name__ != 'bool':
        raise ValueError("data_transform param's missing_fill {} not supported".format(self.missing_fill))

    if self.missing_fill_method is not None:
        self.missing_fill_method = self.check_and_change_lower(self.missing_fill_method,
                                                               ['min', 'max', 'mean', 'designated'],
                                                               descr)

    if self.outlier_replace_method is not None:
        self.outlier_replace_method = self.check_and_change_lower(self.outlier_replace_method,
                                                                  ['min', 'max', 'mean', 'designated'],
                                                                  descr)

    if type(self.with_label).__name__ != 'bool':
        raise ValueError("data_transform param's with_label {} not supported".format(self.with_label))

    if self.with_label:
        if not isinstance(self.label_name, str):
            raise ValueError("data transform param's label_name {} should be str".format(self.label_name))

        self.label_type = self.check_and_change_lower(self.label_type,
                                                      ["int", "int64", "float", "float64", "str", "long"],
                                                      descr)

    if self.exclusive_data_type is not None and not isinstance(self.exclusive_data_type, dict):
        raise ValueError("exclusive_data_type is should be None or a dict")

    if not isinstance(self.with_match_id, bool):
        raise ValueError("with_match_id should be boolean variable, but {} find".format(self.with_match_id))

    if not isinstance(self.match_id_index, int) or self.match_id_index < 0:
        raise ValueError("match_id_index should be non negative integer")

    if self.match_id_name is not None and not isinstance(self.match_id_name, str):
        raise ValueError("match_id_name should be str")

    return True

`FeldmanVerifiableSumParam(sum_cols=None, q_n=6)` ¶

Bases: BaseParam

Define how to transfer the cols

Parameters:

Name	Type	Description	Default
`sum_cols`	`list of column index, default`	Specify which columns need to be sum. If column index is None, each of columns will be sum.	`None`
`q_n`	`int, positive integer less than or equal to 16, default`	q_n is the number of significant decimal digit, If the data type is a float, the maximum significant digit is 16. The sum of integer and significant decimal digits should be less than or equal to 16.	`6`

Source code in python/federatedml/param/feldman_verifiable_sum_param.py

def __init__(self, sum_cols=None, q_n=6):
    self.sum_cols = sum_cols
    self.q_n = q_n

Attributes¶

sum_cols = sum_cols instance-attribute ¶

q_n = q_n instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/feldman_verifiable_sum_param.py

def check(self):
    self.sum_cols = [] if self.sum_cols is None else self.sum_cols
    if isinstance(self.sum_cols, list):
        for idx in self.sum_cols:
            if not isinstance(idx, int):
                raise ValueError(f"type mismatch, column_indexes with element {idx}(type is {type(idx)})")

    if not isinstance(self.q_n, int):
        raise ValueError(f"Init param's q_n {self.q_n} not supported, should be int type", type is {type(self.q_n)})

    if self.q_n < 0:
        raise ValueError(f"param's q_n {self.q_n} not supported, should be non-negative int value")
    elif self.q_n > 16:
        raise ValueError(f"param's q_n {self.q_n} not supported, should be less than or equal to 16")

`InitParam(init_method='random_uniform', init_const=1, fit_intercept=True, random_seed=None)` ¶

Bases: BaseParam

Initialize Parameters used in initializing a model.

Parameters:

Name	Type	Description	Default
`init_method`		Initial method.	`'random_uniform'`
`init_const`	`int or float, default`	Required when init_method is 'const'. Specify the constant.	`1`
`fit_intercept`	`bool, default`	Whether to initialize the intercept or not.	`True`

Source code in python/federatedml/param/init_model_param.py

def __init__(self, init_method='random_uniform', init_const=1, fit_intercept=True, random_seed=None):
    super().__init__()
    self.init_method = init_method
    self.init_const = init_const
    self.fit_intercept = fit_intercept
    self.random_seed = random_seed

Attributes¶

init_method = init_method instance-attribute ¶

init_const = init_const instance-attribute ¶

fit_intercept = fit_intercept instance-attribute ¶

random_seed = random_seed instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/init_model_param.py

def check(self):
    if type(self.init_method).__name__ != "str":
        raise ValueError(
            "Init param's init_method {} not supported, should be str type".format(self.init_method))
    else:
        self.init_method = self.init_method.lower()
        if self.init_method not in ['random_uniform', 'random_normal', 'ones', 'zeros', 'const']:
            raise ValueError(
                "Init param's init_method {} not supported, init_method should in 'random_uniform',"
                " 'random_normal' 'ones', 'zeros' or 'const'".format(self.init_method))

    if type(self.init_const).__name__ not in ['int', 'float']:
        raise ValueError(
            "Init param's init_const {} not supported, should be int or float type".format(self.init_const))

    if type(self.fit_intercept).__name__ != 'bool':
        raise ValueError(
            "Init param's fit_intercept {} not supported, should be bool type".format(self.fit_intercept))

    if self.random_seed is not None:
        if type(self.random_seed).__name__ != 'int':
            raise ValueError(
                "Init param's random_seed {} not supported, should be int or float type".format(self.random_seed))

    return True

`SecureAddExampleParam(seed=None, partition=1, data_num=1000)` ¶

Bases: BaseParam

Source code in python/federatedml/param/secure_add_example_param.py

def __init__(self, seed=None, partition=1, data_num=1000):
    self.seed = seed
    self.partition = partition
    self.data_num = data_num

Attributes¶

seed = seed instance-attribute ¶

partition = partition instance-attribute ¶

data_num = data_num instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/secure_add_example_param.py

def check(self):
    if self.seed is not None and type(self.seed).__name__ != "int":
        raise ValueError("random seed should be None or integers")

    if type(self.partition).__name__ != "int" or self.partition < 1:
        raise ValueError("partition should be an integer large than 0")

    if type(self.data_num).__name__ != "int" or self.data_num < 1:
        raise ValueError("data_num should be an integer large than 0")

`StochasticQuasiNewtonParam(update_interval_L=3, memory_M=5, sample_size=5000, random_seed=None)` ¶

Bases: BaseParam

Parameters used for stochastic quasi-newton method.

Parameters:

Name	Type	Description	Default
`update_interval_L`	`int, default`	Set how many iteration to update hess matrix	`3`
`memory_M`	`int, default`	Stack size of curvature information, i.e. y_k and s_k in the paper.	`5`
`sample_size`	`int, default`	Sample size of data that used to update Hess matrix	`5000`

Source code in python/federatedml/param/sqn_param.py

def __init__(self, update_interval_L=3, memory_M=5, sample_size=5000, random_seed=None):
    super().__init__()
    self.update_interval_L = update_interval_L
    self.memory_M = memory_M
    self.sample_size = sample_size
    self.random_seed = random_seed

Attributes¶

update_interval_L = update_interval_L instance-attribute ¶

memory_M = memory_M instance-attribute ¶

sample_size = sample_size instance-attribute ¶

random_seed = random_seed instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/sqn_param.py

def check(self):
    descr = "hetero sqn param's"
    self.check_positive_integer(self.update_interval_L, descr)
    self.check_positive_integer(self.memory_M, descr)
    self.check_positive_integer(self.sample_size, descr)
    if self.random_seed is not None:
        self.check_positive_integer(self.random_seed, descr)
    return True

`EncryptParam(method=consts.PAILLIER, key_length=1024)` ¶

Bases: BaseParam

Define encryption method that used in federated ml.

Parameters:

Name	Type	Description	Default
`method`		If method is 'Paillier', Paillier encryption will be used for federated ml. To use non-encryption version in HomoLR, set this to None. For detail of Paillier encryption, please check out the paper mentioned in README file.	`'Paillier'`
`key_length`	`int, default`	Used to specify the length of key in this encryption method.	`1024`

Source code in python/federatedml/param/encrypt_param.py

def __init__(self, method=consts.PAILLIER, key_length=1024):
    super(EncryptParam, self).__init__()
    self.method = method
    self.key_length = key_length

Attributes¶

method = method instance-attribute ¶

key_length = key_length instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/encrypt_param.py

def check(self):
    if self.method is not None and type(self.method).__name__ != "str":
        raise ValueError(
            "encrypt_param's method {} not supported, should be str type".format(
                self.method))
    elif self.method is None:
        pass
    else:
        user_input = self.method.lower()
        if user_input == "paillier":
            self.method = consts.PAILLIER
        elif user_input == consts.ITERATIVEAFFINE.lower() or user_input == consts.RANDOM_ITERATIVEAFFINE:
            LOGGER.warning('Iterative Affine and Random Iterative Affine are not supported in version>=1.7.1 '
                           'due to safety concerns, encrypt method will be reset to Paillier')
            self.method = consts.PAILLIER
        elif user_input == "ipcl":
            self.method = consts.PAILLIER_IPCL
        else:
            raise ValueError(
                "encrypt_param's method {} not supported".format(user_input))

    if type(self.key_length).__name__ != "int":
        raise ValueError(
            "encrypt_param's key_length {} not supported, should be int type".format(self.key_length))
    elif self.key_length <= 0:
        raise ValueError(
            "encrypt_param's key_length must be greater or equal to 1")

    LOGGER.debug("Finish encrypt parameter check!")
    return True

`EncryptedModeCalculatorParam(mode='strict', re_encrypted_rate=1)` ¶

Bases: BaseParam

Define the encrypted_mode_calulator parameters.

Parameters:

Name	Type	Description	Default
`mode`		encrypted mode, default: strict	`'strict'`
`re_encrypted_rate`		numeric number in [0, 1], use when mode equals to 'balance', default: 1	`1`

Source code in python/federatedml/param/encrypted_mode_calculation_param.py

def __init__(self, mode="strict", re_encrypted_rate=1):
    self.mode = mode
    self.re_encrypted_rate = re_encrypted_rate

Attributes¶

mode = mode instance-attribute ¶

re_encrypted_rate = re_encrypted_rate instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/encrypted_mode_calculation_param.py

def check(self):
    descr = "encrypted_mode_calculator param"
    self.mode = self.check_and_change_lower(self.mode,
                                            ["strict", "fast", "balance", "confusion_opt", "confusion_opt_balance"],
                                            descr)

    if self.mode != "strict":
        LOGGER.warning("encrypted_mode_calculator will be remove in later version, "
                       "but in current version user can still use it, but it only supports strict mode, "
                       "other mode will be reset to strict for compatibility")
        self.mode = "strict"

    return True

`EvaluateParam(eval_type='binary', pos_label=1, need_run=True, metrics=None, run_clustering_arbiter_metric=False, unfold_multi_result=False)` ¶

Bases: BaseParam

Define the evaluation method of binary/multiple classification and regression

Parameters:

Name	Type	Description	Default
`eval_type`		support 'binary' for HomoLR, HeteroLR and Secureboosting, support 'regression' for Secureboosting, 'multi' is not support these version	`'binary'`
`unfold_multi_result`	`bool`	unfold multi result and get several one-vs-rest binary classification results	`False`
`pos_label`	`int or float or str`	specify positive label type, depend on the data's label. this parameter effective only for 'binary'	`1`
`need_run`		Indicate if this module needed to be run	`True`

Source code in python/federatedml/param/evaluation_param.py

def __init__(self, eval_type="binary", pos_label=1, need_run=True, metrics=None,
             run_clustering_arbiter_metric=False, unfold_multi_result=False):
    super().__init__()
    self.eval_type = eval_type
    self.pos_label = pos_label
    self.need_run = need_run
    self.metrics = metrics
    self.unfold_multi_result = unfold_multi_result
    self.run_clustering_arbiter_metric = run_clustering_arbiter_metric

    self.default_metrics = {
        consts.BINARY: consts.ALL_BINARY_METRICS,
        consts.MULTY: consts.ALL_MULTI_METRICS,
        consts.REGRESSION: consts.ALL_REGRESSION_METRICS,
        consts.CLUSTERING: consts.ALL_CLUSTER_METRICS
    }

    self.allowed_metrics = {
        consts.BINARY: consts.ALL_BINARY_METRICS,
        consts.MULTY: consts.ALL_MULTI_METRICS,
        consts.REGRESSION: consts.ALL_REGRESSION_METRICS,
        consts.CLUSTERING: consts.ALL_CLUSTER_METRICS
    }

Attributes¶

eval_type = eval_type instance-attribute ¶

pos_label = pos_label instance-attribute ¶

need_run = need_run instance-attribute ¶

metrics = metrics instance-attribute ¶

unfold_multi_result = unfold_multi_result instance-attribute ¶

run_clustering_arbiter_metric = run_clustering_arbiter_metric instance-attribute ¶

default_metrics = {consts.BINARY: consts.ALL_BINARY_METRICS, consts.MULTY: consts.ALL_MULTI_METRICS, consts.REGRESSION: consts.ALL_REGRESSION_METRICS, consts.CLUSTERING: consts.ALL_CLUSTER_METRICS}

instance-attribute ¶

allowed_metrics = {consts.BINARY: consts.ALL_BINARY_METRICS, consts.MULTY: consts.ALL_MULTI_METRICS, consts.REGRESSION: consts.ALL_REGRESSION_METRICS, consts.CLUSTERING: consts.ALL_CLUSTER_METRICS}

instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/evaluation_param.py

def check(self):

    descr = "evaluate param's "
    self.eval_type = self.check_and_change_lower(self.eval_type,
                                                 [consts.BINARY, consts.MULTY, consts.REGRESSION,
                                                  consts.CLUSTERING],
                                                 descr)

    if type(self.pos_label).__name__ not in ["str", "float", "int"]:
        raise ValueError(
            "evaluate param's pos_label {} not supported, should be str or float or int type".format(
                self.pos_label))

    if type(self.need_run).__name__ != "bool":
        raise ValueError(
            "evaluate param's need_run {} not supported, should be bool".format(
                self.need_run))

    if self.metrics is None or len(self.metrics) == 0:
        self.metrics = self.default_metrics[self.eval_type]
        LOGGER.warning('use default metric {} for eval type {}'.format(self.metrics, self.eval_type))

    self.check_boolean(self.unfold_multi_result, 'multi_result_unfold')

    self.metrics = self._check_valid_metric(self.metrics)

    return True

check_single_value_default_metric() ¶

Source code in python/federatedml/param/evaluation_param.py

def check_single_value_default_metric(self):
    self._use_single_value_default_metrics()

    # in validation strategy, psi f1-score and confusion-mat pr-quantile are not supported in cur version
    if self.metrics is None or len(self.metrics) == 0:
        self.metrics = self.default_metrics[self.eval_type]
        LOGGER.warning('use default metric {} for eval type {}'.format(self.metrics, self.eval_type))

    ban_metric = [consts.PSI, consts.F1_SCORE, consts.CONFUSION_MAT, consts.QUANTILE_PR]
    for metric in self.metrics:
        if metric in ban_metric:
            self.metrics.remove(metric)
    self.check()

`KmeansParam(k=5, max_iter=300, tol=0.001, random_stat=None)` ¶

Bases: BaseParam

Parameters:

Name	Type	Description	Default
`k`	`int, default 5`	The number of the centroids to generate. should be larger than 1 and less than 100 in this version	`5`
`max_iter`	`int, default 300.`	Maximum number of iterations of the hetero-k-means algorithm to run.	`300`
`tol`	`float, default 0.001.`	tol	`0.001`
`random_stat`	`None or int`	random seed	`None`

Source code in python/federatedml/param/hetero_kmeans_param.py

def __init__(self, k=5, max_iter=300, tol=0.001, random_stat=None):
    super(KmeansParam, self).__init__()
    self.k = k
    self.max_iter = max_iter
    self.tol = tol
    self.random_stat = random_stat

Attributes¶

k = k instance-attribute ¶

max_iter = max_iter instance-attribute ¶

tol = tol instance-attribute ¶

random_stat = random_stat instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/hetero_kmeans_param.py

def check(self):
    descr = "Kmeans_param's"

    if not isinstance(self.k, int):
        raise ValueError(
            descr + "k {} not supported, should be int type".format(self.k))
    elif self.k <= 1:
        raise ValueError(
            descr + "k {} not supported, should be larger than 1")
    elif self.k > 100:
        raise ValueError(
            descr + "k {} not supported, should be less than 100 in this version")

    if not isinstance(self.max_iter, int):
        raise ValueError(
            descr + "max_iter not supported, should be int type".format(self.max_iter))
    elif self.max_iter <= 0:
        raise ValueError(
            descr + "max_iter not supported, should be larger than 0".format(self.max_iter))

    if not isinstance(self.tol, (float, int)):
        raise ValueError(
            descr + "tol not supported, should be float type".format(self.tol))
    elif self.tol < 0:
        raise ValueError(
            descr + "tol not supported, should be larger than or equal to 0".format(self.tol))

    if self.random_stat is not None:
        if not isinstance(self.random_stat, int):
            raise ValueError(descr + "random_stat not supported, should be int type".format(self.random_stat))
        elif self.random_stat < 0:
            raise ValueError(
                descr + "random_stat not supported, should be larger than/equal to 0".format(self.random_stat))

`PearsonParam(column_names=None, column_indexes=None, cross_parties=True, need_run=True, use_mix_rand=False, calc_local_vif=True)` ¶

Bases: BaseParam

param for pearson correlation

Parameters:

Name	Type	Description	Default
`column_names`	`list of string`	list of column names	`None`
`column_index`	`list of int`	list of column index	required
`cross_parties`	`bool, default`	if True, calculate correlation of columns from both party	`True`
`need_run`	`bool`	set False to skip this party	`True`
`use_mix_rand`	`bool, defalut`	mix system random and pseudo random for quicker calculation	`False`
`calc_loca_vif`	`bool, default True`	calculate VIF for columns in local	required

Source code in python/federatedml/param/pearson_param.py

def __init__(
    self,
    column_names=None,
    column_indexes=None,
    cross_parties=True,
    need_run=True,
    use_mix_rand=False,
    calc_local_vif=True,
):
    super().__init__()
    self.column_names = column_names
    self.column_indexes = column_indexes
    self.cross_parties = cross_parties
    self.need_run = need_run
    self.use_mix_rand = use_mix_rand
    self.calc_local_vif = calc_local_vif

Attributes¶

column_names = column_names instance-attribute ¶

column_indexes = column_indexes instance-attribute ¶

cross_parties = cross_parties instance-attribute ¶

need_run = need_run instance-attribute ¶

use_mix_rand = use_mix_rand instance-attribute ¶

calc_local_vif = calc_local_vif instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/pearson_param.py

def check(self):
    if not isinstance(self.use_mix_rand, bool):
        raise ValueError(
            f"use_mix_rand accept bool type only, {type(self.use_mix_rand)} got"
        )
    if self.cross_parties and (not self.need_run):
        raise ValueError(
            f"need_run should be True(which is default) when cross_parties is True."
        )

    self.column_indexes = [] if self.column_indexes is None else self.column_indexes
    self.column_names = [] if self.column_names is None else self.column_names
    if not isinstance(self.column_names, list):
        raise ValueError(
            f"type mismatch, column_names with type {type(self.column_names)}"
        )
    for name in self.column_names:
        if not isinstance(name, str):
            raise ValueError(
                f"type mismatch, column_names with element {name}(type is {type(name)})"
            )

    if isinstance(self.column_indexes, list):
        for idx in self.column_indexes:
            if not isinstance(idx, int):
                raise ValueError(
                    f"type mismatch, column_indexes with element {idx}(type is {type(idx)})"
                )

    if isinstance(self.column_indexes, int) and self.column_indexes != -1:
        raise ValueError(
            f"column_indexes with type int and value {self.column_indexes}(only -1 allowed)"
        )

    if self.need_run:
        if isinstance(self.column_indexes, list) and isinstance(
            self.column_names, list
        ):
            if len(self.column_indexes) == 0 and len(self.column_names) == 0:
                raise ValueError(f"provide at least one column")

`PositiveUnlabeledParam(strategy='probability', threshold=0.9)` ¶

Bases: BaseParam

Parameters used for positive unlabeled.¶

strategy: {"probability", "quantity", "proportion", "distribution"} The strategy of converting unlabeled value.

threshold: int or float, default: 0.9 The threshold in labeling strategy.

Source code in python/federatedml/param/positive_unlabeled_param.py

def __init__(self, strategy="probability", threshold=0.9):
    super(PositiveUnlabeledParam, self).__init__()
    self.strategy = strategy
    self.threshold = threshold

Attributes¶

strategy = strategy instance-attribute ¶

threshold = threshold instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/positive_unlabeled_param.py

def check(self):
    base_descr = "Positive Unlabeled Param's "
    float_descr = "Probability or Proportion Strategy Param's "
    int_descr = "Quantity Strategy Param's "
    numeric_descr = "Distribution Strategy Param's "

    self.check_valid_value(self.strategy, base_descr,
                           [consts.PROBABILITY, consts.QUANTITY, consts.PROPORTION, consts.DISTRIBUTION])

    self.check_defined_type(self.threshold, base_descr, [consts.INT, consts.FLOAT])

    if self.strategy == consts.PROBABILITY or self.strategy == consts.PROPORTION:
        self.check_decimal_float(self.threshold, float_descr)

    if self.strategy == consts.QUANTITY:
        self.check_positive_integer(self.threshold, int_descr)

    if self.strategy == consts.DISTRIBUTION:
        self.check_positive_number(self.threshold, numeric_descr)

    return True

`SampleParam(mode='random', method='downsample', fractions=None, random_state=None, task_type='hetero', need_run=True)` ¶

Bases: BaseParam

Define the sample method

Parameters:

Name	Type	Description	Default
`mode`		specify sample to use, default: 'random'	`'random'`

fractions: None or float or list if mode equals to random, it should be a float number greater than 0, otherwise a list of elements of pairs like [label_i, sample_rate_i], e.g. [[0, 0.5], [1, 0.8], [2, 0.3]]. default: None

random_state: int, RandomState instance or None, default: None random state

need_run: bool, default True Indicate if this module needed to be run

Source code in python/federatedml/param/sample_param.py

def __init__(self, mode="random", method="downsample", fractions=None,
             random_state=None, task_type="hetero", need_run=True):
    self.mode = mode
    self.method = method
    self.fractions = fractions
    self.random_state = random_state
    self.task_type = task_type
    self.need_run = need_run

Attributes¶

mode = mode instance-attribute ¶

method = method instance-attribute ¶

fractions = fractions instance-attribute ¶

random_state = random_state instance-attribute ¶

task_type = task_type instance-attribute ¶

need_run = need_run instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/sample_param.py

def check(self):
    descr = "sample param"
    self.mode = self.check_and_change_lower(self.mode,
                                            ["random", "stratified", "exact_by_weight"],
                                            descr)

    self.method = self.check_and_change_lower(self.method,
                                              ["upsample", "downsample"],
                                              descr)

    if self.mode == "stratified" and self.fractions is not None:
        if not isinstance(self.fractions, list):
            raise ValueError("fractions of sample param when using stratified should be list")
        for ele in self.fractions:
            if not isinstance(ele, collections.Container) or len(ele) != 2:
                raise ValueError(
                    "element in fractions of sample param using stratified should be a pair like [label_i, rate_i]")

    return True

`ScaleParam(method='standard_scale', mode='normal', scale_col_indexes=-1, scale_names=None, feat_upper=None, feat_lower=None, with_mean=True, with_std=True, need_run=True)` ¶

Bases: BaseParam

Define the feature scale parameters.

Parameters:

Name	Type	Description	Default
`method`		like scale in sklearn, now it support "min_max_scale" and "standard_scale", and will support other scale method soon. Default standard_scale, which will do nothing for scale	`"standard_scale"`
`mode`		for mode is "normal", the feat_upper and feat_lower is the normal value like "10" or "3.1" and for "cap", feat_upper and feature_lower will between 0 and 1, which means the percentile of the column. Default "normal"	`"normal"`
`feat_upper`	`int or float or list of int or float`	the upper limit in the column. If use list, mode must be "normal", and list length should equal to the number of features to scale. If the scaled value is larger than feat_upper, it will be set to feat_upper	`None`
`feat_lower`		the lower limit in the column. If use list, mode must be "normal", and list length should equal to the number of features to scale. If the scaled value is less than feat_lower, it will be set to feat_lower	`None`
`scale_col_indexes`		the idx of column in scale_column_idx will be scaled, while the idx of column is not in, it will not be scaled.	`-1`
`scale_names`	`list of string`	Specify which columns need to scaled. Each element in the list represent for a column name in header. default: []	`None`
`with_mean`	`bool`	used for "standard_scale". Default True.	`True`
`with_std`	`bool`	used for "standard_scale". Default True. The standard scale of column x is calculated as : $z = (x - u) / s$ , where $u$ is the mean of the column and $s$ is the standard deviation of the column. if with_mean is False, $u$ will be 0, and if with_std is False, $s$ will be 1.	`True`
`need_run`	`bool`	Indicate if this module needed to be run, default True	`True`

Source code in python/federatedml/param/scale_param.py

def __init__(
        self,
        method="standard_scale",
        mode="normal",
        scale_col_indexes=-1,
        scale_names=None,
        feat_upper=None,
        feat_lower=None,
        with_mean=True,
        with_std=True,
        need_run=True):
    super().__init__()
    self.scale_names = [] if scale_names is None else scale_names

    self.method = method
    self.mode = mode
    self.feat_upper = feat_upper
    # LOGGER.debug("self.feat_upper:{}, type:{}".format(self.feat_upper, type(self.feat_upper)))
    self.feat_lower = feat_lower
    self.scale_col_indexes = scale_col_indexes

    self.with_mean = with_mean
    self.with_std = with_std

    self.need_run = need_run

Attributes¶

scale_names = [] if scale_names is None else scale_names instance-attribute ¶

method = method instance-attribute ¶

mode = mode instance-attribute ¶

feat_upper = feat_upper instance-attribute ¶

feat_lower = feat_lower instance-attribute ¶

scale_col_indexes = scale_col_indexes instance-attribute ¶

with_mean = with_mean instance-attribute ¶

with_std = with_std instance-attribute ¶

need_run = need_run instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/scale_param.py

def check(self):
    if self.method is not None:
        descr = "scale param's method"
        self.method = self.check_and_change_lower(self.method,
                                                  [consts.MINMAXSCALE, consts.STANDARDSCALE],
                                                  descr)

    descr = "scale param's mode"
    self.mode = self.check_and_change_lower(self.mode,
                                            [consts.NORMAL, consts.CAP],
                                            descr)
    # LOGGER.debug("self.feat_upper:{}, type:{}".format(self.feat_upper, type(self.feat_upper)))
    # if type(self.feat_upper).__name__ not in ["float", "int"]:
    #     raise ValueError("scale param's feat_upper {} not supported, should be float or int".format(
    #         self.feat_upper))

    if self.scale_col_indexes != -1 and not isinstance(self.scale_col_indexes, list):
        raise ValueError("scale_col_indexes is should be -1 or a list")

    if self.scale_names is None:
        self.scale_names = []
    if not isinstance(self.scale_names, list):
        raise ValueError("scale_names is should be a list of string")
    else:
        for e in self.scale_names:
            if not isinstance(e, str):
                raise ValueError("scale_names is should be a list of string")

    self.check_boolean(self.with_mean, "scale_param with_mean")
    self.check_boolean(self.with_std, "scale_param with_std")
    self.check_boolean(self.need_run, "scale_param need_run")

    LOGGER.debug("Finish scale parameter check!")
    return True

`DataSplitParam(random_state=None, test_size=None, train_size=None, validate_size=None, stratified=False, shuffle=True, split_points=None, need_run=True)` ¶

Bases: BaseParam

Define data split param that used in data split.

Parameters:

Name	Type	Description	Default
`random_state`	`None or int, default`	Specify the random state for shuffle.	`None`
`test_size`	`float or int or None, default`	Specify test data set size. float value specifies fraction of input data set, int value specifies exact number of data instances	`None`
`train_size`	`float or int or None, default`	Specify train data set size. float value specifies fraction of input data set, int value specifies exact number of data instances	`None`
`validate_size`	`float or int or None, default`	Specify validate data set size. float value specifies fraction of input data set, int value specifies exact number of data instances	`None`
`stratified`	`bool, default`	Define whether sampling should be stratified, according to label value.	`False`
`shuffle`	`bool, default`	Define whether do shuffle before splitting or not.	`True`
`split_points`	`None or list, default`	Specify the point(s) by which continuous label values are bucketed into bins for stratified split. eg.[0.2] for two bins or [0.1, 1, 3] for 4 bins	`None`
`need_run`		Specify whether to run data split	`True`

Source code in python/federatedml/param/data_split_param.py

def __init__(self, random_state=None, test_size=None, train_size=None, validate_size=None, stratified=False,
             shuffle=True, split_points=None, need_run=True):
    super(DataSplitParam, self).__init__()
    self.random_state = random_state
    self.test_size = test_size
    self.train_size = train_size
    self.validate_size = validate_size
    self.stratified = stratified
    self.shuffle = shuffle
    self.split_points = split_points
    self.need_run = need_run

Attributes¶

random_state = random_state instance-attribute ¶

test_size = test_size instance-attribute ¶

train_size = train_size instance-attribute ¶

validate_size = validate_size instance-attribute ¶

stratified = stratified instance-attribute ¶

shuffle = shuffle instance-attribute ¶

split_points = split_points instance-attribute ¶

need_run = need_run instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/data_split_param.py

def check(self):
    model_param_descr = "data split param's "
    if self.random_state is not None:
        if not isinstance(self.random_state, int):
            raise ValueError(f"{model_param_descr} random state should be int type")
        BaseParam.check_nonnegative_number(self.random_state, f"{model_param_descr} random_state ")

    if self.test_size is not None:
        BaseParam.check_nonnegative_number(self.test_size, f"{model_param_descr} test_size ")
        if isinstance(self.test_size, float):
            BaseParam.check_decimal_float(self.test_size, f"{model_param_descr} test_size ")
    if self.train_size is not None:
        BaseParam.check_nonnegative_number(self.train_size, f"{model_param_descr} train_size ")
        if isinstance(self.train_size, float):
            BaseParam.check_decimal_float(self.train_size, f"{model_param_descr} train_size ")
    if self.validate_size is not None:
        BaseParam.check_nonnegative_number(self.validate_size, f"{model_param_descr} validate_size ")
        if isinstance(self.validate_size, float):
            BaseParam.check_decimal_float(self.validate_size, f"{model_param_descr} validate_size ")
    # use default size values if none given
    if self.test_size is None and self.train_size is None and self.validate_size is None:
        self.test_size = 0.0
        self.train_size = 0.8
        self.validate_size = 0.2

    BaseParam.check_boolean(self.stratified, f"{model_param_descr} stratified ")
    BaseParam.check_boolean(self.shuffle, f"{model_param_descr} shuffle ")
    BaseParam.check_boolean(self.need_run, f"{model_param_descr} need run ")

    if self.split_points is not None:
        if not isinstance(self.split_points, list):
            raise ValueError(f"{model_param_descr} split_points should be list type")

    LOGGER.debug("Finish data_split parameter check!")
    return True

`OneVsRestParam(need_one_vs_rest=False, has_arbiter=True)` ¶

Bases: BaseParam

Define the one_vs_rest parameters.

Parameters:

Name	Type	Description	Default
`has_arbiter`		For some algorithm, may not has arbiter, for instances, secureboost of FATE, for these algorithms, it should be set to false.	`True`

Source code in python/federatedml/param/one_vs_rest_param.py

def __init__(self, need_one_vs_rest=False, has_arbiter=True):
    super().__init__()
    self.need_one_vs_rest = need_one_vs_rest
    self.has_arbiter = has_arbiter

Attributes¶

need_one_vs_rest = need_one_vs_rest instance-attribute ¶

has_arbiter = has_arbiter instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/one_vs_rest_param.py

def check(self):
    if type(self.has_arbiter).__name__ != "bool":
        raise ValueError(
            "one_vs_rest param's has_arbiter {} not supported, should be bool type".format(
                self.has_arbiter))

    LOGGER.debug("Finish one_vs_rest parameter check!")
    return True

`SampleWeightParam(class_weight=None, sample_weight_name=None, normalize=False, need_run=True)` ¶

Bases: BaseParam

Define sample weight parameters

Parameters:

Name	Type	Description	Default
`class_weight`	`str or dict, or None, default None`	class weight dictionary or class weight computation mode, string value only accepts 'balanced'; If dict provided, key should be class(label), and weight will not be normalize, e.g.: {'0': 1, '1': 2} If both class_weight and sample_weight_name are None, return original input data.	`None`
`sample_weight_name`	`str`	name of column which specifies sample weight. feature name of sample weight; if both class_weight and sample_weight_name are None, return original input data	`None`
`normalize`	`bool, default False`	whether to normalize sample weight extracted from `sample_weight_name` column	`False`
`need_run`	`bool, default True`	whether to run this module or not	`True`

Source code in python/federatedml/param/sample_weight_param.py

def __init__(self, class_weight=None, sample_weight_name=None, normalize=False, need_run=True):
    self.class_weight = class_weight
    self.sample_weight_name = sample_weight_name
    self.normalize = normalize
    self.need_run = need_run

Attributes¶

class_weight = class_weight instance-attribute ¶

sample_weight_name = sample_weight_name instance-attribute ¶

normalize = normalize instance-attribute ¶

need_run = need_run instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/sample_weight_param.py

def check(self):

    descr = "sample weight param's"

    if self.class_weight:
        if not isinstance(self.class_weight, str) and not isinstance(self.class_weight, dict):
            raise ValueError(f"{descr} class_weight must be str, dict, or None.")
        if isinstance(self.class_weight, str):
            self.class_weight = self.check_and_change_lower(self.class_weight,
                                                            [consts.BALANCED],
                                                            f"{descr} class_weight")
        if isinstance(self.class_weight, dict):
            for k, v in self.class_weight.items():
                if v < 0:
                    LOGGER.warning(f"Negative value {v} provided for class {k} as class_weight.")

    if self.sample_weight_name:
        self.check_string(self.sample_weight_name, f"{descr} sample_weight_name")

    self.check_boolean(self.need_run, f"{descr} need_run")

    self.check_boolean(self.normalize, f"{descr} normalize")

    return True

`StepwiseParam(score_name='AIC', mode=consts.HETERO, role=consts.GUEST, direction='both', max_step=10, nvmin=2, nvmax=None, need_stepwise=False)` ¶

Bases: BaseParam

Define stepwise params

Parameters:

Name	Description	Default
`score_name`	Specify which model selection criterion to be used	`'AIC'`
`mode`	Indicate what mode is current task	`consts.HETERO`
`role`	Indicate what role is current party	`consts.GUEST`
`direction`	Indicate which direction to go for stepwise. 'forward' means forward selection; 'backward' means elimination; 'both' means possible models of both directions are examined at each step.	`'both'`
`max_step`	Specify total number of steps to run before forced stop.	`10`
`nvmin`	Specify the min subset size of final model, cannot be lower than 2. When nvmin > 2, the final model size may be smaller than nvmin due to max_step limit.	`2`
`nvmax`	Specify the max subset size of final model, 2 <= nvmin <= nvmax. The final model size may be larger than nvmax due to max_step limit.	`None`
`need_stepwise`	Indicate if this module needed to be run	`False`

Source code in python/federatedml/param/stepwise_param.py

def __init__(self, score_name="AIC", mode=consts.HETERO, role=consts.GUEST, direction="both",
             max_step=10, nvmin=2, nvmax=None, need_stepwise=False):
    super(StepwiseParam, self).__init__()
    self.score_name = score_name
    self.mode = mode
    self.role = role
    self.direction = direction
    self.max_step = max_step
    self.nvmin = nvmin
    self.nvmax = nvmax
    self.need_stepwise = need_stepwise

Attributes¶

score_name = score_name instance-attribute ¶

mode = mode instance-attribute ¶

role = role instance-attribute ¶

direction = direction instance-attribute ¶

max_step = max_step instance-attribute ¶

nvmin = nvmin instance-attribute ¶

nvmax = nvmax instance-attribute ¶

need_stepwise = need_stepwise instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/stepwise_param.py

def check(self):
    model_param_descr = "stepwise param's"
    self.score_name = self.check_and_change_lower(self.score_name, ["aic", "bic"], model_param_descr)
    self.check_valid_value(self.mode, model_param_descr, valid_values=[consts.HOMO, consts.HETERO])
    self.check_valid_value(self.role, model_param_descr, valid_values=[consts.HOST, consts.GUEST, consts.ARBITER])
    self.direction = self.check_and_change_lower(self.direction, ["forward", "backward", "both"], model_param_descr)
    self.check_positive_integer(self.max_step, model_param_descr)
    self.check_positive_integer(self.nvmin, model_param_descr)
    if self.nvmin < 2:
        raise ValueError(model_param_descr + " nvmin must be no less than 2.")
    if self.nvmax is not None:
        self.check_positive_integer(self.nvmax, model_param_descr)
        if self.nvmin > self.nvmax:
            raise ValueError(model_param_descr + " nvmax must be greater than nvmin.")
    self.check_boolean(self.need_stepwise, model_param_descr)

`UnionParam(need_run=True, allow_missing=False, keep_duplicate=False)` ¶

Bases: BaseParam

Define the union method for combining multiple dTables and keep entries with the same id

Parameters:

Name	Description	Default
`need_run`	Indicate if this module needed to be run	`True`
`allow_missing`	Whether allow mismatch between feature length and header length in the result. Note that empty tables will always be skipped regardless of this param setting.	`False`
`keep_duplicate`	Whether to keep entries with duplicated keys. If set to True, a new id will be generated for duplicated entry in the format {id}_{table_name}.	`False`

Source code in python/federatedml/param/union_param.py

def __init__(self, need_run=True, allow_missing=False, keep_duplicate=False):
    super().__init__()
    self.need_run = need_run
    self.allow_missing = allow_missing
    self.keep_duplicate = keep_duplicate

Attributes¶

need_run = need_run instance-attribute ¶

allow_missing = allow_missing instance-attribute ¶

keep_duplicate = keep_duplicate instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/union_param.py

def check(self):
    descr = "union param's "

    if type(self.need_run).__name__ != "bool":
        raise ValueError(
            descr + "need_run {} not supported, should be bool".format(
                self.need_run))

    if type(self.allow_missing).__name__ != "bool":
        raise ValueError(
            descr + "allow_missing {} not supported, should be bool".format(
                self.allow_missing))

    if type(self.keep_duplicate).__name__ != "bool":
        raise ValueError(
            descr + "keep_duplicate {} not supported, should be bool".format(
                self.keep_duplicate))

    LOGGER.info("Finish union parameter check!")
    return True

`ColumnExpandParam(append_header=None, method='manual', fill_value=consts.FLOAT_ZERO, need_run=True)` ¶

Bases: BaseParam

Define method used for expanding column

Parameters:

Name	Type	Description	Default
`append_header`	`None or str or List[str], default`	Name(s) for appended feature(s). If None is given, module outputs the original input value without any operation.	`None`
`method`	`str, default`	If method is 'manual', use user-specified `fill_value` to fill in new features.	`'manual'`
`fill_value`	`int or float or str or List[int] or List[float] or List[str], default`	Used for filling expanded feature columns. If given a list, length of the list must match that of `append_header`	`consts.FLOAT_ZERO`
`need_run`		Indicate if this module needed to be run.	`True`

Source code in python/federatedml/param/column_expand_param.py

def __init__(self, append_header=None, method="manual",
             fill_value=consts.FLOAT_ZERO, need_run=True):
    super(ColumnExpandParam, self).__init__()
    self.append_header = append_header
    self.method = method
    self.fill_value = fill_value
    self.need_run = need_run

Attributes¶

append_header = append_header instance-attribute ¶

method = method instance-attribute ¶

fill_value = fill_value instance-attribute ¶

need_run = need_run instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/column_expand_param.py

def check(self):
    descr = "column_expand param's "
    if not isinstance(self.method, str):
        raise ValueError(f"{descr}method {self.method} not supported, should be str type")
    else:
        user_input = self.method.lower()
        if user_input == "manual":
            self.method = consts.MANUAL
        else:
            raise ValueError(f"{descr} method {user_input} not supported")

    BaseParam.check_boolean(self.need_run, descr=descr)

    self.append_header = [] if self.append_header is None else self.append_header
    if not isinstance(self.append_header, list):
        raise ValueError(f"{descr} append_header must be None or list of str. "
                         f"Received {type(self.append_header)} instead.")
    for feature_name in self.append_header:
        BaseParam.check_string(feature_name, descr + "append_header values")

    if isinstance(self.fill_value, list):
        if len(self.append_header) != len(self.fill_value):
            raise ValueError(
                f"{descr} `fill value` is set to be list, "
                f"and param `append_header` must also be list of the same length.")
    else:
        self.fill_value = [self.fill_value]
    for value in self.fill_value:
        if type(value).__name__ not in ["float", "int", "long", "str"]:
            raise ValueError(
                f"{descr} fill value(s) must be float, int, or str. Received type {type(value)} instead.")

    LOGGER.debug("Finish column expand parameter check!")
    return True

`CrossValidationParam(n_splits=5, mode=consts.HETERO, role=consts.GUEST, shuffle=True, random_seed=1, need_cv=False, output_fold_history=True, history_value_type='score')` ¶

Bases: BaseParam

Define cross validation params

Parameters:

Name	Description	Default
`n_splits`	Specify how many splits used in KFold	`5`
`mode`	Indicate what mode is current task	`consts.HETERO`
`role`	Indicate what role is current party	`consts.GUEST`
`shuffle`	Define whether do shuffle before KFold or not.	`True`
`random_seed`	Specify the random seed for numpy shuffle	`1`
`need_cv`	Indicate if this module needed to be run	`False`
`output_fold_history`	Indicate whether to output table of ids used by each fold, else return original input data returned ids are formatted as: {original_id}#fold{fold_num}#{train/validate}	`True`
`history_value_type`	Indicate whether to include original instance or predict score in the output fold history, only effective when output_fold_history set to True	`'score'`

Source code in python/federatedml/param/cross_validation_param.py

def __init__(self, n_splits=5, mode=consts.HETERO, role=consts.GUEST, shuffle=True, random_seed=1,
             need_cv=False, output_fold_history=True, history_value_type="score"):
    super(CrossValidationParam, self).__init__()
    self.n_splits = n_splits
    self.mode = mode
    self.role = role
    self.shuffle = shuffle
    self.random_seed = random_seed
    # self.evaluate_param = copy.deepcopy(evaluate_param)
    self.need_cv = need_cv
    self.output_fold_history = output_fold_history
    self.history_value_type = history_value_type

Attributes¶

n_splits = n_splits instance-attribute ¶

mode = mode instance-attribute ¶

role = role instance-attribute ¶

shuffle = shuffle instance-attribute ¶

random_seed = random_seed instance-attribute ¶

need_cv = need_cv instance-attribute ¶

output_fold_history = output_fold_history instance-attribute ¶

history_value_type = history_value_type instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/cross_validation_param.py

def check(self):
    model_param_descr = "cross validation param's "
    self.check_positive_integer(self.n_splits, model_param_descr)
    self.check_valid_value(self.mode, model_param_descr, valid_values=[consts.HOMO, consts.HETERO])
    self.check_valid_value(self.role, model_param_descr, valid_values=[consts.HOST, consts.GUEST, consts.ARBITER])
    self.check_boolean(self.shuffle, model_param_descr)
    self.check_boolean(self.output_fold_history, model_param_descr)
    self.history_value_type = self.check_and_change_lower(
        self.history_value_type, ["instance", "score"], model_param_descr)
    if self.random_seed is not None:
        self.check_positive_integer(self.random_seed, model_param_descr)

`ScorecardParam(method='credit', offset=500, factor=20, factor_base=2, upper_limit_ratio=3, lower_limit_value=0, need_run=True)` ¶

Bases: BaseParam

Define method used for transforming prediction score to credit score

Parameters:

Name	Type	Description	Default
`method`		score method, currently only supports "credit"	`"credit"`
`offset`	`int or float, default`	score baseline	`500`
`factor`	`int or float, default`	scoring step, when odds double, result score increases by this factor	`20`
`factor_base`	`int or float, default`	factor base, value ln(factor_base) is used for calculating result score	`2`
`upper_limit_ratio`	`int or float, default`	upper bound for odds, credit score upper bound is upper_limit_ratio * offset	`3`
`lower_limit_value`	`int or float, default`	lower bound for result score	`0`
`need_run`	`bool, default`	Indicate if this module needs to be run.	`True`

Source code in python/federatedml/param/scorecard_param.py

def __init__(
        self,
        method="credit",
        offset=500,
        factor=20,
        factor_base=2,
        upper_limit_ratio=3,
        lower_limit_value=0,
        need_run=True):
    super(ScorecardParam, self).__init__()
    self.method = method
    self.offset = offset
    self.factor = factor
    self.factor_base = factor_base
    self.upper_limit_ratio = upper_limit_ratio
    self.lower_limit_value = lower_limit_value
    self.need_run = need_run

Attributes¶

method = method instance-attribute ¶

offset = offset instance-attribute ¶

factor = factor instance-attribute ¶

factor_base = factor_base instance-attribute ¶

upper_limit_ratio = upper_limit_ratio instance-attribute ¶

lower_limit_value = lower_limit_value instance-attribute ¶

need_run = need_run instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/scorecard_param.py

def check(self):
    descr = "scorecard param"
    if not isinstance(self.method, str):
        raise ValueError(f"{descr}method {self.method} not supported, should be str type")
    else:
        user_input = self.method.lower()
        if user_input == "credit":
            self.method = consts.CREDIT
        else:
            raise ValueError(f"{descr} method {user_input} not supported")

    if type(self.offset).__name__ not in ["int", "long", "float"]:
        raise ValueError(f"{descr} offset must be numeric,"
                         f"received {type(self.offset)} instead.")

    if type(self.factor).__name__ not in ["int", "long", "float"]:
        raise ValueError(f"{descr} factor must be numeric,"
                         f"received {type(self.factor)} instead.")

    if type(self.factor_base).__name__ not in ["int", "long", "float"]:
        raise ValueError(f"{descr} factor_base must be numeric,"
                         f"received {type(self.factor_base)} instead.")

    if type(self.upper_limit_ratio).__name__ not in ["int", "long", "float"]:
        raise ValueError(f"{descr} upper_limit_ratio must be numeric,"
                         f"received {type(self.upper_limit_ratio)} instead.")

    if type(self.lower_limit_value).__name__ not in ["int", "long", "float"]:
        raise ValueError(f"{descr} lower_limit_value must be numeric,"
                         f"received {type(self.lower_limit_value)} instead.")

    BaseParam.check_boolean(self.need_run, descr=descr + "need_run ")

    LOGGER.debug("Finish Scorecard parameter check!")
    return True

`LocalBaselineParam(model_name='LogisticRegression', model_opts=None, predict_param=PredictParam(), need_run=True)` ¶

Bases: BaseParam

Define the local baseline model param

Parameters:

Name	Type	Description	Default
`model_name`	`str`	sklearn model used to train on baseline model	`'LogisticRegression'`
`model_opts`	`dict or none, default None`	Param to be used as input into baseline model	`None`
`predict_param`	`PredictParam object, default`	predict param	`PredictParam()`
`need_run`		Indicate if this module needed to be run	`True`

Source code in python/federatedml/param/local_baseline_param.py

def __init__(self, model_name="LogisticRegression", model_opts=None, predict_param=PredictParam(), need_run=True):
    super(LocalBaselineParam, self).__init__()
    self.model_name = model_name
    self.model_opts = model_opts
    self.predict_param = copy.deepcopy(predict_param)
    self.need_run = need_run

Attributes¶

model_name = model_name instance-attribute ¶

model_opts = model_opts instance-attribute ¶

predict_param = copy.deepcopy(predict_param) instance-attribute ¶

need_run = need_run instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/local_baseline_param.py

def check(self):
    descr = "local baseline param"

    self.model_name = self.check_and_change_lower(self.model_name,
                                                  ["logisticregression"],
                                                  descr)
    self.check_boolean(self.need_run, descr)
    if self.model_opts is not None:
        if not isinstance(self.model_opts, dict):
            raise ValueError(descr + " model_opts must be None or dict.")
    if self.model_opts is None:
        self.model_opts = {}
    self.predict_param.check()

    return True

`PredictParam(threshold=0.5)` ¶

Bases: BaseParam

Define the predict method of HomoLR, HeteroLR, SecureBoosting

Parameters:

Name	Type	Description	Default
`threshold`		The threshold use to separate positive and negative class. Normally, it should be (0,1)	`0.5`

Source code in python/federatedml/param/predict_param.py

def __init__(self, threshold=0.5):
    self.threshold = threshold

Attributes¶

threshold = threshold instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/predict_param.py

def check(self):

    if type(self.threshold).__name__ not in ["float", "int"]:
        raise ValueError("predict param's predict_param {} not supported, should be float or int".format(
            self.threshold))

    LOGGER.debug("Finish predict parameter check!")
    return True

`SecureInformationRetrievalParam(security_level=0.5, oblivious_transfer_protocol=consts.OT_HAUCK, commutative_encryption=consts.CE_PH, non_committing_encryption=consts.AES, key_size=consts.DEFAULT_KEY_LENGTH, dh_params=DHParam(), raw_retrieval=False, target_cols=None)` ¶

Bases: BaseParam

Parameters:

Name	Description	Default
`security_level`	security level, should set value in [0, 1] if security_level equals 0.0 means raw data retrieval	`0.5`
`oblivious_transfer_protocol`	OT type, only supports OT_Hauck	`consts.OT_HAUCK`
`commutative_encryption`	the commutative encryption scheme used	`"CommutativeEncryptionPohligHellman"`
`non_committing_encryption`	the non-committing encryption scheme used	`"aes"`
`dh_params`	params for Pohlig-Hellman Encryption	`DHParam()`
`key_size`	the key length of the commutative cipher; note that this param will be deprecated in future, please specify key_length in PHParam instead.	`consts.DEFAULT_KEY_LENGTH`
`raw_retrieval`	perform raw retrieval if raw_retrieval	`False`
`target_cols`	target cols to retrieve; any values not retrieved will be marked as "unretrieved", if target_cols is None, label will be retrieved, same behavior as in previous version default None	`None`

Source code in python/federatedml/param/sir_param.py

def __init__(self, security_level=0.5,
             oblivious_transfer_protocol=consts.OT_HAUCK,
             commutative_encryption=consts.CE_PH,
             non_committing_encryption=consts.AES,
             key_size=consts.DEFAULT_KEY_LENGTH,
             dh_params=DHParam(),
             raw_retrieval=False,
             target_cols=None):
    super(SecureInformationRetrievalParam, self).__init__()
    self.security_level = security_level
    self.oblivious_transfer_protocol = oblivious_transfer_protocol
    self.commutative_encryption = commutative_encryption
    self.non_committing_encryption = non_committing_encryption
    self.dh_params = dh_params
    self.key_size = key_size
    self.raw_retrieval = raw_retrieval
    self.target_cols = target_cols

Attributes¶

security_level = security_level instance-attribute ¶

oblivious_transfer_protocol = oblivious_transfer_protocol instance-attribute ¶

commutative_encryption = commutative_encryption instance-attribute ¶

non_committing_encryption = non_committing_encryption instance-attribute ¶

dh_params = dh_params instance-attribute ¶

key_size = key_size instance-attribute ¶

raw_retrieval = raw_retrieval instance-attribute ¶

target_cols = target_cols instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/sir_param.py

def check(self):
    descr = "secure information retrieval param's "
    self.check_decimal_float(self.security_level, descr + "security_level")
    self.oblivious_transfer_protocol = self.check_and_change_lower(self.oblivious_transfer_protocol,
                                                                   [consts.OT_HAUCK.lower()],
                                                                   descr + "oblivious_transfer_protocol")
    self.commutative_encryption = self.check_and_change_lower(self.commutative_encryption,
                                                              [consts.CE_PH.lower()],
                                                              descr + "commutative_encryption")
    self.non_committing_encryption = self.check_and_change_lower(self.non_committing_encryption,
                                                                 [consts.AES.lower()],
                                                                 descr + "non_committing_encryption")
    if self._warn_to_deprecate_param("key_size", descr, "dh_param's key_length"):
        self.dh_params.key_length = self.key_size
    self.dh_params.check()
    if self._warn_to_deprecate_param("raw_retrieval", descr, "dh_param's security_level = 0"):
        self.check_boolean(self.raw_retrieval, descr)

    self.target_cols = [] if self.target_cols is None else self.target_cols
    if not isinstance(self.target_cols, list):
        self.target_cols = [self.target_cols]
    for col in self.target_cols:
        self.check_string(col, descr + "target_cols")
    if len(self.target_cols) == 0:
        LOGGER.warning(f"Both 'target_cols' and 'target_indexes' are empty. Label will be retrieved.")

`StatisticsParam(statistics='summary', column_names=None, column_indexes=-1, need_run=True, abnormal_list=None, quantile_error=consts.DEFAULT_RELATIVE_ERROR, bias=True)` ¶

Bases: BaseParam

Define statistics params

Parameters:

Name	Description	Default
`statistics`	Specify the statistic types to be computed. "summary" represents list: [consts.SUM, consts.MEAN, consts.STANDARD_DEVIATION, consts.MEDIAN, consts.MIN, consts.MAX, consts.MISSING_COUNT, consts.SKEWNESS, consts.KURTOSIS]	`'summary'`
`column_names`	Specify columns to be used for statistic computation by column names in header	`None`
`column_indexes`	Specify columns to be used for statistic computation by column order in header -1 indicates to compute statistics over all columns	`-1`
`bias`	If False, the calculations of skewness and kurtosis are corrected for statistical bias.	`True`
`need_run`	Indicate whether to run this modules	`True`

Source code in python/federatedml/param/statistics_param.py

def __init__(self, statistics="summary", column_names=None,
             column_indexes=-1, need_run=True, abnormal_list=None,
             quantile_error=consts.DEFAULT_RELATIVE_ERROR, bias=True):
    super().__init__()
    self.statistics = statistics
    self.column_names = column_names
    self.column_indexes = column_indexes
    self.abnormal_list = abnormal_list
    self.need_run = need_run
    self.quantile_error = quantile_error
    self.bias = bias

Attributes¶

LEGAL_STAT = [consts.COUNT, consts.SUM, consts.MEAN, consts.STANDARD_DEVIATION, consts.MEDIAN, consts.MIN, consts.MAX, consts.VARIANCE, consts.COEFFICIENT_OF_VARIATION, consts.MISSING_COUNT, consts.MISSING_RATIO, consts.SKEWNESS, consts.KURTOSIS]

instance-attribute class-attribute ¶

BASIC_STAT = [consts.SUM, consts.MEAN, consts.STANDARD_DEVIATION, consts.MEDIAN, consts.MIN, consts.MAX, consts.MISSING_RATIO, consts.MISSING_COUNT, consts.SKEWNESS, consts.KURTOSIS, consts.COEFFICIENT_OF_VARIATION]

instance-attribute class-attribute ¶

LEGAL_QUANTILE = re.compile('^(100)|([1-9]?[0-9])%$') instance-attribute class-attribute ¶

statistics = statistics instance-attribute ¶

column_names = column_names instance-attribute ¶

column_indexes = column_indexes instance-attribute ¶

abnormal_list = abnormal_list instance-attribute ¶

need_run = need_run instance-attribute ¶

quantile_error = quantile_error instance-attribute ¶

bias = bias instance-attribute ¶

Functions¶

find_stat_name_match(stat_name) staticmethod ¶

Source code in python/federatedml/param/statistics_param.py

@staticmethod
def find_stat_name_match(stat_name):
    if stat_name in StatisticsParam.LEGAL_STAT or StatisticsParam.LEGAL_QUANTILE.match(stat_name):
        return True
    return False

check() ¶

Source code in python/federatedml/param/statistics_param.py

def check(self):
    model_param_descr = "Statistics's param statistics"
    BaseParam.check_boolean(self.need_run, model_param_descr)
    statistics = copy.copy(self.BASIC_STAT)
    if not isinstance(self.statistics, list):
        if self.statistics in [consts.SUMMARY]:
            self.statistics = statistics
        else:
            if self.statistics not in statistics:
                statistics.append(self.statistics)
            self.statistics = statistics
    else:
        for s in self.statistics:
            if s not in statistics:
                statistics.append(s)
        self.statistics = statistics

    for stat_name in self.statistics:
        match_found = StatisticsParam.find_stat_name_match(stat_name)
        if not match_found:
            raise ValueError(f"Illegal statistics name provided: {stat_name}.")

    self.column_names = [] if self.column_names is None else self.column_names
    self.column_indexes = [] if self.column_indexes is None else self.column_indexes
    self.abnormal_list = [] if self.abnormal_list is None else self.abnormal_list
    model_param_descr = "Statistics's param column_names"
    if not isinstance(self.column_names, list):
        raise ValueError(f"column_names should be list of string.")
    for col_name in self.column_names:
        BaseParam.check_string(col_name, model_param_descr)

    model_param_descr = "Statistics's param column_indexes"
    if not isinstance(self.column_indexes, list) and self.column_indexes != -1:
        raise ValueError(f"column_indexes should be list of int or -1.")
    if self.column_indexes != -1:
        for col_index in self.column_indexes:
            if not isinstance(col_index, int):
                raise ValueError(f"{model_param_descr} should be int or list of int")
            if col_index < -consts.FLOAT_ZERO:
                raise ValueError(f"{model_param_descr} should be non-negative int value(s)")

    if not isinstance(self.abnormal_list, list):
        raise ValueError(f"abnormal_list should be list of int or string.")

    self.check_decimal_float(self.quantile_error, "Statistics's param quantile_error ")
    self.check_boolean(self.bias, "Statistics's param bias ")
    return True

`EncodeParam(salt='', encode_method='none', base64=False)` ¶

Bases: BaseParam

Define the hash method for raw intersect method

Parameters:

Name	Description	Default
`salt`	the src id will be str = str + salt, default by empty string	`''`
`encode_method`	the hash method of src id, support md5, sha1, sha224, sha256, sha384, sha512, sm3, default by None	`'none'`
`base64`	if True, the result of hash will be changed to base64, default by False	`False`

Source code in python/federatedml/param/intersect_param.py

def __init__(self, salt='', encode_method='none', base64=False):
    super().__init__()
    self.salt = salt
    self.encode_method = encode_method
    self.base64 = base64

Attributes¶

salt = salt instance-attribute ¶

encode_method = encode_method instance-attribute ¶

base64 = base64 instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/intersect_param.py

def check(self):
    if type(self.salt).__name__ != "str":
        raise ValueError(
            "encode param's salt {} not supported, should be str type".format(
                self.salt))

    descr = "encode param's "

    self.encode_method = self.check_and_change_lower(self.encode_method,
                                                     ["none", consts.MD5, consts.SHA1, consts.SHA224,
                                                      consts.SHA256, consts.SHA384, consts.SHA512,
                                                      consts.SM3],
                                                     descr)

    if type(self.base64).__name__ != "bool":
        raise ValueError(
            "hash param's base64 {} not supported, should be bool type".format(self.base64))

    LOGGER.debug("Finish EncodeParam check!")
    LOGGER.warning(f"'EncodeParam' will be replaced by 'RAWParam' in future release."
                   f"Please do not rely on current param naming in application.")
    return True

PoissonParam(penalty='L2', tol=0.0001, alpha=1.0, optimizer='rmsprop', batch_size=-1, learning_rate=0.01, init_param=InitParam(), max_iter=20, early_stop='diff', exposure_colname=None, encrypt_param=EncryptParam(), encrypted_mode_calculator_param=EncryptedModeCalculatorParam(), cv_param=CrossValidationParam(), stepwise_param=StepwiseParam(), decay=1, decay_sqrt=True, validation_freqs=None, early_stopping_rounds=None, metrics=None, use_first_metric_only=False, floating_point_precision=23, callback_param=CallbackParam()) ¶

Bases: LinearModelParam

Parameters used for Poisson Regression.

Parameters:

Name	Type	Description	Default
`penalty`		Penalty method used in Poisson. Please note that, when using encrypted version in HeteroPoisson, 'L1' is not supported.	`'L2'`
`tol`	`float, default`	The tolerance of convergence	`0.0001`
`alpha`	`float, default`	Regularization strength coefficient.	`1.0`
`optimizer`		Optimize method	`'rmsprop'`
`batch_size`	`int, default`	Batch size when updating model. -1 means use all data in a batch. i.e. Not to use mini-batch strategy.	`-1`
`learning_rate`	`float, default`	Learning rate	`0.01`
`max_iter`	`int, default`	The maximum iteration for training.	`20`
`init_param`		Init param method object.	`InitParam()`
`early_stop`	`str, 'weight_diff', 'diff' or 'abs', default`	Method used to judge convergence. a) diff： Use difference of loss between two iterations to judge whether converge. b) weight_diff: Use difference between weights of two consecutive iterations c) abs: Use the absolute value of loss to judge whether converge. i.e. if loss < eps, it is converged.	`'diff'`
`exposure_colname`		Name of optional exposure variable in dTable.	`None`
`encrypt_param`		encrypt param	`EncryptParam()`
`encrypted_mode_calculator_param`		encrypted mode calculator param	`EncryptedModeCalculatorParam()`
`cv_param`		cv param	`CrossValidationParam()`
`stepwise_param`		stepwise param	`StepwiseParam()`
`decay`		Decay rate for learning rate. learning rate will follow the following decay schedule. lr = lr0/(1+decayt) if decay_sqrt is False. If decay_sqrt is True, lr = lr0 / sqrt(1+decayt) where t is the iter number.	`1`
`decay_sqrt`		lr = lr0/(1+decayt) if decay_sqrt is False, otherwise, lr = lr0 / sqrt(1+decayt)	`True`
`validation_freqs`		validation frequency during training, required when using early stopping. The default value is None, 1 is suggested. You can set it to a number larger than 1 in order to speed up training by skipping validation rounds. When it is larger than 1, a number which is divisible by "max_iter" is recommended, otherwise, you will miss the validation scores of the last training iteration.	`None`
`early_stopping_rounds`		If positive number specified, at every specified training rounds, program checks for early stopping criteria. Validation_freqs must also be set when using early stopping.	`None`
`metrics`		Specify which metrics to be used when performing evaluation during training process. If metrics have not improved at early_stopping rounds, trianing stops before convergence. If set as empty, default metrics will be used. For regression tasks, default metrics are ['root_mean_squared_error', 'mean_absolute_error']	`None`
`use_first_metric_only`		Indicate whether to use the first metric in `metrics` as the only criterion for early stopping judgement.	`False`
`floating_point_precision`		if not None, use floating_point_precision-bit to speed up calculation, e.g.: convert an x to round(x * 2floating_point_precision) during Paillier operation, divide the result by 2floating_point_precision in the end.	`23`
`callback_param`		callback param	`CallbackParam()`

Source code in python/federatedml/param/poisson_regression_param.py

def __init__(self, penalty='L2',
             tol=1e-4, alpha=1.0, optimizer='rmsprop',
             batch_size=-1, learning_rate=0.01, init_param=InitParam(),
             max_iter=20, early_stop='diff',
             exposure_colname=None,
             encrypt_param=EncryptParam(),
             encrypted_mode_calculator_param=EncryptedModeCalculatorParam(),
             cv_param=CrossValidationParam(), stepwise_param=StepwiseParam(),
             decay=1, decay_sqrt=True,
             validation_freqs=None, early_stopping_rounds=None, metrics=None, use_first_metric_only=False,
             floating_point_precision=23, callback_param=CallbackParam()):
    super(PoissonParam, self).__init__(penalty=penalty, tol=tol, alpha=alpha, optimizer=optimizer,
                                       batch_size=batch_size, learning_rate=learning_rate,
                                       init_param=init_param, max_iter=max_iter,
                                       early_stop=early_stop, cv_param=cv_param, decay=decay,
                                       decay_sqrt=decay_sqrt, validation_freqs=validation_freqs,
                                       early_stopping_rounds=early_stopping_rounds, metrics=metrics,
                                       floating_point_precision=floating_point_precision,
                                       encrypt_param=encrypt_param,
                                       use_first_metric_only=use_first_metric_only,
                                       stepwise_param=stepwise_param,
                                       callback_param=callback_param)
    self.encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param)
    self.exposure_colname = exposure_colname

Attributes¶

encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param) instance-attribute ¶

exposure_colname = exposure_colname instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/poisson_regression_param.py

def check(self):
    descr = "poisson_regression_param's "
    super(PoissonParam, self).check()
    if self.encrypt_param.method != consts.PAILLIER:
        raise ValueError(
            descr + "encrypt method supports 'Paillier' only")
    if self.optimizer not in ['sgd', 'rmsprop', 'adam', 'adagrad']:
        raise ValueError(
            descr + "optimizer not supported, optimizer should be"
                    " 'sgd', 'rmsprop', 'adam', or 'adagrad'")
    if self.exposure_colname is not None:
        if type(self.exposure_colname).__name__ != "str":
            raise ValueError(
                descr + "exposure_colname {} not supported, should be string type".format(self.exposure_colname))
    self.encrypted_mode_calculator_param.check()
    return True

LinearParam(penalty='L2', tol=0.0001, alpha=1.0, optimizer='sgd', batch_size=-1, learning_rate=0.01, init_param=InitParam(), max_iter=20, early_stop='diff', encrypt_param=EncryptParam(), sqn_param=StochasticQuasiNewtonParam(), encrypted_mode_calculator_param=EncryptedModeCalculatorParam(), cv_param=CrossValidationParam(), decay=1, decay_sqrt=True, validation_freqs=None, early_stopping_rounds=None, stepwise_param=StepwiseParam(), metrics=None, use_first_metric_only=False, floating_point_precision=23, callback_param=CallbackParam()) ¶

Bases: LinearModelParam

Parameters used for Linear Regression.

Parameters:

Name	Type	Description	Default
`penalty`		Penalty method used in LinR. Please note that, when using encrypted version in HeteroLinR, 'L1' is not supported. When using Homo-LR, 'L1' is not supported	`'L2' or 'L1'`
`tol`	`float, default`	The tolerance of convergence	`0.0001`
`alpha`	`float, default`	Regularization strength coefficient.	`1.0`
`optimizer`		Optimize method	`'sgd'`
`batch_size`	`int, default`	Batch size when updating model. -1 means use all data in a batch. i.e. Not to use mini-batch strategy.	`-1`
`learning_rate`	`float, default`	Learning rate	`0.01`
`max_iter`	`int, default`	The maximum iteration for training.	`20`
`init_param`		Init param method object.	`InitParam()`
`early_stop`		Method used to judge convergence. a) diff： Use difference of loss between two iterations to judge whether converge. b) abs: Use the absolute value of loss to judge whether converge. i.e. if loss < tol, it is converged. c) weight_diff: Use difference between weights of two consecutive iterations	`'diff'`
`encrypt_param`		encrypt param	`EncryptParam()`
`encrypted_mode_calculator_param`		encrypted mode calculator param	`EncryptedModeCalculatorParam()`
`cv_param`		cv param	`CrossValidationParam()`
`decay`		Decay rate for learning rate. learning rate will follow the following decay schedule. lr = lr0/(1+decayt) if decay_sqrt is False. If decay_sqrt is True, lr = lr0 / sqrt(1+decayt) where t is the iter number.	`1`
`decay_sqrt`		lr = lr0/(1+decayt) if decay_sqrt is False, otherwise, lr = lr0 / sqrt(1+decayt)	`True`
`validation_freqs`		validation frequency during training, required when using early stopping. The default value is None, 1 is suggested. You can set it to a number larger than 1 in order to speed up training by skipping validation rounds. When it is larger than 1, a number which is divisible by "max_iter" is recommended, otherwise, you will miss the validation scores of the last training iteration.	`None`
`early_stopping_rounds`		If positive number specified, at every specified training rounds, program checks for early stopping criteria. Validation_freqs must also be set when using early stopping.	`None`
`metrics`		Specify which metrics to be used when performing evaluation during training process. If metrics have not improved at early_stopping rounds, trianing stops before convergence. If set as empty, default metrics will be used. For regression tasks, default metrics are ['root_mean_squared_error', 'mean_absolute_error']	`None`
`use_first_metric_only`		Indicate whether to use the first metric in `metrics` as the only criterion for early stopping judgement.	`False`
`floating_point_precision`		if not None, use floating_point_precision-bit to speed up calculation, e.g.: convert an x to round(x * 2floating_point_precision) during Paillier operation, divide the result by 2floating_point_precision in the end.	`23`
`callback_param`		callback param	`CallbackParam()`

Source code in python/federatedml/param/linear_regression_param.py

def __init__(self, penalty='L2',
             tol=1e-4, alpha=1.0, optimizer='sgd',
             batch_size=-1, learning_rate=0.01, init_param=InitParam(),
             max_iter=20, early_stop='diff',
             encrypt_param=EncryptParam(), sqn_param=StochasticQuasiNewtonParam(),
             encrypted_mode_calculator_param=EncryptedModeCalculatorParam(),
             cv_param=CrossValidationParam(), decay=1, decay_sqrt=True, validation_freqs=None,
             early_stopping_rounds=None, stepwise_param=StepwiseParam(), metrics=None, use_first_metric_only=False,
             floating_point_precision=23, callback_param=CallbackParam()):
    super(LinearParam, self).__init__(penalty=penalty, tol=tol, alpha=alpha, optimizer=optimizer,
                                      batch_size=batch_size, learning_rate=learning_rate,
                                      init_param=init_param, max_iter=max_iter, early_stop=early_stop,
                                      encrypt_param=encrypt_param, cv_param=cv_param, decay=decay,
                                      decay_sqrt=decay_sqrt, validation_freqs=validation_freqs,
                                      early_stopping_rounds=early_stopping_rounds,
                                      stepwise_param=stepwise_param, metrics=metrics,
                                      use_first_metric_only=use_first_metric_only,
                                      floating_point_precision=floating_point_precision,
                                      callback_param=callback_param)
    self.sqn_param = copy.deepcopy(sqn_param)
    self.encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param)

Attributes¶

sqn_param = copy.deepcopy(sqn_param) instance-attribute ¶

encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param) instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/linear_regression_param.py

def check(self):
    descr = "linear_regression_param's "
    super(LinearParam, self).check()
    if self.optimizer not in ['sgd', 'rmsprop', 'adam', 'adagrad', 'sqn']:
        raise ValueError(
            descr + "optimizer not supported, optimizer should be"
                    " 'sgd', 'rmsprop', 'adam', 'sqn' or 'adagrad'")
    self.sqn_param.check()
    if self.encrypt_param.method != consts.PAILLIER:
        raise ValueError(
            descr + "encrypt method supports 'Paillier' only")
    return True

LogisticParam(penalty='L2', tol=0.0001, alpha=1.0, optimizer='rmsprop', batch_size=-1, shuffle=True, batch_strategy='full', masked_rate=5, learning_rate=0.01, init_param=InitParam(), max_iter=100, early_stop='diff', encrypt_param=EncryptParam(), predict_param=PredictParam(), cv_param=CrossValidationParam(), decay=1, decay_sqrt=True, multi_class='ovr', validation_freqs=None, early_stopping_rounds=None, stepwise_param=StepwiseParam(), floating_point_precision=23, metrics=None, use_first_metric_only=False, callback_param=CallbackParam()) ¶

Bases: LinearModelParam

Parameters used for Logistic Regression both for Homo mode or Hetero mode.

Parameters:

Name	Type	Description	Default
`penalty`		Penalty method used in LR. Please note that, when using encrypted version in HomoLR, 'L1' is not supported.	`'L2'`
`tol`	`float, default`	The tolerance of convergence	`0.0001`
`alpha`	`float, default`	Regularization strength coefficient.	`1.0`
`optimizer`		Optimize method.	`'rmsprop'`
`batch_strategy`	`str`	Strategy to generate batch data. a) full: use full data to generate batch_data, batch_nums every iteration is ceil(data_size / batch_size) b) random: select data randomly from full data, batch_num will be 1 every iteration.	`'full'`
`batch_size`	`int, default`	Batch size when updating model. -1 means use all data in a batch. i.e. Not to use mini-batch strategy.	`-1`
`shuffle`	`bool, default`	Work only in hetero logistic regression, batch data will be shuffle in every iteration.	`True`
`masked_rate`		Use masked data to enhance security of hetero logistic regression	`5`
`learning_rate`	`float, default`	Learning rate	`0.01`
`max_iter`	`int, default`	The maximum iteration for training.	`100`
`early_stop`		Method used to judge converge or not. a) diff： Use difference of loss between two iterations to judge whether converge. b) weight_diff: Use difference between weights of two consecutive iterations c) abs: Use the absolute value of loss to judge whether converge. i.e. if loss < eps, it is converged. Please note that for hetero-lr multi-host situation, this parameter support "weight_diff" only. In homo-lr, weight_diff is not supported	`'diff'`
`decay`		Decay rate for learning rate. learning rate will follow the following decay schedule. lr = lr0/(1+decayt) if decay_sqrt is False. If decay_sqrt is True, lr = lr0 / sqrt(1+decayt) where t is the iter number.	`1`
`decay_sqrt`		lr = lr0/(1+decayt) if decay_sqrt is False, otherwise, lr = lr0 / sqrt(1+decayt)	`True`
`encrypt_param`		encrypt param	`EncryptParam()`
`predict_param`		predict param	`PredictParam()`
`callback_param`		callback param	`CallbackParam()`
`cv_param`		cv param	`CrossValidationParam()`
`multi_class`		If it is a multi_class task, indicate what strategy to use. Currently, support 'ovr' short for one_vs_rest only.	`'ovr'`
`validation_freqs`		validation frequency during training.	`None`
`early_stopping_rounds`		Will stop training if one metric doesn’t improve in last early_stopping_round rounds	`None`
`metrics`		Indicate when executing evaluation during train process, which metrics will be used. If set as empty, default metrics for specific task type will be used. As for binary classification, default metrics are ['auc', 'ks']	`None`
`use_first_metric_only`		Indicate whether use the first metric only for early stopping judgement.	`False`
`floating_point_precision`		if not None, use floating_point_precision-bit to speed up calculation, e.g.: convert an x to round(x * 2floating_point_precision) during Paillier operation, divide the result by 2floating_point_precision in the end.	`23`

Source code in python/federatedml/param/logistic_regression_param.py

def __init__(self, penalty='L2',
             tol=1e-4, alpha=1.0, optimizer='rmsprop',
             batch_size=-1, shuffle=True, batch_strategy="full", masked_rate=5,
             learning_rate=0.01, init_param=InitParam(),
             max_iter=100, early_stop='diff', encrypt_param=EncryptParam(),
             predict_param=PredictParam(), cv_param=CrossValidationParam(),
             decay=1, decay_sqrt=True,
             multi_class='ovr', validation_freqs=None, early_stopping_rounds=None,
             stepwise_param=StepwiseParam(), floating_point_precision=23,
             metrics=None,
             use_first_metric_only=False,
             callback_param=CallbackParam()
             ):
    super(LogisticParam, self).__init__()
    self.penalty = penalty
    self.tol = tol
    self.alpha = alpha
    self.optimizer = optimizer
    self.batch_size = batch_size
    self.learning_rate = learning_rate
    self.init_param = copy.deepcopy(init_param)
    self.max_iter = max_iter
    self.early_stop = early_stop
    self.encrypt_param = encrypt_param
    self.shuffle = shuffle
    self.batch_strategy = batch_strategy
    self.masked_rate = masked_rate
    self.predict_param = copy.deepcopy(predict_param)
    self.cv_param = copy.deepcopy(cv_param)
    self.decay = decay
    self.decay_sqrt = decay_sqrt
    self.multi_class = multi_class
    self.validation_freqs = validation_freqs
    self.stepwise_param = copy.deepcopy(stepwise_param)
    self.early_stopping_rounds = early_stopping_rounds
    self.metrics = metrics or []
    self.use_first_metric_only = use_first_metric_only
    self.floating_point_precision = floating_point_precision
    self.callback_param = copy.deepcopy(callback_param)

Attributes¶

penalty = penalty instance-attribute ¶

tol = tol instance-attribute ¶

alpha = alpha instance-attribute ¶

optimizer = optimizer instance-attribute ¶

batch_size = batch_size instance-attribute ¶

learning_rate = learning_rate instance-attribute ¶

init_param = copy.deepcopy(init_param) instance-attribute ¶

max_iter = max_iter instance-attribute ¶

early_stop = early_stop instance-attribute ¶

encrypt_param = encrypt_param instance-attribute ¶

shuffle = shuffle instance-attribute ¶

batch_strategy = batch_strategy instance-attribute ¶

masked_rate = masked_rate instance-attribute ¶

predict_param = copy.deepcopy(predict_param) instance-attribute ¶

cv_param = copy.deepcopy(cv_param) instance-attribute ¶

decay = decay instance-attribute ¶

decay_sqrt = decay_sqrt instance-attribute ¶

multi_class = multi_class instance-attribute ¶

validation_freqs = validation_freqs instance-attribute ¶

stepwise_param = copy.deepcopy(stepwise_param) instance-attribute ¶

early_stopping_rounds = early_stopping_rounds instance-attribute ¶

metrics = metrics or [] instance-attribute ¶

use_first_metric_only = use_first_metric_only instance-attribute ¶

floating_point_precision = floating_point_precision instance-attribute ¶

callback_param = copy.deepcopy(callback_param) instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/logistic_regression_param.py

def check(self):
    descr = "logistic_param's"
    super(LogisticParam, self).check()
    self.predict_param.check()
    if self.encrypt_param.method not in [consts.PAILLIER, consts.PAILLIER_IPCL, None]:
        raise ValueError(
            "logistic_param's encrypted method support 'Paillier' or None only")
    self.multi_class = self.check_and_change_lower(
        self.multi_class, ["ovr"], f"{descr}")
    if not isinstance(self.masked_rate, (float, int)) or self.masked_rate < 0:
        raise ValueError(
            "masked rate should be non-negative numeric number")
    if not isinstance(self.batch_strategy, str) or self.batch_strategy.lower() not in ["full", "random"]:
        raise ValueError("batch strategy should be full or random")
    self.batch_strategy = self.batch_strategy.lower()
    if not isinstance(self.shuffle, bool):
        raise ValueError("shuffle should be boolean type")
    return True

`ObjectiveParam(objective='cross_entropy', params=None)` ¶

Bases: BaseParam

Define objective parameters that used in federated ml.

Parameters:

Name	Type	Description	Default
`objective`		None in host's config, should be str in guest'config. when task_type is classification, only support 'cross_entropy', other 6 types support in regression task	`None`
`params`	`None or list`	should be non empty list when objective is 'tweedie','fair','huber', first element of list shoulf be a float-number large than 0.0 when objective is 'fair', 'huber', first element of list should be a float-number in [1.0, 2.0) when objective is 'tweedie'	`None`

Source code in python/federatedml/param/boosting_param.py

def __init__(self, objective='cross_entropy', params=None):
    self.objective = objective
    self.params = params

Attributes¶

objective = objective instance-attribute ¶

params = params instance-attribute ¶

Functions¶

check(task_type=None) ¶

Source code in python/federatedml/param/boosting_param.py

def check(self, task_type=None):
    if self.objective is None:
        return True

    descr = "objective param's"

    LOGGER.debug('check objective {}'.format(self.objective))

    if task_type not in [consts.CLASSIFICATION, consts.REGRESSION]:
        self.objective = self.check_and_change_lower(self.objective,
                                                     ["cross_entropy", "lse", "lae", "huber", "fair",
                                                      "log_cosh", "tweedie"],
                                                     descr)

    if task_type == consts.CLASSIFICATION:
        if self.objective != "cross_entropy":
            raise ValueError("objective param's objective {} not supported".format(self.objective))

    elif task_type == consts.REGRESSION:
        self.objective = self.check_and_change_lower(self.objective,
                                                     ["lse", "lae", "huber", "fair", "log_cosh", "tweedie"],
                                                     descr)

        params = self.params
        if self.objective in ["huber", "fair", "tweedie"]:
            if type(params).__name__ != 'list' or len(params) < 1:
                raise ValueError(
                    "objective param's params {} not supported, should be non-empty list".format(params))

            if type(params[0]).__name__ not in ["float", "int", "long"]:
                raise ValueError("objective param's params[0] {} not supported".format(self.params[0]))

            if self.objective == 'tweedie':
                if params[0] < 1 or params[0] >= 2:
                    raise ValueError("in tweedie regression, objective params[0] should betweend [1, 2)")

            if self.objective == 'fair' or 'huber':
                if params[0] <= 0.0:
                    raise ValueError("in {} regression, objective params[0] should greater than 0.0".format(
                        self.objective))
    return True

`FTLParam(alpha=1, tol=1e-06, n_iter_no_change=False, validation_freqs=None, optimizer={'optimizer': 'Adam', 'learning_rate': 0.01}, nn_define={}, epochs=1, intersect_param=IntersectParam(consts.RSA), config_type='keras', batch_size=-1, encrypte_param=EncryptParam(), encrypted_mode_calculator_param=EncryptedModeCalculatorParam(mode='confusion_opt'), predict_param=PredictParam(), mode='plain', communication_efficient=False, local_round=5, callback_param=CallbackParam())` ¶

Bases: BaseParam

Parameters:

Name	Type	Description	Default
`alpha`	`float`	a loss coefficient defined in paper, it defines the importance of alignment loss	`1`
`tol`	`float`	loss tolerance	`1e-06`
`n_iter_no_change`	`bool`	check loss convergence or not	`False`
`validation_freqs`	`None or positive integer or container object in python`	Do validation in training process or Not. if equals None, will not do validation in train process; if equals positive integer, will validate data every validation_freqs epochs passes; if container object in python, will validate data if epochs belong to this container. e.g. validation_freqs = [10, 15], will validate data when epoch equals to 10 and 15. The default value is None, 1 is suggested. You can set it to a number larger than 1 in order to speed up training by skipping validation rounds. When it is larger than 1, a number which is divisible by "epochs" is recommended, otherwise, you will miss the validation scores of last training epoch.	`None`
`optimizer`	`str or dict`	optimizer method, accept following types: 1. a string, one of "Adadelta", "Adagrad", "Adam", "Adamax", "Nadam", "RMSprop", "SGD" 2. a dict, with a required key-value pair keyed by "optimizer", with optional key-value pairs such as learning rate. defaults to "SGD"	`{'optimizer': 'Adam', 'learning_rate': 0.01}`
`nn_define`	`dict`	a dict represents the structure of neural network, it can be output by tf-keras	`{}`
`epochs`	`int`	epochs num	`1`
`intersect_param`		define the intersect method	`IntersectParam(consts.RSA)`
`config_type`		config type	`'tf-keras'`
`batch_size`	`int`	batch size when computing transformed feature embedding, -1 use full data.	`-1`
`encrypte_param`		encrypted param	`EncryptParam()`
`encrypted_mode_calculator_param`		encrypted mode calculator param:	`EncryptedModeCalculatorParam(mode='confusion_opt')`
`predict_param`		predict param	`PredictParam()`
`mode`			`'plain'`
`communication_efficient`		will use communication efficient or not. when communication efficient is enabled, FTL model will update gradients by several local rounds using intermediate data	`False`
`local_round`		local update round when using communication efficient	`5`

Source code in python/federatedml/param/ftl_param.py

def __init__(self, alpha=1, tol=0.000001,
             n_iter_no_change=False, validation_freqs=None, optimizer={'optimizer': 'Adam', 'learning_rate': 0.01},
             nn_define={}, epochs=1, intersect_param=IntersectParam(consts.RSA), config_type='keras', batch_size=-1,
             encrypte_param=EncryptParam(),
             encrypted_mode_calculator_param=EncryptedModeCalculatorParam(mode="confusion_opt"),
             predict_param=PredictParam(), mode='plain', communication_efficient=False,
             local_round=5, callback_param=CallbackParam()):
    """
    Parameters
    ----------
    alpha : float
        a loss coefficient defined in paper, it defines the importance of alignment loss
    tol : float
        loss tolerance
    n_iter_no_change : bool
        check loss convergence or not
    validation_freqs : None or positive integer or container object in python
        Do validation in training process or Not.
        if equals None, will not do validation in train process;
        if equals positive integer, will validate data every validation_freqs epochs passes;
        if container object in python, will validate data if epochs belong to this container.
        e.g. validation_freqs = [10, 15], will validate data when epoch equals to 10 and 15.
        The default value is None, 1 is suggested. You can set it to a number larger than 1 in order to
        speed up training by skipping validation rounds. When it is larger than 1, a number which is
        divisible by "epochs" is recommended, otherwise, you will miss the validation scores
        of last training epoch.
    optimizer : str or dict
        optimizer method, accept following types:
        1. a string, one of "Adadelta", "Adagrad", "Adam", "Adamax", "Nadam", "RMSprop", "SGD"
        2. a dict, with a required key-value pair keyed by "optimizer",
            with optional key-value pairs such as learning rate.
        defaults to "SGD"
    nn_define : dict
        a dict represents the structure of neural network, it can be output by tf-keras
    epochs : int
        epochs num
    intersect_param
        define the intersect method
    config_type : {'tf-keras'}
        config type
    batch_size : int
        batch size when computing transformed feature embedding, -1 use full data.
    encrypte_param
        encrypted param
    encrypted_mode_calculator_param
        encrypted mode calculator param:
    predict_param
        predict param
    mode: {"plain", "encrypted"}
        plain: will not use any encrypt algorithms, data exchanged in plaintext
        encrypted: use paillier to encrypt gradients
    communication_efficient: bool
        will use communication efficient or not. when communication efficient is enabled, FTL model will
        update gradients by several local rounds using intermediate data
    local_round: int
        local update round when using communication efficient
    """

    super(FTLParam, self).__init__()
    self.alpha = alpha
    self.tol = tol
    self.n_iter_no_change = n_iter_no_change
    self.validation_freqs = validation_freqs
    self.optimizer = optimizer
    self.nn_define = nn_define
    self.epochs = epochs
    self.intersect_param = copy.deepcopy(intersect_param)
    self.config_type = config_type
    self.batch_size = batch_size
    self.encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param)
    self.encrypt_param = copy.deepcopy(encrypte_param)
    self.predict_param = copy.deepcopy(predict_param)
    self.mode = mode
    self.communication_efficient = communication_efficient
    self.local_round = local_round
    self.callback_param = copy.deepcopy(callback_param)

Attributes¶

alpha = alpha instance-attribute ¶

tol = tol instance-attribute ¶

n_iter_no_change = n_iter_no_change instance-attribute ¶

validation_freqs = validation_freqs instance-attribute ¶

optimizer = optimizer instance-attribute ¶

nn_define = nn_define instance-attribute ¶

epochs = epochs instance-attribute ¶

intersect_param = copy.deepcopy(intersect_param) instance-attribute ¶

config_type = config_type instance-attribute ¶

batch_size = batch_size instance-attribute ¶

encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param) instance-attribute ¶

encrypt_param = copy.deepcopy(encrypte_param) instance-attribute ¶

predict_param = copy.deepcopy(predict_param) instance-attribute ¶

mode = mode instance-attribute ¶

communication_efficient = communication_efficient instance-attribute ¶

local_round = local_round instance-attribute ¶

callback_param = copy.deepcopy(callback_param) instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/ftl_param.py

def check(self):
    self.intersect_param.check()
    self.encrypt_param.check()
    self.encrypted_mode_calculator_param.check()

    self.optimizer = self._parse_optimizer(self.optimizer)

    supported_config_type = ["keras"]
    if self.config_type not in supported_config_type:
        raise ValueError(f"config_type should be one of {supported_config_type}")

    if not isinstance(self.tol, (int, float)):
        raise ValueError("tol should be numeric")

    if not isinstance(self.epochs, int) or self.epochs <= 0:
        raise ValueError("epochs should be a positive integer")

    if self.nn_define and not isinstance(self.nn_define, dict):
        raise ValueError("bottom_nn_define should be a dict defining the structure of neural network")

    if self.batch_size != -1:
        if not isinstance(self.batch_size, int) \
                or self.batch_size < consts.MIN_BATCH_SIZE:
            raise ValueError(
                " {} not supported, should be larger than 10 or -1 represent for all data".format(self.batch_size))

    for p in deprecated_param_list:
        # if self._warn_to_deprecate_param(p, "", ""):
        if self._deprecated_params_set.get(p):
            if "callback_param" in self.get_user_feeded():
                raise ValueError(f"{p} and callback param should not be set simultaneously，"
                                 f"{self._deprecated_params_set}, {self.get_user_feeded()}")
            else:
                self.callback_param.callbacks = ["PerformanceEvaluate"]
            break

    descr = "ftl's"

    if self._warn_to_deprecate_param("validation_freqs", descr, "callback_param's 'validation_freqs'"):
        self.callback_param.validation_freqs = self.validation_freqs

    if self._warn_to_deprecate_param("metrics", descr, "callback_param's 'metrics'"):
        self.callback_param.metrics = self.metrics

    if self.validation_freqs is None:
        pass
    elif isinstance(self.validation_freqs, int):
        if self.validation_freqs < 1:
            raise ValueError("validation_freqs should be larger than 0 when it's integer")
    elif not isinstance(self.validation_freqs, collections.Container):
        raise ValueError("validation_freqs should be None or positive integer or container")

    assert isinstance(self.communication_efficient, bool), 'communication efficient must be a boolean'
    assert self.mode in [
        'encrypted', 'plain'], 'mode options: encrpyted or plain, but {} is offered'.format(
        self.mode)

    self.check_positive_integer(self.epochs, 'epochs')
    self.check_positive_number(self.alpha, 'alpha')
    self.check_positive_integer(self.local_round, 'local round')

`HomoNNParam(trainer=TrainerParam(), dataset=DatasetParam(), torch_seed=100, nn_define=None, loss=None, optimizer=None, ds_config=None)` ¶

Bases: BaseParam

Source code in python/federatedml/param/homo_nn_param.py

def __init__(self,
             trainer: TrainerParam = TrainerParam(),
             dataset: DatasetParam = DatasetParam(),
             torch_seed: int = 100,
             nn_define: dict = None,
             loss: dict = None,
             optimizer: dict = None,
             ds_config: dict = None
             ):

    super(HomoNNParam, self).__init__()
    self.trainer = trainer
    self.dataset = dataset
    self.torch_seed = torch_seed
    self.nn_define = nn_define
    self.loss = loss
    self.optimizer = optimizer
    self.ds_config = ds_config

Attributes¶

trainer = trainer instance-attribute ¶

dataset = dataset instance-attribute ¶

torch_seed = torch_seed instance-attribute ¶

nn_define = nn_define instance-attribute ¶

loss = loss instance-attribute ¶

optimizer = optimizer instance-attribute ¶

ds_config = ds_config instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/homo_nn_param.py

def check(self):

    assert isinstance(self.trainer, TrainerParam), 'trainer must be a TrainerParam()'
    assert isinstance(self.dataset, DatasetParam), 'dataset must be a DatasetParam()'

    self.trainer.check()
    self.dataset.check()

    # torch seed >= 0
    if isinstance(self.torch_seed, int):
        assert self.torch_seed >= 0, 'torch seed should be an int >=0'
    else:
        raise ValueError('torch seed should be an int >=0')

    if self.nn_define is not None:
        assert isinstance(self.nn_define, dict), 'nn define should be a dict defining model structures'
    if self.loss is not None:
        assert isinstance(self.loss, dict), 'loss parameter should be a loss config dict'
    if self.optimizer is not None:
        assert isinstance(self.optimizer, dict), 'optimizer parameter should be a config dict'

`DecisionTreeParam(criterion_method='xgboost', criterion_params=[0.1, 0], max_depth=3, min_sample_split=2, min_impurity_split=0.001, min_leaf_node=1, max_split_nodes=consts.MAX_SPLIT_NODES, feature_importance_type='split', n_iter_no_change=True, tol=0.001, min_child_weight=0, use_missing=False, zero_as_missing=False, deterministic=False)` ¶

Bases: BaseParam

Define decision tree parameters that used in federated ml.

Parameters:

Name	Description	Default
`criterion_method`	the criterion function to use	`"xgboost"`
`criterion_params`	should be non empty and elements are float-numbers, if a list is offered, the first one is l2 regularization value, and the second one is l1 regularization value. if a dict is offered, make sure it contains key 'l1', and 'l2'. l1, l2 regularization values are non-negative floats. default: [0.1, 0] or {'l1':0, 'l2':0,1}	`[0.1, 0]`
`max_depth`	the max depth of a decision tree, default: 3	`3`
`min_sample_split`	least quantity of nodes to split, default: 2	`2`
`min_impurity_split`	least gain of a single split need to reach, default: 1e-3	`0.001`
`min_child_weight`	sum of hessian needed in child nodes. default is 0	`0`
`min_leaf_node`	when samples no more than min_leaf_node, it becomes a leave, default: 1	`1`
`max_split_nodes`	we will use no more than max_split_nodes to parallel finding their splits in a batch, for memory consideration. default is 65536	`consts.MAX_SPLIT_NODES`
`feature_importance_type`	if is 'split', feature_importances calculate by feature split times, if is 'gain', feature_importances calculate by feature split gain. default: 'split' Due to the safety concern, we adjust training strategy of Hetero-SBT in FATE-1.8, When running Hetero-SBT, this parameter is now abandoned. In Hetero-SBT of FATE-1.8, guest side will compute split, gain of local features, and receive anonymous feature importance results from hosts. Hosts will compute split importance of local features.	`'split'`
`use_missing`	use missing value in training process or not.	`False`
`zero_as_missing`	regard 0 as missing value or not, will be use only if use_missing=True, default: False	`False`
`deterministic`	ensure stability when computing histogram. Set this to true to ensure stable result when using same data and same parameter. But it may slow down computation.	`False`

Source code in python/federatedml/param/boosting_param.py

def __init__(self, criterion_method="xgboost", criterion_params=[0.1, 0], max_depth=3,
             min_sample_split=2, min_impurity_split=1e-3, min_leaf_node=1,
             max_split_nodes=consts.MAX_SPLIT_NODES, feature_importance_type='split',
             n_iter_no_change=True, tol=0.001, min_child_weight=0,
             use_missing=False, zero_as_missing=False, deterministic=False):

    super(DecisionTreeParam, self).__init__()

    self.criterion_method = criterion_method
    self.criterion_params = criterion_params
    self.max_depth = max_depth
    self.min_sample_split = min_sample_split
    self.min_impurity_split = min_impurity_split
    self.min_leaf_node = min_leaf_node
    self.min_child_weight = min_child_weight
    self.max_split_nodes = max_split_nodes
    self.feature_importance_type = feature_importance_type
    self.n_iter_no_change = n_iter_no_change
    self.tol = tol
    self.use_missing = use_missing
    self.zero_as_missing = zero_as_missing
    self.deterministic = deterministic

Attributes¶

criterion_method = criterion_method instance-attribute ¶

criterion_params = criterion_params instance-attribute ¶

max_depth = max_depth instance-attribute ¶

min_sample_split = min_sample_split instance-attribute ¶

min_impurity_split = min_impurity_split instance-attribute ¶

min_leaf_node = min_leaf_node instance-attribute ¶

min_child_weight = min_child_weight instance-attribute ¶

max_split_nodes = max_split_nodes instance-attribute ¶

feature_importance_type = feature_importance_type instance-attribute ¶

n_iter_no_change = n_iter_no_change instance-attribute ¶

tol = tol instance-attribute ¶

use_missing = use_missing instance-attribute ¶

zero_as_missing = zero_as_missing instance-attribute ¶

deterministic = deterministic instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/boosting_param.py

def check(self):
    descr = "decision tree param"

    self.criterion_method = self.check_and_change_lower(self.criterion_method,
                                                        ["xgboost"],
                                                        descr)

    if len(self.criterion_params) == 0:
        raise ValueError("decisition tree param's criterio_params should be non empty")

    if isinstance(self.criterion_params, list):
        assert len(self.criterion_params) == 2, 'length of criterion_param should be 2: l1, l2 regularization ' \
                                                'values are needed'
        self.check_nonnegative_number(self.criterion_params[0], 'l2 reg value')
        self.check_nonnegative_number(self.criterion_params[1], 'l1 reg value')

    elif isinstance(self.criterion_params, dict):
        assert 'l1' in self.criterion_params and 'l2' in self.criterion_params, 'l1 and l2 keys are needed in ' \
                                                                                'criterion_params dict'
        self.criterion_params = [self.criterion_params['l2'], self.criterion_params['l1']]
    else:
        raise ValueError('criterion_params should be a dict or a list contains l1, l2 reg value')

    if type(self.max_depth).__name__ not in ["int", "long"]:
        raise ValueError("decision tree param's max_depth {} not supported, should be integer".format(
            self.max_depth))

    if self.max_depth < 1:
        raise ValueError("decision tree param's max_depth should be positive integer, no less than 1")

    if type(self.min_sample_split).__name__ not in ["int", "long"]:
        raise ValueError("decision tree param's min_sample_split {} not supported, should be integer".format(
            self.min_sample_split))

    if type(self.min_impurity_split).__name__ not in ["int", "long", "float"]:
        raise ValueError("decision tree param's min_impurity_split {} not supported, should be numeric".format(
            self.min_impurity_split))

    if type(self.min_leaf_node).__name__ not in ["int", "long"]:
        raise ValueError("decision tree param's min_leaf_node {} not supported, should be integer".format(
            self.min_leaf_node))

    if type(self.max_split_nodes).__name__ not in ["int", "long"] or self.max_split_nodes < 1:
        raise ValueError("decision tree param's max_split_nodes {} not supported, " +
                         "should be positive integer between 1 and {}".format(self.max_split_nodes,
                                                                              consts.MAX_SPLIT_NODES))

    if type(self.n_iter_no_change).__name__ != "bool":
        raise ValueError("decision tree param's n_iter_no_change {} not supported, should be bool type".format(
            self.n_iter_no_change))

    if type(self.tol).__name__ not in ["float", "int", "long"]:
        raise ValueError("decision tree param's tol {} not supported, should be numeric".format(self.tol))

    self.feature_importance_type = self.check_and_change_lower(self.feature_importance_type,
                                                               ["split", "gain"],
                                                               descr)
    self.check_nonnegative_number(self.min_child_weight, 'min_child_weight')
    self.check_boolean(self.deterministic, 'deterministic')

    return True

`RSAParam(salt='', hash_method='sha256', final_hash_method='sha256', split_calculation=False, random_base_fraction=None, key_length=consts.DEFAULT_KEY_LENGTH, random_bit=DEFAULT_RANDOM_BIT)` ¶

Bases: BaseParam

Specify parameters for RSA intersect method

Parameters:

Name	Description	Default
`salt`	the src id will be str = str + salt, default ''	`''`
`hash_method`	the hash method of src id, support sha256, sha384, sha512, sm3, default sha256	`'sha256'`
`final_hash_method`	the hash method of result data string, support md5, sha1, sha224, sha256, sha384, sha512, sm3, default sha256	`'sha256'`
`split_calculation`	if True, Host & Guest split operations for faster performance, recommended on large data set	`False`
`random_base_fraction`	if not None, generate (fraction * public key id count) of r for encryption and reuse generated r; note that value greater than 0.99 will be taken as 1, and value less than 0.01 will be rounded up to 0.01	`None`
`key_length`	value >= 1024, bit count of rsa key, default 1024	`consts.DEFAULT_KEY_LENGTH`
`random_bit`	it will define the size of blinding factor in rsa algorithm, default 128	`DEFAULT_RANDOM_BIT`

Source code in python/federatedml/param/intersect_param.py

def __init__(self, salt='', hash_method='sha256', final_hash_method='sha256',
             split_calculation=False, random_base_fraction=None, key_length=consts.DEFAULT_KEY_LENGTH,
             random_bit=DEFAULT_RANDOM_BIT):
    super().__init__()
    self.salt = salt
    self.hash_method = hash_method
    self.final_hash_method = final_hash_method
    self.split_calculation = split_calculation
    self.random_base_fraction = random_base_fraction
    self.key_length = key_length
    self.random_bit = random_bit

Attributes¶

salt = salt instance-attribute ¶

hash_method = hash_method instance-attribute ¶

final_hash_method = final_hash_method instance-attribute ¶

split_calculation = split_calculation instance-attribute ¶

random_base_fraction = random_base_fraction instance-attribute ¶

key_length = key_length instance-attribute ¶

random_bit = random_bit instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/intersect_param.py

def check(self):
    descr = "rsa param's "
    self.check_string(self.salt, f"{descr}salt")

    self.hash_method = self.check_and_change_lower(self.hash_method,
                                                   [consts.SHA256, consts.SHA384, consts.SHA512, consts.SM3],
                                                   f"{descr}hash_method")

    self.final_hash_method = self.check_and_change_lower(self.final_hash_method,
                                                         [consts.MD5, consts.SHA1, consts.SHA224,
                                                          consts.SHA256, consts.SHA384, consts.SHA512,
                                                          consts.SM3],
                                                         f"{descr}final_hash_method")

    self.check_boolean(self.split_calculation, f"{descr}split_calculation")

    if self.random_base_fraction:
        self.check_positive_number(self.random_base_fraction, descr)
        self.check_decimal_float(self.random_base_fraction, f"{descr}random_base_fraction")

    self.check_positive_integer(self.key_length, f"{descr}key_length")
    if self.key_length < 1024:
        raise ValueError(f"key length must be >= 1024")
    self.check_positive_integer(self.random_bit, f"{descr}random_bit")

    LOGGER.debug("Finish RSAParam parameter check!")
    return True

`FeatureBinningParam(method=consts.QUANTILE, compress_thres=consts.DEFAULT_COMPRESS_THRESHOLD, head_size=consts.DEFAULT_HEAD_SIZE, error=consts.DEFAULT_RELATIVE_ERROR, bin_num=consts.G_BIN_NUM, bin_indexes=-1, bin_names=None, adjustment_factor=0.5, transform_param=TransformParam(), local_only=False, category_indexes=None, category_names=None, need_run=True, skip_static=False)` ¶

Bases: BaseParam

Define the feature binning method

Parameters:

Name	Type	Description	Default
`method`	`str, quantile`	Binning method.	`consts.QUANTILE`
`compress_thres`		When the number of saved summaries exceed this threshold, it will call its compress function	`consts.DEFAULT_COMPRESS_THRESHOLD`
`head_size`		The buffer size to store inserted observations. When head list reach this buffer size, the QuantileSummaries object start to generate summary(or stats) and insert into its sampled list.	`consts.DEFAULT_HEAD_SIZE`
`error`		The error of tolerance of binning. The final split point comes from original data, and the rank of this value is close to the exact rank. More precisely, floor((p - 2 * error) * N) <= rank(x) <= ceil((p + 2 * error) * N) where p is the quantile in float, and N is total number of data.	`consts.DEFAULT_RELATIVE_ERROR`
`bin_num`		The max bin number for binning	`consts.G_BIN_NUM`
`bin_indexes`	`list of int or int, default`	Specify which columns need to be binned. -1 represent for all columns. If you need to indicate specific cols, provide a list of header index instead of -1. Note tha columns specified by `bin_indexes` and `bin_names` will be combined.	`-1`
`bin_names`	`list of string, default`	Specify which columns need to calculated. Each element in the list represent for a column name in header. Note tha columns specified by `bin_indexes` and `bin_names` will be combined.	`None`
`adjustment_factor`	`float, default`	the adjustment factor when calculating WOE. This is useful when there is no event or non-event in a bin. Please note that this parameter will NOT take effect for setting in host.	`0.5`
`category_indexes`	`list of int or int, default`	Specify which columns are category features. -1 represent for all columns. List of int indicate a set of such features. For category features, bin_obj will take its original values as split_points and treat them as have been binned. If this is not what you expect, please do NOT put it into this parameters. The number of categories should not exceed bin_num set above. Note tha columns specified by `category_indexes` and `category_names` will be combined.	`None`
`category_names`	`list of string, default`	Use column names to specify category features. Each element in the list represent for a column name in header. Note tha columns specified by `category_indexes` and `category_names` will be combined.	`None`
`local_only`	`bool, default`	Whether just provide binning method to guest party. If true, host party will do nothing. Warnings: This parameter will be deprecated in future version.	`False`
`transform_param`		Define how to transfer the binned data.	`TransformParam()`
`need_run`		Indicate if this module needed to be run	`True`
`skip_static`		If true, binning will not calculate iv, woe etc. In this case, optimal-binning will not be supported.	`False`

Source code in python/federatedml/param/feature_binning_param.py

def __init__(self, method=consts.QUANTILE,
             compress_thres=consts.DEFAULT_COMPRESS_THRESHOLD,
             head_size=consts.DEFAULT_HEAD_SIZE,
             error=consts.DEFAULT_RELATIVE_ERROR,
             bin_num=consts.G_BIN_NUM, bin_indexes=-1, bin_names=None, adjustment_factor=0.5,
             transform_param=TransformParam(),
             local_only=False,
             category_indexes=None, category_names=None,
             need_run=True, skip_static=False):
    super(FeatureBinningParam, self).__init__()
    self.method = method
    self.compress_thres = compress_thres
    self.head_size = head_size
    self.error = error
    self.adjustment_factor = adjustment_factor
    self.bin_num = bin_num
    self.bin_indexes = bin_indexes
    self.bin_names = bin_names
    self.category_indexes = category_indexes
    self.category_names = category_names
    self.transform_param = copy.deepcopy(transform_param)
    self.need_run = need_run
    self.skip_static = skip_static
    self.local_only = local_only

Attributes¶

method = method instance-attribute ¶

compress_thres = compress_thres instance-attribute ¶

head_size = head_size instance-attribute ¶

error = error instance-attribute ¶

adjustment_factor = adjustment_factor instance-attribute ¶

bin_num = bin_num instance-attribute ¶

bin_indexes = bin_indexes instance-attribute ¶

bin_names = bin_names instance-attribute ¶

category_indexes = category_indexes instance-attribute ¶

category_names = category_names instance-attribute ¶

transform_param = copy.deepcopy(transform_param) instance-attribute ¶

need_run = need_run instance-attribute ¶

skip_static = skip_static instance-attribute ¶

local_only = local_only instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/feature_binning_param.py

def check(self):
    descr = "Binning param's"
    self.check_string(self.method, descr)
    self.method = self.method.lower()
    self.check_positive_integer(self.compress_thres, descr)
    self.check_positive_integer(self.head_size, descr)
    self.check_decimal_float(self.error, descr)
    self.check_positive_integer(self.bin_num, descr)
    if self.bin_indexes != -1:
        self.check_defined_type(self.bin_indexes, descr, ['list', 'RepeatedScalarContainer', "NoneType"])
    self.check_defined_type(self.bin_names, descr, ['list', "NoneType"])
    self.check_defined_type(self.category_indexes, descr, ['list', "NoneType"])
    self.check_defined_type(self.category_names, descr, ['list', "NoneType"])
    self.check_open_unit_interval(self.adjustment_factor, descr)
    self.check_boolean(self.local_only, descr)

HeteroNNParam(task_type='classification', bottom_nn_define=None, top_nn_define=None, interactive_layer_define=None, interactive_layer_lr=0.9, config_type='pytorch', optimizer='SGD', loss=None, epochs=100, batch_size=-1, early_stop='diff', tol=1e-05, seed=100, encrypt_param=EncryptParam(), encrypted_mode_calculator_param=EncryptedModeCalculatorParam(), predict_param=PredictParam(), cv_param=CrossValidationParam(), validation_freqs=None, early_stopping_rounds=None, metrics=None, use_first_metric_only=True, selector_param=SelectorParam(), floating_point_precision=23, callback_param=CallbackParam(), coae_param=CoAEConfuserParam(), dataset=DatasetParam()) ¶

Bases: BaseParam

Parameters used for Hetero Neural Network.

Parameters:

Name	Type	Description	Default
`task_type`			`'classification'`
`bottom_nn_define`			`None`
`interactive_layer_define`			`None`
`interactive_layer_lr`			`0.9`
`top_nn_define`			`None`
`optimizer`		a string, one of "Adadelta", "Adagrad", "Adam", "Adamax", "Nadam", "RMSprop", "SGD" a dict, with a required key-value pair keyed by "optimizer", with optional key-value pairs such as learning rate. defaults to "SGD".	`'SGD'`
`loss`			`None`
`epochs`			`100`
`batch_size`	`int, batch size when updating model.`	-1 means use all data in a batch. i.e. Not to use mini-batch strategy. defaults to -1.	`-1`
`early_stop`	`str, accept 'diff' only in this version, default`	Method used to judge converge or not. a) diff： Use difference of loss between two iterations to judge whether converge.	`'diff'`
`floating_point_precision`		e.g.: convert an x to round(x * 2floating_point_precision) during Paillier operation, divide the result by 2floating_point_precision in the end.	`23`
`callback_param`			`CallbackParam()`

Source code in python/federatedml/param/hetero_nn_param.py

def __init__(self,
             task_type='classification',
             bottom_nn_define=None,
             top_nn_define=None,
             interactive_layer_define=None,
             interactive_layer_lr=0.9,
             config_type='pytorch',
             optimizer='SGD',
             loss=None,
             epochs=100,
             batch_size=-1,
             early_stop="diff",
             tol=1e-5,
             seed=100,
             encrypt_param=EncryptParam(),
             encrypted_mode_calculator_param=EncryptedModeCalculatorParam(),
             predict_param=PredictParam(),
             cv_param=CrossValidationParam(),
             validation_freqs=None,
             early_stopping_rounds=None,
             metrics=None,
             use_first_metric_only=True,
             selector_param=SelectorParam(),
             floating_point_precision=23,
             callback_param=CallbackParam(),
             coae_param=CoAEConfuserParam(),
             dataset=DatasetParam()
             ):

    super(HeteroNNParam, self).__init__()

    self.task_type = task_type
    self.bottom_nn_define = bottom_nn_define
    self.interactive_layer_define = interactive_layer_define
    self.interactive_layer_lr = interactive_layer_lr
    self.top_nn_define = top_nn_define
    self.batch_size = batch_size
    self.epochs = epochs
    self.early_stop = early_stop
    self.tol = tol
    self.optimizer = optimizer
    self.loss = loss
    self.validation_freqs = validation_freqs
    self.early_stopping_rounds = early_stopping_rounds
    self.metrics = metrics or []
    self.use_first_metric_only = use_first_metric_only
    self.encrypt_param = copy.deepcopy(encrypt_param)
    self.encrypted_model_calculator_param = encrypted_mode_calculator_param
    self.predict_param = copy.deepcopy(predict_param)
    self.cv_param = copy.deepcopy(cv_param)
    self.selector_param = selector_param
    self.floating_point_precision = floating_point_precision
    self.callback_param = copy.deepcopy(callback_param)
    self.coae_param = coae_param
    self.dataset = dataset
    self.seed = seed
    self.config_type = 'pytorch'  # pytorch only

Attributes¶

task_type = task_type instance-attribute ¶

bottom_nn_define = bottom_nn_define instance-attribute ¶

interactive_layer_define = interactive_layer_define instance-attribute ¶

interactive_layer_lr = interactive_layer_lr instance-attribute ¶

top_nn_define = top_nn_define instance-attribute ¶

batch_size = batch_size instance-attribute ¶

epochs = epochs instance-attribute ¶

early_stop = early_stop instance-attribute ¶

tol = tol instance-attribute ¶

optimizer = optimizer instance-attribute ¶

loss = loss instance-attribute ¶

validation_freqs = validation_freqs instance-attribute ¶

early_stopping_rounds = early_stopping_rounds instance-attribute ¶

metrics = metrics or [] instance-attribute ¶

use_first_metric_only = use_first_metric_only instance-attribute ¶

encrypt_param = copy.deepcopy(encrypt_param) instance-attribute ¶

encrypted_model_calculator_param = encrypted_mode_calculator_param instance-attribute ¶

predict_param = copy.deepcopy(predict_param) instance-attribute ¶

cv_param = copy.deepcopy(cv_param) instance-attribute ¶

selector_param = selector_param instance-attribute ¶

floating_point_precision = floating_point_precision instance-attribute ¶

callback_param = copy.deepcopy(callback_param) instance-attribute ¶

coae_param = coae_param instance-attribute ¶

dataset = dataset instance-attribute ¶

seed = seed instance-attribute ¶

config_type = 'pytorch' instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/hetero_nn_param.py

def check(self):

    assert isinstance(self.dataset, DatasetParam), 'dataset must be a DatasetParam()'

    self.dataset.check()

    self.check_positive_integer(self.seed, 'seed')

    if self.task_type not in ["classification", "regression"]:
        raise ValueError("config_type should be classification or regression")

    if not isinstance(self.tol, (int, float)):
        raise ValueError("tol should be numeric")

    if not isinstance(self.epochs, int) or self.epochs <= 0:
        raise ValueError("epochs should be a positive integer")

    if self.bottom_nn_define and not isinstance(self.bottom_nn_define, dict):
        raise ValueError("bottom_nn_define should be a dict defining the structure of neural network")

    if self.top_nn_define and not isinstance(self.top_nn_define, dict):
        raise ValueError("top_nn_define should be a dict defining the structure of neural network")

    if self.interactive_layer_define is not None and not isinstance(self.interactive_layer_define, dict):
        raise ValueError(
            "the interactive_layer_define should be a dict defining the structure of interactive layer")

    if self.batch_size != -1:
        if not isinstance(self.batch_size, int) \
                or self.batch_size < consts.MIN_BATCH_SIZE:
            raise ValueError(
                " {} not supported, should be larger than 10 or -1 represent for all data".format(self.batch_size))

    if self.early_stop != "diff":
        raise ValueError("early stop should be diff in this version")

    if self.metrics is not None and not isinstance(self.metrics, list):
        raise ValueError("metrics should be a list")

    if self.floating_point_precision is not None and \
            (not isinstance(self.floating_point_precision, int) or
             self.floating_point_precision < 0 or self.floating_point_precision > 63):
        raise ValueError("floating point precision should be null or a integer between 0 and 63")

    self.encrypt_param.check()
    self.encrypted_model_calculator_param.check()
    self.predict_param.check()
    self.selector_param.check()
    self.coae_param.check()

    descr = "hetero nn param's "

    for p in ["early_stopping_rounds", "validation_freqs",
              "use_first_metric_only"]:
        if self._deprecated_params_set.get(p):
            if "callback_param" in self.get_user_feeded():
                raise ValueError(f"{p} and callback param should not be set simultaneously，"
                                 f"{self._deprecated_params_set}, {self.get_user_feeded()}")
            else:
                self.callback_param.callbacks = ["PerformanceEvaluate"]
            break

    if self._warn_to_deprecate_param("validation_freqs", descr, "callback_param's 'validation_freqs'"):
        self.callback_param.validation_freqs = self.validation_freqs

    if self._warn_to_deprecate_param("early_stopping_rounds", descr, "callback_param's 'early_stopping_rounds'"):
        self.callback_param.early_stopping_rounds = self.early_stopping_rounds

    if self._warn_to_deprecate_param("metrics", descr, "callback_param's 'metrics'"):
        if self.metrics:
            self.callback_param.metrics = self.metrics

    if self._warn_to_deprecate_param("use_first_metric_only", descr, "callback_param's 'use_first_metric_only'"):
        self.callback_param.use_first_metric_only = self.use_first_metric_only

`BoostingParam(task_type=consts.CLASSIFICATION, objective_param=ObjectiveParam(), learning_rate=0.3, num_trees=5, subsample_feature_rate=1, n_iter_no_change=True, tol=0.0001, bin_num=32, predict_param=PredictParam(), cv_param=CrossValidationParam(), validation_freqs=None, metrics=None, random_seed=100, binning_error=consts.DEFAULT_RELATIVE_ERROR)` ¶

Bases: BaseParam

Basic parameter for Boosting Algorithms

Parameters:

Name	Type	Description	Default
`task_type`		task type	`'classification'`
`objective_param`	`ObjectiveParam Object, default`	objective param	`ObjectiveParam()`
`learning_rate`	`float, int or long`	the learning rate of secure boost. default: 0.3	`0.3`
`num_trees`	`int or float`	the max number of boosting round. default: 5	`5`
`subsample_feature_rate`	`float`	a float-number in [0, 1], default: 1.0	`1`
`n_iter_no_change`	`bool`	when True and residual error less than tol, tree building process will stop. default: True	`True`
`bin_num`		bin number use in quantile. default: 32	`32`
`validation_freqs`		Do validation in training process or Not. if equals None, will not do validation in train process; if equals positive integer, will validate data every validation_freqs epochs passes; if container object in python, will validate data if epochs belong to this container. e.g. validation_freqs = [10, 15], will validate data when epoch equals to 10 and 15. Default: None	`None`

Source code in python/federatedml/param/boosting_param.py

def __init__(self, task_type=consts.CLASSIFICATION,
             objective_param=ObjectiveParam(),
             learning_rate=0.3, num_trees=5, subsample_feature_rate=1, n_iter_no_change=True,
             tol=0.0001, bin_num=32,
             predict_param=PredictParam(), cv_param=CrossValidationParam(),
             validation_freqs=None, metrics=None, random_seed=100,
             binning_error=consts.DEFAULT_RELATIVE_ERROR):

    super(BoostingParam, self).__init__()

    self.task_type = task_type
    self.objective_param = copy.deepcopy(objective_param)
    self.learning_rate = learning_rate
    self.num_trees = num_trees
    self.subsample_feature_rate = subsample_feature_rate
    self.n_iter_no_change = n_iter_no_change
    self.tol = tol
    self.bin_num = bin_num
    self.predict_param = copy.deepcopy(predict_param)
    self.cv_param = copy.deepcopy(cv_param)
    self.validation_freqs = validation_freqs
    self.metrics = metrics
    self.random_seed = random_seed
    self.binning_error = binning_error

Attributes¶

task_type = task_type instance-attribute ¶

objective_param = copy.deepcopy(objective_param) instance-attribute ¶

learning_rate = learning_rate instance-attribute ¶

num_trees = num_trees instance-attribute ¶

subsample_feature_rate = subsample_feature_rate instance-attribute ¶

n_iter_no_change = n_iter_no_change instance-attribute ¶

tol = tol instance-attribute ¶

bin_num = bin_num instance-attribute ¶

predict_param = copy.deepcopy(predict_param) instance-attribute ¶

cv_param = copy.deepcopy(cv_param) instance-attribute ¶

validation_freqs = validation_freqs instance-attribute ¶

metrics = metrics instance-attribute ¶

random_seed = random_seed instance-attribute ¶

binning_error = binning_error instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/boosting_param.py

def check(self):

    descr = "boosting tree param's"

    if self.task_type not in [consts.CLASSIFICATION, consts.REGRESSION]:
        raise ValueError("boosting_core tree param's task_type {} not supported, should be {} or {}".format(
            self.task_type, consts.CLASSIFICATION, consts.REGRESSION))

    self.objective_param.check(self.task_type)

    if type(self.learning_rate).__name__ not in ["float", "int", "long"]:
        raise ValueError("boosting_core tree param's learning_rate {} not supported, should be numeric".format(
            self.learning_rate))

    if type(self.subsample_feature_rate).__name__ not in ["float", "int", "long"] or \
            self.subsample_feature_rate < 0 or self.subsample_feature_rate > 1:
        raise ValueError(
            "boosting_core tree param's subsample_feature_rate should be a numeric number between 0 and 1")

    if type(self.n_iter_no_change).__name__ != "bool":
        raise ValueError("boosting_core tree param's n_iter_no_change {} not supported, should be bool type".format(
            self.n_iter_no_change))

    if type(self.tol).__name__ not in ["float", "int", "long"]:
        raise ValueError("boosting_core tree param's tol {} not supported, should be numeric".format(self.tol))

    if type(self.bin_num).__name__ not in ["int", "long"] or self.bin_num < 2:
        raise ValueError(
            "boosting_core tree param's bin_num {} not supported, should be positive integer greater than 1".format(
                self.bin_num))

    if self.validation_freqs is None:
        pass
    elif isinstance(self.validation_freqs, int):
        if self.validation_freqs < 1:
            raise ValueError("validation_freqs should be larger than 0 when it's integer")
    elif not isinstance(self.validation_freqs, collections.Container):
        raise ValueError("validation_freqs should be None or positive integer or container")

    if self.metrics is not None and not isinstance(self.metrics, list):
        raise ValueError("metrics should be a list")

    if self.random_seed is not None:
        assert isinstance(self.random_seed, int) and self.random_seed >= 0, 'random seed must be an integer >= 0'

    self.check_decimal_float(self.binning_error, descr)

    return True

IntersectParam(intersect_method=consts.RSA, random_bit=DEFAULT_RANDOM_BIT, sync_intersect_ids=True, join_role=consts.GUEST, only_output_key=False, with_encode=False, encode_params=EncodeParam(), raw_params=RAWParam(), rsa_params=RSAParam(), dh_params=DHParam(), ecdh_params=ECDHParam(), join_method=consts.INNER_JOIN, new_sample_id=False, sample_id_generator=consts.GUEST, intersect_cache_param=IntersectCache(), run_cache=False, cardinality_only=False, sync_cardinality=False, cardinality_method=consts.ECDH, run_preprocess=False, intersect_preprocess_params=IntersectPreProcessParam(), repeated_id_process=False, repeated_id_owner=consts.GUEST, with_sample_id=False, allow_info_share=False, info_owner=consts.GUEST) ¶

Bases: BaseParam

Define the intersect method

Parameters:

Name	Type	Description	Default
`intersect_method`	`str`	it supports 'rsa', 'raw', 'dh', 'ecdh', default by 'rsa'	`consts.RSA`
`random_bit`		it will define the size of blinding factor in rsa algorithm, default 128 note that this param will be deprecated in future, please use random_bit in RSAParam instead	`DEFAULT_RANDOM_BIT`
`sync_intersect_ids`		In rsa, 'sync_intersect_ids' is True means guest or host will send intersect results to the others, and False will not. while in raw, 'sync_intersect_ids' is True means the role of "join_role" will send intersect results and the others will get them. Default by True.	`True`
`join_role`		role who joins ids, supports "guest" and "host" only and effective only for raw. If it is "guest", the host will send its ids to guest and find the intersection of ids in guest; if it is "host", the guest will send its ids to host. Default by "guest"; note this param will be deprecated in future version, please use 'join_role' in raw_params instead	`consts.GUEST`
`only_output_key`	`bool`	if false, the results of intersection will include key and value which from input data; if true, it will just include key from input data and the value will be empty or filled by uniform string like "intersect_id"	`False`
`with_encode`		if True, it will use hash method for intersect ids, effective for raw method only; note that this param will be deprecated in future version, please use 'use_hash' in raw_params; currently if this param is set to True, specification by 'encode_params' will be taken instead of 'raw_params'.	`False`
`encode_params`		effective only when with_encode is True; this param will be deprecated in future version, use 'raw_params' in future implementation	`EncodeParam()`
`raw_params`		this param is deprecated	`RAWParam()`
`rsa_params`		effective for rsa method only, this param is deprecated	`RSAParam()`
`dh_params`		effective for dh method only	`DHParam()`
`ecdh_params`		effective for ecdh method only	`ECDHParam()`
`join_method`		if 'left_join', participants will all include sample_id_generator's (imputed) ids in output, default 'inner_join'	`consts.INNER_JOIN`
`new_sample_id`	`bool`	whether to generate new id for sample_id_generator's ids, only effective when join_method is 'left_join' or when input data are instance with match id, default False	`False`
`sample_id_generator`		role whose ids are to be kept, effective only when join_method is 'left_join' or when input data are instance with match id, default 'guest'	`consts.GUEST`
`intersect_cache_param`		specification for cache generation, with ver1.7 and above, this param is ignored.	`IntersectCache()`
`run_cache`	`bool`	whether to store Host's encrypted ids, only valid when intersect method is 'rsa', 'dh', 'ecdh', default False	`False`
`cardinality_only`	`bool`	whether to output estimated intersection count(cardinality); if sync_cardinality is True, then sync cardinality count with host(s)	`False`
`cardinality_method`		specify which intersect method to use for coutning cardinality, default "ecdh"; note that with "rsa", estimated cardinality will be produced; while "dh" and "ecdh" method output exact cardinality, it only supports single-host task	`consts.ECDH`
`sync_cardinality`	`bool`	whether to sync cardinality with all participants, default False, only effective when cardinality_only set to True	`False`
`run_preprocess`	`bool`	whether to run preprocess process, default False	`False`
`intersect_preprocess_params`		used for preprocessing and cardinality_only mode	`IntersectPreProcessParam()`
`repeated_id_process`		if true, intersection will process the ids which can be repeatable; in ver 1.7 and above,repeated id process will be automatically applied to data with instance id, this param will be ignored	`False`
`repeated_id_owner`		which role has the repeated id; in ver 1.7 and above, this param is ignored	`consts.GUEST`
`allow_info_share`	`bool`	in ver 1.7 and above, this param is ignored	`False`
`info_owner`		in ver 1.7 and above, this param is ignored	`consts.GUEST`
`with_sample_id`		data with sample id or not, default False; in ver 1.7 and above, this param is ignored	`False`

Source code in python/federatedml/param/intersect_param.py

def __init__(self, intersect_method: str = consts.RSA, random_bit=DEFAULT_RANDOM_BIT, sync_intersect_ids=True,
             join_role=consts.GUEST, only_output_key: bool = False,
             with_encode=False, encode_params=EncodeParam(),
             raw_params=RAWParam(), rsa_params=RSAParam(), dh_params=DHParam(), ecdh_params=ECDHParam(),
             join_method=consts.INNER_JOIN, new_sample_id: bool = False, sample_id_generator=consts.GUEST,
             intersect_cache_param=IntersectCache(), run_cache: bool = False,
             cardinality_only: bool = False, sync_cardinality: bool = False, cardinality_method=consts.ECDH,
             run_preprocess: bool = False,
             intersect_preprocess_params=IntersectPreProcessParam(),
             repeated_id_process=False, repeated_id_owner=consts.GUEST,
             with_sample_id=False, allow_info_share: bool = False, info_owner=consts.GUEST):
    super().__init__()
    self.intersect_method = intersect_method
    self.random_bit = random_bit
    self.sync_intersect_ids = sync_intersect_ids
    self.join_role = join_role
    self.with_encode = with_encode
    self.encode_params = copy.deepcopy(encode_params)
    self.raw_params = copy.deepcopy(raw_params)
    self.rsa_params = copy.deepcopy(rsa_params)
    self.only_output_key = only_output_key
    self.sample_id_generator = sample_id_generator
    self.intersect_cache_param = copy.deepcopy(intersect_cache_param)
    self.run_cache = run_cache
    self.repeated_id_process = repeated_id_process
    self.repeated_id_owner = repeated_id_owner
    self.allow_info_share = allow_info_share
    self.info_owner = info_owner
    self.with_sample_id = with_sample_id
    self.join_method = join_method
    self.new_sample_id = new_sample_id
    self.dh_params = copy.deepcopy(dh_params)
    self.cardinality_only = cardinality_only
    self.sync_cardinality = sync_cardinality
    self.cardinality_method = cardinality_method
    self.run_preprocess = run_preprocess
    self.intersect_preprocess_params = copy.deepcopy(intersect_preprocess_params)
    self.ecdh_params = copy.deepcopy(ecdh_params)

Attributes¶

intersect_method = intersect_method instance-attribute ¶

random_bit = random_bit instance-attribute ¶

sync_intersect_ids = sync_intersect_ids instance-attribute ¶

join_role = join_role instance-attribute ¶

with_encode = with_encode instance-attribute ¶

encode_params = copy.deepcopy(encode_params) instance-attribute ¶

raw_params = copy.deepcopy(raw_params) instance-attribute ¶

rsa_params = copy.deepcopy(rsa_params) instance-attribute ¶

only_output_key = only_output_key instance-attribute ¶

sample_id_generator = sample_id_generator instance-attribute ¶

intersect_cache_param = copy.deepcopy(intersect_cache_param) instance-attribute ¶

run_cache = run_cache instance-attribute ¶

repeated_id_process = repeated_id_process instance-attribute ¶

repeated_id_owner = repeated_id_owner instance-attribute ¶

allow_info_share = allow_info_share instance-attribute ¶

info_owner = info_owner instance-attribute ¶

with_sample_id = with_sample_id instance-attribute ¶

join_method = join_method instance-attribute ¶

new_sample_id = new_sample_id instance-attribute ¶

dh_params = copy.deepcopy(dh_params) instance-attribute ¶

cardinality_only = cardinality_only instance-attribute ¶

sync_cardinality = sync_cardinality instance-attribute ¶

cardinality_method = cardinality_method instance-attribute ¶

run_preprocess = run_preprocess instance-attribute ¶

intersect_preprocess_params = copy.deepcopy(intersect_preprocess_params) instance-attribute ¶

ecdh_params = copy.deepcopy(ecdh_params) instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/intersect_param.py

def check(self):
    descr = "intersect param's "

    if self.intersect_method.lower() == consts.RAW.lower():
        self.intersect_method = consts.ECDH
        LOGGER.warning("Raw intersect method is not supported, it will be replaced by ECDH")

    self.intersect_method = self.check_and_change_lower(self.intersect_method,
                                                        [consts.RSA, consts.RAW, consts.DH, consts.ECDH],
                                                        f"{descr}intersect_method")

    if self._warn_to_deprecate_param("random_bit", descr, "rsa_params' 'random_bit'"):
        if "rsa_params.random_bit" in self.get_user_feeded():
            raise ValueError(f"random_bit and rsa_params.random_bit should not be set simultaneously")
        self.rsa_params.random_bit = self.random_bit

    self.check_boolean(self.sync_intersect_ids, f"{descr}intersect_ids")

    if self._warn_to_deprecate_param("encode_param", "", ""):
        if "raw_params" in self.get_user_feeded():
            raise ValueError(f"encode_param and raw_params should not be set simultaneously")
        else:
            self.callback_param.callbacks = ["PerformanceEvaluate"]

    if self._warn_to_deprecate_param("join_role", descr, "raw_params' 'join_role'"):
        if "raw_params.join_role" in self.get_user_feeded():
            raise ValueError(f"join_role and raw_params.join_role should not be set simultaneously")
        self.raw_params.join_role = self.join_role

    self.check_boolean(self.only_output_key, f"{descr}only_output_key")

    self.join_method = self.check_and_change_lower(self.join_method, [consts.INNER_JOIN, consts.LEFT_JOIN],
                                                   f"{descr}join_method")
    self.check_boolean(self.new_sample_id, f"{descr}new_sample_id")
    self.sample_id_generator = self.check_and_change_lower(self.sample_id_generator,
                                                           [consts.GUEST, consts.HOST],
                                                           f"{descr}sample_id_generator")

    if self.join_method == consts.LEFT_JOIN:
        if not self.sync_intersect_ids:
            raise ValueError(f"Cannot perform left join without sync intersect ids")

    self.check_boolean(self.run_cache, f"{descr} run_cache")

    if self._warn_to_deprecate_param("encode_params", descr, "raw_params") or \
            self._warn_to_deprecate_param("with_encode", descr, "raw_params' 'use_hash'"):
        # self.encode_params.check()
        if "with_encode" in self.get_user_feeded() and "raw_params.use_hash" in self.get_user_feeded():
            raise ValueError(f"'raw_params' and 'encode_params' should not be set simultaneously.")
        if "raw_params" in self.get_user_feeded() and "encode_params" in self.get_user_feeded():
            raise ValueError(f"'raw_params' and 'encode_params' should not be set simultaneously.")
        LOGGER.warning(f"Param values from 'encode_params' will override 'raw_params' settings.")
        self.raw_params.use_hash = self.with_encode
        self.raw_params.hash_method = self.encode_params.encode_method
        self.raw_params.salt = self.encode_params.salt
        self.raw_params.base64 = self.encode_params.base64

    self.raw_params.check()
    self.rsa_params.check()
    self.dh_params.check()
    self.ecdh_params.check()
    self.check_boolean(self.cardinality_only, f"{descr}cardinality_only")
    self.check_boolean(self.sync_cardinality, f"{descr}sync_cardinality")
    self.check_boolean(self.run_preprocess, f"{descr}run_preprocess")
    self.intersect_preprocess_params.check()
    if self.cardinality_only:
        if self.cardinality_method not in [consts.RSA, consts.DH, consts.ECDH]:
            raise ValueError(f"cardinality-only mode only support rsa, dh, ecdh.")
        if self.cardinality_method == consts.RSA and self.rsa_params.split_calculation:
            raise ValueError(f"cardinality-only mode only supports unified calculation.")
    if self.run_preprocess:
        if self.intersect_preprocess_params.false_positive_rate < 0.01:
            raise ValueError(f"for preprocessing ids, false_positive_rate must be no less than 0.01")
        if self.cardinality_only:
            raise ValueError(f"cardinality_only mode cannot run preprocessing.")
    if self.run_cache:
        if self.intersect_method not in [consts.RSA, consts.DH, consts.ECDH]:
            raise ValueError(f"Only rsa, dh, or ecdh method supports cache.")
        if self.intersect_method == consts.RSA and self.rsa_params.split_calculation:
            raise ValueError(f"RSA split_calculation does not support cache.")
        if self.cardinality_only:
            raise ValueError(f"cache is not available for cardinality_only mode.")
        if self.run_preprocess:
            raise ValueError(f"Preprocessing does not support cache.")

    deprecated_param_list = ["repeated_id_process", "repeated_id_owner", "intersect_cache_param",
                             "allow_info_share", "info_owner", "with_sample_id"]
    for param in deprecated_param_list:
        self._warn_deprecated_param(param, descr)

    LOGGER.debug("Finish intersect parameter check!")
    return True

FeatureSelectionParam(select_col_indexes=-1, select_names=None, filter_methods=None, unique_param=UniqueValueParam(), iv_value_param=IVValueSelectionParam(), iv_percentile_param=IVPercentileSelectionParam(), iv_top_k_param=IVTopKParam(), variance_coe_param=VarianceOfCoeSelectionParam(), outlier_param=OutlierColsSelectionParam(), manually_param=ManuallyFilterParam(), percentage_value_param=PercentageValueParam(), iv_param=IVFilterParam(), statistic_param=CommonFilterParam(metrics=consts.MEAN), psi_param=CommonFilterParam(metrics=consts.PSI, take_high=False), vif_param=CommonFilterParam(metrics=consts.VIF, threshold=5.0, take_high=False), sbt_param=CommonFilterParam(metrics=consts.FEATURE_IMPORTANCE), correlation_param=CorrelationFilterParam(), use_anonymous=False, need_run=True) ¶

Bases: BaseParam

Define the feature selection parameters.

Parameters:

Name	Type	Description	Default
`select_col_indexes`		Specify which columns need to calculated. -1 represent for all columns. Note tha columns specified by `select_col_indexes` and `select_names` will be combined.	`-1`
`select_names`	`list of string, default`	Specify which columns need to calculated. Each element in the list represent for a column name in header. Note tha columns specified by `select_col_indexes` and `select_names` will be combined.	`None`
`filter_methods`		“hetero_sbt_filter", "homo_sbt_filter", "hetero_fast_sbt_filter", "percentage_value", "vif_filter", "correlation_filter"], default: ["manually"]. The following methods will be deprecated in future version: "unique_value", "iv_value_thres", "iv_percentile", "coefficient_of_variation_value_thres", "outlier_cols" Specify the filter methods used in feature selection. The orders of filter used is depended on this list. Please be notified that, if a percentile method is used after some certain filter method, the percentile represent for the ratio of rest features. e.g. If you have 10 features at the beginning. After first filter method, you have 8 rest. Then, you want top 80% highest iv feature. Here, we will choose floor(0.8 * 8) = 6 features instead of 8.	`None`
`unique_param`		filter the columns if all values in this feature is the same	`UniqueValueParam()`
`iv_value_param`		Use information value to filter columns. If this method is set, a float threshold need to be provided. Filter those columns whose iv is smaller than threshold. Will be deprecated in the future.	`IVValueSelectionParam()`
`iv_percentile_param`		Use information value to filter columns. If this method is set, a float ratio threshold need to be provided. Pick floor(ratio * feature_num) features with higher iv. If multiple features around the threshold are same, all those columns will be keep. Will be deprecated in the future.	`IVPercentileSelectionParam()`
`variance_coe_param`		Use coefficient of variation to judge whether filtered or not. Will be deprecated in the future.	`VarianceOfCoeSelectionParam()`
`outlier_param`		Filter columns whose certain percentile value is larger than a threshold. Will be deprecated in the future.	`OutlierColsSelectionParam()`
`percentage_value_param`		Filter the columns that have a value that exceeds a certain percentage.	`PercentageValueParam()`
`iv_param`		Setting how to filter base on iv. It support take high mode only. All of "threshold", "top_k" and "top_percentile" are accepted. Check more details in CommonFilterParam. To use this filter, hetero-feature-binning module has to be provided.	`IVFilterParam()`
`statistic_param`		Setting how to filter base on statistic values. All of "threshold", "top_k" and "top_percentile" are accepted. Check more details in CommonFilterParam. To use this filter, data_statistic module has to be provided.	`CommonFilterParam(metrics=consts.MEAN)`
`psi_param`		Setting how to filter base on psi values. All of "threshold", "top_k" and "top_percentile" are accepted. Its take_high properties should be False to choose lower psi features. Check more details in CommonFilterParam. To use this filter, data_statistic module has to be provided.	`CommonFilterParam(metrics=consts.PSI, take_high=False)`
`use_anonymous`		whether to interpret 'select_names' as anonymous names.	`False`
`need_run`		Indicate if this module needed to be run	`True`

Source code in python/federatedml/param/feature_selection_param.py

def __init__(self, select_col_indexes=-1, select_names=None, filter_methods=None,
             unique_param=UniqueValueParam(),
             iv_value_param=IVValueSelectionParam(),
             iv_percentile_param=IVPercentileSelectionParam(),
             iv_top_k_param=IVTopKParam(),
             variance_coe_param=VarianceOfCoeSelectionParam(),
             outlier_param=OutlierColsSelectionParam(),
             manually_param=ManuallyFilterParam(),
             percentage_value_param=PercentageValueParam(),
             iv_param=IVFilterParam(),
             statistic_param=CommonFilterParam(metrics=consts.MEAN),
             psi_param=CommonFilterParam(metrics=consts.PSI,
                                         take_high=False),
             vif_param=CommonFilterParam(metrics=consts.VIF,
                                         threshold=5.0,
                                         take_high=False),
             sbt_param=CommonFilterParam(metrics=consts.FEATURE_IMPORTANCE),
             correlation_param=CorrelationFilterParam(),
             use_anonymous=False,
             need_run=True
             ):
    super(FeatureSelectionParam, self).__init__()
    self.correlation_param = correlation_param
    self.vif_param = vif_param
    self.select_col_indexes = select_col_indexes
    if select_names is None:
        self.select_names = []
    else:
        self.select_names = select_names
    if filter_methods is None:
        self.filter_methods = [consts.MANUALLY_FILTER]
    else:
        self.filter_methods = filter_methods

    # deprecate in the future
    self.unique_param = copy.deepcopy(unique_param)
    self.iv_value_param = copy.deepcopy(iv_value_param)
    self.iv_percentile_param = copy.deepcopy(iv_percentile_param)
    self.iv_top_k_param = copy.deepcopy(iv_top_k_param)
    self.variance_coe_param = copy.deepcopy(variance_coe_param)
    self.outlier_param = copy.deepcopy(outlier_param)
    self.percentage_value_param = copy.deepcopy(percentage_value_param)

    self.manually_param = copy.deepcopy(manually_param)
    self.iv_param = copy.deepcopy(iv_param)
    self.statistic_param = copy.deepcopy(statistic_param)
    self.psi_param = copy.deepcopy(psi_param)
    self.sbt_param = copy.deepcopy(sbt_param)
    self.need_run = need_run
    self.use_anonymous = use_anonymous

Attributes¶

correlation_param = correlation_param instance-attribute ¶

vif_param = vif_param instance-attribute ¶

select_col_indexes = select_col_indexes instance-attribute ¶

select_names = [] instance-attribute ¶

filter_methods = [consts.MANUALLY_FILTER] instance-attribute ¶

unique_param = copy.deepcopy(unique_param) instance-attribute ¶

iv_value_param = copy.deepcopy(iv_value_param) instance-attribute ¶

iv_percentile_param = copy.deepcopy(iv_percentile_param) instance-attribute ¶

iv_top_k_param = copy.deepcopy(iv_top_k_param) instance-attribute ¶

variance_coe_param = copy.deepcopy(variance_coe_param) instance-attribute ¶

outlier_param = copy.deepcopy(outlier_param) instance-attribute ¶

percentage_value_param = copy.deepcopy(percentage_value_param) instance-attribute ¶

manually_param = copy.deepcopy(manually_param) instance-attribute ¶

iv_param = copy.deepcopy(iv_param) instance-attribute ¶

statistic_param = copy.deepcopy(statistic_param) instance-attribute ¶

psi_param = copy.deepcopy(psi_param) instance-attribute ¶

sbt_param = copy.deepcopy(sbt_param) instance-attribute ¶

need_run = need_run instance-attribute ¶

use_anonymous = use_anonymous instance-attribute ¶

Functions¶

check() ¶

Source code in python/federatedml/param/feature_selection_param.py

def check(self):
    descr = "hetero feature selection param's"

    self.check_defined_type(self.filter_methods, descr, ['list'])

    for idx, method in enumerate(self.filter_methods):
        method = method.lower()
        self.check_valid_value(method, descr, [consts.UNIQUE_VALUE, consts.IV_VALUE_THRES, consts.IV_PERCENTILE,
                                               consts.COEFFICIENT_OF_VARIATION_VALUE_THRES, consts.OUTLIER_COLS,
                                               consts.MANUALLY_FILTER, consts.PERCENTAGE_VALUE,
                                               consts.IV_FILTER, consts.STATISTIC_FILTER, consts.IV_TOP_K,
                                               consts.PSI_FILTER, consts.HETERO_SBT_FILTER,
                                               consts.HOMO_SBT_FILTER, consts.HETERO_FAST_SBT_FILTER,
                                               consts.VIF_FILTER, consts.CORRELATION_FILTER])

        self.filter_methods[idx] = method

    self.check_defined_type(self.select_col_indexes, descr, ['list', 'int'])

    self.unique_param.check()
    self.iv_value_param.check()
    self.iv_percentile_param.check()
    self.iv_top_k_param.check()
    self.variance_coe_param.check()
    self.outlier_param.check()
    self.manually_param.check()
    self.percentage_value_param.check()

    self.iv_param.check()
    for th in self.iv_param.take_high:
        if not th:
            raise ValueError("Iv filter should take higher iv features")
    for m in self.iv_param.metrics:
        if m != consts.IV:
            raise ValueError("For iv filter, metrics should be 'iv'")

    self.statistic_param.check()
    self.psi_param.check()
    for th in self.psi_param.take_high:
        if th:
            raise ValueError("PSI filter should take lower psi features")
    for m in self.psi_param.metrics:
        if m != consts.PSI:
            raise ValueError("For psi filter, metrics should be 'psi'")

    self.sbt_param.check()
    for th in self.sbt_param.take_high:
        if not th:
            raise ValueError("SBT filter should take higher feature_importance features")
    for m in self.sbt_param.metrics:
        if m != consts.FEATURE_IMPORTANCE:
            raise ValueError("For SBT filter, metrics should be 'feature_importance'")

    self.vif_param.check()
    for m in self.vif_param.metrics:
        if m != consts.VIF:
            raise ValueError("For VIF filter, metrics should be 'vif'")

    self.correlation_param.check()
    self.check_boolean(self.use_anonymous, f"{descr} use_anonymous")

    self._warn_to_deprecate_param("iv_value_param", descr, "iv_param")
    self._warn_to_deprecate_param("iv_percentile_param", descr, "iv_param")
    self._warn_to_deprecate_param("iv_top_k_param", descr, "iv_param")
    self._warn_to_deprecate_param("variance_coe_param", descr, "statistic_param")
    self._warn_to_deprecate_param("unique_param", descr, "statistic_param")
    self._warn_to_deprecate_param("outlier_param", descr, "statistic_param")

Last update: 2023-05-31

Federated Machine Learning¶

Algorithm List¶

Secure Protocol¶

Params¶

param ¶

Attributes¶

Classes¶

PSIParam(max_bin_num=20, need_run=True, dense_missing_val=None, binning_error=consts.DEFAULT_RELATIVE_ERROR) ¶

Attributes¶

Functions¶

HomoOneHotParam(transform_col_indexes=-1, transform_col_names=None, need_run=True, need_alignment=True) ¶

Attributes¶

Functions¶

Attributes¶

Functions¶

Attributes¶

Functions¶

FeldmanVerifiableSumParam(sum_cols=None, q_n=6) ¶

Attributes¶

Functions¶

InitParam(init_method='random_uniform', init_const=1, fit_intercept=True, random_seed=None) ¶

Attributes¶

Functions¶

SecureAddExampleParam(seed=None, partition=1, data_num=1000) ¶

Attributes¶

Functions¶

StochasticQuasiNewtonParam(update_interval_L=3, memory_M=5, sample_size=5000, random_seed=None) ¶

Attributes¶

Functions¶

EncryptParam(method=consts.PAILLIER, key_length=1024) ¶

Attributes¶

Functions¶

EncryptedModeCalculatorParam(mode='strict', re_encrypted_rate=1) ¶

Attributes¶

Functions¶

EvaluateParam(eval_type='binary', pos_label=1, need_run=True, metrics=None, run_clustering_arbiter_metric=False, unfold_multi_result=False) ¶

Attributes¶

Functions¶

KmeansParam(k=5, max_iter=300, tol=0.001, random_stat=None) ¶

Attributes¶

Functions¶

PearsonParam(column_names=None, column_indexes=None, cross_parties=True, need_run=True, use_mix_rand=False, calc_local_vif=True) ¶

Attributes¶

Functions¶

PositiveUnlabeledParam(strategy='probability', threshold=0.9) ¶

Parameters used for positive unlabeled.¶

Attributes¶

Functions¶

SampleParam(mode='random', method='downsample', fractions=None, random_state=None, task_type='hetero', need_run=True) ¶

Attributes¶

Functions¶

ScaleParam(method='standard_scale', mode='normal', scale_col_indexes=-1, scale_names=None, feat_upper=None, feat_lower=None, with_mean=True, with_std=True, need_run=True) ¶

Attributes¶

Functions¶

DataSplitParam(random_state=None, test_size=None, train_size=None, validate_size=None, stratified=False, shuffle=True, split_points=None, need_run=True) ¶

Attributes¶

Functions¶

OneVsRestParam(need_one_vs_rest=False, has_arbiter=True) ¶

Attributes¶

Functions¶

SampleWeightParam(class_weight=None, sample_weight_name=None, normalize=False, need_run=True) ¶

Attributes¶

Functions¶

StepwiseParam(score_name='AIC', mode=consts.HETERO, role=consts.GUEST, direction='both', max_step=10, nvmin=2, nvmax=None, need_stepwise=False) ¶

Attributes¶

Functions¶

UnionParam(need_run=True, allow_missing=False, keep_duplicate=False) ¶

Attributes¶

Functions¶

ColumnExpandParam(append_header=None, method='manual', fill_value=consts.FLOAT_ZERO, need_run=True) ¶

Attributes¶

Functions¶

CrossValidationParam(n_splits=5, mode=consts.HETERO, role=consts.GUEST, shuffle=True, random_seed=1, need_cv=False, output_fold_history=True, history_value_type='score') ¶

Attributes¶

Functions¶

ScorecardParam(method='credit', offset=500, factor=20, factor_base=2, upper_limit_ratio=3, lower_limit_value=0, need_run=True) ¶

Attributes¶

Functions¶

LocalBaselineParam(model_name='LogisticRegression', model_opts=None, predict_param=PredictParam(), need_run=True) ¶

Attributes¶

`param` ¶

`PSIParam(max_bin_num=20, need_run=True, dense_missing_val=None, binning_error=consts.DEFAULT_RELATIVE_ERROR)` ¶

`HomoOneHotParam(transform_col_indexes=-1, transform_col_names=None, need_run=True, need_alignment=True)` ¶

`FeldmanVerifiableSumParam(sum_cols=None, q_n=6)` ¶

`InitParam(init_method='random_uniform', init_const=1, fit_intercept=True, random_seed=None)` ¶

`SecureAddExampleParam(seed=None, partition=1, data_num=1000)` ¶

`StochasticQuasiNewtonParam(update_interval_L=3, memory_M=5, sample_size=5000, random_seed=None)` ¶

`EncryptParam(method=consts.PAILLIER, key_length=1024)` ¶

`EncryptedModeCalculatorParam(mode='strict', re_encrypted_rate=1)` ¶

`EvaluateParam(eval_type='binary', pos_label=1, need_run=True, metrics=None, run_clustering_arbiter_metric=False, unfold_multi_result=False)` ¶

`KmeansParam(k=5, max_iter=300, tol=0.001, random_stat=None)` ¶

`PearsonParam(column_names=None, column_indexes=None, cross_parties=True, need_run=True, use_mix_rand=False, calc_local_vif=True)` ¶

`PositiveUnlabeledParam(strategy='probability', threshold=0.9)` ¶

`SampleParam(mode='random', method='downsample', fractions=None, random_state=None, task_type='hetero', need_run=True)` ¶

`ScaleParam(method='standard_scale', mode='normal', scale_col_indexes=-1, scale_names=None, feat_upper=None, feat_lower=None, with_mean=True, with_std=True, need_run=True)` ¶

`DataSplitParam(random_state=None, test_size=None, train_size=None, validate_size=None, stratified=False, shuffle=True, split_points=None, need_run=True)` ¶

`OneVsRestParam(need_one_vs_rest=False, has_arbiter=True)` ¶

`SampleWeightParam(class_weight=None, sample_weight_name=None, normalize=False, need_run=True)` ¶

`StepwiseParam(score_name='AIC', mode=consts.HETERO, role=consts.GUEST, direction='both', max_step=10, nvmin=2, nvmax=None, need_stepwise=False)` ¶

`UnionParam(need_run=True, allow_missing=False, keep_duplicate=False)` ¶

`ColumnExpandParam(append_header=None, method='manual', fill_value=consts.FLOAT_ZERO, need_run=True)` ¶

`CrossValidationParam(n_splits=5, mode=consts.HETERO, role=consts.GUEST, shuffle=True, random_seed=1, need_cv=False, output_fold_history=True, history_value_type='score')` ¶

`ScorecardParam(method='credit', offset=500, factor=20, factor_base=2, upper_limit_ratio=3, lower_limit_value=0, need_run=True)` ¶

`LocalBaselineParam(model_name='LogisticRegression', model_opts=None, predict_param=PredictParam(), need_run=True)` ¶

`PredictParam(threshold=0.5)` ¶

`SecureInformationRetrievalParam(security_level=0.5, oblivious_transfer_protocol=consts.OT_HAUCK, commutative_encryption=consts.CE_PH, non_committing_encryption=consts.AES, key_size=consts.DEFAULT_KEY_LENGTH, dh_params=DHParam(), raw_retrieval=False, target_cols=None)` ¶

`StatisticsParam(statistics='summary', column_names=None, column_indexes=-1, need_run=True, abnormal_list=None, quantile_error=consts.DEFAULT_RELATIVE_ERROR, bias=True)` ¶

`EncodeParam(salt='', encode_method='none', base64=False)` ¶

`ObjectiveParam(objective='cross_entropy', params=None)` ¶

`HomoNNParam(trainer=TrainerParam(), dataset=DatasetParam(), torch_seed=100, nn_define=None, loss=None, optimizer=None, ds_config=None)` ¶

`RSAParam(salt='', hash_method='sha256', final_hash_method='sha256', split_calculation=False, random_base_fraction=None, key_length=consts.DEFAULT_KEY_LENGTH, random_bit=DEFAULT_RANDOM_BIT)` ¶