联邦机器学习¶
Federatedml模块包括许多常见机器学习算法联邦化实现。所有模块均采用去耦的模块化方法开发,以增强模块的可扩展性。具体来说,我们提供:
- 联邦统计: 包括隐私交集计算,并集计算,皮尔逊系数, PSI等
- 联邦信息检索:基于OT的PIR(SIR)
- 联邦特征工程:包括联邦采样,联邦特征分箱,联邦特征选择等。
- 联邦机器学习算法:包括横向和纵向的联邦LR, GBDT, DNN,迁移学习, 无监督学习,纵向半监督学习等
- 模型评估:提供对二分类,多分类,回归评估,聚类评估,联邦和单边对比评估
- 安全协议:提供了多种安全协议,以进行更安全的多方交互计算。
算法清单¶
算法 | 模块名 | 描述 | 数据输入 | 数据输出 | 模型输入 | 模型输出 |
---|---|---|---|---|---|---|
DataTransform | DataTransform | 该组件将原始数据转换为Instance对象。 | Table,值为原始数据 | 转换后的数据表,值为Data Instance的实例 | DataTransform模型 | |
Intersect | Intersection | 计算两方的相交数据集,而不会泄漏任何差异数据集的信息。主要用于纵向任务。 | Table | 两方Table中相交的部分 | Intersect模型 | |
Federated Sampling | FederatedSample | 对数据进行联邦采样,使得数据分布在各方之间变得平衡。这一模块同时支持单机和集群版本。 | Table | 采样后的数据,同时支持随机采样和分层采样 | ||
Feature Scale | FeatureScale | 特征归一化和标准化。 | Table,其值为instance | 转换后的Table | 变换系数,例如最小值/最大值,平均值/标准差 | |
Hetero Feature Binning | HeteroFeatureBinning | 使用分箱的输入数据,计算每个列的iv和woe,并根据合并后的信息转换数据。 | Table,在guest中有标签y,在host中没有标签y | 转换后的Table | 每列的iv/woe,分裂点,事件计数,非事件计数等 | |
Homo Feature Binning | HomoFeatureBinning | 计算横向场景的等频分箱 | Table | 转换后的Table | 每列的分裂点 | |
OneHot Encoder | OneHotEncoder | 将一列转换为One-Hot格式。 | Table, 值为Instance | 转换了带有新列名的Table | 原始列名和特征值到新列名的映射 | |
Hetero Feature Selection | HeteroFeatureSelection | 提供多种类型的filter。每个filter都可以根据用户配置选择列。 | Table, 值为Instance | 转换的Table具有新的header和已过滤的数据实例 | 模型输入如果使用iv filters,则需要hetero_binning模型 | 每列是否留下 |
Union | Union | 将多个数据表合并成一个。 | Tables | 多个Tables合并后的Table | ||
Hetero-LR | HeteroLR | 通过多方构建纵向逻辑回归模块。 | Table, 值为Instance | Logistic回归模型,由模型本身和模型参数组成 | ||
Local Baseline | LocalBaseline | 使用本地数据运行sklearn Logistic回归模型。 | Table, 值为Instance | |||
Hetero-LinR | HeteroLinR | 通过多方建立纵向线性回归模块。 | Table, 值为Instance | 线性回归模型,由模型本身和模型参数组成 | ||
Hetero-Poisson | HeteroPoisson | 通过多方构建纵向泊松回归模块。 | Table, 值为Instance | 泊松回归模型,由模型本身和模型参数组成 | ||
Homo-LR | HomoLR | 通过多方构建横向逻辑回归模块。 | Table, 值为Instance | Logistic回归模型,由模型本身和模型参数组成 | ||
Homo-NN | HomoNN | 通过多方构建横向神经网络模块。 | Table, 值为Instance | 神经网络模型,由模型本身和模型参数组成 | ||
Hetero Secure Boosting | HeteroSecureBoost | 通过多方构建纵向Secure Boost模块。 | Table,值为Instance | SecureBoost模型,由模型本身和模型参数组成 | ||
Hetero Fast Secure Boosting | HeteroFastSecureBoost | 使用分层/混合模式快速构建树模型 | Table,值为Instance | Table,值为Instance | FastSecureBoost模型 | |
Evaluation | Evaluation | 为用户输出模型评估指标。 | Table(s), 值为Instance | |||
Hetero Pearson | HeteroPearson | 计算来自不同方的特征的Pearson相关系数。 | Table, 值为Instance | |||
Hetero-NN | HeteroNN | 构建纵向神经网络模块。 | Table, 值为Instance | 纵向神经网络模型 | ||
Homo Secure Boosting | HomoSecureBoost | 通过多方构建横向Secure Boost模块 | Table, 值为Instance | SecureBoost模型,由模型本身和模型参数组成 | ||
Homo OneHot Encoder | HomoOneHotEncoder | 将一列转换为One-Hot格式。 | Table, 值为Instance | 转换了带有新列名的Table | 原始列名和特征值到新列名的映射 | |
Hetero Data Split | HeteroDataSplit | 将输入数据集按用户自定义比例或样本量切分为3份子数据集 | Table, 值为Instance | 3 Tables | ||
Homo Data Split | HomoDataSplit | 将输入数据集按用户自定义比例或样本量切分为3份子数据集 | Table, 值为Instance | 3 Tables | ||
Column Expand | ColumnExpand | 对原始Table添加任意列数的任意数值 | Table, 值为原始数据 | 转换后带有新数列与列名的Table | Column Expand模型 | |
Secure Information Retrieval | SecureInformationRetrieval | 通过不经意传输协议安全取回所需数值 | Table, 值为Instance | Table, 值为取回数值 | ||
Hetero Federated Transfer Learning | FTL | 在两个party间构建联邦迁移模型 | Table, 值为Instance | FTL神经网络模型参数等 | ||
PSI | PSI | 计算两个表特征间的PSI值 | Table, 值为Instance | PSI 结果 | ||
Hetero KMeans | HeteroKMeans | 构建K均值模块 | Table, 值为Instance | Table, 值为Instance; Arbiter方输出2个Table | Hetero KMeans模型 | |
Data Statistics | DataStatistics | 这个组件会对数据做一些统计工作,包括统计均值,最大最小值,中位数等 | Table, 值为Instance | Table | Statistic Result | |
Scorecard | Scorecard | 转换二分类预测分数至信用分 | Table, 值为二分类预测结果 | Table, 值为转化后信用分结果 | ||
Sample Weight | SampleWeight | 根据用户设置对输入数据加权 | Table, 值为Instance | Table, 值为加权后Instance | SampleWeight Model | |
Feldman Verifiable Sum | FeldmanVerifiableSum | 不暴露隐私数据的前提下进行多方隐私数据求和 | Table, 值为加数或被加数 | Table, 值为求和结果 | ||
Feature Imputation | FeatureImputation | 使用指定方法、数值填充特征缺失值 | Table, 值为Instance | Table, 值为填充后Instance | Feature Imputation Model | FeatureImputation Model |
Label Transform | LabelTransform | 转化输入数据与预测结果的标签值 | Table, 值为Instance或预测结果 | Table, 值为标签转化后的Instance或预测结果 | LabelTransform Model | |
Hetero SSHE Logistic Regression | HeteroSSHELR | 两方构建纵向逻辑回归(无可信第三方) | Table, 值为Instance | Table, 值为Instance | SSHE LR Model | |
Hetero SSHE Linear Regression | HeteroSSHELinR | 两方构建纵向线性回归(无可信第三方) | Table, 值为Instance | Table, 值为Instance | SSHE LinR Model | |
Positive Unlabeled Learning | PositiveUnlabeled | 构建positive unlabeled learning(PU learning)模型 | Table, 值为Instance | Table, 值为Instance |
安全协议¶
- Encrypt
- Hash
- Diffne Hellman Key Exchange
- SecretShare MPC Protocol(SPDZ)
- Oblivious Transfer
- Feldman Verifiable Secret Sharing
算法参数¶
param
¶
Attributes¶
__all__ = ['BoostingParam', 'ObjectiveParam', 'DecisionTreeParam', 'CrossValidationParam', 'DataSplitParam', 'DataIOParam', 'DataTransformParam', 'EncryptParam', 'EncryptedModeCalculatorParam', 'FeatureBinningParam', 'FeatureSelectionParam', 'FTLParam', 'HeteroNNParam', 'HomoNNParam', 'HomoOneHotParam', 'InitParam', 'IntersectParam', 'EncodeParam', 'RSAParam', 'LinearParam', 'LocalBaselineParam', 'LogisticParam', 'OneVsRestParam', 'PearsonParam', 'PoissonParam', 'PositiveUnlabeledParam', 'PredictParam', 'PSIParam', 'SampleParam', 'ScaleParam', 'SecureAddExampleParam', 'StochasticQuasiNewtonParam', 'StatisticsParam', 'StepwiseParam', 'UnionParam', 'ColumnExpandParam', 'KmeansParam', 'ScorecardParam', 'SecureInformationRetrievalParam', 'SampleWeightParam', 'FeldmanVerifiableSumParam', 'EvaluateParam']
module-attribute
¶
Classes¶
PSIParam(max_bin_num=20, need_run=True, dense_missing_val=None, binning_error=consts.DEFAULT_RELATIVE_ERROR)
¶
Bases: BaseParam
Source code in federatedml/param/psi_param.py
7 8 9 10 11 12 13 |
|
Attributes¶
max_bin_num = max_bin_num
instance-attribute
¶need_run = need_run
instance-attribute
¶dense_missing_val = dense_missing_val
instance-attribute
¶binning_error = binning_error
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/psi_param.py
15 16 17 18 19 20 21 22 23 24 |
|
HomoOneHotParam(transform_col_indexes=-1, transform_col_names=None, need_run=True, need_alignment=True)
¶
Bases: BaseParam
Parameters:
Name | Type | Description | Default |
---|---|---|---|
transform_col_indexes |
Specify which columns need to calculated. -1 represent for all columns. |
-1
|
|
need_run |
Indicate if this module needed to be run |
True
|
|
need_alignment |
Indicated whether alignment of features is turned on |
True
|
Source code in federatedml/param/homo_onehot_encoder_param.py
24 25 26 27 28 29 |
|
Attributes¶
transform_col_indexes = transform_col_indexes
instance-attribute
¶transform_col_names = transform_col_names
instance-attribute
¶need_run = need_run
instance-attribute
¶need_alignment = need_alignment
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/homo_onehot_encoder_param.py
31 32 33 34 35 36 37 38 |
|
DataIOParam(input_format='dense', delimitor=',', data_type='float64', exclusive_data_type=None, tag_with_value=False, tag_value_delimitor=':', missing_fill=False, default_value=0, missing_fill_method=None, missing_impute=None, outlier_replace=False, outlier_replace_method=None, outlier_impute=None, outlier_replace_value=0, with_label=False, label_name='y', label_type='int', output_format='dense', need_run=True)
¶
Bases: BaseParam
Define dataio parameters that used in federated ml.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_format |
please have a look at this tutorial at "DataIO" section of federatedml/util/README.md. Formally, dense input format data should be set to "dense", svm-light input format data should be set to "sparse", tag or tag:value input format data should be set to "tag". |
'dense'
|
|
delimitor |
str
|
the delimitor of data input, default: ',' |
','
|
data_type |
the data type of data input |
'float64'
|
|
exclusive_data_type |
dict
|
the key of dict is col_name, the value is data_type, use to specified special data type of some features. |
None
|
tag_with_value |
use if input_format is 'tag', if tag_with_value is True, input column data format should be tag[delimitor]value, otherwise is tag only |
False
|
|
tag_value_delimitor |
use if input_format is 'tag' and 'tag_with_value' is True, delimitor of tag[delimitor]value column value. |
':'
|
|
missing_fill |
bool
|
need to fill missing value or not, accepted only True/False, default: False |
False
|
default_value |
None or object or list
|
the value to replace missing value. if None, it will use default value define in federatedml/feature/imputer.py, if single object, will fill missing value with this object, if list, it's length should be the sample of input data' feature dimension, means that if some column happens to have missing values, it will replace it the value by element in the identical position of this list. |
0
|
missing_fill_method |
the method to replace missing value |
None
|
|
missing_impute |
element of list can be any type, or auto generated if value is None, define which values to be consider as missing |
None
|
|
outlier_replace |
need to replace outlier value or not, accepted only True/False, default: True |
False
|
|
outlier_replace_method |
the method to replace missing value |
None
|
|
outlier_impute |
element of list can be any type, which values should be regard as missing value, default: None |
None
|
|
outlier_replace_value |
None or object or list
|
the value to replace outlier. if None, it will use default value define in federatedml/feature/imputer.py, if single object, will replace outlier with this object, if list, it's length should be the sample of input data' feature dimension, means that if some column happens to have outliers, it will replace it the value by element in the identical position of this list. |
0
|
with_label |
bool
|
True if input data consist of label, False otherwise. default: 'false' |
False
|
label_name |
str
|
column_name of the column where label locates, only use in dense-inputformat. default: 'y' |
'y'
|
label_type |
use when with_label is True. |
'int'
|
|
output_format |
output format |
'dense'
|
Source code in federatedml/param/dataio_param.py
83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 |
|
Attributes¶
input_format = input_format
instance-attribute
¶delimitor = delimitor
instance-attribute
¶data_type = data_type
instance-attribute
¶exclusive_data_type = exclusive_data_type
instance-attribute
¶tag_with_value = tag_with_value
instance-attribute
¶tag_value_delimitor = tag_value_delimitor
instance-attribute
¶missing_fill = missing_fill
instance-attribute
¶default_value = default_value
instance-attribute
¶missing_fill_method = missing_fill_method
instance-attribute
¶missing_impute = missing_impute
instance-attribute
¶outlier_replace = outlier_replace
instance-attribute
¶outlier_replace_method = outlier_replace_method
instance-attribute
¶outlier_impute = outlier_impute
instance-attribute
¶outlier_replace_value = outlier_replace_value
instance-attribute
¶with_label = with_label
instance-attribute
¶label_name = label_name
instance-attribute
¶label_type = label_type
instance-attribute
¶output_format = output_format
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/dataio_param.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
|
DataTransformParam(input_format='dense', delimitor=',', data_type='float64', exclusive_data_type=None, tag_with_value=False, tag_value_delimitor=':', missing_fill=False, default_value=0, missing_fill_method=None, missing_impute=None, outlier_replace=False, outlier_replace_method=None, outlier_impute=None, outlier_replace_value=0, with_label=False, label_name='y', label_type='int', output_format='dense', need_run=True, with_match_id=False, match_id_name='', match_id_index=0)
¶
Bases: BaseParam
Define data transform parameters that used in federated ml.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_format |
please have a look at this tutorial at "DataTransform" section of federatedml/util/README.md. Formally, dense input format data should be set to "dense", svm-light input format data should be set to "sparse", tag or tag:value input format data should be set to "tag". Note: in fate's version >= 1.9.0, this params can be used in uploading/binding data's meta |
'dense'
|
|
delimitor |
str
|
the delimitor of data input, default: ',' |
','
|
data_type |
int
|
{'float64','float','int','int64','str','long'} the data type of data input |
'float64'
|
exclusive_data_type |
dict
|
the key of dict is col_name, the value is data_type, use to specified special data type of some features. |
None
|
tag_with_value |
use if input_format is 'tag', if tag_with_value is True, input column data format should be tag[delimitor]value, otherwise is tag only |
False
|
|
tag_value_delimitor |
use if input_format is 'tag' and 'tag_with_value' is True, delimitor of tag[delimitor]value column value. |
':'
|
|
missing_fill |
bool
|
need to fill missing value or not, accepted only True/False, default: False |
False
|
default_value |
None or object or list
|
the value to replace missing value. if None, it will use default value define in federatedml/feature/imputer.py, if single object, will fill missing value with this object, if list, it's length should be the sample of input data' feature dimension, means that if some column happens to have missing values, it will replace it the value by element in the identical position of this list. |
0
|
missing_fill_method |
the method to replace missing value, should be one of [None, 'min', 'max', 'mean', 'designated'] |
None
|
|
missing_impute |
element of list can be any type, or auto generated if value is None, define which values to be consider as missing |
None
|
|
outlier_replace |
need to replace outlier value or not, accepted only True/False, default: True |
False
|
|
outlier_replace_method |
the method to replace missing value, should be one of [None, 'min', 'max', 'mean', 'designated'] |
None
|
|
outlier_impute |
element of list can be any type, which values should be regard as missing value |
None
|
|
outlier_replace_value |
the value to replace outlier. if None, it will use default value define in federatedml/feature/imputer.py, if single object, will replace outlier with this object, if list, it's length should be the sample of input data' feature dimension, means that if some column happens to have outliers, it will replace it the value by element in the identical position of this list. |
0
|
|
with_label |
bool
|
True if input data consist of label, False otherwise. default: 'false' Note: in fate's version >= 1.9.0, this params can be used in uploading/binding data's meta |
False
|
label_name |
str
|
column_name of the column where label locates, only use in dense-inputformat. default: 'y' |
'y'
|
label_type |
use when with_label is True |
'int','int64','float','float64','long','str'
|
|
output_format |
output format |
'dense'
|
|
with_match_id |
True if dataset has match_id, default: False Note: in fate's version >= 1.9.0, this params can be used in uploading/binding data's meta |
False
|
|
match_id_name |
Valid if input_format is "dense", and multiple columns are considered as match_ids, the name of match_id to be used in current job Note: in fate's version >= 1.9.0, this params can be used in uploading/binding data's meta |
''
|
|
match_id_index |
Valid if input_format is "tag" or "sparse", and multiple columns are considered as match_ids, the index of match_id, default: 0 This param works only when data meta has been set with uploading/binding. |
0
|
Source code in federatedml/param/data_transform_param.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
Attributes¶
input_format = input_format
instance-attribute
¶delimitor = delimitor
instance-attribute
¶data_type = data_type
instance-attribute
¶exclusive_data_type = exclusive_data_type
instance-attribute
¶tag_with_value = tag_with_value
instance-attribute
¶tag_value_delimitor = tag_value_delimitor
instance-attribute
¶missing_fill = missing_fill
instance-attribute
¶default_value = default_value
instance-attribute
¶missing_fill_method = missing_fill_method
instance-attribute
¶missing_impute = missing_impute
instance-attribute
¶outlier_replace = outlier_replace
instance-attribute
¶outlier_replace_method = outlier_replace_method
instance-attribute
¶outlier_impute = outlier_impute
instance-attribute
¶outlier_replace_value = outlier_replace_value
instance-attribute
¶with_label = with_label
instance-attribute
¶label_name = label_name
instance-attribute
¶label_type = label_type
instance-attribute
¶output_format = output_format
instance-attribute
¶need_run = need_run
instance-attribute
¶with_match_id = with_match_id
instance-attribute
¶match_id_name = match_id_name
instance-attribute
¶match_id_index = match_id_index
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/data_transform_param.py
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 |
|
FeldmanVerifiableSumParam(sum_cols=None, q_n=6)
¶
Bases: BaseParam
Define how to transfer the cols
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sum_cols |
list of column index, default
|
Specify which columns need to be sum. If column index is None, each of columns will be sum. |
None
|
q_n |
int, positive integer less than or equal to 16, default
|
q_n is the number of significant decimal digit, If the data type is a float, the maximum significant digit is 16. The sum of integer and significant decimal digits should be less than or equal to 16. |
6
|
Source code in federatedml/param/feldman_verifiable_sum_param.py
36 37 38 |
|
Attributes¶
sum_cols = sum_cols
instance-attribute
¶q_n = q_n
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/feldman_verifiable_sum_param.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
InitParam(init_method='random_uniform', init_const=1, fit_intercept=True, random_seed=None)
¶
Bases: BaseParam
Initialize Parameters used in initializing a model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
init_method |
Initial method. |
'random_uniform'
|
|
init_const |
int or float, default
|
Required when init_method is 'const'. Specify the constant. |
1
|
fit_intercept |
bool, default
|
Whether to initialize the intercept or not. |
True
|
Source code in federatedml/param/init_model_param.py
36 37 38 39 40 41 |
|
Attributes¶
init_method = init_method
instance-attribute
¶init_const = init_const
instance-attribute
¶fit_intercept = fit_intercept
instance-attribute
¶random_seed = random_seed
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/init_model_param.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
|
SecureAddExampleParam(seed=None, partition=1, data_num=1000)
¶
Bases: BaseParam
Source code in federatedml/param/secure_add_example_param.py
23 24 25 26 |
|
Attributes¶
seed = seed
instance-attribute
¶partition = partition
instance-attribute
¶data_num = data_num
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/secure_add_example_param.py
28 29 30 31 32 33 34 35 36 |
|
StochasticQuasiNewtonParam(update_interval_L=3, memory_M=5, sample_size=5000, random_seed=None)
¶
Bases: BaseParam
Parameters used for stochastic quasi-newton method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
update_interval_L |
int, default
|
Set how many iteration to update hess matrix |
3
|
memory_M |
int, default
|
Stack size of curvature information, i.e. y_k and s_k in the paper. |
5
|
sample_size |
int, default
|
Sample size of data that used to update Hess matrix |
5000
|
Source code in federatedml/param/sqn_param.py
37 38 39 40 41 42 |
|
Attributes¶
update_interval_L = update_interval_L
instance-attribute
¶memory_M = memory_M
instance-attribute
¶sample_size = sample_size
instance-attribute
¶random_seed = random_seed
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/sqn_param.py
44 45 46 47 48 49 50 51 |
|
EncryptParam(method=consts.PAILLIER, key_length=1024)
¶
Bases: BaseParam
Define encryption method that used in federated ml.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
method |
If method is 'Paillier', Paillier encryption will be used for federated ml. To use non-encryption version in HomoLR, set this to None. For detail of Paillier encryption, please check out the paper mentioned in README file. |
'Paillier'
|
|
key_length |
int, default
|
Used to specify the length of key in this encryption method. |
1024
|
Source code in federatedml/param/encrypt_param.py
38 39 40 41 |
|
Attributes¶
method = method
instance-attribute
¶key_length = key_length
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/encrypt_param.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
|
EncryptedModeCalculatorParam(mode='strict', re_encrypted_rate=1)
¶
Bases: BaseParam
Define the encrypted_mode_calulator parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mode |
encrypted mode, default: strict |
'strict'
|
|
re_encrypted_rate |
numeric number in [0, 1], use when mode equals to 'balance', default: 1 |
1
|
Source code in federatedml/param/encrypted_mode_calculation_param.py
35 36 37 |
|
Attributes¶
mode = mode
instance-attribute
¶re_encrypted_rate = re_encrypted_rate
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/encrypted_mode_calculation_param.py
39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
EvaluateParam(eval_type='binary', pos_label=1, need_run=True, metrics=None, run_clustering_arbiter_metric=False, unfold_multi_result=False)
¶
Bases: BaseParam
Define the evaluation method of binary/multiple classification and regression
Parameters:
Name | Type | Description | Default |
---|---|---|---|
eval_type |
support 'binary' for HomoLR, HeteroLR and Secureboosting, support 'regression' for Secureboosting, 'multi' is not support these version |
'binary'
|
|
unfold_multi_result |
bool
|
unfold multi result and get several one-vs-rest binary classification results |
False
|
pos_label |
int or float or str
|
specify positive label type, depend on the data's label. this parameter effective only for 'binary' |
1
|
need_run |
Indicate if this module needed to be run |
True
|
Source code in federatedml/param/evaluation_param.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
Attributes¶
eval_type = eval_type
instance-attribute
¶pos_label = pos_label
instance-attribute
¶need_run = need_run
instance-attribute
¶metrics = metrics
instance-attribute
¶unfold_multi_result = unfold_multi_result
instance-attribute
¶run_clustering_arbiter_metric = run_clustering_arbiter_metric
instance-attribute
¶default_metrics = {consts.BINARY: consts.ALL_BINARY_METRICS, consts.MULTY: consts.ALL_MULTI_METRICS, consts.REGRESSION: consts.ALL_REGRESSION_METRICS, consts.CLUSTERING: consts.ALL_CLUSTER_METRICS}
instance-attribute
¶allowed_metrics = {consts.BINARY: consts.ALL_BINARY_METRICS, consts.MULTY: consts.ALL_MULTI_METRICS, consts.REGRESSION: consts.ALL_REGRESSION_METRICS, consts.CLUSTERING: consts.ALL_CLUSTER_METRICS}
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/evaluation_param.py
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
|
check_single_value_default_metric()
¶Source code in federatedml/param/evaluation_param.py
143 144 145 146 147 148 149 150 151 152 153 154 155 |
|
KmeansParam(k=5, max_iter=300, tol=0.001, random_stat=None)
¶
Bases: BaseParam
Parameters:
Name | Type | Description | Default |
---|---|---|---|
k |
int, default 5
|
The number of the centroids to generate. should be larger than 1 and less than 100 in this version |
5
|
max_iter |
int, default 300.
|
Maximum number of iterations of the hetero-k-means algorithm to run. |
300
|
tol |
float, default 0.001.
|
tol |
0.001
|
random_stat |
None or int
|
random seed |
None
|
Source code in federatedml/param/hetero_kmeans_param.py
38 39 40 41 42 43 |
|
Attributes¶
k = k
instance-attribute
¶max_iter = max_iter
instance-attribute
¶tol = tol
instance-attribute
¶random_stat = random_stat
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/hetero_kmeans_param.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|
PearsonParam(column_names=None, column_indexes=None, cross_parties=True, need_run=True, use_mix_rand=False, calc_local_vif=True)
¶
Bases: BaseParam
param for pearson correlation
Parameters:
Name | Type | Description | Default |
---|---|---|---|
column_names |
list of string
|
list of column names |
None
|
column_index |
list of int
|
list of column index |
required |
cross_parties |
bool, default
|
if True, calculate correlation of columns from both party |
True
|
need_run |
bool
|
set False to skip this party |
True
|
use_mix_rand |
bool, defalut
|
mix system random and pseudo random for quicker calculation |
False
|
calc_loca_vif |
bool, default True
|
calculate VIF for columns in local |
required |
Source code in federatedml/param/pearson_param.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|
Attributes¶
column_names = column_names
instance-attribute
¶column_indexes = column_indexes
instance-attribute
¶cross_parties = cross_parties
instance-attribute
¶need_run = need_run
instance-attribute
¶use_mix_rand = use_mix_rand
instance-attribute
¶calc_local_vif = calc_local_vif
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/pearson_param.py
60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 |
|
PositiveUnlabeledParam(strategy='probability', threshold=0.9)
¶
Bases: BaseParam
Parameters used for positive unlabeled.¶
strategy: {"probability", "quantity", "proportion", "distribution"} The strategy of converting unlabeled value.
threshold: int or float, default: 0.9 The threshold in labeling strategy.
Source code in federatedml/param/positive_unlabeled_param.py
34 35 36 37 |
|
Attributes¶
strategy = strategy
instance-attribute
¶threshold = threshold
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/positive_unlabeled_param.py
39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
SampleParam(mode='random', method='downsample', fractions=None, random_state=None, task_type='hetero', need_run=True)
¶
Bases: BaseParam
Define the sample method
Parameters:
Name | Type | Description | Default |
---|---|---|---|
mode |
specify sample to use, default: 'random' |
'random'
|
fractions: None or float or list if mode equals to random, it should be a float number greater than 0, otherwise a list of elements of pairs like [label_i, sample_rate_i], e.g. [[0, 0.5], [1, 0.8], [2, 0.3]]. default: None
random_state: int, RandomState instance or None, default: None random state
need_run: bool, default True Indicate if this module needed to be run
Source code in federatedml/param/sample_param.py
47 48 49 50 51 52 53 54 |
|
Attributes¶
mode = mode
instance-attribute
¶method = method
instance-attribute
¶fractions = fractions
instance-attribute
¶random_state = random_state
instance-attribute
¶task_type = task_type
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/sample_param.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|
ScaleParam(method='standard_scale', mode='normal', scale_col_indexes=-1, scale_names=None, feat_upper=None, feat_lower=None, with_mean=True, with_std=True, need_run=True)
¶
Bases: BaseParam
Define the feature scale parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
method |
like scale in sklearn, now it support "min_max_scale" and "standard_scale", and will support other scale method soon. Default standard_scale, which will do nothing for scale |
"standard_scale"
|
|
mode |
for mode is "normal", the feat_upper and feat_lower is the normal value like "10" or "3.1" and for "cap", feat_upper and feature_lower will between 0 and 1, which means the percentile of the column. Default "normal" |
"normal"
|
|
feat_upper |
int or float or list of int or float
|
the upper limit in the column. If use list, mode must be "normal", and list length should equal to the number of features to scale. If the scaled value is larger than feat_upper, it will be set to feat_upper |
None
|
feat_lower |
the lower limit in the column. If use list, mode must be "normal", and list length should equal to the number of features to scale. If the scaled value is less than feat_lower, it will be set to feat_lower |
None
|
|
scale_col_indexes |
the idx of column in scale_column_idx will be scaled, while the idx of column is not in, it will not be scaled. |
-1
|
|
scale_names |
list of string
|
Specify which columns need to scaled. Each element in the list represent for a column name in header. default: [] |
None
|
with_mean |
bool
|
used for "standard_scale". Default True. |
True
|
with_std |
bool
|
used for "standard_scale". Default True. The standard scale of column x is calculated as : z = (x - u) / s , where u is the mean of the column and s is the standard deviation of the column. if with_mean is False, u will be 0, and if with_std is False, s will be 1. |
True
|
need_run |
bool
|
Indicate if this module needed to be run, default True |
True
|
Source code in federatedml/param/scale_param.py
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
|
Attributes¶
scale_names = [] if scale_names is None else scale_names
instance-attribute
¶method = method
instance-attribute
¶mode = mode
instance-attribute
¶feat_upper = feat_upper
instance-attribute
¶feat_lower = feat_lower
instance-attribute
¶scale_col_indexes = scale_col_indexes
instance-attribute
¶with_mean = with_mean
instance-attribute
¶with_std = with_std
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/scale_param.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
|
DataSplitParam(random_state=None, test_size=None, train_size=None, validate_size=None, stratified=False, shuffle=True, split_points=None, need_run=True)
¶
Bases: BaseParam
Define data split param that used in data split.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
random_state |
None or int, default
|
Specify the random state for shuffle. |
None
|
test_size |
float or int or None, default
|
Specify test data set size. float value specifies fraction of input data set, int value specifies exact number of data instances |
None
|
train_size |
float or int or None, default
|
Specify train data set size. float value specifies fraction of input data set, int value specifies exact number of data instances |
None
|
validate_size |
float or int or None, default
|
Specify validate data set size. float value specifies fraction of input data set, int value specifies exact number of data instances |
None
|
stratified |
bool, default
|
Define whether sampling should be stratified, according to label value. |
False
|
shuffle |
bool, default
|
Define whether do shuffle before splitting or not. |
True
|
split_points |
None or list, default
|
Specify the point(s) by which continuous label values are bucketed into bins for stratified split. eg.[0.2] for two bins or [0.1, 1, 3] for 4 bins |
None
|
need_run |
Specify whether to run data split |
True
|
Source code in federatedml/param/data_split_param.py
52 53 54 55 56 57 58 59 60 61 62 |
|
Attributes¶
random_state = random_state
instance-attribute
¶test_size = test_size
instance-attribute
¶train_size = train_size
instance-attribute
¶validate_size = validate_size
instance-attribute
¶stratified = stratified
instance-attribute
¶shuffle = shuffle
instance-attribute
¶split_points = split_points
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/data_split_param.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
|
OneVsRestParam(need_one_vs_rest=False, has_arbiter=True)
¶
Bases: BaseParam
Define the one_vs_rest parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
has_arbiter |
For some algorithm, may not has arbiter, for instances, secureboost of FATE, for these algorithms, it should be set to false. |
True
|
Source code in federatedml/param/one_vs_rest_param.py
35 36 37 38 |
|
Attributes¶
need_one_vs_rest = need_one_vs_rest
instance-attribute
¶has_arbiter = has_arbiter
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/one_vs_rest_param.py
40 41 42 43 44 45 46 47 |
|
SampleWeightParam(class_weight=None, sample_weight_name=None, normalize=False, need_run=True)
¶
Bases: BaseParam
Define sample weight parameters
Parameters:
Name | Type | Description | Default |
---|---|---|---|
class_weight |
str or dict, or None, default None
|
class weight dictionary or class weight computation mode, string value only accepts 'balanced'; If dict provided, key should be class(label), and weight will not be normalize, e.g.: {'0': 1, '1': 2} If both class_weight and sample_weight_name are None, return original input data. |
None
|
sample_weight_name |
str
|
name of column which specifies sample weight. feature name of sample weight; if both class_weight and sample_weight_name are None, return original input data |
None
|
normalize |
bool, default False
|
whether to normalize sample weight extracted from |
False
|
need_run |
bool, default True
|
whether to run this module or not |
True
|
Source code in federatedml/param/sample_weight_param.py
44 45 46 47 48 |
|
Attributes¶
class_weight = class_weight
instance-attribute
¶sample_weight_name = sample_weight_name
instance-attribute
¶normalize = normalize
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/sample_weight_param.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|
StepwiseParam(score_name='AIC', mode=consts.HETERO, role=consts.GUEST, direction='both', max_step=10, nvmin=2, nvmax=None, need_stepwise=False)
¶
Bases: BaseParam
Define stepwise params
Parameters:
Name | Type | Description | Default |
---|---|---|---|
score_name |
Specify which model selection criterion to be used |
'AIC'
|
|
mode |
Indicate what mode is current task |
consts.HETERO
|
|
role |
Indicate what role is current party |
consts.GUEST
|
|
direction |
Indicate which direction to go for stepwise. 'forward' means forward selection; 'backward' means elimination; 'both' means possible models of both directions are examined at each step. |
'both'
|
|
max_step |
Specify total number of steps to run before forced stop. |
10
|
|
nvmin |
Specify the min subset size of final model, cannot be lower than 2. When nvmin > 2, the final model size may be smaller than nvmin due to max_step limit. |
2
|
|
nvmax |
Specify the max subset size of final model, 2 <= nvmin <= nvmax. The final model size may be larger than nvmax due to max_step limit. |
None
|
|
need_stepwise |
Indicate if this module needed to be run |
False
|
Source code in federatedml/param/stepwise_param.py
50 51 52 53 54 55 56 57 58 59 60 |
|
Attributes¶
score_name = score_name
instance-attribute
¶mode = mode
instance-attribute
¶role = role
instance-attribute
¶direction = direction
instance-attribute
¶max_step = max_step
instance-attribute
¶nvmin = nvmin
instance-attribute
¶nvmax = nvmax
instance-attribute
¶need_stepwise = need_stepwise
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/stepwise_param.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
|
UnionParam(need_run=True, allow_missing=False, keep_duplicate=False)
¶
Bases: BaseParam
Define the union method for combining multiple dTables and keep entries with the same id
Parameters:
Name | Type | Description | Default |
---|---|---|---|
need_run |
Indicate if this module needed to be run |
True
|
|
allow_missing |
Whether allow mismatch between feature length and header length in the result. Note that empty tables will always be skipped regardless of this param setting. |
False
|
|
keep_duplicate |
Whether to keep entries with duplicated keys. If set to True, a new id will be generated for duplicated entry in the format {id}_{table_name}. |
False
|
Source code in federatedml/param/union_param.py
38 39 40 41 42 |
|
Attributes¶
need_run = need_run
instance-attribute
¶allow_missing = allow_missing
instance-attribute
¶keep_duplicate = keep_duplicate
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/union_param.py
44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
ColumnExpandParam(append_header=None, method='manual', fill_value=consts.FLOAT_ZERO, need_run=True)
¶
Bases: BaseParam
Define method used for expanding column
Parameters:
Name | Type | Description | Default |
---|---|---|---|
append_header |
None or str or List[str], default
|
Name(s) for appended feature(s). If None is given, module outputs the original input value without any operation. |
None
|
method |
str, default
|
If method is 'manual', use user-specified |
'manual'
|
fill_value |
int or float or str or List[int] or List[float] or List[str], default
|
Used for filling expanded feature columns. If given a list, length of the list must match that of |
consts.FLOAT_ZERO
|
need_run |
Indicate if this module needed to be run. |
True
|
Source code in federatedml/param/column_expand_param.py
41 42 43 44 45 46 47 |
|
Attributes¶
append_header = append_header
instance-attribute
¶method = method
instance-attribute
¶fill_value = fill_value
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/column_expand_param.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
|
CrossValidationParam(n_splits=5, mode=consts.HETERO, role=consts.GUEST, shuffle=True, random_seed=1, need_cv=False, output_fold_history=True, history_value_type='score')
¶
Bases: BaseParam
Define cross validation params
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_splits |
Specify how many splits used in KFold |
5
|
|
mode |
Indicate what mode is current task |
consts.HETERO
|
|
role |
Indicate what role is current party |
consts.GUEST
|
|
shuffle |
Define whether do shuffle before KFold or not. |
True
|
|
random_seed |
Specify the random seed for numpy shuffle |
1
|
|
need_cv |
Indicate if this module needed to be run |
False
|
|
output_fold_history |
Indicate whether to output table of ids used by each fold, else return original input data returned ids are formatted as: {original_id}#fold{fold_num}#{train/validate} |
True
|
|
history_value_type |
Indicate whether to include original instance or predict score in the output fold history, only effective when output_fold_history set to True |
'score'
|
Source code in federatedml/param/cross_validation_param.py
52 53 54 55 56 57 58 59 60 61 62 63 |
|
Attributes¶
n_splits = n_splits
instance-attribute
¶mode = mode
instance-attribute
¶role = role
instance-attribute
¶shuffle = shuffle
instance-attribute
¶random_seed = random_seed
instance-attribute
¶need_cv = need_cv
instance-attribute
¶output_fold_history = output_fold_history
instance-attribute
¶history_value_type = history_value_type
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/cross_validation_param.py
65 66 67 68 69 70 71 72 73 74 75 |
|
ScorecardParam(method='credit', offset=500, factor=20, factor_base=2, upper_limit_ratio=3, lower_limit_value=0, need_run=True)
¶
Bases: BaseParam
Define method used for transforming prediction score to credit score
Parameters:
Name | Type | Description | Default |
---|---|---|---|
method |
score method, currently only supports "credit" |
"credit"
|
|
offset |
int or float, default
|
score baseline |
500
|
factor |
int or float, default
|
scoring step, when odds double, result score increases by this factor |
20
|
factor_base |
int or float, default
|
factor base, value ln(factor_base) is used for calculating result score |
2
|
upper_limit_ratio |
int or float, default
|
upper bound for odds, credit score upper bound is upper_limit_ratio * offset |
3
|
lower_limit_value |
int or float, default
|
lower bound for result score |
0
|
need_run |
bool, default
|
Indicate if this module needs to be run. |
True
|
Source code in federatedml/param/scorecard_param.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
|
Attributes¶
method = method
instance-attribute
¶offset = offset
instance-attribute
¶factor = factor
instance-attribute
¶factor_base = factor_base
instance-attribute
¶upper_limit_ratio = upper_limit_ratio
instance-attribute
¶lower_limit_value = lower_limit_value
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/scorecard_param.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 |
|
LocalBaselineParam(model_name='LogisticRegression', model_opts=None, predict_param=PredictParam(), need_run=True)
¶
Bases: BaseParam
Define the local baseline model param
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_name |
str
|
sklearn model used to train on baseline model |
'LogisticRegression'
|
model_opts |
dict or none, default None
|
Param to be used as input into baseline model |
None
|
predict_param |
PredictParam object, default
|
predict param |
PredictParam()
|
need_run |
Indicate if this module needed to be run |
True
|
Source code in federatedml/param/local_baseline_param.py
42 43 44 45 46 47 |
|
Attributes¶
model_name = model_name
instance-attribute
¶model_opts = model_opts
instance-attribute
¶predict_param = copy.deepcopy(predict_param)
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/local_baseline_param.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
|
PredictParam(threshold=0.5)
¶
Bases: BaseParam
Define the predict method of HomoLR, HeteroLR, SecureBoosting
Parameters:
Name | Type | Description | Default |
---|---|---|---|
threshold |
The threshold use to separate positive and negative class. Normally, it should be (0,1) |
0.5
|
Source code in federatedml/param/predict_param.py
36 37 |
|
Attributes¶
threshold = threshold
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/predict_param.py
39 40 41 42 43 44 45 46 |
|
SecureInformationRetrievalParam(security_level=0.5, oblivious_transfer_protocol=consts.OT_HAUCK, commutative_encryption=consts.CE_PH, non_committing_encryption=consts.AES, key_size=consts.DEFAULT_KEY_LENGTH, dh_params=DHParam(), raw_retrieval=False, target_cols=None)
¶
Bases: BaseParam
Parameters:
Name | Type | Description | Default |
---|---|---|---|
security_level |
security level, should set value in [0, 1] if security_level equals 0.0 means raw data retrieval |
0.5
|
|
oblivious_transfer_protocol |
OT type, only supports OT_Hauck |
consts.OT_HAUCK
|
|
commutative_encryption |
the commutative encryption scheme used |
"CommutativeEncryptionPohligHellman"
|
|
non_committing_encryption |
the non-committing encryption scheme used |
"aes"
|
|
dh_params |
params for Pohlig-Hellman Encryption |
DHParam()
|
|
key_size |
the key length of the commutative cipher; note that this param will be deprecated in future, please specify key_length in PHParam instead. |
consts.DEFAULT_KEY_LENGTH
|
|
raw_retrieval |
perform raw retrieval if raw_retrieval |
False
|
|
target_cols |
target cols to retrieve; any values not retrieved will be marked as "unretrieved", if target_cols is None, label will be retrieved, same behavior as in previous version default None |
None
|
Source code in federatedml/param/sir_param.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
|
Attributes¶
security_level = security_level
instance-attribute
¶oblivious_transfer_protocol = oblivious_transfer_protocol
instance-attribute
¶commutative_encryption = commutative_encryption
instance-attribute
¶non_committing_encryption = non_committing_encryption
instance-attribute
¶dh_params = dh_params
instance-attribute
¶key_size = key_size
instance-attribute
¶raw_retrieval = raw_retrieval
instance-attribute
¶target_cols = target_cols
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/sir_param.py
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 |
|
StatisticsParam(statistics='summary', column_names=None, column_indexes=-1, need_run=True, abnormal_list=None, quantile_error=consts.DEFAULT_RELATIVE_ERROR, bias=True)
¶
Bases: BaseParam
Define statistics params
Parameters:
Name | Type | Description | Default |
---|---|---|---|
statistics |
Specify the statistic types to be computed. "summary" represents list: [consts.SUM, consts.MEAN, consts.STANDARD_DEVIATION, consts.MEDIAN, consts.MIN, consts.MAX, consts.MISSING_COUNT, consts.SKEWNESS, consts.KURTOSIS] |
'summary'
|
|
column_names |
Specify columns to be used for statistic computation by column names in header |
None
|
|
column_indexes |
Specify columns to be used for statistic computation by column order in header -1 indicates to compute statistics over all columns |
-1
|
|
bias |
If False, the calculations of skewness and kurtosis are corrected for statistical bias. |
True
|
|
need_run |
Indicate whether to run this modules |
True
|
Source code in federatedml/param/statistics_param.py
61 62 63 64 65 66 67 68 69 70 71 |
|
Attributes¶
LEGAL_STAT = [consts.COUNT, consts.SUM, consts.MEAN, consts.STANDARD_DEVIATION, consts.MEDIAN, consts.MIN, consts.MAX, consts.VARIANCE, consts.COEFFICIENT_OF_VARIATION, consts.MISSING_COUNT, consts.MISSING_RATIO, consts.SKEWNESS, consts.KURTOSIS]
class-attribute
¶BASIC_STAT = [consts.SUM, consts.MEAN, consts.STANDARD_DEVIATION, consts.MEDIAN, consts.MIN, consts.MAX, consts.MISSING_RATIO, consts.MISSING_COUNT, consts.SKEWNESS, consts.KURTOSIS, consts.COEFFICIENT_OF_VARIATION]
class-attribute
¶LEGAL_QUANTILE = re.compile('^(100)|([1-9]?[0-9])%$')
class-attribute
¶statistics = statistics
instance-attribute
¶column_names = column_names
instance-attribute
¶column_indexes = column_indexes
instance-attribute
¶abnormal_list = abnormal_list
instance-attribute
¶need_run = need_run
instance-attribute
¶quantile_error = quantile_error
instance-attribute
¶bias = bias
instance-attribute
¶Functions¶
find_stat_name_match(stat_name)
staticmethod
¶Source code in federatedml/param/statistics_param.py
86 87 88 89 90 |
|
check()
¶Source code in federatedml/param/statistics_param.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 |
|
EncodeParam(salt='', encode_method='none', base64=False)
¶
Bases: BaseParam
Define the hash method for raw intersect method
Parameters:
Name | Type | Description | Default |
---|---|---|---|
salt |
the src id will be str = str + salt, default by empty string |
''
|
|
encode_method |
the hash method of src id, support md5, sha1, sha224, sha256, sha384, sha512, sm3, default by None |
'none'
|
|
base64 |
if True, the result of hash will be changed to base64, default by False |
False
|
Source code in federatedml/param/intersect_param.py
43 44 45 46 47 |
|
Attributes¶
salt = salt
instance-attribute
¶encode_method = encode_method
instance-attribute
¶base64 = base64
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/intersect_param.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
|
PoissonParam(penalty='L2', tol=0.0001, alpha=1.0, optimizer='rmsprop', batch_size=-1, learning_rate=0.01, init_param=InitParam(), max_iter=20, early_stop='diff', exposure_colname=None, encrypt_param=EncryptParam(), encrypted_mode_calculator_param=EncryptedModeCalculatorParam(), cv_param=CrossValidationParam(), stepwise_param=StepwiseParam(), decay=1, decay_sqrt=True, validation_freqs=None, early_stopping_rounds=None, metrics=None, use_first_metric_only=False, floating_point_precision=23, callback_param=CallbackParam())
¶
Bases: LinearModelParam
Parameters used for Poisson Regression.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
penalty |
Penalty method used in Poisson. Please note that, when using encrypted version in HeteroPoisson, 'L1' is not supported. |
'L2'
|
|
tol |
float, default
|
The tolerance of convergence |
0.0001
|
alpha |
float, default
|
Regularization strength coefficient. |
1.0
|
optimizer |
Optimize method |
'rmsprop'
|
|
batch_size |
int, default
|
Batch size when updating model. -1 means use all data in a batch. i.e. Not to use mini-batch strategy. |
-1
|
learning_rate |
float, default
|
Learning rate |
0.01
|
max_iter |
int, default
|
The maximum iteration for training. |
20
|
init_param |
Init param method object. |
InitParam()
|
|
early_stop |
str, 'weight_diff', 'diff' or 'abs', default
|
Method used to judge convergence. a) diff: Use difference of loss between two iterations to judge whether converge. b) weight_diff: Use difference between weights of two consecutive iterations c) abs: Use the absolute value of loss to judge whether converge. i.e. if loss < eps, it is converged. |
'diff'
|
exposure_colname |
Name of optional exposure variable in dTable. |
None
|
|
encrypt_param |
encrypt param |
EncryptParam()
|
|
encrypted_mode_calculator_param |
encrypted mode calculator param |
EncryptedModeCalculatorParam()
|
|
cv_param |
cv param |
CrossValidationParam()
|
|
stepwise_param |
stepwise param |
StepwiseParam()
|
|
decay |
Decay rate for learning rate. learning rate will follow the following decay schedule. lr = lr0/(1+decay*t) if decay_sqrt is False. If decay_sqrt is True, lr = lr0 / sqrt(1+decay*t) where t is the iter number. |
1
|
|
decay_sqrt |
lr = lr0/(1+decay*t) if decay_sqrt is False, otherwise, lr = lr0 / sqrt(1+decay*t) |
True
|
|
validation_freqs |
validation frequency during training, required when using early stopping. The default value is None, 1 is suggested. You can set it to a number larger than 1 in order to speed up training by skipping validation rounds. When it is larger than 1, a number which is divisible by "max_iter" is recommended, otherwise, you will miss the validation scores of the last training iteration. |
None
|
|
early_stopping_rounds |
If positive number specified, at every specified training rounds, program checks for early stopping criteria. Validation_freqs must also be set when using early stopping. |
None
|
|
metrics |
Specify which metrics to be used when performing evaluation during training process. If metrics have not improved at early_stopping rounds, trianing stops before convergence. If set as empty, default metrics will be used. For regression tasks, default metrics are ['root_mean_squared_error', 'mean_absolute_error'] |
None
|
|
use_first_metric_only |
Indicate whether to use the first metric in |
False
|
|
floating_point_precision |
if not None, use floating_point_precision-bit to speed up calculation, e.g.: convert an x to round(x * 2**floating_point_precision) during Paillier operation, divide the result by 2**floating_point_precision in the end. |
23
|
|
callback_param |
callback param |
CallbackParam()
|
Source code in federatedml/param/poisson_regression_param.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 |
|
Attributes¶
encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param)
instance-attribute
¶exposure_colname = exposure_colname
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/poisson_regression_param.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
|
LinearParam(penalty='L2', tol=0.0001, alpha=1.0, optimizer='sgd', batch_size=-1, learning_rate=0.01, init_param=InitParam(), max_iter=20, early_stop='diff', encrypt_param=EncryptParam(), sqn_param=StochasticQuasiNewtonParam(), encrypted_mode_calculator_param=EncryptedModeCalculatorParam(), cv_param=CrossValidationParam(), decay=1, decay_sqrt=True, validation_freqs=None, early_stopping_rounds=None, stepwise_param=StepwiseParam(), metrics=None, use_first_metric_only=False, floating_point_precision=23, callback_param=CallbackParam())
¶
Bases: LinearModelParam
Parameters used for Linear Regression.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
penalty |
Penalty method used in LinR. Please note that, when using encrypted version in HeteroLinR, 'L1' is not supported. When using Homo-LR, 'L1' is not supported |
'L2' or 'L1'
|
|
tol |
float, default
|
The tolerance of convergence |
0.0001
|
alpha |
float, default
|
Regularization strength coefficient. |
1.0
|
optimizer |
Optimize method |
'sgd'
|
|
batch_size |
int, default
|
Batch size when updating model. -1 means use all data in a batch. i.e. Not to use mini-batch strategy. |
-1
|
learning_rate |
float, default
|
Learning rate |
0.01
|
max_iter |
int, default
|
The maximum iteration for training. |
20
|
init_param |
Init param method object. |
InitParam()
|
|
early_stop |
Method used to judge convergence. a) diff: Use difference of loss between two iterations to judge whether converge. b) abs: Use the absolute value of loss to judge whether converge. i.e. if loss < tol, it is converged. c) weight_diff: Use difference between weights of two consecutive iterations |
'diff'
|
|
encrypt_param |
encrypt param |
EncryptParam()
|
|
encrypted_mode_calculator_param |
encrypted mode calculator param |
EncryptedModeCalculatorParam()
|
|
cv_param |
cv param |
CrossValidationParam()
|
|
decay |
Decay rate for learning rate. learning rate will follow the following decay schedule. lr = lr0/(1+decay*t) if decay_sqrt is False. If decay_sqrt is True, lr = lr0 / sqrt(1+decay*t) where t is the iter number. |
1
|
|
decay_sqrt |
lr = lr0/(1+decay*t) if decay_sqrt is False, otherwise, lr = lr0 / sqrt(1+decay*t) |
True
|
|
validation_freqs |
validation frequency during training, required when using early stopping. The default value is None, 1 is suggested. You can set it to a number larger than 1 in order to speed up training by skipping validation rounds. When it is larger than 1, a number which is divisible by "max_iter" is recommended, otherwise, you will miss the validation scores of the last training iteration. |
None
|
|
early_stopping_rounds |
If positive number specified, at every specified training rounds, program checks for early stopping criteria. Validation_freqs must also be set when using early stopping. |
None
|
|
metrics |
Specify which metrics to be used when performing evaluation during training process. If metrics have not improved at early_stopping rounds, trianing stops before convergence. If set as empty, default metrics will be used. For regression tasks, default metrics are ['root_mean_squared_error', 'mean_absolute_error'] |
None
|
|
use_first_metric_only |
Indicate whether to use the first metric in |
False
|
|
floating_point_precision |
if not None, use floating_point_precision-bit to speed up calculation, e.g.: convert an x to round(x * 2**floating_point_precision) during Paillier operation, divide the result by 2**floating_point_precision in the end. |
23
|
|
callback_param |
callback param |
CallbackParam()
|
Source code in federatedml/param/linear_regression_param.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 |
|
Attributes¶
sqn_param = copy.deepcopy(sqn_param)
instance-attribute
¶encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param)
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/linear_regression_param.py
116 117 118 119 120 121 122 123 124 125 126 127 |
|
LogisticParam(penalty='L2', tol=0.0001, alpha=1.0, optimizer='rmsprop', batch_size=-1, shuffle=True, batch_strategy='full', masked_rate=5, learning_rate=0.01, init_param=InitParam(), max_iter=100, early_stop='diff', encrypt_param=EncryptParam(), predict_param=PredictParam(), cv_param=CrossValidationParam(), decay=1, decay_sqrt=True, multi_class='ovr', validation_freqs=None, early_stopping_rounds=None, stepwise_param=StepwiseParam(), floating_point_precision=23, metrics=None, use_first_metric_only=False, callback_param=CallbackParam())
¶
Bases: LinearModelParam
Parameters used for Logistic Regression both for Homo mode or Hetero mode.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
penalty |
Penalty method used in LR. Please note that, when using encrypted version in HomoLR, 'L1' is not supported. |
'L2'
|
|
tol |
float, default
|
The tolerance of convergence |
0.0001
|
alpha |
float, default
|
Regularization strength coefficient. |
1.0
|
optimizer |
Optimize method. |
'rmsprop'
|
|
batch_strategy |
str
|
Strategy to generate batch data. a) full: use full data to generate batch_data, batch_nums every iteration is ceil(data_size / batch_size) b) random: select data randomly from full data, batch_num will be 1 every iteration. |
'full'
|
batch_size |
int, default
|
Batch size when updating model. -1 means use all data in a batch. i.e. Not to use mini-batch strategy. |
-1
|
shuffle |
bool, default
|
Work only in hetero logistic regression, batch data will be shuffle in every iteration. |
True
|
masked_rate |
Use masked data to enhance security of hetero logistic regression |
5
|
|
learning_rate |
float, default
|
Learning rate |
0.01
|
max_iter |
int, default
|
The maximum iteration for training. |
100
|
early_stop |
Method used to judge converge or not. a) diff: Use difference of loss between two iterations to judge whether converge. b) weight_diff: Use difference between weights of two consecutive iterations c) abs: Use the absolute value of loss to judge whether converge. i.e. if loss < eps, it is converged. Please note that for hetero-lr multi-host situation, this parameter support "weight_diff" only. In homo-lr, weight_diff is not supported |
'diff'
|
|
decay |
Decay rate for learning rate. learning rate will follow the following decay schedule. lr = lr0/(1+decay*t) if decay_sqrt is False. If decay_sqrt is True, lr = lr0 / sqrt(1+decay*t) where t is the iter number. |
1
|
|
decay_sqrt |
lr = lr0/(1+decay*t) if decay_sqrt is False, otherwise, lr = lr0 / sqrt(1+decay*t) |
True
|
|
encrypt_param |
encrypt param |
EncryptParam()
|
|
predict_param |
predict param |
PredictParam()
|
|
callback_param |
callback param |
CallbackParam()
|
|
cv_param |
cv param |
CrossValidationParam()
|
|
multi_class |
If it is a multi_class task, indicate what strategy to use. Currently, support 'ovr' short for one_vs_rest only. |
'ovr'
|
|
validation_freqs |
validation frequency during training. |
None
|
|
early_stopping_rounds |
Will stop training if one metric doesn’t improve in last early_stopping_round rounds |
None
|
|
metrics |
Indicate when executing evaluation during train process, which metrics will be used. If set as empty, default metrics for specific task type will be used. As for binary classification, default metrics are ['auc', 'ks'] |
None
|
|
use_first_metric_only |
Indicate whether use the first metric only for early stopping judgement. |
False
|
|
floating_point_precision |
if not None, use floating_point_precision-bit to speed up calculation, e.g.: convert an x to round(x * 2**floating_point_precision) during Paillier operation, divide the result by 2**floating_point_precision in the end. |
23
|
Source code in federatedml/param/logistic_regression_param.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
|
Attributes¶
penalty = penalty
instance-attribute
¶tol = tol
instance-attribute
¶alpha = alpha
instance-attribute
¶optimizer = optimizer
instance-attribute
¶batch_size = batch_size
instance-attribute
¶learning_rate = learning_rate
instance-attribute
¶init_param = copy.deepcopy(init_param)
instance-attribute
¶max_iter = max_iter
instance-attribute
¶early_stop = early_stop
instance-attribute
¶encrypt_param = encrypt_param
instance-attribute
¶shuffle = shuffle
instance-attribute
¶batch_strategy = batch_strategy
instance-attribute
¶masked_rate = masked_rate
instance-attribute
¶predict_param = copy.deepcopy(predict_param)
instance-attribute
¶cv_param = copy.deepcopy(cv_param)
instance-attribute
¶decay = decay
instance-attribute
¶decay_sqrt = decay_sqrt
instance-attribute
¶multi_class = multi_class
instance-attribute
¶validation_freqs = validation_freqs
instance-attribute
¶stepwise_param = copy.deepcopy(stepwise_param)
instance-attribute
¶early_stopping_rounds = early_stopping_rounds
instance-attribute
¶metrics = metrics or []
instance-attribute
¶use_first_metric_only = use_first_metric_only
instance-attribute
¶floating_point_precision = floating_point_precision
instance-attribute
¶callback_param = copy.deepcopy(callback_param)
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/logistic_regression_param.py
143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
|
ObjectiveParam(objective='cross_entropy', params=None)
¶
Bases: BaseParam
Define objective parameters that used in federated ml.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
objective |
None in host's config, should be str in guest'config. when task_type is classification, only support 'cross_entropy', other 6 types support in regression task |
None
|
|
params |
None or list
|
should be non empty list when objective is 'tweedie','fair','huber', first element of list shoulf be a float-number large than 0.0 when objective is 'fair', 'huber', first element of list should be a float-number in [1.0, 2.0) when objective is 'tweedie' |
None
|
Source code in federatedml/param/boosting_param.py
50 51 52 |
|
Attributes¶
objective = objective
instance-attribute
¶params = params
instance-attribute
¶Functions¶
check(task_type=None)
¶Source code in federatedml/param/boosting_param.py
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
|
FTLParam(alpha=1, tol=1e-06, n_iter_no_change=False, validation_freqs=None, optimizer={'optimizer': 'Adam', 'learning_rate': 0.01}, nn_define={}, epochs=1, intersect_param=IntersectParam(consts.RSA), config_type='keras', batch_size=-1, encrypte_param=EncryptParam(), encrypted_mode_calculator_param=EncryptedModeCalculatorParam(mode='confusion_opt'), predict_param=PredictParam(), mode='plain', communication_efficient=False, local_round=5, callback_param=CallbackParam())
¶
Bases: BaseParam
Parameters:
Name | Type | Description | Default |
---|---|---|---|
alpha |
float
|
a loss coefficient defined in paper, it defines the importance of alignment loss |
1
|
tol |
float
|
loss tolerance |
1e-06
|
n_iter_no_change |
bool
|
check loss convergence or not |
False
|
validation_freqs |
None or positive integer or container object in python
|
Do validation in training process or Not. if equals None, will not do validation in train process; if equals positive integer, will validate data every validation_freqs epochs passes; if container object in python, will validate data if epochs belong to this container. e.g. validation_freqs = [10, 15], will validate data when epoch equals to 10 and 15. The default value is None, 1 is suggested. You can set it to a number larger than 1 in order to speed up training by skipping validation rounds. When it is larger than 1, a number which is divisible by "epochs" is recommended, otherwise, you will miss the validation scores of last training epoch. |
None
|
optimizer |
str or dict
|
optimizer method, accept following types: 1. a string, one of "Adadelta", "Adagrad", "Adam", "Adamax", "Nadam", "RMSprop", "SGD" 2. a dict, with a required key-value pair keyed by "optimizer", with optional key-value pairs such as learning rate. defaults to "SGD" |
{'optimizer': 'Adam', 'learning_rate': 0.01}
|
nn_define |
dict
|
a dict represents the structure of neural network, it can be output by tf-keras |
{}
|
epochs |
int
|
epochs num |
1
|
intersect_param |
define the intersect method |
IntersectParam(consts.RSA)
|
|
config_type |
config type |
'tf-keras'
|
|
batch_size |
int
|
batch size when computing transformed feature embedding, -1 use full data. |
-1
|
encrypte_param |
encrypted param |
EncryptParam()
|
|
encrypted_mode_calculator_param |
encrypted mode calculator param: |
EncryptedModeCalculatorParam(mode='confusion_opt')
|
|
predict_param |
predict param |
PredictParam()
|
|
mode |
'plain'
|
||
communication_efficient |
will use communication efficient or not. when communication efficient is enabled, FTL model will update gradients by several local rounds using intermediate data |
False
|
|
local_round |
local update round when using communication efficient |
5
|
Source code in federatedml/param/ftl_param.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
Attributes¶
alpha = alpha
instance-attribute
¶tol = tol
instance-attribute
¶n_iter_no_change = n_iter_no_change
instance-attribute
¶validation_freqs = validation_freqs
instance-attribute
¶optimizer = optimizer
instance-attribute
¶nn_define = nn_define
instance-attribute
¶epochs = epochs
instance-attribute
¶intersect_param = copy.deepcopy(intersect_param)
instance-attribute
¶config_type = config_type
instance-attribute
¶batch_size = batch_size
instance-attribute
¶encrypted_mode_calculator_param = copy.deepcopy(encrypted_mode_calculator_param)
instance-attribute
¶encrypt_param = copy.deepcopy(encrypte_param)
instance-attribute
¶predict_param = copy.deepcopy(predict_param)
instance-attribute
¶mode = mode
instance-attribute
¶communication_efficient = communication_efficient
instance-attribute
¶local_round = local_round
instance-attribute
¶callback_param = copy.deepcopy(callback_param)
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/ftl_param.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 |
|
HomoNNParam(trainer=TrainerParam(), dataset=DatasetParam(), torch_seed=100, nn_define=None, loss=None, optimizer=None)
¶
Bases: BaseParam
Source code in federatedml/param/homo_nn_param.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
|
Attributes¶
trainer = trainer
instance-attribute
¶dataset = dataset
instance-attribute
¶torch_seed = torch_seed
instance-attribute
¶nn_define = nn_define
instance-attribute
¶loss = loss
instance-attribute
¶optimizer = optimizer
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/homo_nn_param.py
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 |
|
DecisionTreeParam(criterion_method='xgboost', criterion_params=[0.1, 0], max_depth=3, min_sample_split=2, min_impurity_split=0.001, min_leaf_node=1, max_split_nodes=consts.MAX_SPLIT_NODES, feature_importance_type='split', n_iter_no_change=True, tol=0.001, min_child_weight=0, use_missing=False, zero_as_missing=False, deterministic=False)
¶
Bases: BaseParam
Define decision tree parameters that used in federated ml.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
criterion_method |
the criterion function to use |
"xgboost"
|
|
criterion_params |
should be non empty and elements are float-numbers, if a list is offered, the first one is l2 regularization value, and the second one is l1 regularization value. if a dict is offered, make sure it contains key 'l1', and 'l2'. l1, l2 regularization values are non-negative floats. default: [0.1, 0] or {'l1':0, 'l2':0,1} |
[0.1, 0]
|
|
max_depth |
the max depth of a decision tree, default: 3 |
3
|
|
min_sample_split |
least quantity of nodes to split, default: 2 |
2
|
|
min_impurity_split |
least gain of a single split need to reach, default: 1e-3 |
0.001
|
|
min_child_weight |
sum of hessian needed in child nodes. default is 0 |
0
|
|
min_leaf_node |
when samples no more than min_leaf_node, it becomes a leave, default: 1 |
1
|
|
max_split_nodes |
we will use no more than max_split_nodes to parallel finding their splits in a batch, for memory consideration. default is 65536 |
consts.MAX_SPLIT_NODES
|
|
feature_importance_type |
if is 'split', feature_importances calculate by feature split times, if is 'gain', feature_importances calculate by feature split gain. default: 'split' Due to the safety concern, we adjust training strategy of Hetero-SBT in FATE-1.8, When running Hetero-SBT, this parameter is now abandoned. In Hetero-SBT of FATE-1.8, guest side will compute split, gain of local features, and receive anonymous feature importance results from hosts. Hosts will compute split importance of local features. |
'split'
|
|
use_missing |
use missing value in training process or not. |
False
|
|
zero_as_missing |
regard 0 as missing value or not, will be use only if use_missing=True, default: False |
False
|
|
deterministic |
ensure stability when computing histogram. Set this to true to ensure stable result when using same data and same parameter. But it may slow down computation. |
False
|
Source code in federatedml/param/boosting_param.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 |
|
Attributes¶
criterion_method = criterion_method
instance-attribute
¶criterion_params = criterion_params
instance-attribute
¶max_depth = max_depth
instance-attribute
¶min_sample_split = min_sample_split
instance-attribute
¶min_impurity_split = min_impurity_split
instance-attribute
¶min_leaf_node = min_leaf_node
instance-attribute
¶min_child_weight = min_child_weight
instance-attribute
¶max_split_nodes = max_split_nodes
instance-attribute
¶feature_importance_type = feature_importance_type
instance-attribute
¶n_iter_no_change = n_iter_no_change
instance-attribute
¶tol = tol
instance-attribute
¶use_missing = use_missing
instance-attribute
¶zero_as_missing = zero_as_missing
instance-attribute
¶deterministic = deterministic
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/boosting_param.py
167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 |
|
FeatureBinningParam(method=consts.QUANTILE, compress_thres=consts.DEFAULT_COMPRESS_THRESHOLD, head_size=consts.DEFAULT_HEAD_SIZE, error=consts.DEFAULT_RELATIVE_ERROR, bin_num=consts.G_BIN_NUM, bin_indexes=-1, bin_names=None, adjustment_factor=0.5, transform_param=TransformParam(), local_only=False, category_indexes=None, category_names=None, need_run=True, skip_static=False)
¶
Bases: BaseParam
Define the feature binning method
Parameters:
Name | Type | Description | Default |
---|---|---|---|
method |
str, quantile
|
Binning method. |
consts.QUANTILE
|
compress_thres |
When the number of saved summaries exceed this threshold, it will call its compress function |
consts.DEFAULT_COMPRESS_THRESHOLD
|
|
head_size |
The buffer size to store inserted observations. When head list reach this buffer size, the QuantileSummaries object start to generate summary(or stats) and insert into its sampled list. |
consts.DEFAULT_HEAD_SIZE
|
|
error |
The error of tolerance of binning. The final split point comes from original data, and the rank of this value is close to the exact rank. More precisely, floor((p - 2 * error) * N) <= rank(x) <= ceil((p + 2 * error) * N) where p is the quantile in float, and N is total number of data. |
consts.DEFAULT_RELATIVE_ERROR
|
|
bin_num |
The max bin number for binning |
consts.G_BIN_NUM
|
|
bin_indexes |
list of int or int, default
|
Specify which columns need to be binned. -1 represent for all columns. If you need to indicate specific
cols, provide a list of header index instead of -1.
Note tha columns specified by |
-1
|
bin_names |
list of string, default
|
Specify which columns need to calculated. Each element in the list represent for a column name in header.
Note tha columns specified by |
None
|
adjustment_factor |
float, default
|
the adjustment factor when calculating WOE. This is useful when there is no event or non-event in a bin. Please note that this parameter will NOT take effect for setting in host. |
0.5
|
category_indexes |
list of int or int, default
|
Specify which columns are category features. -1 represent for all columns. List of int indicate a set of
such features. For category features, bin_obj will take its original values as split_points and treat them
as have been binned. If this is not what you expect, please do NOT put it into this parameters.
The number of categories should not exceed bin_num set above.
Note tha columns specified by |
None
|
category_names |
list of string, default
|
Use column names to specify category features. Each element in the list represent for a column name in header.
Note tha columns specified by |
None
|
local_only |
bool, default
|
Whether just provide binning method to guest party. If true, host party will do nothing. Warnings: This parameter will be deprecated in future version. |
False
|
transform_param |
Define how to transfer the binned data. |
TransformParam()
|
|
need_run |
Indicate if this module needed to be run |
True
|
|
skip_static |
If true, binning will not calculate iv, woe etc. In this case, optimal-binning will not be supported. |
False
|
Source code in federatedml/param/feature_binning_param.py
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
|
Attributes¶
method = method
instance-attribute
¶compress_thres = compress_thres
instance-attribute
¶head_size = head_size
instance-attribute
¶error = error
instance-attribute
¶adjustment_factor = adjustment_factor
instance-attribute
¶bin_num = bin_num
instance-attribute
¶bin_indexes = bin_indexes
instance-attribute
¶bin_names = bin_names
instance-attribute
¶category_indexes = category_indexes
instance-attribute
¶category_names = category_names
instance-attribute
¶transform_param = copy.deepcopy(transform_param)
instance-attribute
¶need_run = need_run
instance-attribute
¶skip_static = skip_static
instance-attribute
¶local_only = local_only
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/feature_binning_param.py
196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
RSAParam(salt='', hash_method='sha256', final_hash_method='sha256', split_calculation=False, random_base_fraction=None, key_length=consts.DEFAULT_KEY_LENGTH, random_bit=DEFAULT_RANDOM_BIT)
¶
Bases: BaseParam
Specify parameters for RSA intersect method
Parameters:
Name | Type | Description | Default |
---|---|---|---|
salt |
the src id will be str = str + salt, default '' |
''
|
|
hash_method |
the hash method of src id, support sha256, sha384, sha512, sm3, default sha256 |
'sha256'
|
|
final_hash_method |
the hash method of result data string, support md5, sha1, sha224, sha256, sha384, sha512, sm3, default sha256 |
'sha256'
|
|
split_calculation |
if True, Host & Guest split operations for faster performance, recommended on large data set |
False
|
|
random_base_fraction |
if not None, generate (fraction * public key id count) of r for encryption and reuse generated r; note that value greater than 0.99 will be taken as 1, and value less than 0.01 will be rounded up to 0.01 |
None
|
|
key_length |
value >= 1024, bit count of rsa key, default 1024 |
consts.DEFAULT_KEY_LENGTH
|
|
random_bit |
it will define the size of blinding factor in rsa algorithm, default 128 |
DEFAULT_RANDOM_BIT
|
Source code in federatedml/param/intersect_param.py
144 145 146 147 148 149 150 151 152 153 154 |
|
Attributes¶
salt = salt
instance-attribute
¶hash_method = hash_method
instance-attribute
¶final_hash_method = final_hash_method
instance-attribute
¶split_calculation = split_calculation
instance-attribute
¶random_base_fraction = random_base_fraction
instance-attribute
¶key_length = key_length
instance-attribute
¶random_bit = random_bit
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/intersect_param.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
|
HeteroNNParam(task_type='classification', bottom_nn_define=None, top_nn_define=None, interactive_layer_define=None, interactive_layer_lr=0.9, config_type='pytorch', optimizer='SGD', loss=None, epochs=100, batch_size=-1, early_stop='diff', tol=1e-05, seed=100, encrypt_param=EncryptParam(), encrypted_mode_calculator_param=EncryptedModeCalculatorParam(), predict_param=PredictParam(), cv_param=CrossValidationParam(), validation_freqs=None, early_stopping_rounds=None, metrics=None, use_first_metric_only=True, selector_param=SelectorParam(), floating_point_precision=23, callback_param=CallbackParam(), coae_param=CoAEConfuserParam(), dataset=DatasetParam())
¶
Bases: BaseParam
Parameters used for Hetero Neural Network.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task_type |
'classification'
|
||
bottom_nn_define |
None
|
||
interactive_layer_define |
None
|
||
interactive_layer_lr |
0.9
|
||
top_nn_define |
None
|
||
optimizer |
|
'SGD'
|
|
loss |
None
|
||
epochs |
100
|
||
batch_size |
int, batch size when updating model.
|
-1 means use all data in a batch. i.e. Not to use mini-batch strategy. defaults to -1. |
-1
|
early_stop |
str, accept 'diff' only in this version, default
|
Method used to judge converge or not. a) diff: Use difference of loss between two iterations to judge whether converge. |
'diff'
|
floating_point_precision |
e.g.: convert an x to round(x * 2**floating_point_precision) during Paillier operation, divide the result by 2**floating_point_precision in the end. |
23
|
|
callback_param |
CallbackParam()
|
Source code in federatedml/param/hetero_nn_param.py
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 |
|
Attributes¶
task_type = task_type
instance-attribute
¶bottom_nn_define = bottom_nn_define
instance-attribute
¶interactive_layer_define = interactive_layer_define
instance-attribute
¶interactive_layer_lr = interactive_layer_lr
instance-attribute
¶top_nn_define = top_nn_define
instance-attribute
¶batch_size = batch_size
instance-attribute
¶epochs = epochs
instance-attribute
¶early_stop = early_stop
instance-attribute
¶tol = tol
instance-attribute
¶optimizer = optimizer
instance-attribute
¶loss = loss
instance-attribute
¶validation_freqs = validation_freqs
instance-attribute
¶early_stopping_rounds = early_stopping_rounds
instance-attribute
¶metrics = metrics or []
instance-attribute
¶use_first_metric_only = use_first_metric_only
instance-attribute
¶encrypt_param = copy.deepcopy(encrypt_param)
instance-attribute
¶encrypted_model_calculator_param = encrypted_mode_calculator_param
instance-attribute
¶predict_param = copy.deepcopy(predict_param)
instance-attribute
¶cv_param = copy.deepcopy(cv_param)
instance-attribute
¶selector_param = selector_param
instance-attribute
¶floating_point_precision = floating_point_precision
instance-attribute
¶callback_param = copy.deepcopy(callback_param)
instance-attribute
¶coae_param = coae_param
instance-attribute
¶dataset = dataset
instance-attribute
¶seed = seed
instance-attribute
¶config_type = 'pytorch'
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/hetero_nn_param.py
222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 |
|
BoostingParam(task_type=consts.CLASSIFICATION, objective_param=ObjectiveParam(), learning_rate=0.3, num_trees=5, subsample_feature_rate=1, n_iter_no_change=True, tol=0.0001, bin_num=32, predict_param=PredictParam(), cv_param=CrossValidationParam(), validation_freqs=None, metrics=None, random_seed=100, binning_error=consts.DEFAULT_RELATIVE_ERROR)
¶
Bases: BaseParam
Basic parameter for Boosting Algorithms
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task_type |
task type |
'classification'
|
|
objective_param |
ObjectiveParam Object, default
|
objective param |
ObjectiveParam()
|
learning_rate |
float, int or long
|
the learning rate of secure boost. default: 0.3 |
0.3
|
num_trees |
int or float
|
the max number of boosting round. default: 5 |
5
|
subsample_feature_rate |
float
|
a float-number in [0, 1], default: 1.0 |
1
|
n_iter_no_change |
bool
|
when True and residual error less than tol, tree building process will stop. default: True |
True
|
bin_num |
bin number use in quantile. default: 32 |
32
|
|
validation_freqs |
Do validation in training process or Not. if equals None, will not do validation in train process; if equals positive integer, will validate data every validation_freqs epochs passes; if container object in python, will validate data if epochs belong to this container. e.g. validation_freqs = [10, 15], will validate data when epoch equals to 10 and 15. Default: None |
None
|
Source code in federatedml/param/boosting_param.py
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 |
|
Attributes¶
task_type = task_type
instance-attribute
¶objective_param = copy.deepcopy(objective_param)
instance-attribute
¶learning_rate = learning_rate
instance-attribute
¶num_trees = num_trees
instance-attribute
¶subsample_feature_rate = subsample_feature_rate
instance-attribute
¶n_iter_no_change = n_iter_no_change
instance-attribute
¶tol = tol
instance-attribute
¶bin_num = bin_num
instance-attribute
¶predict_param = copy.deepcopy(predict_param)
instance-attribute
¶cv_param = copy.deepcopy(cv_param)
instance-attribute
¶validation_freqs = validation_freqs
instance-attribute
¶metrics = metrics
instance-attribute
¶random_seed = random_seed
instance-attribute
¶binning_error = binning_error
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/boosting_param.py
284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 |
|
IntersectParam(intersect_method=consts.RSA, random_bit=DEFAULT_RANDOM_BIT, sync_intersect_ids=True, join_role=consts.GUEST, only_output_key=False, with_encode=False, encode_params=EncodeParam(), raw_params=RAWParam(), rsa_params=RSAParam(), dh_params=DHParam(), ecdh_params=ECDHParam(), join_method=consts.INNER_JOIN, new_sample_id=False, sample_id_generator=consts.GUEST, intersect_cache_param=IntersectCache(), run_cache=False, cardinality_only=False, sync_cardinality=False, cardinality_method=consts.ECDH, run_preprocess=False, intersect_preprocess_params=IntersectPreProcessParam(), repeated_id_process=False, repeated_id_owner=consts.GUEST, with_sample_id=False, allow_info_share=False, info_owner=consts.GUEST)
¶
Bases: BaseParam
Define the intersect method
Parameters:
Name | Type | Description | Default |
---|---|---|---|
intersect_method |
str
|
it supports 'rsa', 'raw', 'dh', 'ecdh', default by 'rsa' |
consts.RSA
|
random_bit |
it will define the size of blinding factor in rsa algorithm, default 128 note that this param will be deprecated in future, please use random_bit in RSAParam instead |
DEFAULT_RANDOM_BIT
|
|
sync_intersect_ids |
In rsa, 'sync_intersect_ids' is True means guest or host will send intersect results to the others, and False will not. while in raw, 'sync_intersect_ids' is True means the role of "join_role" will send intersect results and the others will get them. Default by True. |
True
|
|
join_role |
role who joins ids, supports "guest" and "host" only and effective only for raw. If it is "guest", the host will send its ids to guest and find the intersection of ids in guest; if it is "host", the guest will send its ids to host. Default by "guest"; note this param will be deprecated in future version, please use 'join_role' in raw_params instead |
consts.GUEST
|
|
only_output_key |
bool
|
if false, the results of intersection will include key and value which from input data; if true, it will just include key from input data and the value will be empty or filled by uniform string like "intersect_id" |
False
|
with_encode |
if True, it will use hash method for intersect ids, effective for raw method only; note that this param will be deprecated in future version, please use 'use_hash' in raw_params; currently if this param is set to True, specification by 'encode_params' will be taken instead of 'raw_params'. |
False
|
|
encode_params |
effective only when with_encode is True; this param will be deprecated in future version, use 'raw_params' in future implementation |
EncodeParam()
|
|
raw_params |
effective for raw method only |
RAWParam()
|
|
rsa_params |
effective for rsa method only |
RSAParam()
|
|
dh_params |
effective for dh method only |
DHParam()
|
|
ecdh_params |
effective for ecdh method only |
ECDHParam()
|
|
join_method |
if 'left_join', participants will all include sample_id_generator's (imputed) ids in output, default 'inner_join' |
consts.INNER_JOIN
|
|
new_sample_id |
bool
|
whether to generate new id for sample_id_generator's ids, only effective when join_method is 'left_join' or when input data are instance with match id, default False |
False
|
sample_id_generator |
role whose ids are to be kept, effective only when join_method is 'left_join' or when input data are instance with match id, default 'guest' |
consts.GUEST
|
|
intersect_cache_param |
specification for cache generation, with ver1.7 and above, this param is ignored. |
IntersectCache()
|
|
run_cache |
bool
|
whether to store Host's encrypted ids, only valid when intersect method is 'rsa', 'dh', 'ecdh', default False |
False
|
cardinality_only |
bool
|
whether to output estimated intersection count(cardinality); if sync_cardinality is True, then sync cardinality count with host(s) |
False
|
cardinality_method |
specify which intersect method to use for coutning cardinality, default "ecdh"; note that with "rsa", estimated cardinality will be produced; while "dh" and "ecdh" method output exact cardinality, it only supports single-host task |
consts.ECDH
|
|
sync_cardinality |
bool
|
whether to sync cardinality with all participants, default False, only effective when cardinality_only set to True |
False
|
run_preprocess |
bool
|
whether to run preprocess process, default False |
False
|
intersect_preprocess_params |
used for preprocessing and cardinality_only mode |
IntersectPreProcessParam()
|
|
repeated_id_process |
if true, intersection will process the ids which can be repeatable; in ver 1.7 and above,repeated id process will be automatically applied to data with instance id, this param will be ignored |
False
|
|
repeated_id_owner |
which role has the repeated id; in ver 1.7 and above, this param is ignored |
consts.GUEST
|
|
allow_info_share |
bool
|
in ver 1.7 and above, this param is ignored |
False
|
info_owner |
in ver 1.7 and above, this param is ignored |
consts.GUEST
|
|
with_sample_id |
data with sample id or not, default False; in ver 1.7 and above, this param is ignored |
False
|
Source code in federatedml/param/intersect_param.py
456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 |
|
Attributes¶
intersect_method = intersect_method
instance-attribute
¶random_bit = random_bit
instance-attribute
¶sync_intersect_ids = sync_intersect_ids
instance-attribute
¶join_role = join_role
instance-attribute
¶with_encode = with_encode
instance-attribute
¶encode_params = copy.deepcopy(encode_params)
instance-attribute
¶raw_params = copy.deepcopy(raw_params)
instance-attribute
¶rsa_params = copy.deepcopy(rsa_params)
instance-attribute
¶only_output_key = only_output_key
instance-attribute
¶sample_id_generator = sample_id_generator
instance-attribute
¶intersect_cache_param = copy.deepcopy(intersect_cache_param)
instance-attribute
¶run_cache = run_cache
instance-attribute
¶repeated_id_process = repeated_id_process
instance-attribute
¶repeated_id_owner = repeated_id_owner
instance-attribute
¶allow_info_share = allow_info_share
instance-attribute
¶info_owner = info_owner
instance-attribute
¶with_sample_id = with_sample_id
instance-attribute
¶join_method = join_method
instance-attribute
¶new_sample_id = new_sample_id
instance-attribute
¶dh_params = copy.deepcopy(dh_params)
instance-attribute
¶cardinality_only = cardinality_only
instance-attribute
¶sync_cardinality = sync_cardinality
instance-attribute
¶cardinality_method = cardinality_method
instance-attribute
¶run_preprocess = run_preprocess
instance-attribute
¶intersect_preprocess_params = copy.deepcopy(intersect_preprocess_params)
instance-attribute
¶ecdh_params = copy.deepcopy(ecdh_params)
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/intersect_param.py
495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 |
|
FeatureSelectionParam(select_col_indexes=-1, select_names=None, filter_methods=None, unique_param=UniqueValueParam(), iv_value_param=IVValueSelectionParam(), iv_percentile_param=IVPercentileSelectionParam(), iv_top_k_param=IVTopKParam(), variance_coe_param=VarianceOfCoeSelectionParam(), outlier_param=OutlierColsSelectionParam(), manually_param=ManuallyFilterParam(), percentage_value_param=PercentageValueParam(), iv_param=IVFilterParam(), statistic_param=CommonFilterParam(metrics=consts.MEAN), psi_param=CommonFilterParam(metrics=consts.PSI, take_high=False), vif_param=CommonFilterParam(metrics=consts.VIF, threshold=5.0, take_high=False), sbt_param=CommonFilterParam(metrics=consts.FEATURE_IMPORTANCE), correlation_param=CorrelationFilterParam(), use_anonymous=False, need_run=True)
¶
Bases: BaseParam
Define the feature selection parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
select_col_indexes |
Specify which columns need to calculated. -1 represent for all columns.
Note tha columns specified by |
-1
|
|
select_names |
list of string, default
|
Specify which columns need to calculated. Each element in the list represent for a column name in header.
Note tha columns specified by |
None
|
filter_methods |
“hetero_sbt_filter", "homo_sbt_filter", "hetero_fast_sbt_filter", "percentage_value", "vif_filter", "correlation_filter"], default: ["manually"]. The following methods will be deprecated in future version: "unique_value", "iv_value_thres", "iv_percentile", "coefficient_of_variation_value_thres", "outlier_cols" Specify the filter methods used in feature selection. The orders of filter used is depended on this list. Please be notified that, if a percentile method is used after some certain filter method, the percentile represent for the ratio of rest features. e.g. If you have 10 features at the beginning. After first filter method, you have 8 rest. Then, you want top 80% highest iv feature. Here, we will choose floor(0.8 * 8) = 6 features instead of 8. |
None
|
|
unique_param |
filter the columns if all values in this feature is the same |
UniqueValueParam()
|
|
iv_value_param |
Use information value to filter columns. If this method is set, a float threshold need to be provided. Filter those columns whose iv is smaller than threshold. Will be deprecated in the future. |
IVValueSelectionParam()
|
|
iv_percentile_param |
Use information value to filter columns. If this method is set, a float ratio threshold need to be provided. Pick floor(ratio * feature_num) features with higher iv. If multiple features around the threshold are same, all those columns will be keep. Will be deprecated in the future. |
IVPercentileSelectionParam()
|
|
variance_coe_param |
Use coefficient of variation to judge whether filtered or not. Will be deprecated in the future. |
VarianceOfCoeSelectionParam()
|
|
outlier_param |
Filter columns whose certain percentile value is larger than a threshold. Will be deprecated in the future. |
OutlierColsSelectionParam()
|
|
percentage_value_param |
Filter the columns that have a value that exceeds a certain percentage. |
PercentageValueParam()
|
|
iv_param |
Setting how to filter base on iv. It support take high mode only. All of "threshold", "top_k" and "top_percentile" are accepted. Check more details in CommonFilterParam. To use this filter, hetero-feature-binning module has to be provided. |
IVFilterParam()
|
|
statistic_param |
Setting how to filter base on statistic values. All of "threshold", "top_k" and "top_percentile" are accepted. Check more details in CommonFilterParam. To use this filter, data_statistic module has to be provided. |
CommonFilterParam(metrics=consts.MEAN)
|
|
psi_param |
Setting how to filter base on psi values. All of "threshold", "top_k" and "top_percentile" are accepted. Its take_high properties should be False to choose lower psi features. Check more details in CommonFilterParam. To use this filter, data_statistic module has to be provided. |
CommonFilterParam(metrics=consts.PSI, take_high=False)
|
|
use_anonymous |
whether to interpret 'select_names' as anonymous names. |
False
|
|
need_run |
Indicate if this module needed to be run |
True
|
Source code in federatedml/param/feature_selection_param.py
450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 |
|
Attributes¶
correlation_param = correlation_param
instance-attribute
¶vif_param = vif_param
instance-attribute
¶select_col_indexes = select_col_indexes
instance-attribute
¶select_names = []
instance-attribute
¶filter_methods = [consts.MANUALLY_FILTER]
instance-attribute
¶unique_param = copy.deepcopy(unique_param)
instance-attribute
¶iv_value_param = copy.deepcopy(iv_value_param)
instance-attribute
¶iv_percentile_param = copy.deepcopy(iv_percentile_param)
instance-attribute
¶iv_top_k_param = copy.deepcopy(iv_top_k_param)
instance-attribute
¶variance_coe_param = copy.deepcopy(variance_coe_param)
instance-attribute
¶outlier_param = copy.deepcopy(outlier_param)
instance-attribute
¶percentage_value_param = copy.deepcopy(percentage_value_param)
instance-attribute
¶manually_param = copy.deepcopy(manually_param)
instance-attribute
¶iv_param = copy.deepcopy(iv_param)
instance-attribute
¶statistic_param = copy.deepcopy(statistic_param)
instance-attribute
¶psi_param = copy.deepcopy(psi_param)
instance-attribute
¶sbt_param = copy.deepcopy(sbt_param)
instance-attribute
¶need_run = need_run
instance-attribute
¶use_anonymous = use_anonymous
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/feature_selection_param.py
501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 |
|