Feature Scale¶
Feature scale is a process that scales each feature along column. Feature Scale module supports min-max scale and standard scale.
- min-max scale: this estimator scales and translates each feature individually such that it is in the given range on the training set, e.g. between min and max value of each feature.
- standard scale: standardize features by removing the mean and scaling to unit variance
Param¶
scale_param
¶
Attributes¶
Classes¶
ScaleParam(method='standard_scale', mode='normal', scale_col_indexes=-1, scale_names=None, feat_upper=None, feat_lower=None, with_mean=True, with_std=True, need_run=True)
¶
Bases: BaseParam
Define the feature scale parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
method |
like scale in sklearn, now it support "min_max_scale" and "standard_scale", and will support other scale method soon. Default standard_scale, which will do nothing for scale |
"standard_scale"
|
|
mode |
for mode is "normal", the feat_upper and feat_lower is the normal value like "10" or "3.1" and for "cap", feat_upper and feature_lower will between 0 and 1, which means the percentile of the column. Default "normal" |
"normal"
|
|
feat_upper |
int or float or list of int or float
|
the upper limit in the column. If use list, mode must be "normal", and list length should equal to the number of features to scale. If the scaled value is larger than feat_upper, it will be set to feat_upper |
None
|
feat_lower |
the lower limit in the column. If use list, mode must be "normal", and list length should equal to the number of features to scale. If the scaled value is less than feat_lower, it will be set to feat_lower |
None
|
|
scale_col_indexes |
the idx of column in scale_column_idx will be scaled, while the idx of column is not in, it will not be scaled. |
-1
|
|
scale_names |
list of string
|
Specify which columns need to scaled. Each element in the list represent for a column name in header. default: [] |
None
|
with_mean |
bool
|
used for "standard_scale". Default True. |
True
|
with_std |
bool
|
used for "standard_scale". Default True. The standard scale of column x is calculated as : z = (x - u) / s , where u is the mean of the column and s is the standard deviation of the column. if with_mean is False, u will be 0, and if with_std is False, s will be 1. |
True
|
need_run |
bool
|
Indicate if this module needed to be run, default True |
True
|
Source code in federatedml/param/scale_param.py
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
|
Attributes¶
scale_names = [] if scale_names is None else scale_names
instance-attribute
¶method = method
instance-attribute
¶mode = mode
instance-attribute
¶feat_upper = feat_upper
instance-attribute
¶feat_lower = feat_lower
instance-attribute
¶scale_col_indexes = scale_col_indexes
instance-attribute
¶with_mean = with_mean
instance-attribute
¶with_std = with_std
instance-attribute
¶need_run = need_run
instance-attribute
¶Functions¶
check()
¶Source code in federatedml/param/scale_param.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
|