Stepwise

Stepwise is a simple, effective model selection technique. FATE provides stepwise wrapper for heterogeneous linear models. The compatible models are listed below:

Please note that due to lack of loss history, Stepwise does not support multi-host modeling.

Another point to notice is that Stepwise Module currently does not support validation strategy or early stopping. While validate data may be set in job configuration file, the validate data will not be used.

To use stepwise, set ‘need_stepwise’ to True and specify stepwise parameters as desired. Below is an example of stepwise parameter setting in job configuration file.

{
        "stepwise_param": {
                "score_name": "AIC",
                "direction": "both",
                "need_stepwise": true,
                "max_step": 3,
                "nvmin": 2,
                "nvmax": 6
            }
        }

For examples of using stepwise with linear models, please refer to examples/federatedml-1.x-examples/hetero_stepwise. For explanation on each stepwise module parameter, please refer to the comments in stepwise param stepwise_param.py.

Please note that on FATE Board, the model information (max iters & coefficient/intercept values) represents the final result model.

Param

class StepwiseParam(score_name='AIC', mode='hetero', role='guest', direction='both', max_step=10, nvmin=2, nvmax=None, need_stepwise=False)

Define stepwise params

Parameters
  • score_name (str, default: 'AIC') – Specify which model selection criterion to be used

  • mode (str, default: 'Hetero') – Indicate what mode is current task

  • role (str, default: 'Guest') – Indicate what role is current party

  • direction (str, default: 'both') – Indicate which direction to go for stepwise. ‘forward’ means forward selection; ‘backward’ means elimination; ‘both’ means possible models of both directions are examined at each step.

  • max_step (int, default: '10') – Specify total number of steps to run before forced stop.

  • nvmin (int, default: '2') – Specify the min subset size of final model, cannot be lower than 2. When nvmin > 2, the final model size may be smaller than nvmin due to max_step limit.

  • nvmax (int, default: None) – Specify the max subset size of final model, 2 <= nvmin <= nvmax. The final model size may be larger than nvmax due to max_step limit.

  • need_stepwise (bool, default False) – Indicate if this module needed to be run