Stepwise

Stepwise is a simple, effective model selection technique. FATE provides stepwise wrapper for heterogeneous linear models. The compatible models are listed below:

Please note that due to lack of loss history, Stepwise does not support multi-host modeling.

Stepwise Module currently does not support validation strategy or early stopping. While validate data may be set in job configuration file, it will not be used in the stepwise process.

To use stepwise, set ‘need_stepwise’ to True and specify stepwise parameters as desired. Below is an example of stepwise parameter setting in job configuration file.

{
        "stepwise_param": {
                "score_name": "AIC",
                "direction": "both",
                "need_stepwise": true,
                "max_step": 3,
                "nvmin": 2,
                "nvmax": 6
            }
        }

For examples of using stepwise with linear models, please refer here. For explanation on stepwise module parameters, please refer to stepwise_param.

Please note that on FATE Board, shown model information (max iters & coefficient/intercept values) are of the final result model.

Param

class StepwiseParam(score_name='AIC', mode='hetero', role='guest', direction='both', max_step=10, nvmin=2, nvmax=None, need_stepwise=False)

Define stepwise params

Parameters
  • score_name (str, default: 'AIC') – Specify which model selection criterion to be used, choose ‘aic’ or ‘bic’

  • mode (str, default: 'Hetero') – Indicate what mode is current task

  • role (str, default: 'Guest') – Indicate what role is current party

  • direction (str, default: 'both') – Indicate which direction to go for stepwise. ‘forward’ means forward selection; ‘backward’ means elimination; ‘both’ means possible models of both directions are examined at each step.

  • max_step (int, default: '10') – Specify total number of steps to run before forced stop.

  • nvmin (int, default: '2') – Specify the min subset size of final model, cannot be lower than 2. When nvmin > 2, the final model size may be smaller than nvmin due to max_step limit.

  • nvmax (int, default: None) – Specify the max subset size of final model, 2 <= nvmin <= nvmax. The final model size may be larger than nvmax due to max_step limit.

  • need_stepwise (bool, default False) – Indicate if this module needed to be run