Data¶
This document provides explanation and source information on data used for running examples. Many of the data sets have been scaled or transformed from their original version.
Data Set Naming Rule¶
All data sets are named according to this guideline: name: “{content}_{mode}_{size}_{role}_{role_index}”
content: brief description of data content
mode: how original data is divided, either “homo””or hetero”; some data sets do not have this information
size: includes keyword “mini” if the data set is truncated from another larger set
role: role name, either “host” or “guest”
role_index: if a data set is further divided and shared among multiple hosts in some example, indices are used to distinguish different parties, starts at 1
Data sets used for running examples are uploaded to FATE data storage at the time of deployment.
Uploaded tables share the same namespace
“experiment” and have table_name
matching to original file names.
Below lists example data sets and their information.
Horizontally Divided Data¶
For Homogeneous Federated Learning
breast_homo:¶
30 features
label type: binary
data sets:
“breast_homo_guest.csv”
name: “breast_homo_guest”
namespace: “experiment”
“breast_homo_host.csv”
name: “breast_homo_host”
namespace: “experiment”
“breast_homo_test.csv”
name: “breast_homo_test”
namespace: “experiment”
default_credit_homo:¶
23 features
label type: binary
data sets:
“default_credit_homo_guest.csv”
name: “default_credit_homo_guest”
namespace: “experiment”
“default_credit_homo_host_1.csv”
name: “default_credit_homo_host_1”
namespace: “experiment”
“default_credit_homo_host_2.csv”
name: “default_credit_homo_host_2”
namespace: “experiment”
“default_credit_homo_test.csv”
name: “defeault_credit_homo_test”
namespace: “experiment”
epsilon_5k_homo:¶
100 features
label type: binary
mock data
data sets:
“epsilon_5k_homo_guest.csv”
name: “epsilon_5k_homo_guest”
namespace: “experiment”
“epsilon_5k_homo_hostt.csv”
name: “epsilon_5k_homo_host”
namespace: “experiment”
“epsilon_5k_homo_test.csv”
name: “epsilon_5k_homo_test”
namespace: “experiment”
give_credit_homo:¶
10 features
label type: binary
data sets:
“give_credit_homo_guest.csv”
name: “give_credit_homo_guest”
namespace: “experiment”
“give_credit_homo_host.csv”
name: “give_credit_homo_host”
namespace: “experiment”
“give_credit_homo_test.csv”
name: “give_credit_homo_test”
namespace: “experiment”
student_homo:¶
13 features
label type: continuous
data sets:
“student_homo_guest.csv”
name: “student_homo_guest”
namespace: “experiment”
“student_homo_host.csv”
name: “student_homo_host”
namespace: “experiment”
“student_homo_test.csv”
name: “student_homo_test”
namespace: “experiment”
vehicle_scale_homo:¶
18 features
label type: multi-class
data sets:
“vehicle_scale_homo_guest.csv”
name: “vehicle_scale_homo_guest”
namespace: “experiment”
“vehicle_scale_homo_host.csv”
name: “vehicle_scale_homo_host”
namespace: “experiment”
“vehicle_scale_homo_test.csv”
name: “vehicle_scale_homo_test”
namespace: “experiment”
Vertically Divided Data¶
For Heterogeneous Federated Learning
breast_hetero:¶
30 features
label type: binary
data sets:
“breast_hetero_guest.csv”
name: “breast_hetero_guest”
namespace: “experiment”
“breast_hetero_host.csv”
name: “breast_hetero_host”
namespace: “experiment”
breast_hetero_mini:¶
7 features
label type: binary
data sets:
“breast_hetero_mini_guest.csv”
name: “breast_hetero_mini_guest”
namespace: “experiment”
“breast_hetero_mini_host.csv”
name: “breast_hetero_mini_host”
namespace: “experiment”
default_credit_hetero:¶
23 features
label type: binary
data sets:
“default_credit_hetero_guest.csv”
name: “default_credit_hetero_guest”
namespace: “experiment”
“default_credit_hetero_host.csv”
name: “default_credit_hetero_host”
namespace: “experiment”
dvisits_hetero:¶
12 features
label type: continuous
data sets:
“dvisits_hetero_guest.csv”
name: “dvisits_hetero_guest”
namespace: “experiment”
“dvisits_hetero_host.csv”
name: “dvisits_hetero_host”
namespace: “experiment”
epsilon_5k_hetero:¶
100 features
label type: binary
mock data
data sets:
“epsilon_5k_hetero_guest.csv”
name: “epsilon_5k_hetero_guest”
namespace: “experiment”
“epsilon_5k_hetero_host.csv”
name: “epsilon_5k_hetero_host”
namespace: “experiment”
give_credit_hetero:¶
10 features
label type: binary
data sets:
“give_credit_hetero_guest.csv”
name: “give_credit_hetero_guest”
namespace: “experiment”
“give_credit_hetero_host.csv”
name: “give_credit_hetero_host”
namespace: “experiment”
ionosphere_scale_hetero¶
34 features
label type: binary
data sets:
“ionosphere_scale_hetero_guest.csv”
name: “ionosphere_scale_hetero_guest”
namespace: “experiment”
“ionosphere_scale_hetero_host.csv”
name: “ionosphere_scale_hetero_host”
namespace: “experiment”
motor_hetero:¶
11 features
label type: continuous
data sets:
“motor_hetero_guest.csv”
name: “motor_hetero_guest”
namespace: “experiment”
“motor_hetero_host.csv”
name: “motor_hetero_host”
namespace: “experiment”
“motor_hetero_host_1.csv”
name: “motor_hetero_host_1”
namespace: “experiment”
“motor_hetero_host_2.csv”
name: “motor_hetero_host_2”
namespace: “experiment”
motor_hetero_mini:¶
7 features
label type: continuous
data sets:
“motor_hetero_mini_guest.csv”
name: “motor_hetero_mini_guest”
namespace: “experiment”
“motor_hetero_mini_host.csv”
name: “motor_hetero_mini_host”
namespace: “experiment”
student_hetero:¶
13 features
label type: continuous
data sets:
“student_hetero_guest.csv”
name: “student_hetero_guest”
namespace: “experiment”
“student_hetero_host.csv”
name: “student_hetero_host”
namespace: “experiment”
Federated Transfer Learning Data¶
For Federated Transfer Learning
nus_wide:¶
636/1000 features
label type: binary
data sets:
“nus_wide_train_guest.csv”
name: “nus_wide_train_guest”
namespace: “experiment”
“nus_wide_train_host.csv”
name: “nus_wide_train_host”
namespace: “experiment”
“nus_wide_validate_guest.csv”
name: “nus_wide_validate_guest”
namespace: “experiment”
“nus_wide_validate_guest.csv”
name: “nus_wide_validate_guest”
namespace: “experiment”