pipefitter.transformer.imputer.Imputer.transform¶
-
Imputer.
transform
(table, value=None)¶ Perform the imputation on the given data set
Parameters: table : data set
The data set to impute
value : ImputerMethod or scalar or dict, optional
Same as for constructor
Returns: data set
Data set of the same type as
table
Examples
Sample data set used for imputing examples:
>>> data.head() A B C D E F G H 0 1.0 2.0 3.0 4.0 5.0 a b c 1 6.0 NaN 8.0 9.0 NaN j e f 2 11.0 NaN 13.0 14.0 NaN h i 3 16.0 17.0 18.0 NaN 20.0 j l 4 NaN 22.0 23.0 24.0 NaN n o
Impute values using the mean:
>>> meanimp = Imputer(Imputer.MEAN) >>> newdata = meanimp.transform(data) >>> newdata.head() A B C D E F G H 0 1.0 2.000000 3.0 4.00 5.0 a b c 1 6.0 13.666667 8.0 9.00 12.5 j e f 2 11.0 13.666667 13.0 14.00 12.5 h i 3 16.0 17.000000 18.0 12.75 20.0 j l 4 8.5 22.000000 23.0 24.00 12.5 n o
Impute values using the mode:
>>> modeimp = Imputer(Imputer.MODE) >>> newdata = modeimp.transform(data) >>> newdata.head() A B C D E F G H 0 1.0 2.0 3.0 4.0 5.0 a b c 1 6.0 2.0 8.0 9.0 5.0 j e f 2 11.0 2.0 13.0 14.0 5.0 j h i 3 16.0 17.0 18.0 4.0 20.0 j b l 4 1.0 22.0 23.0 24.0 5.0 j n o
Impute a constant value:
>>> cimp = Imputer(100) >>> newdata = cimp.transform(data) >>> newdata.head() A B C D E F G H 0 1.0 2.0 3.0 4.0 5.0 a b c 1 6.0 100.0 8.0 9.0 100.0 j e f 2 11.0 100.0 13.0 14.0 100.0 h i 3 16.0 17.0 18.0 100.0 20.0 j l 4 100.0 22.0 23.0 24.0 100.0 n o
Impute values in specified columns:
>>> dimp = Imputer({'A': 1, 'B': 100, ... 'F': 'none', 'G': 'miss'}) >>> newdata = cimp.transform(data) >>> newdata.head() A B C D E F G H 0 1.0 2.0 3.0 4.0 5.0 a b c 1 6.0 100.0 8.0 9.0 NaN j e f 2 11.0 100.0 13.0 14.0 NaN none h i 3 16.0 17.0 18.0 NaN 20.0 j miss l 4 1.0 22.0 23.0 24.0 NaN none n o