dlpy.splitting.three_way_split¶
-
dlpy.splitting.
three_way_split
(tbl, valid_rate=20, test_rate=20, stratify=True, im_table=True, stratify_by='_label_', image_col='_image_', train_name=None, valid_name=None, test_name=None, **kwargs)¶ Split image data into training and testing sets
Parameters: - tbl : CASTable
The CAS table to split
- valid_rate : double, optional
Specifies the proportion of the validation data set, e.g. 20 mean 20% of the images will be in the validation set.
- test_rate : double, optional
Specifies the proportion of the testing data set, e.g. 20 mean 20% of the images will be in the testing set.
Note: the total of valid_rate and test_rate cannot be exceed 100- stratify : boolean, optional
If True stratify the sampling by the stratify_by column name If False do random sampling without stratification
- im_table : boolean, optional
If True outputs are converted to an imageTable If False CASTables are returned with all columns
- stratify_by : string, optional
The variable to stratify by
- image_col : string
Name of image column if returning ImageTable
- train_name : string
Specifies the output table name for the training set
- valid_name : string
Specifies the output table name for the validation set
- test_name : string
Specifies the output table name for the test set
- kwargs : keyword arguments, optional
Additional keyword arguments to the sample.stratified or sample.src actions. For details see sample.stratifed and sample.srs
- Returns
- ——-
- ( train CASTable, valid CASTable, test CASTable )