dlpy.timeseries.TimeseriesTable.timeseries_partition¶
-
TimeseriesTable.
timeseries_partition
(training_start=None, validation_start=None, testing_start=None, end_time=None, partition_var_name='split_id', traintbl_suffix='train', validtbl_suffix='valid', testtbl_suffix='test')¶ Split the dataset into training, validation and testing set
Parameters: - training_start : float or datetime.datetime or datetime.date, optional
The training set starting time stamp. if None, the training set start at the earliest observation record in the table.
Default: None- validation_start : float or datetime.datetime or datetime.date, optional
The validation set starting time stamp. The training set ends right before it. If None, there is no validation set, and the training set ends right before the start of testing set.
Default: None- testing_start : float or datetime.datetime or datetime.date, optional
The testing set starting time stamp. The validation set (or training set if validation set is not specified) ends right before it. If None, there is no testing set, and the validation set (or training set if validation set is not set) ends at the end_time.
Default: None- end_time : float or datetime.datetime or datetime.date, optional
The end time for the table.
- partition_var_name : string, optional
The name of the indicator column that indicates training, testing and validation.
Default: ‘split_id’.- traintbl_suffix : string, optional
The suffix name of the CASTable for the training set.
Default: ‘train’- validtbl_suffix : string, optional
The suffix name of the CASTable for the validation set.
Default: ‘valid’- testtbl_suffix : string, optional
The suffix name of the CASTable for the testing set.
Default: ‘test’
Returns: - ( training TimeseriesTable, validation TimeseriesTable, testing TimeseriesTable )
Examples
>>> from swat import CAS >>> from dlpy.timeseries import TimeseriesTable >>> import datetime >>> time_tbl = TimeseriesTable.from_localfile( ... s, ... r"U:\data mp imeseries_exp1.csv", ... casout=dict(name='table1', replace=True)) >>> valid_start = datetime.datetime(2015, 1, 7, 0 , 0, 0) >>> test_start = datetime.date(2015, 1, 9) >>> traintbl, validtbl, testtbl = time_tbl.timeseries_partition(validation_start=valid_start, ... testing_start=test_start)