dlpy.timeseries.TimeseriesTable.timeseries_partition¶

TimeseriesTable.timeseries_partition(training_start=None, validation_start=None, testing_start=None, end_time=None, partition_var_name='split_id', traintbl_suffix='train', validtbl_suffix='valid', testtbl_suffix='test')¶

Split the dataset into training, validation and testing set

Parameters:

training_start : float or datetime.datetime or datetime.date, optional: The training set starting time stamp. if None, the training set start at the earliest observation record in the table.
Default: None
validation_start : float or datetime.datetime or datetime.date, optional: The validation set starting time stamp. The training set ends right before it. If None, there is no validation set, and the training set ends right before the start of testing set.
Default: None
testing_start : float or datetime.datetime or datetime.date, optional: The testing set starting time stamp. The validation set (or training set if validation set is not specified) ends right before it. If None, there is no testing set, and the validation set (or training set if validation set is not set) ends at the end_time.
Default: None
end_time : float or datetime.datetime or datetime.date, optional: The end time for the table.
partition_var_name : string, optional: The name of the indicator column that indicates training, testing and validation.
Default: ‘split_id’.
traintbl_suffix : string, optional: The suffix name of the CASTable for the training set.
Default: ‘train’
validtbl_suffix : string, optional: The suffix name of the CASTable for the validation set.
Default: ‘valid’
testtbl_suffix : string, optional: The suffix name of the CASTable for the testing set.
Default: ‘test’

Returns:

( training TimeseriesTable, validation TimeseriesTable, testing TimeseriesTable )

Examples

>>> from swat import CAS
>>> from dlpy.timeseries import TimeseriesTable
>>> import datetime
>>> time_tbl = TimeseriesTable.from_localfile(
...     s,
...     r"U:\data       mp      imeseries_exp1.csv",
...     casout=dict(name='table1', replace=True))
>>> valid_start = datetime.datetime(2015, 1, 7, 0 , 0, 0)
>>> test_start = datetime.date(2015, 1, 9)
>>> traintbl, validtbl, testtbl = time_tbl.timeseries_partition(validation_start=valid_start,
... testing_start=test_start)