dlpy.timeseries.TimeseriesTable.timeseries_partition

TimeseriesTable.timeseries_partition(training_start=None, validation_start=None, testing_start=None, end_time=None, partition_var_name='split_id', traintbl_suffix='train', validtbl_suffix='valid', testtbl_suffix='test')

Split the dataset into training, validation and testing set

Parameters:
training_start : float or datetime.datetime or datetime.date, optional

The training set starting time stamp. if None, the training set start at the earliest observation record in the table.
Default: None

validation_start : float or datetime.datetime or datetime.date, optional

The validation set starting time stamp. The training set ends right before it. If None, there is no validation set, and the training set ends right before the start of testing set.
Default: None

testing_start : float or datetime.datetime or datetime.date, optional

The testing set starting time stamp. The validation set (or training set if validation set is not specified) ends right before it. If None, there is no testing set, and the validation set (or training set if validation set is not set) ends at the end_time.
Default: None

end_time : float or datetime.datetime or datetime.date, optional

The end time for the table.

partition_var_name : string, optional

The name of the indicator column that indicates training, testing and validation.
Default: ‘split_id’.

traintbl_suffix : string, optional

The suffix name of the CASTable for the training set.
Default: ‘train’

validtbl_suffix : string, optional

The suffix name of the CASTable for the validation set.
Default: ‘valid’

testtbl_suffix : string, optional

The suffix name of the CASTable for the testing set.
Default: ‘test’

Returns:
( training TimeseriesTable, validation TimeseriesTable, testing TimeseriesTable )

Examples

>>> from swat import CAS
>>> from dlpy.timeseries import TimeseriesTable
>>> import datetime
>>> time_tbl = TimeseriesTable.from_localfile(
...     s,
...     r"U:\data       mp      imeseries_exp1.csv",
...     casout=dict(name='table1', replace=True))
>>> valid_start = datetime.datetime(2015, 1, 7, 0 , 0, 0)
>>> test_start = datetime.date(2015, 1, 9)
>>> traintbl, validtbl, testtbl = time_tbl.timeseries_partition(validation_start=valid_start,
... testing_start=test_start)