dlpy.audio.AudioTable.create_audio_feature_table¶

AudioTable.create_audio_feature_table(frame_shift=10, frame_length=25, n_bins=40, n_ceps=40, feature_scaling_method='STANDARDIZATION', n_output_frames=500, casout=None, label_level=0, random_shuffle=True)¶

Extracts audio features from the audio table and create a new CASTable that contains the features.

Parameters:

frame_shift : int, optional: Specifies the time difference (in milliseconds) between the beginnings of consecutive frames.
Default: 10
frame_length : int, optional: Specifies the length of a frame (in milliseconds).
Default: 25
n_bins : int, optional: Specifies the number of triangular mel-frequency bins.
Default: 40
n_ceps : int, optional: Specifies the number of cepstral coefficients in each MFCC feature frame (including C0).
Default: 40
feature_scaling_method : string, optional: Specifies the feature scaling method to apply to the computed feature vectors.
Default: ‘standardization’
n_output_frames : int, optional: Specifies the exact number of frames to include in the output table (extra frames are dropped and missing frames are padded with zeros).
Default: 500
casout : dict or string or CASTable, optional: CAS Output table
label_level : optional: Specifies which path level should be used to generate the class labels for each audio. For instance, label_level = 1 means the first directory and label_level = -2 means the last directory. This internally use the SAS scan function (check https://www.sascrunch.com/scan-function.html for more details). In default, no class labels are generated.
Default: 0
random_shuffle : bool, optional: Specifies whether shuffle the generated CAS table randomly.
Default: True

Returns:

AudioTable: If table exists
None: If no table exists

Examples

>>> import swat
>>> from dlpy.audio import AudioTable
>>> s=swat.CAS("cloud.example.com", 5570)
>>> aud_table = AudioTable.load_audio_files(s, "/path/to/audio/file.txt")
>>> feature_table = aud_table.create_audio_feature_table()
>>> feature_table.head()