dlpy.audio.AudioTable.create_audio_feature_table

AudioTable.create_audio_feature_table(frame_shift=10, frame_length=25, n_bins=40, n_ceps=40, feature_scaling_method='STANDARDIZATION', n_output_frames=500, casout=None, label_level=0, random_shuffle=True)

Extracts audio features from the audio table and create a new CASTable that contains the features.

Parameters
frame_shiftint, optional

Specifies the time difference (in milliseconds) between the beginnings of consecutive frames. Default: 10

frame_lengthint, optional

Specifies the length of a frame (in milliseconds). Default: 25

n_binsint, optional

Specifies the number of triangular mel-frequency bins. Default: 40

n_cepsint, optional

Specifies the number of cepstral coefficients in each MFCC feature frame (including C0). Default: 40

feature_scaling_methodstring, optional

Specifies the feature scaling method to apply to the computed feature vectors. Default: ‘standardization’

n_output_framesint, optional

Specifies the exact number of frames to include in the output table (extra frames are dropped and missing frames are padded with zeros). Default: 500

casoutdict or string or CASTable, optional

CAS Output table

label_leveloptional

Specifies which path level should be used to generate the class labels for each audio. For instance, label_level = 1 means the first directory and label_level = -2 means the last directory. This internally use the SAS scan function (check https://www.sascrunch.com/scan-function.html for more details). In default, no class labels are generated. Default: 0

random_shufflebool, optional

Specifies whether shuffle the generated CAS table randomly. Default: True

Returns
AudioTable

If table exists

None

If no table exists