dlpy.audio.AudioTable.extract_audio_features¶
-
classmethod
AudioTable.
extract_audio_features
(conn, table, frame_shift=10, frame_length=25, n_bins=40, n_ceps=40, feature_scaling_method='STANDARDIZATION', n_output_frames=500, casout=None, random_shuffle=True, **kwargs)¶ Extracts audio features from the audio files
Parameters: - conn : CAS
A connection object to the current session.
- table : AudioTable
An audio table containing the audio files.
- frame_shift : int, optional
Specifies the time difference (in milliseconds) between the beginnings of consecutive frames.
Default: 10- frame_length : int, optional
Specifies the length of a frame (in milliseconds).
Default: 25- n_bins : int, optional
Specifies the number of triangular mel-frequency bins.
Default: 40- n_ceps : int, optional
Specifies the number of cepstral coefficients in each MFCC feature frame (including C0).
Default: 40- feature_scaling_method : string, optional
Specifies the feature scaling method to apply to the computed feature vectors.
Default: ‘standardization’- n_output_frames : int, optional
Specifies the exact number of frames to include in the output table (extra frames are dropped and missing frames are padded with zeros).
Default: 500- casout : dict or string or CASTable, optional
CAS Output table
- random_shuffle : bool, optional
Specifies whether shuffle the generated CAS table randomly.
Default: True- kwargs : keyword-arguments, optional
Additional parameter for feature extraction. For details, see the documentation for audio.computeFeatures.
Returns: - AudioTable
If table exists
- None
If no table exists
Examples
>>> import swat >>> from dlpy.audio import AudioTable >>> s=swat.CAS("cloud.example.com", 5570) >>> aud_table = AudioTable.load_audio_files(s, "/path/to/audio/file.txt") >>> feature_table = aud_table.extract_audio_features(s, aud_table) >>> feature_table.summary()