dlpy.speech.Speech.transcribe¶
-
Speech.
transcribe
(audio_path, max_path_size=100, alpha=1.0, beta=0.0, gpu=None)¶ Transcribe the audio file into text.
Notice that for this API, we are assuming that the speech-to-test models published by SAS Viya 3.4 will be used. Please download the acoustic and language model files from here: https://support.sas.com/documentation/prod-p/vdmml/zip/speech_19w21.zip
Parameters: - audio_path : string
Specifies the location of the audio file (client-side, absolute/relative).
- max_path_size : int, optional
Specifies the maximum number of paths kept as candidates of the final results during the decoding process. Default = 100
- alpha : double, optional
Specifies the weight of the language model, relative to the acoustic model. Default = 1.0
- beta : double, optional
Specifies the weight of the sentence length, relative to the acoustic model. Default = 0.0
- gpu : class
When specified, the action uses Graphics Processing Unit hardware. The simplest way to use GPU processing is to specify “gpu=1”. In this case, the default values of other GPU parameters are used. Setting gpu=1 enables all available GPU devices for use. Setting gpu=0 disables GPU processing.
Returns: - string
Transcribed text from audio file located at ‘audio_path’.