dlpy.speech.Speech¶
-
class
dlpy.speech.
Speech
(conn, data_path, local_path=None, acoustic_model_path=None, language_model_path=None)¶ Bases: object
Class to do speech recognition using SAS Viya.
Parameters: - conn : CAS Connection
Specifies the CAS connection object
- data_path : string
Specifies the absolute path of the folder where segmented audio files are stored (server side).
The “audio_path” parameter in “transcribe” method is located on the client side. To transcribe the audio, we need to firstly save the .wav file somewhere the CAS server can access. Also, if the audio is really long we may need to segment it into multiple files before copying.
Notice that this is the location to store the temporary audio files. The Python client should have both reading and writing permission for this folder, and the CAS server should have at least reading permission for this folder.
- local_path : string, optional
Specifies the path of the folder where segmented audio files are stored (client side). Default = None
Notice that “data_path” and “local_path” actually point to the same location, and they should only have the same path if the CAS server and the Python client are on the same machine.
- acoustic_model_path : string, optional
Specifies the absolute server-side path of the acoustic model file. Please make sure the weights file and the weights attribute file are placed under the same directory. Default = None
- language_model_path : string, optional
Specifies the absolute server-side path of the language model file. Default = None
-
__init__
(conn, data_path, local_path=None, acoustic_model_path=None, language_model_path=None)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__(conn, data_path[, local_path, …]) Initialize self. load_acoustic_model(acoustic_model_path) Load the RNN acoustic model. load_language_model(language_model_path) Load the N-gram language model. transcribe(audio_path[, max_path_size, …]) Transcribe the audio file into text.