dlpy.speech.Speech¶

class dlpy.speech.Speech(conn, data_path, local_path=None, acoustic_model_path=None, language_model_path=None)¶

Bases: object

Class to do speech recognition using SAS Viya.

Parameters:

conn : CAS Connection

Specifies the CAS connection object

data_path : string

Specifies the absolute path of the folder where segmented audio files are stored (server side).

The “audio_path” parameter in “transcribe” method is located on the client side. To transcribe the audio, we need to firstly save the .wav file somewhere the CAS server can access. Also, if the audio is really long we may need to segment it into multiple files before copying.

Notice that this is the location to store the temporary audio files. The Python client should have both reading and writing permission for this folder, and the CAS server should have at least reading permission for this folder.

local_path : string, optional

Specifies the path of the folder where segmented audio files are stored (client side). Default = None

Notice that “data_path” and “local_path” actually point to the same location, and they should only have the same path if the CAS server and the Python client are on the same machine.

acoustic_model_path : string, optional

Specifies the absolute server-side path of the acoustic model file. Please make sure the weights file and the weights attribute file are placed under the same directory. Default = None

language_model_path : string, optional

Specifies the absolute server-side path of the language model file. Default = None

__init__(conn, data_path, local_path=None, acoustic_model_path=None, language_model_path=None)¶: Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(conn, data_path[, local_path, …])	Initialize self.
load_acoustic_model(acoustic_model_path)	Load the RNN acoustic model.
load_language_model(language_model_path)	Load the N-gram language model.
transcribe(audio_path[, max_path_size, …])	Transcribe the audio file into text.