sasctl.pzmm.write_score_code#
- class sasctl.pzmm.write_score_code.ScoreCode[source]#
Bases:
object
- static convert_mas_to_cas(mas_code: str, model: str | dict | RestObj) str [source]#
Using the generated score.sas code from the Python wrapper API, convert the SAS Microanalytic Service based code to CAS compatible.
- static sanitize_model_prefix(prefix: str) str [source]#
Check the model_prefix for a valid Python function name.
- Parameters:
prefix (str) – The variable for the model name that is used when naming model files. (For example: hmeqClassTree + [Score.py || .pickle]).
- Returns:
model_prefix (str) – Returns a model_prefix, adjusted as needed for valid Python function names.
- static upload_and_copy_score_resources(model: str | dict | RestObj, files: List[Any]) RestObj [source]#
Upload score resources to SAS Model Manager and copy them to the Compute server.
- write_score_code(model_prefix: str, input_data: DataFrame | List[dict], predict_method: Callable[[...], List] | List[Any], target_variable: str | None = None, target_values: List | None = None, score_metrics: List[str] | None = None, predict_threshold: float | None = None, model: str | dict | RestObj | None = None, pickle_type: str = 'pickle', missing_values: bool | list | DataFrame = False, score_cas: bool | None = True, score_code_path: str | Path | None = None, target_index: int | None = None, preprocess_function: Callable[[DataFrame], DataFrame] | None = None, **kwargs) dict | None [source]#
Generates Python score code based on training data used to create the model object.
If a score_code_path argument is included, then a Python file is written to disk and can be included in the zip archive that is imported or registered into the common model repository. If no path is provided, then a dictionary is returned with the relevant score code files as strings.
The following files are generated by this function if score_code_path:
- ‘*_score.py’
The Python score code file for the model.
- ‘dcmas_epscorecode.sas’ (for SAS Viya 3.5 models)
Python score code wrapped in DS2 and prepared for CAS scoring or publishing.
- ‘dmcas_packagescorecode.sas’ (for SAS Viya 3.5 models)
Python score code wrapped in DS2 and prepared for SAS Microanalytic Score scoring or publishing.
The function determines the type of model based on the following arguments: output_variables, target_values, predict_threshold. As an example, consider the popular iris dataset, in which the input dataset contains a number of flower features and their numerical values.
For a binary classification model, where the model is determining if a flower is or is not the setosa species, the following can be passed:
score_metrics = [“Setosa”] or [“Setosa”, “Setosa_Proba”],
target_values = [“1”, “0”],
predict_threshold = [“0.4”]
For a multi-classification model, where the model is determining if a flower is one of three species, the following can be passed:
score_metrics = [“Species”] or [“Species”, “Setosa_Proba”, “Versicolor_Proba”, “Virginica_Proba”]
target_values = [“Setosa”, “Versicolor”, “Virginica”]
predict_threshold = None
Disclaimer: The score code that is generated is designed to be a working template for any Python model, but is not guaranteed to work out of the box for scoring, publishing, or validating the model.
- Parameters:
model_prefix (str) – The variable for the model name that is used when naming model files. (For example: hmeqClassTree + [Score.py || .pickle]).
input_data (pandas.DataFrame or list of dict) – The
pandas.DataFrame
object contains the training data, and includes only the predictor columns. The write_score_code function currently supports int(64), float(64), and string data types for scoring. Providing a list of dict objects signals that the model files are being created from an MLFlow model.predict_method (Callable or list of Any) –
The Python function used for model predictions and the expected output types. The expected output types can be passed as example values or as the value types. For example, if the model is a Scikit-Learn DecisionTreeClassifier, then pass either of the following:
[sklearn.tree.DecisionTreeClassifier.predict, [“A”]]
[sklearn.tree.DecisionTreeClassifier.predict_proba, [0.4, float]]
target_variable (str, optional) – Target variable to be predicted by the model. The default value is None.
target_values (list of str, optional) – A list of target values for the target variable. The default value is None.
score_metrics (list of str, optional) – The scoring metrics for the model. For classification models, it is assumed that the first value in the list represents the classification output. This function supports single- and multi- classification models as well as prediction models. The default value is None.
predict_threshold (float, optional) – The prediction threshold for normalized probability output_variables. Values are expected to be between 0 and 1. The default value is None.
model (str, dict, or RestObj, optional) – The name or id of the model, or a dictionary representation of the model. The default value is None and is only necessary for models that will be hosted on SAS Viya 3.5.
pickle_type (str, optional) – Indicator for the package used to serialize the model file to be uploaded to SAS Model Manager. The default value is pickle.
missing_values (bool, list, or dict, optional) – Sets whether data handled by the score code will impute for missing values. If set to True, then the function determines the imputed values based on the input_data argument. In order to set the imputation values, pass a dict with variable and value key-value pairs or a list of values in the same variable order as the input_data argument. The default value is False.
score_cas (bool, optional) – Sets whether models registered to SAS Viya 3.5 should be able to be scored and validated through both CAS and SAS Micro Analytic Service. If set to false, then the model will only be able to be scored and validated through SAS Micro Analytic Service. The default value is True.
score_code_path (str or pathlib.Path, optional) – Path for output score code file(s) to be generated. If no value is supplied a dict is returned instead. The default value is None.
target_index (int, optional) – Sets the index of success for a binary model. If target_values are given, this index should match the index of the target outcome in target_values. If target_values are not given, this index should indicate whether the the target probability variable is the first or second variable returned by the model. The default value is 1.
**kwargs –
Other keyword arguments are passed to one of the following functions: * sasctl.pzmm.ScoreCode._write_imports(pickle_type, mojo_model=None,
binary_h2o_model=None, binary_string=None)
sasctl.pzmm.ScoreCode._viya35_model_load(model_id, pickle_type, model_file_name, mojo_model=None, binary_h2o_model=None)
sasctl.pzmm.ScoreCode._viya4_model_load(pickle_type, model_file_name, mojo_model=None, binary_h2o_model=None)
sasctl.pzmm.ScoreCode._predict_method(predict_method, input_var_list, dtype_list=None, statsmodels_model=None)
sasctl.pzmm.ScoreCode._predictions_to_metrics(output_variables, target_values=None, predict_threshold=None, h2o_model=None)