sasctl

Version 1.4.5

Introduction

sasctl enables easy integration with the SAS Viya platform.

It can be used directly as a python module:

>>> sasctl.folders.list_folders()

Or as a command line utility:

$ sasctl folders list

Prerequisites

sasctl requires the following Python packages be installed. If not already present, these packages will be downloaded and install automatically.

  • requests

  • six

The following additional packages are recommended for full functionality:

  • swat

  • kerberos

Installation

For basic functionality:

pip install sasctl

Functionality that depends on additional packages can be installed using the following:

pip install sasctl[swat]
pip install sasctl[kerberos]
pip install sasctl[all]

Quickstart

As a Module

Once the sasctl package has been installed and you have a SAS Viya server to connect to, the first step is to establish a session:

>>> from sasctl import Session

>>> s = Session(host, username, password)

Once a session has been created, all commands will target that environment by default. The easiest way to use sasctl is often to use a pre-defined task, which will handle all necessary communication with the SAS Viya server:

>>> from sasctl import Session, register_model
>>> from sklearn import linear_model as lm

>>> with Session('example.com', authinfo=<authinfo file>):
...    model = lm.LogisticRegression()
...    register_model('Sklearn Model', model, 'My Project')

A slightly more low-level way to interact with the environment is to use the service methods directly:

>>> from pprint import pprint
>>> from sasctl import Session
>>> from sasctl.services import folders

>>> with Session(host, username, password):
...    folders = folders.list_folders()
...    pprint(folders)

{'links': [{'href': '/folders/folders',
            'method': 'GET',
            'rel': 'folders',

...  # truncated for clarity

            'rel': 'createSubfolder',
            'type': 'application/vnd.sas.content.folder',
            'uri': '/folders/folders?parentFolderUri=/folders/folders/{parentId}'}],
 'version': 1}

The most basic way to interact with the server is simply to call REST functions directly, though in general, this is not recommended.:

>>> from pprint import pprint
>>> from sasctl import Session

>>> with Session(host, username, password) as s:
...    folders = s.get('/folders')
...    pprint(folders)

{'links': [{'href': '/folders/folders',
            'method': 'GET',
            'rel': 'folders',

...  # truncated for clarity

            'rel': 'createSubfolder',
            'type': 'application/vnd.sas.content.folder',
            'uri': '/folders/folders?parentFolderUri=/folders/folders/{parentId}'}],
 'version': 1}

As a Command Line Utility

When using sasctl as a command line utility you must pass authentication credentials using environment variables. See the Authentication section for details

Once these environment variables have been set you can simply run sasctl and ask for help:

$ sasctl -h

usage: sasctl [-h] [-k] [-v] [--version]
              {folders,performanceTasks,models,repositories,projects} ...

sasctl interacts with a SAS Viya environment.

optional arguments:
  -h, --help            show this help message and exit
  -k, --insecure        Skip SSL verification
  -v, --verbose
  --version             show program's version number and exit

service:
  {folders,performanceTasks,models,repositories,projects}

This also works on individual commands:

$ sasctl folders -h

usage: sasctl folders [-h] {create,get,list,update,delete} ...

optional arguments:
  -h, --help            show this help message and exit

command:
  {create,get,list,update,delete}
    create
    get                 Returns a folders instance.
    list                List all folders available in the environment.
    update              Updates a folders instance.
    delete              Deletes a folders instance.

Common Uses

Registering a model built with SWAT.

examples/astore_model.py
import swat

from sasctl import Session
from sasctl.tasks import register_model, publish_model
from sasctl.services import microanalytic_score as mas

s = swat.CAS('hostname', 'username', 'password')
s.loadactionset('decisionTree')

tbl = s.CASTable('iris')
tbl.decisiontree.gbtreetrain(target='Species',
                             inputs=['SepalLength', 'SepalWidth',
                                     'PetalLength', 'PetalWidth'],
                             savestate='gradboost_astore')

astore = s.CASTable('gradboost_astore')

with Session(s):
    model = register_model(astore, 'Gradient Boosting', 'Iris')
    module = publish_model(model, 'maslocal')
    response = mas.execute_module_step(module, 'score',
                                       SepalLength=5.1,
                                       SepalWidth=3.5,
                                       PetalLength=1.4,
                                       PetalWidth=0.2)

Registering a model built with sci-kit learn.

examples/sklearn_model.py
import pandas as pd
from sklearn import datasets
from sklearn.linear_model import LogisticRegression

from sasctl import Session, register_model, publish_model


# Load the Iris data set and convert into a Pandas data frame.
raw = datasets.load_iris()
X = pd.DataFrame(raw.data, columns=['SepalLength', 'SepalWidth',
                                    'PetalLength', 'PetalWidth'])
y = pd.DataFrame(raw.target, columns=['Species'], dtype='category')
y.Species.cat.categories = raw.target_names

# Fit a sci-kit learn model
model = LogisticRegression()
model.fit(X, y)

# Establish a session with Viya
with Session('hostname', 'username', 'password'):
    model_name = 'Iris Regression'

    # Register the model in Model Manager
    register_model(model,
                   model_name,
                   input=X,         # Use X to determine model inputs
                   project='Iris',  # Register in "Iris" project
                   force=True)      # Create project if it doesn't exist

    # Publish the model to the real-time scoring engine
    module = publish_model(model_name, 'maslocal')

    # Select the first row of training data
    x = X.iloc[0, :]

    # Call the published module and score the record
    result = module.score(**x)
    print(result)
  • publish a model

  • score a model

See the examples/ directory in the repository for more complete examples.

User Guide

Authentication

There are a variety of ways to provide authentication credentials when creating a Session. They are presented here in the order of precedence in which they are recognized. The simplest method is to just provide the information directly:

>>> s = Session(hostname, username, password)

Although this is often the easiest method when getting started, it is not the most secure. If your program will be used interactively, consider using the builtin getpass module to avoid hard-coded user names and passwords.

Because sasctl augments analytic modeling tasks, it may frequently be used in conjuction with the swat module. When this is the case, another easy way to create a Session is to simply reuse the existing CAS connection from swat:

>>> cas_session = swat.CAS(hostname, username, password)
...
>>> s = Session(cas_session)

Note: this method will only work when the SWAT connection to CAS is using REST.

A related option is to piggy-back on the .authinfo file used by swat or a .netrc file:

>>> s = Session(hostname, authinfo=file_path)

Note: this method will not work with SAS-encoded passwords that may be contained in a .authinfo file.

If the SAS Viya server is configured for Kerberos and a TGT is already present on the client, then a session can be instantiated using simply the hostname:

>>> s = Session(hostname)

The final method for supplying credentials is also simple and straight-forward: environment variables.

sasctl recognizes the following authentication-related environment variables:

SSL Certificates

By default, sasctl will use HTTPS connections to communicating with the server and will validate the certificate presented by the server against the client’s trusted CA list.

While this behavior should be sufficient for most use cases, there may be times where it is necessary to trust a certificate that has not been signed by a trusted CA. The following environment variables can be used to accomplish this behavior:

In addition, it is possible to disable SSL ceritificate validation entirely, although this should be used with caution. When instantiating a Session instance you can set the verify_ssl parameter to False:

>>> s = Session(hostname, username, verify_ssl=False)

If you’re using sasctl from the command line, or want to disable SSL validation for all sessions, you can use the following SSLREQCERT environment variable.

Logging

All logging is handled through the built-in logging module with standard module-level loggers. The one exception to this is Session request/response logging. Sessions contain a message_log which is exclusively used to record requests and responses made through the session. Message recording can be configured on a per-session basis by updating this logger, or the sasctl.core.session logger can be configured to control all message recording by all sessions.

HATEOAS

Coming soon.

API Reference

Environment Variables

CAS_CLIENT_SSL_CA_LIST

Client-side path to a certificate file containing CA certificates to be trusted. Used by the swat module. This will take precedence over SSLCALISTLOC and REQUESTS_CA_BUNDLE.

SSLCALISTLOC

Client-side path to a certificate file containing CA certificates to be trusted. Used by the swat module. This will take precedence over REQUESTS_CA_BUNDLE.

REQUESTS_CA_BUNDLE

Client-side path to a certificate file containing CA certificates to be trusted. Used by the requests module.

SSLREQCERT

Disables validation of SSL certificates when set to no or false

SASCTL_SERVER_NAME

Hostname of the SAS Viya server to connect to. Required for CLI usage.

SASCTL_USER_NAME

The name of the user that will be used when creating the Session instance.

SASCTL_PASSWORD

Password for authentication to the SAS Viya server.

Contributor Guide

Ways to contribute

Improving the documentation

Accurate, clear, and concise documentation that is also easy to read is critical for the success of any project. This makes improving or expanding the sasctl documentation one of the most valuable ways to contribute. In addition, simple code examples that demonstrate how to use various features are always welcome. These may be added into the examples/ directory or placed inline in the appropriate documentation.

All documentation is contained in the doc/ directory of the source repository and is written in reStructuredText. The .rst files are then processed by Sphinx to produce the final documentation. See the Useful Tox Commands section for details on how to build the final documentation.

Contributing new code

All code contributions are managed through the standard GitHub pull request process. Consult GitHub Help for more information on using pull requests. In addition, all pull requests must include appropriate test cases to verify the changes. See the Testing section for more information on how test cases are configured.

Contributions to this project must also be accompanied by a signed Contributor Agreement. You (or your employer) retain the copyright to your contribution, this simply gives us permission to use and redistribute your contributions as part of the project.

  1. Fork the repository

  2. Run all unit and integration tests and ensure they pass. This can be easily accomplished by running the following:

tox

See Useful Tox Commands for more information on using Tox.

  1. If any tests fail, you should investigate and correct the failure before making any changes.

  2. Make your code changes

  3. Include new tests that validate your changes

  4. Rerun all unit and integration tests and ensure they pass.

  5. Submit a GitHub pull request

All code submissions must meet the following requirements before the pull request will be accepted:
  • Contain appropriate unit/integration tests to validate the changes

  • Document all public members with numpydoc-style docstrings

  • Adherence to the PEP 8 style guide

Testing

Automated testing is used to ensure that existing functionality and new features continue to work on all supported platforms. To accomplish this, the tests/ folder contains a collection of unit tests and integration tests where “unit” tests are generally target a single function and “integration” tests target a series of functions or classes working together.

Test execution is handled by the py.test module which supports tests written using the builtin unittest framework, but also adds some powerful features like test fixtures. It is recommended that you review tests/conftest.py and the existing test cases to understand what features are currently available for testing.

To isolate individual methods for testing, the unit test cases make extensive use of mocking via the builtin unittest.mock module.

Most of the integration tests execute end-to-end functionality that would normally depend on a running SAS Viya environment. However, it can be difficult to reliably test against dynamic environments and not all developers may have access to an environment that is suitable for development. Therefore, most of the integration tests rely on the betamax module to record and replay HTTP requests and responses. These recordings are scrubbed of any sensitive information and stored in tests/cassettes/. This allows tests to be rerun repeatedly once the test has been recorded once.

And finally, the tox module is used to ensure that sasctl will install and work correctly on all supported Python versions.

Useful Tox Commands

tox is used to automate common development tasks such as testing, linting, and building documentation. Running tox from the project root directory will automatically build virtual environments for all Python interpreters found on the system and then install the required packages necessary to perform a given task. The simplest way to run Tox is:

$ tox

This will run the flake8 linter followed by pytest to test the code against all Python runtimes found on the machine. One of the great features of Tox is the ability to run specific tasks by specifying the environment to run. A few useful environments are listed below, where XX indicates a Python version present in your development environment, such as ‘27’ or ‘36’.

  1. $ tox -e pyXX-flake8
    

    Runs the flake8 linter against all sasctl source code.

  2. $ tox -e pyXX-flake8 src/sasctl/tasks.py
    

    Runs the flake8 linter against a specific file.

  3. $ tox -e pyXX-tests
    

    Runs all tests using the specified Python interpreter.

  4. $ tox -e pyXX-doc
    

    Builds the documentation.

  5. $ tox -e pyXX-tests -- python
    

    Starts a Python REPL in an environment with sasctl already installed.

For additional information on configuring and using Tox, see the official documentation or Sean Hammond’s excellent tutorial.