API Reference¶
Utility Functions¶
concat(objs, \*\*kwargs) |
Concatenate data in given objects |
merge(left, right, \*\*kwargs) |
Merge data in given objects |
CAS¶
The CAS object is the connection to the CAS server. CAS actions can be called on this object. It also incorporates many of the data reader functions of the Pandas package.
Session Management¶
CAS.close(self[, close_session]) |
Close the CAS connection |
CAS.terminate(self) |
End the session and close the CAS connection |
CAS.copy(self) |
Create a copy of the connection |
CAS.fork(self[, num]) |
Create multiple copies of a connection |
CAS.session_context(self, \*args, \*\*kwargs) |
Create a context of session options |
Reading Data¶
There are various ways of loading data into CAS: server-side parsed and loaded, client-side parsed, and client-side files uploaded and parsed on the server. They follow the a naming convention to prevent confusion.
load_* : Loads server-side paths
read_* : Uses client-side parsers, then uploads the result
upload_* : Uploads client-side files as-is which are parsed on the server
Server-Side Files¶
CAS.load_path(self[, path, readahead, …]) |
Load a path from a CASLib |
Client-Side Files¶
CAS.read_pickle(self, path[, casout]) |
Load pickled pandas object from the specified path |
CAS.read_table(self, filepath_or_buffer[, …]) |
Read general delimited file into a CAS table |
CAS.read_csv(self, filepath_or_buffer[, casout]) |
Read CSV file into a CAS table |
CAS.read_fwf(self, filepath_or_buffer[, casout]) |
Read a table of fixed-width formatted lines into a CAS table |
CAS.read_clipboard(self[, casout]) |
Read text from clipboard and pass to read_table() |
CAS.read_excel(self, io[, casout]) |
Read an Excel table into a CAS table |
CAS.read_html(self, io[, casout]) |
Read HTML tables into a list of CASTable objects |
CAS.read_hdf(self, path_or_buf[, casout]) |
Read from the HDF store and create a CAS table |
CAS.read_sas(self, filepath_or_buffer[, casout]) |
Read SAS files stored as XPORT or SAS7BDAT into a CAS table |
CAS.read_sql_table(self, table_name, con[, …]) |
Read SQL database table into a CAS table |
CAS.read_sql_query(self, sql, con[, casout]) |
Read SQL query table into a CAS table |
CAS.read_sql(self, sql, con[, casout]) |
Read SQL query or database table into a CAS table |
CAS.read_gbq(self, query[, casout]) |
Load data from a Google BigQuery into a CAS table |
CAS.read_stata(self, filepath_or_buffer[, …]) |
Read Stata file into a CAS table |
CAS.upload_file(self, data[, importoptions, …]) |
Upload a client-side data file to CAS and parse it into a CAS table |
Client-Side DataFrames¶
CAS.upload_frame(self, data[, …]) |
Upload a client-side data file to CAS and parse it into a CAS table |
Running Actions¶
CAS.retrieve(self, _name_, \*\*kwargs) |
Call the action and aggregate the results |
CAS.invoke(self, _name_, \*\*kwargs) |
Call an action on the server |
CAS.__iter__(self) |
Iterate over responses from CAS |
getone(connection[, datamsghandler]) |
Get a single response from a connection |
getnext(\*objs, \*\*kwargs) |
Return responses as they appear from multiple connections |
CASResults¶
The CASResults object is a subclass of Python’s ordered dictionary. CAS actions can return any number of result objects which are accessible by the dictionary keys. This class also defines several methods for handling tables in By groups.
Constructor¶
CASResults(*args, **kwargs) |
Ordered collection of results from a CAS action |
By Group Processing¶
CASResults.get_tables(self, name[, set, concat]) |
Return all tables ending with name in all By groups |
CASResults.get_group(_self_, \*name, \*\*kwargs) |
Return a CASResults object of the specified By group tables |
CASResults.get_set(self, num) |
Return a CASResults object of the By group set |
CASResults.concat_bygroups(self[, inplace]) |
Concatenate all tables within a By group into a single table |
SASDataFrame¶
The SASDataFrame object is a simple subclass of pandas.DataFrame. It merely adds attributes to hold SAS metadata such as titles, labels, column metadata, etc. It also adds a few utility methods for handling By group representations.
Constructor¶
SASDataFrame([data, index, columns, dtype, …]) |
Two-dimensional tabular data structure with SAS metadata added |
Column Metadata¶
SASColumnSpec(name[, label, dtype, width, …]) |
Create a SASDataFrame column information object |
Utilities¶
reshape_bygroups(items[, bygroup_columns, …]) |
Convert current By group representation to the specified representation |
SASFormatter¶
The SASFormatter object can be used to apply SAS data formats to Python values. It will only work with builtin SAS data formats; not user-defined formats. If you need user-defined formats, the fetch action can be configured to bring back formatted values rather than raw values.
Constructor¶
SASFormatter([locale, soptions]) |
Create a locale-aware SAS value formatter |
Formatting Data¶
SASFormatter.format(self, value[, sasfmt, width]) |
Format the given value |
CASTable¶
The CASTable is essentially a client-side view of a table in the CAS server. CAS actions can be called on it directly just like a CAS connection object, and it also supports much of the Pandas pandas.DataFrame API.
CAS Connections¶
CASTable.get_connection(self) |
Get the registered connection object |
CASTable.set_connection(self, connection) |
Set the connection to use for action calls |
Setters and Getters¶
CASTable.__setattr__(self, name, value) |
Set attribute or parameter value |
CASTable.__getattr__(self, name) |
Get named parameter, CAS action, or table column |
CASTable.__delattr__(self, name) |
Delete an attribute |
Attributes and Underlying Data¶
columns: column labels
CASTable.as_matrix(self[, columns, n]) |
Represent CASTable as a Numpy array |
Series of the data types in the table |
|
Series of the ftypes (indication of sparse/dense and dtype) in the table |
|
Retrieve the frequency of CAS table column data types |
|
Retrieve the frequency of CAS table column data types |
|
CASTable.select_dtypes(self[, include, …]) |
Return a subset CASTable including/excluding columns based on data type |
Numpy representation of the table |
|
List of the row axis labels and column axis labels |
|
Number of axes dimensions |
|
Number of elements in the table |
|
Return a tuple representing the dimensionality of the table |
Indexing, Iteration¶
CASTable.drop(self, labels[, axis, level, …]) |
Return a new CASTable object with the specified columns removed |
CASTable.head(self[, n, columns, …]) |
Retrieve first n rows |
Label-based indexer with integer position fallback |
|
Label-based indexer |
|
Integer-based indexer for selecting by position |
|
CASTable.__iter__(self) |
Iterate through all visible column names in self |
CASTable.iteritems(self) |
Iterate over column names and CASColumn objects |
CASTable.iterrows(self[, chunksize]) |
Iterate over the rows of a CAS table as (index, pandas.Series) pairs |
CASTable.itertuples(self[, index, chunksize]) |
Iterate over rows as tuples |
CASTable.lookup(self, row_labels, col_labels) |
Retrieve values indicated by row_labels, col_labels positions |
CASTable.tail(self[, n, columns, …]) |
Retrieve last n rows |
CASTable.query(self, expr[, inplace, engine]) |
Query the table with a boolean expression |
For more information on .ix, .loc, and .iloc, see the indexing documentation.
GroupBy¶
CASTable.groupby(self, by[, axis, level, …]) |
Specify grouping variables for the table |
Computations / Descriptive Stats¶
CASTable.abs(self) |
Return a new CASTable with absolute values of numerics |
CASTable.all(self[, axis, bool_only, …]) |
Return True for each column with only elements that evaluate to true |
CASTable.any(self[, axis, bool_only, …]) |
Return True for each column with at least one true element |
CASTable.clip(self[, lower, upper, axis]) |
Clip values at thresholds |
CASTable.clip_lower(self, threshold[, axis]) |
Clip values at lower threshold |
CASTable.clip_upper(self, threshold[, axis]) |
Clip values at upper threshold |
CASTable.corr(self[, method, min_periods]) |
Compute pairwise correlation of columns |
CASTable.count(self[, axis, level, numeric_only]) |
Return total number of non-missing values in each column |
CASTable.css(self[, casout]) |
Return the corrected sum of squares of the values of each column |
CASTable.cv(self[, casout]) |
Return the coefficient of variation of the values of each column |
CASTable.describe(self[, percentiles, …]) |
Get descriptive statistics |
CASTable.eval(self, expr[, inplace, kwargs]) |
Evaluate a CAS table expression |
CASTable.kurt(self[, axis, skipna, level, …]) |
Return the kurtosis of the values of each column |
CASTable.max(self[, axis, skipna, level, …]) |
Return the maximum value of each column |
CASTable.mean(self[, axis, skipna, level, …]) |
Return the mean value of each column |
CASTable.median(self[, axis, skipna, level, …]) |
Return the median value of each numeric column |
CASTable.min(self[, axis, skipna, level, …]) |
Return the minimum value of each column |
CASTable.mode(self[, axis, numeric_only, …]) |
Return the mode of each column |
CASTable.nmiss(self[, axis, level, …]) |
Return total number of missing values in each column |
CASTable.probt(self[, casout]) |
Return the p-value of the T-statistics of the values of each column |
CASTable.quantile(self[, q, axis, …]) |
Return values at the given quantile |
CASTable.skew(self[, axis, skipna, level, …]) |
Return the skewness of the values of each column |
CASTable.stderr(self[, casout]) |
Return the standard error of the values of each column |
CASTable.sum(self[, axis, skipna, level, …]) |
Return the sum of the values of each column |
CASTable.std(self[, axis, skipna, level, …]) |
Return the standard deviation of the values of each column |
CASTable.tvalue(self[, casout]) |
Return the T-statistics for hypothesis testing of the values of each column |
CASTable.uss(self[, casout]) |
Return the uncorrected sum of squares of the values of each column |
CASTable.var(self[, axis, skipna, level, …]) |
Return the variance of the values of each column |
Reindexing / Selection / Label manipulation¶
CASTable.head(self[, n, columns, …]) |
Retrieve first n rows |
CASTable.sample(self[, n, frac, replace, …]) |
Returns a random sample of the table rows |
CASTable.tail(self[, n, columns, …]) |
Retrieve last n rows |
Sorting¶
Note
There is no concept of a sorted table in the server. The sort_values merely stores sorting information that is applied when fetching data.
CASTable.sort_values(self, by[, axis, …]) |
Specify sort parameters for data in a CASTable |
CASTable.nlargest(self, n, columns[, keep, …]) |
Return the n largest values ordered by columns |
CASTable.nsmallest(self, n, columns[, keep, …]) |
Return the n smallest values ordered by columns |
CASTable.to_xarray(self, \*args, \*\*kwargs) |
Represent table data as a numpy.xarray |
Combining / Merging¶
CASTable.append(self, other[, ignore_index, …]) |
Append rows of other to self |
CASTable.merge(self, right[, how, on, …]) |
Merge CASTable objects using a database-style join on a column |
Plotting¶
CASTable.plot() is both a callable method and a namespace attribute for specific plotting methods of the form CASTable.plot.<kind>.
Note
In all of the plotting methods, the rendering is done completely on the client side. This means that all of the data is fetched in the background prior to doing the plotting.
Since plotting is done on the client-side, data must be downloaded to create the graphs. By default, the amount of data pulled down is limited by the cas.dataset.max_rows_fetched option. Sampling is used to randomize the data that is plotted. You can control the sampling with the following options:
- sample_pct=float
The percentage of the rows of the original table to return given as a float value between 0 and 1. Using this option disables the cas.dataset.max_rows_fetched option row limits.
- sample_seed=int
The seed for the random number generator given an as integer. This can be set to create deterministic sampling.
- stratify_by='var-name'
Specifies the variable to do stratified sampling by.
- sample=bool
A boolean used to indicate that the values fetched should be sampled. This is used in conjunction with the cas.dataset.max_rows_fetched option to return random samples up to that limit. It is assumed to be true when sample_pct= is specified.
Plot the data in the table |
|
CASTable.plot.area(self[, x, y]) |
Area plot |
CASTable.plot.bar(self[, x, y]) |
Bar plot |
CASTable.plot.barh(self[, x, y]) |
Horizontal bar plot |
CASTable.plot.box(self[, by]) |
Boxplot |
CASTable.plot.density(self, \*\*kwargs) |
Kernel density estimate plot |
CASTable.plot.hexbin(self[, x, y, C, …]) |
Hexbin plot |
CASTable.plot.hist(self[, by, bins]) |
Histogram |
CASTable.plot.kde(self, \*\*kwargs) |
Kernel density estimate plot |
CASTable.plot.line(self[, x, y]) |
Line plot |
CASTable.plot.pie(self[, y]) |
Pie chart |
CASTable.plot.scatter(self, x, y[, s, c]) |
Scatter plot |
CASTable.boxplot(self[, column, by]) |
Make a boxplot from the table data |
CASTable.hist(self[, column, by]) |
Make a histogram from the table data |
Serialization / IO / Conversion¶
CASTable.from_csv(connection, path[, casout]) |
Create a CASTable from a CSV file |
CASTable.from_dict(connection, data[, casout]) |
Create a CASTable from a dictionary |
CASTable.from_items(connection, items[, casout]) |
Create a CASTable from a (key, value) pairs |
CASTable.from_records(connection, data[, casout]) |
Create a CASTable from records |
CASTable.info(self[, verbose, buf, …]) |
Print summary of CASTable information |
CASTable.to_pickle(self, \*args, \*\*kwargs) |
Pickle (serialize) the table data |
CASTable.to_csv(self, \*args, \*\*kwargs) |
Write table data to comma-separated values (CSV) |
CASTable.to_hdf(self, \*args, \*\*kwargs) |
Write table data to HDF |
CASTable.to_sql(self, \*args, \*\*kwargs) |
Write table records to SQL database |
CASTable.to_dict(self, \*args, \*\*kwargs) |
Convert table data to a Python dictionary |
CASTable.to_excel(self, \*args, \*\*kwargs) |
Write table data to an Excel spreadsheet |
CASTable.to_json(self, \*args, \*\*kwargs) |
Convert the table data to a JSON string |
CASTable.to_html(self, \*args, \*\*kwargs) |
Render the table data to an HTML table |
CASTable.to_latex(self, \*args, \*\*kwargs) |
Render the table data to a LaTeX tabular environment |
CASTable.to_stata(self, \*args, \*\*kwargs) |
Write table data to Stata file |
CASTable.to_msgpack(self, \*args, \*\*kwargs) |
Write table data to msgpack object |
CASTable.to_gbq(self, \*args, \*\*kwargs) |
Write table data to a Google BigQuery table |
CASTable.to_records(self, \*args, \*\*kwargs) |
Convert table data to record array |
CASTable.to_sparse(self, \*args, \*\*kwargs) |
Convert table data to SparseDataFrame |
CASTable.to_dense(self, \*args, \*\*kwargs) |
Return dense representation of table data |
CASTable.to_string(self, \*args, \*\*kwargs) |
Render the table to a console-friendly tabular output |
CASTable.to_clipboard(self, \*args, \*\*kwargs) |
Write the table data to the clipboard |
Utilities¶
CASTable.copy(self[, deep, exclude]) |
Make a copy of the CASTable object |
CASTable.with_params(self, \*\*kwargs) |
Create copy of table with kwargs inserted as parameters |
CASColumn¶
While CAS does not have a true concept of a standalone column, the CASColumn object emulates one by creating a client-side view of the CAS table using just a single column. CASColumn objects are used much in the same way as pandas.Series objects. They support many of the pandas.Series methods, and can also be used in indexing operations to filter data in a CAS table.
Constructor¶
CASColumn(name, **table_params) |
Special subclass of CASTable for holding single columns |
Attributes¶
Return column data as numpy.ndarray() |
|
The data type of the underlying data |
|
The data type and whether it is sparse or dense |
|
Return a tuple of the shape of the underlying data |
|
Return the number of dimensions of the underlying data |
|
Return the number of elements in the underlying data |
Indexing, Iteration¶
Label-based indexer with integer position fallback |
|
Label-based indexer |
|
Integer-based indexer for selecting by position |
|
CASColumn.__iter__(self) |
Iterate through all visible column names in self |
CASColumn.iteritems(self[, chunksize]) |
Lazily iterate over (index, value) tuples |
For more information on .ix, .loc, and .iloc, see the indexing documentation.
Binary Operator Functions¶
CASColumn.add(self, other[, level, …]) |
Addition of CASColumn with other, element-wise |
CASColumn.sub(self, other[, level, …]) |
Subtraction of CASColumn with other, element-wise |
CASColumn.mul(self, other[, level, …]) |
Multiplication of CASColumn with other, element-wise |
CASColumn.div(self, other[, level, …]) |
Floating division of CASColumn and other, element-wise |
CASColumn.truediv(self, other[, level, …]) |
Floating division of CASColumn and other, element-wise |
CASColumn.floordiv(self, other[, level, …]) |
Integer division of CASColumn and other, element-wise |
CASColumn.mod(self, other[, level, …]) |
Modulo of CASColumn and other, element-wise |
CASColumn.pow(self, other[, level, …]) |
Exponential power of CASColumn and other, element-wise |
CASColumn.radd(self, other[, level, …]) |
Addition of CASColumn and other, element-wise |
CASColumn.rsub(self, other[, level, …]) |
Subtraction of CASColumn and other, element-wise |
CASColumn.rmul(self, other[, level, …]) |
Multiplication of CASColumn and other, element-wise |
CASColumn.rdiv(self, other[, level, …]) |
Floating division of CASColumn and other, element-wise |
CASColumn.rtruediv(self, other[, level, …]) |
Floating division of CASColumn and other, element-wise |
CASColumn.rfloordiv(self, other[, level, …]) |
Integer division of CASColumn and other, element-wise |
CASColumn.rmod(self, other[, level, …]) |
Modulo of CASColumn and other, element-wise |
CASColumn.rpow(self, other[, level, …]) |
Exponential power of CASColumn and other, element-wise |
CASColumn.round(self[, decimals, out]) |
Round each value of the CASColumn to the given number of decimals |
CASColumn.lt(self, other[, axis]) |
Less-than comparison of CASColumn and other, element-wise |
CASColumn.gt(self, other[, axis]) |
Greater-than comparison of CASColumn and other, element-wise |
CASColumn.le(self, other[, axis]) |
Less-than-or-equal-to comparison of CASColumn and other, element-wise |
CASColumn.ge(self, other[, axis]) |
Greater-than-or-equal-to comparison of CASColumn and other, element-wise |
CASColumn.ne(self, other[, axis]) |
Not-equal-to comparison of CASColumn and other, element-wise |
CASColumn.eq(self, other[, axis]) |
Equal-to comparison of CASColumn and other, element-wise |
GroupBy¶
CASColumn.groupby(self, by[, axis, level, …]) |
Specify grouping variables for the table |
Computations / Descriptive Stats¶
CASColumn.abs(self) |
Return absolute values element-wise |
CASColumn.all(self[, axis, bool_only, …]) |
Return whether all elements are True |
CASColumn.any(self[, axis, bool_only, …]) |
Return True for each column with one or more element treated as true |
CASColumn.between(self, left, right[, inclusive]) |
Return boolean CASColumn equivalent to left <= value <= right |
CASColumn.clip(self[, lower, upper, out, axis]) |
Trim values at input threshold(s) |
CASColumn.clip_lower(self, threshold[, axis]) |
Trim values below given threshold |
CASColumn.clip_upper(self, threshold[, axis]) |
Trim values above given threshold |
CASColumn.count(self[, level]) |
Return the number of non-NA/null observations in the CASColumn |
CASColumn.describe(self[, percentiles, …]) |
Generate various summary statistics |
CASColumn.max(self[, axis, skipna, level, …]) |
Return the maximum value |
CASColumn.mean(self[, axis, skipna, level, …]) |
Return the mean value |
CASColumn.median(self[, q, axis, …]) |
Return the median value |
CASColumn.min(self[, axis, skipna, level, …]) |
Return the minimum value |
CASColumn.mode(self[, axis, max_tie]) |
Return the mode values |
CASColumn.nlargest(self[, n, keep, casout]) |
Return the n largest values |
CASColumn.nsmallest(self[, n, keep, casout]) |
Return the n smallest values |
CASColumn.quantile(self[, q, axis, …]) |
Return the value at the given quantile |
CASColumn.std(self[, axis, skipna, level, …]) |
Return the standard deviation of the values |
CASColumn.sum(self[, axis, skipna, level, …]) |
Return the sum of the values |
CASColumn.var(self[, axis, skipna, level, …]) |
Return the unbiased variance of the values |
CASColumn.nmiss(self[, casout]) |
Return number of missing values |
CASColumn.stderr(self[, casout]) |
Return standard error of the values |
CASColumn.uss(self[, casout]) |
Return uncorrected sum of squares of the values |
CASColumn.css(self[, casout]) |
Return corrected sum of squares of the values |
CASColumn.cv(self[, casout]) |
Return coefficient of variation of the values |
CASColumn.tvalue(self[, casout]) |
Return value of T-statistic for hypothetical testing |
CASColumn.probt(self[, casout]) |
Return p-value of the T-statistic |
CASColumn.skew(self[, casout]) |
Return skewness |
CASColumn.kurt(self[, casout]) |
Return kurtosis |
CASColumn.unique(self[, casout]) |
Return array of unique values in the CASColumn |
CASColumn.nunique(self[, dropna, casout]) |
Return number of unique elements in the CASColumn |
Return boolean indicating if the values in the CASColumn are unique |
|
CASColumn.value_counts(self[, normalize, …]) |
Return object containing counts of unique values |
Selection¶
CASColumn.head(self[, n, bygroup_as_index, …]) |
Return first n rows of the column in a Series |
CASColumn.isin(self, values) |
Return a boolean CASColumn indicating if the value is in the given values |
CASColumn.sample(self[, n, frac, replace, …]) |
Returns a random sample of the table rows |
CASColumn.tail(self[, n, bygroup_as_index, …]) |
Return last n rows of the column in a Series |
Sorting¶
Note
There is no concept of a sorted table in the server. The sort_values merely stores sorting information that is applied when fetching data.
CASColumn.sort_values(self[, axis, …]) |
Apply sort order parameters to fetches of the data in this column |
Datetime Properties¶
CASColumn.dt can be used to access the values of a CAS table column as datetime-like properties. They are accessed as CASColumn.dt.<property>.
The year of the datetime |
|
The month of the datetime January=1, December=12 |
|
The day of the datetime |
|
The hour of the datetime |
|
The minute of the datetime |
|
The second of the datetime |
|
The microsecond of the datetime |
|
The nanosecond of the datetime (always zero) |
|
The week ordinal of the year |
|
The week ordinal of the year |
|
The day of the week (Monday=0, Sunday=6) |
|
The day of the week (Monday=0, Sunday=6) |
|
The ordinal day of the year |
|
The quarter of the date |
|
Logical indicating if first day of the month |
|
Logical indicating if last day of the month |
|
Logical indicating if first day of quarter |
|
Logical indicating if last day of the quarter |
|
Logical indicating if first day of the year |
|
Logical indicating if the last day of the year |
|
The number of days in the month |
|
The number of days in the month |
String Handling¶
CASColumn.str can be used to access the values of a CAS table column as strings and apply operations. They are accessed as CASColumn.str.<method/property>.
CASColumn.str.capitalize(self) |
Capitalize first letter, lowercase the rest |
CASColumn.str.contains(self, pat[, …]) |
Indicates whether the value contains the specified pattern |
CASColumn.str.count(self, pat[, flags]) |
Count occurrences of pattern in each value |
CASColumn.str.endswith(self, pat[, …]) |
Indicates whether the table column ends with the given pattern |
CASColumn.str.find(self, sub[, …]) |
Return lowest index of pattern in each value, or -1 on failure |
CASColumn.str.index(self, sub[, …]) |
Return lowest index of pattern in each value |
CASColumn.str.len(self) |
Compute the length of each value |
CASColumn.str.lower(self) |
Lowercase the value |
CASColumn.str.lstrip(self[, to_strip]) |
Strip leading spaces |
CASColumn.str.repeat(self, repeats) |
Duplicate value the specified number of times |
CASColumn.str.replace(self, pat, repl) |
Replace a pattern in the data |
CASColumn.str.rfind(self, sub[, …]) |
Return highest index of the pattern |
CASColumn.str.rindex(self, sub[, …]) |
Return highest index of the pattern |
CASColumn.str.rstrip(self[, to_strip]) |
Strip trailing whitespace |
CASColumn.str.startswith(self, pat) |
Indicates whether the table column start with the given pattern |
CASColumn.str.strip(self[, to_strip]) |
Strip leading and trailing whitespace |
CASColumn.str.title(self) |
Capitalize each word in the value |
CASColumn.str.upper(self) |
Uppercase the value |
CASColumn.str.isalpha(self) |
Indicates whether the value contains only alpha characters |
CASColumn.str.isdigit(self) |
Indicates whether the value contains only digits |
CASColumn.str.isspace(self) |
Indicates whether the value contains only whitespace |
CASColumn.str.islower(self) |
Indicates whether the value contain only lowercase characters |
CASColumn.str.isupper(self) |
Indicates whether the value contains only uppercase characters |
CASColumn.str.istitle(self) |
Indicates whether the value is equivalent to the title representation |
CASColumn.str.isnumeric(self) |
Indicates whether the value contains a numeric representation |
CASColumn.str.isdecimal(self) |
Indicates whether the value contains a decimal representation |
SAS Functions¶
CASColumn.sas can be used to apply SAS functions to values in a table column. They are accessed as CASColumn.sas.<method>. Documentation for SAS functions can be seen at support.sas.com.
CASColumn.sas.abs(self) |
Computes the absolute value |
CASColumn.sas.airy(self) |
Computes the value of the Airy function |
CASColumn.sas.beta(self, param) |
Computes the value of the beta function |
CASColumn.sas.cnonct(self, df, prob) |
Computes the noncentrality parameter from a chi-square distribution |
CASColumn.sas.constant(self, name[, …]) |
Computes machine and mathematical constants |
CASColumn.sas.dairy(self) |
Computes the derivative of the AIRY function |
CASColumn.sas.digamma(self) |
Computes the value of the digamma function |
CASColumn.sas.erf(self) |
Computes the value of the (normal) error function |
CASColumn.sas.erfc(self) |
Computes the value of the complementary (normal) error function |
CASColumn.sas.exp(self) |
Computes the value of the exponential function |
CASColumn.sas.fact(self) |
Computes a factorial |
CASColumn.sas.fnonct(self, ndf, ddf, prob) |
Computes the value of the noncentrality parameter of an F distribution |
CASColumn.sas.gamma(self) |
Computes the value of the gamma function |
CASColumn.sas.lgamma(self) |
Computes the natural logarithm of the Gamma function |
CASColumn.sas.log(self) |
Computes the natural (base e) logarithm |
CASColumn.sas.log1px(self) |
Computes the log of 1 plus the argument |
CASColumn.sas.log10(self) |
Computes the logarithm to the base 10 |
CASColumn.sas.log2(self) |
Computes the logarithm to the base 2 |
CASColumn.sas.logbeta(self, param) |
Computes the logarithm of the beta function |
CASColumn.sas.mod(self, divisor) |
Computes the remainder from the division with fuzzing |
CASColumn.sas.modz(self, divisor) |
Computes the remainder from the division without fuzzing |
CASColumn.sas.sign(self) |
Returns the sign of a value |
CASColumn.sas.sqrt(self) |
Computes the square root of a value |
CASColumn.sas.tnonct(self, df, prob) |
Computes the noncentrality parameter from the Student’s t distribution |
CASColumn.sas.trigamma(self) |
Returns the value of the trigamma function |
CASTableGroupBy¶
CASTableGroupBy objects are returned by CASTable.grouppby() and CASColumn.groupby().
Constructor¶
CASTableGroupBy(table, by[, axis, level, …]) |
Group CASTable / CASColumn objects by specified values |
Indexing and Iteration¶
CASTableGroupBy.__iter__(self) |
|
CASTableGroupBy.get_group(self, name[, obj]) |
Construct a CASTable / CASColumn with the given group key |
CASTableGroupBy.query(self, \*args, \*\*kwargs) |
Query the table with a boolean expression |
Conversion¶
CASTableGroupBy.to_frame(self, \*\*kwargs) |
Retrieve all values into a DataFrame |
CASTableGroupBy.to_series(self[, name]) |
Retrieve all values into a Series |
Computations / Descriptive Statistics¶
CASTableGroupBy.css(self, \*args, \*\*kwargs) |
Get css using groups |
CASTableGroupBy.cv(self, \*args, \*\*kwargs) |
Get cv using groups |
CASTableGroupBy.describe(self, \*args, …) |
Get basic statistics using groups |
CASTableGroupBy.head(self, \*args, \*\*kwargs) |
Retrieve first values of each group |
CASTableGroupBy.max(self, \*args, \*\*kwargs) |
Get maximum values using groups |
CASTableGroupBy.mean(self, \*args, \*\*kwargs) |
Get mean values using groups |
CASTableGroupBy.median(self, \*args, \*\*kwargs) |
Get median values using groups |
CASTableGroupBy.min(self, \*args, \*\*kwargs) |
Get minimum values using groups |
CASTableGroupBy.mode(self, \*args, \*\*kwargs) |
Get mode values using groups |
CASTableGroupBy.nth(self, n[, dropna]) |
Return the nth row from each group |
CASTableGroupBy.nmiss(self, \*args, \*\*kwargs) |
Get nmiss using groups |
CASTableGroupBy.nlargest(self, \*args, …) |
Return the n largest values ordered by columns |
CASTableGroupBy.nsmallest(self, \*args, …) |
Return the n smallest values ordered by columns |
CASTableGroupBy.nunique(self, \*args, \*\*kwargs) |
Get number of unique values using groups |
CASTableGroupBy.probt(self, \*args, \*\*kwargs) |
Get probt using groups |
CASTableGroupBy.quantile(self, \*args, …) |
Get quantiles using groups |
CASTableGroupBy.std(self, \*args, \*\*kwargs) |
Get std using groups |
CASTableGroupBy.stderr(self, \*args, \*\*kwargs) |
Get stderr using groups |
CASTableGroupBy.sum(self, \*args, \*\*kwargs) |
Get sum using groups |
CASTableGroupBy.tvalue(self, \*args, \*\*kwargs) |
Get tvalue using groups |
CASTableGroupBy.skew(self, \*args, \*\*kwargs) |
Get skewness using groups |
CASTableGroupBy.kurt(self, \*args, \*\*kwargs) |
Get kurtosis using groups |
CASTableGroupBy.unique(self, \*args, \*\*kwargs) |
Get unique values using groups |
CASTableGroupBy.uss(self, \*args, \*\*kwargs) |
Get uss using groups |
CASTableGroupBy.value_counts(self, \*args, …) |
Get value counts using groups |
CASTableGroupBy.var(self, \*args, \*\*kwargs) |
Get var using groups |
CASResponse¶
CASResponse objects are primarily used internally, but they can be used in more advanced workflows. They are never instantiated directly, they will always be created by the CAS connection object and returned by an iterator.
Constructor¶
CASResponse(_sw_response[, soptions, connection]) |
Response from a CAS action |
Response Properties¶
CASDisposition(_sw_response) |
Disposition of a CAS response |
CASPerformance(_sw_response) |
Performance metrics of a CAS response |
Data Message Handlers¶
Data message handlers are used to create custom data loaders. They construct the parameters to the addtable CAS action and handle the piece-wise loading of data into the server.
Note
Data message handlers are not currently supported in the REST interface.
CASDataMsgHandler(vars[, nrecs, reclen, …]) |
Base class for all CAS data message handlers |
PandasDataFrame(data[, nrecs, dtype, …]) |
CAS data message handler for pandas.DataFrame objects |
SAS7BDAT(path[, nrecs, transformers]) |
Create a SAS7BDAT data message handler |
CSV(path[, nrecs, transformers]) |
Create a CSV data messsage handler |
Text(path[, nrecs, transformers]) |
Create a Text data message handler |
FWF(path[, nrecs, transformers]) |
Create an FWF data message handler |
JSON(path[, nrecs, transformers]) |
Create a JSON data message handler |
HTML(path[, index, nrecs, transformers]) |
Create an HTML data message handler |
SQLTable(table, engine[, nrecs, transformers]) |
Create an SQLTable data message handler |
SQLQuery(query, engine[, nrecs, transformers]) |
Create an SQLQuery data message handler |
Excel(path[, sheet, nrecs, transformers]) |
Create an Excel data message handler |
Clipboard([nrecs, transformers]) |
Create a Clipboard data message handler |
DBAPI(module, cursor[, nrecs, transformers]) |
Create a Python DB-API 2.0 compliant data message handler |
Image(data[, nrecs, subdirs]) |
CAS data message handler for images. |
Date and Time Functions¶
The following date / time / datetime functions can be used to convert dates to and from Python, CAS, and SAS date values.
CAS Dates and Times¶
cas2python_timestamp(cts[, tz]) |
Convert a CAS datetime to Python datetime |
cas2python_datetime(cts[, tz]) |
Convert a CAS datetime to Python datetime |
cas2python_date(cdt) |
Convert a CAS date to a Python date |
cas2python_time(ctm) |
Convert a CAS time to a Python time |
python2cas_timestamp(pyts) |
Convert a Python datetime to CAS datetime |
python2cas_datetime(pyts) |
Convert a Python datetime to CAS datetime |
python2cas_date(pydt) |
Convert a Python date to a CAS date |
python2cas_time(pytm) |
Convert a Python time to a CAS time |
str2cas_timestamp(dts) |
Convert a string to a CAS timestamp |
str2cas_datetime(dts) |
Convert a string to a CAS timestamp |
str2cas_date(dts) |
Convert a string to a CAS date |
str2cas_time(dts) |
Convert a string to a CAS time |
cas2sas_timestamp(cdt) |
Convert a CAS timestamp to a SAS timestamp |
cas2sas_datetime(cdt) |
Convert a CAS timestamp to a SAS timestamp |
cas2sas_date(cdt) |
Convert a CAS date to a SAS date |
cas2sas_time(cdt) |
Convert a CAS time to a SAS time |
SAS Dates and Times¶
sas2python_timestamp(sts[, tz]) |
Convert a SAS datetime to Python datetime |
sas2python_datetime(sts[, tz]) |
Convert a SAS datetime to Python datetime |
sas2python_date(sdt) |
Convert a SAS date to a Python date |
sas2python_time(sts) |
Convert a SAS time to a Python time |
python2sas_timestamp(pyts[, tz]) |
Convert a Python datetime to SAS datetime |
python2sas_datetime(pyts[, tz]) |
Convert a Python datetime to SAS datetime |
python2sas_date(pydt) |
Convert a Python date to a SAS date |
python2sas_time(pytm) |
Convert a Python time to a SAS time |
str2sas_timestamp(dts) |
Convert a string to a SAS timestamp |
str2sas_datetime(dts) |
Convert a string to a SAS timestamp |
str2sas_date(dts) |
Convert a string to a SAS date |
str2sas_time(dts) |
Convert a string to a SAS time |
sas2cas_timestamp(sts) |
Convert a SAS datetime to CAS datetime |
sas2cas_datetime(sts) |
Convert a SAS datetime to CAS datetime |
sas2cas_date(sdt) |
Convert a SAS date to a CAS date |
sas2cas_time(sts) |
Convert a SAS time to a CAS time |