Sorting

Since data in CAS can be spread across many machines and may even be redistributed depending on events that occur, the data is not stored in an ordered form. In general, when using statistical actions, this doesn’t make much difference since the CAS actions doing the work will handle the data regardless of the order that it is in. However, when you are bringing a table of data back to the client from CAS using the fetch action, you may want to have it come back in a sorted form.

In [1]: conn = swat.CAS(host, port, username, password)

In [2]: tbl = conn.read_csv('https://raw.githubusercontent.com/'
   ...:                     'sassoftware/sas-viya-programming/master/data/cars.csv')
   ...: 

Note: Cloud Analytic Services made the uploaded file available as table TMP9XJ3EKR8 in caslib CASUSER(castest).

Note: The table TMP9XJ3EKR8 has been created in caslib CASUSER(castest) from binary data uploaded to Cloud Analytic Services.
In [3]: tbl.fetch(to=5) Out[3]: [Fetch] Selected Rows from Table TMP9XJ3EKR8 Make Model Type ... Weight Wheelbase Length 0 Acura MDX SUV ... 4451.0 106.0 189.0 1 Acura RSX Type S 2dr Sedan ... 2778.0 101.0 172.0 2 Acura TSX 4dr Sedan ... 3230.0 105.0 183.0 3 Acura TL 4dr Sedan ... 3575.0 108.0 186.0 4 Acura 3.5 RL 4dr Sedan ... 3880.0 115.0 197.0 [5 rows x 15 columns] + Elapsed: 0.00828s, user: 0.006s, sys: 0.002s, mem: 1.94mb In [4]: tbl.fetch(to=5, sortby=['MSRP']) Out[4]: [Fetch] Selected Rows from Table TMP9XJ3EKR8 Make Model Type ... Weight Wheelbase Length 0 Kia Rio 4dr manual Sedan ... 2403.0 95.0 167.0 1 Hyundai Accent 2dr hatch Sedan ... 2255.0 96.0 167.0 2 Toyota Echo 2dr manual Sedan ... 2035.0 93.0 163.0 3 Saturn Ion1 4dr Sedan ... 2692.0 103.0 185.0 4 Kia Rio 4dr auto Sedan ... 2458.0 95.0 167.0 [5 rows x 15 columns] + Elapsed: 0.0144s, user: 0.01s, sys: 0.003s, mem: 6.76mb

Of course, it is possible to set the direction of the sorting as well.

In [5]: tbl.fetch(to=5, sortby=[{'name':'MSRP', 'order':'descending'}])
Out[5]: 
[Fetch]

 Selected Rows from Table TMP9XJ3EKR8
 
             Make                  Model  ...   Wheelbase Length
 0        Porsche            911 GT2 2dr  ...        93.0  175.0
 1  Mercedes-Benz              CL600 2dr  ...       114.0  196.0
 2  Mercedes-Benz  SL600 convertible 2dr  ...       101.0  179.0
 3  Mercedes-Benz           SL55 AMG 2dr  ...       101.0  179.0
 4  Mercedes-Benz              CL500 2dr  ...       114.0  196.0
 
 [5 rows x 15 columns]

+ Elapsed: 0.0144s, user: 0.011s, sys: 0.004s, mem: 6.76mb

If you are using the pandas.DataFrame style API for CASTable, you can also use the sort_values() method on CASTable objects.

In [6]: sorttbl = tbl.sort_values(['MSRP'])

In [7]: sorttbl.head()
Out[7]: 
Selected Rows from Table TMP9XJ3EKR8

      Make             Model   Type   ...    Weight Wheelbase  Length
0      Kia    Rio 4dr manual  Sedan   ...    2403.0      95.0   167.0
1  Hyundai  Accent 2dr hatch  Sedan   ...    2255.0      96.0   167.0
2   Toyota   Echo 2dr manual  Sedan   ...    2035.0      93.0   163.0
3   Saturn          Ion1 4dr  Sedan   ...    2692.0     103.0   185.0
4      Kia      Rio 4dr auto  Sedan   ...    2458.0      95.0   167.0

[5 rows x 15 columns]

In [8]: sorttbl = tbl.sort_values(['MSRP'], ascending=False)

In [9]: sorttbl.head()
Out[9]: 
Selected Rows from Table TMP9XJ3EKR8

            Make                  Model  ...   Wheelbase Length
0        Porsche            911 GT2 2dr  ...        93.0  175.0
1  Mercedes-Benz              CL600 2dr  ...       114.0  196.0
2  Mercedes-Benz  SL600 convertible 2dr  ...       101.0  179.0
3  Mercedes-Benz           SL55 AMG 2dr  ...       101.0  179.0
4  Mercedes-Benz              CL500 2dr  ...       114.0  196.0

[5 rows x 15 columns]

As previously mentioned, this doesn’t affect anything in the table on the CAS server itself, it merely stores away the sort keys and applies them when data is fetched (either through fetch directly, or any method that calls fetch in the background).