Sorting

Since data in CAS can be spread across many machines and may even be redistributed depending on events that occur, the data is not stored in an ordered form. In general, when using statistical actions, this doesn’t make much difference since the CAS actions doing the work will handle the data regardless of the order that it is in. However, when you are bringing a table of data back to the client from CAS using the fetch action, you may want to have it come back in a sorted form.

In [1]: conn = swat.CAS(host, port, username, password)

In [2]: tbl = conn.read_csv('https://raw.githubusercontent.com/'
   ...:                     'sassoftware/sas-viya-programming/master/data/cars.csv')
   ...: 

Note: Cloud Analytic Services made the uploaded file available as table TMPMRJ21JTM in caslib CASUSER(castest).

Note: The table TMPMRJ21JTM has been created in caslib CASUSER(castest) from binary data uploaded to Cloud Analytic Services.
In [3]: tbl.fetch(to=5) Out[3]: [Fetch] Selected Rows from Table TMPMRJ21JTM Make Model Type ... Weight Wheelbase Length 0 Acura MDX SUV ... 4451.0 106.0 189.0 1 Acura RSX Type S 2dr Sedan ... 2778.0 101.0 172.0 2 Acura TSX 4dr Sedan ... 3230.0 105.0 183.0 3 Acura TL 4dr Sedan ... 3575.0 108.0 186.0 4 Acura 3.5 RL 4dr Sedan ... 3880.0 115.0 197.0 [5 rows x 15 columns] + Elapsed: 0.00788s, user: 0.006s, sys: 0.001s, mem: 2.04mb In [4]: tbl.fetch(to=5, sortby=['MSRP']) Out[4]: [Fetch] Selected Rows from Table TMPMRJ21JTM Make Model Type ... Weight Wheelbase Length 0 Kia Rio 4dr manual Sedan ... 2403.0 95.0 167.0 1 Hyundai Accent 2dr hatch Sedan ... 2255.0 96.0 167.0 2 Toyota Echo 2dr manual Sedan ... 2035.0 93.0 163.0 3 Saturn Ion1 4dr Sedan ... 2692.0 103.0 185.0 4 Kia Rio 4dr auto Sedan ... 2458.0 95.0 167.0 [5 rows x 15 columns] + Elapsed: 0.0141s, user: 0.013s, sys: 0.000999s, mem: 6.82mb

Of course, it is possible to set the direction of the sorting as well.

In [5]: tbl.fetch(to=5, sortby=[{'name':'MSRP', 'order':'descending'}])
Out[5]: 
[Fetch]

 Selected Rows from Table TMPMRJ21JTM
 
             Make                  Model  ...   Wheelbase Length
 0        Porsche            911 GT2 2dr  ...        93.0  175.0
 1  Mercedes-Benz              CL600 2dr  ...       114.0  196.0
 2  Mercedes-Benz  SL600 convertible 2dr  ...       101.0  179.0
 3  Mercedes-Benz           SL55 AMG 2dr  ...       101.0  179.0
 4  Mercedes-Benz              CL500 2dr  ...       114.0  196.0
 
 [5 rows x 15 columns]

+ Elapsed: 0.0138s, user: 0.011s, sys: 0.002s, mem: 6.82mb

If you are using the pandas.DataFrame style API for CASTable, you can also use the sort_values() method on CASTable objects.

In [6]: sorttbl = tbl.sort_values(['MSRP'])

In [7]: sorttbl.head()
Out[7]: 
Selected Rows from Table TMPMRJ21JTM

      Make             Model   Type   ...    Weight Wheelbase  Length
0      Kia    Rio 4dr manual  Sedan   ...    2403.0      95.0   167.0
1  Hyundai  Accent 2dr hatch  Sedan   ...    2255.0      96.0   167.0
2   Toyota   Echo 2dr manual  Sedan   ...    2035.0      93.0   163.0
3   Saturn          Ion1 4dr  Sedan   ...    2692.0     103.0   185.0
4      Kia      Rio 4dr auto  Sedan   ...    2458.0      95.0   167.0

[5 rows x 15 columns]

In [8]: sorttbl = tbl.sort_values(['MSRP'], ascending=False)

In [9]: sorttbl.head()
Out[9]: 
Selected Rows from Table TMPMRJ21JTM

            Make                  Model  ...   Wheelbase Length
0        Porsche            911 GT2 2dr  ...        93.0  175.0
1  Mercedes-Benz              CL600 2dr  ...       114.0  196.0
2  Mercedes-Benz  SL600 convertible 2dr  ...       101.0  179.0
3  Mercedes-Benz           SL55 AMG 2dr  ...       101.0  179.0
4  Mercedes-Benz              CL500 2dr  ...       114.0  196.0

[5 rows x 15 columns]

As previously mentioned, this doesn’t affect anything in the table on the CAS server itself, it merely stores away the sort keys and applies them when data is fetched (either through fetch directly, or any method that calls fetch in the background).