PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Thursday, November 10, 2022

[FIXED] How to write proc sql without windowfunction over partition by sum?

 November 10, 2022     proc, proc-sql, sas, sql     No comments   

Issue

I just started learning SAS and realised that proc sql don't use window functions. As I am more at ease with sql I was wondering how I can simulate a sum window function in proc?

desired result

 select a.active, a.store_id, a.nbr, sum(nbr) over (partition by a.store_id)
from(select active, store_id, count(customer_id) as nbr from customer group by active, store_id) as a
;
active store_id nbr sum
0 1 8 326
1 1 318 326
0 2 7 273
1 2 266 273

eg of raw data

select active, store_id, customer_id
from customer
limit 10;
active store_id customer_id
1 1 1
1 1 2
1 2 3
1 2 4
1 1 5
1 1 6
0 1 7
1 2 8
1 1 9
1 2 10

current result and query

    select a.active, a.store_id, a.nbr, sum(nbr) 
    from(select active, store_id, count(customer_id) as nbr from customer group by active, store_id) as a
    group by a.active, a.store_id, a.nbr;
active store_id nbr sum
0 1 8 8
1 1 318 318
0 2 7 7
1 2 266 266

Solution

You can do the equivalent in proc sql by merging two sub-queries: one for the count of customers by active, store_id, and another for the total customers for each store_id.

proc sql noprint;
    create table want as
        select t1.active
             , t1.store_id
             , t1.nbr
             , t2.sum
        from (select active
                   , store_id
                   , count(customer_id) as nbr
              from have 
              group by store_id, active
             ) as t1

        LEFT JOIN

             (select store_id
                   , count(customer_id) as sum
              from have
              group by store_id
             ) as t2

        ON t1.store_id = t2.store_id
    ;
quit;

If you wanted to do this in a more SASsy way, you can run proc means and merge together the results from a single dataset that holds everything you need. proc means will calculate all possible combinations of your variables by default.

proc means data=have noprint;
    class store_id active;
    ways 1 2;

    output out=want_total
        n(customer_id) = total
    ;
run;

data want;
    merge want_total(where=(_TYPE_ = 3) rename=(total = nbr) )
          want_total(where=(_TYPE_ = 2) rename=(total = sum) keep=_TYPE_ store_id total)
    ;
    by store_id;

    drop _:;
run;

Or, in SQL:

proc sql;
    create table want as
        select t1.store_id
             , t1.active
             , t1.total as nbr
             , t2.total as sum
        from want_total as t1
        LEFT JOIN
             want_total as t2
        ON t1.store_id = t2.store_id
        where     t1._TYPE_ = 3
              AND t2._TYPE_ = 2
    ;
quit;

The _TYPE_ variable identifies the level of the analysis. For example, _TYPE_ = 1 is for active only, _TYPE_ = 2 is for store_id only, and _TYPE_ = 3 is for all combinations. You can view this in the output dataset from proc means:

store_id    active  _TYPE_  _FREQ_  total
.           0       1        3      3
.           1       1        7      7
1           .       2        6      6
2           .       2        4      4
1           0       3        1      1
1           1       3        5      5
2           0       3        2      2
2           1       3        2      2

And if you wanted faster high-performance results, check out its big sibling, proc hpsummary.

Therein lies the cool thing about SAS: You can bounce between PROCs, SQL, the DATA Step, and Python via Pandas/proc python. You can exploit the unique benefits of each of these methods and thought processes for any number of data engineering and statistics problems.



Answered By - Stu Sztukowski
Answer Checked By - Cary Denson (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing