PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Thursday, May 19, 2022

[FIXED] How to generate new variable (column) based on a specific given column containing cycles of numbers from 0 to n, where n is a positive integer

 May 19, 2022     excel, excel-formula, python, variables, vba     No comments   

Issue

The dataset contains data about COVID-19 patients. It is in both EXCEL and CSV file formats, and contains several variables and over 7 thousand records (rows) which has made the problem extremely harder and very time consuming to solve manually. Below are the 4 most important variables (columns) needed in solving the problem; 1: id for identifying each record (row), 2: day_at_hosp for each day a patient remained admitted at the hospital, 3: sex of patient, 4: death for whether the patient eventually died or survived. I want to create a new variable total_days_at_hosp which should contain a total of days a patient remained admitted at hospital.

Old Table:
 _______________________________________
|   id  | day_at_hosp |  sex   | death  |
|_______|_____________|________|________|
|   1   |     0       |  male  |   no   |
|   2   |     1       |        |        |
|   3   |     2       |        |        |
|   4   |     0       | female |   no   |
|   5   |     1       |        |        |
|   6   |     0       |  male  |   no   |
|   7   |     0       | female |   no   |
|   8   |     0       |  male  |   no   |
|   9   |     1       |        |        |
|  10   |     2       |        |        |
|  11   |     3       |        |        |
|  12   |     4       |        |        |
| ...   |    ...      |   ...  |  ...   |
| 7882  |     0       | female |   no   |
| 7883  |     1       |        |        |
|_______|_____________|________|________|

New Table:
I want to convert table above into table below:
 ____________________________________________
|   id  |total_days_at_hosp|  sex   | death  |
|_______|__________________|________|________|
|   1   |        3         |  male  |   no   |
|   4   |        2         |  male  |   yes  |
|   6   |        1         |  male  |   yes  |
|   7   |        1         | female |   no   |
|   8   |        5         |  male  |   no   |
| ...   |       ...        |   ...  |  ...   |
| 2565  |        2         | female |   no   |
|_______|__________________|________|________|

NOTE: the id column is for every record entered, and multiple records were entered for each patient depending on how long a patient remained admitted at the hospital. The day_at_hosp variable contains days: 0=initial day at hospital, 1=second day at hospital, ... , n=nth last day at hospital. The record (row) where the variable (column) day_at_hosp is 0 corresponds to all entries in other columns, if the record (row) for day_at_hosp is *not 0, say 1,2,3, ...,5 then it belongs to the patient right above, and all the corresponding variables (columns) are left blank. However the dataset I need should look like the table below. It should include a new variable (column) called total_days_at_hosp generated from the variable (column) day_at_hosp. The new variable (column) total_days_at_hosp is more useful in statistical tests to be conducted and will replace variable (column) day_at_hosp, so that all blank rows can be deleted. To move from old table to new table the needed program should do the following:

day_at_hosp ===> total_days_at_hosp
    0                                 
    1        --->        3               
    2
-------------------------------------                              
    0        --->        2           
    1                              
-------------------------------------
    0        --->        1          
-------------------------------------
    0        --->        1
-------------------------------------
    0                                  
    1                                  
    2        --->        5                
    3
    4                                  
-------------------------------------
   ...                                 
------------------------------------- 
    0         --->       2                                              
    1
-------------------------------------

How can I achieve this?


Solution

Another formula option without dummy value placed at end of the Old/New Table.

1] Create New Table by >>

  • Copy and paste all Old Table data to a unused area
  • Click "Autofilter"
  • In "days_at_hospital" column select =0 value
  • Copy and paste filter of admissions to New Table column F
  • Delete all 0s in rows of Column G

Then,

2] In G2, formula copied down :

=IF(F2="","",IF(F3="",MATCH(9^9,A:A)+1,MATCH(F3,A:A,0))-MATCH(F2,A:A,0))

Remark : If your "ID Column" is Text value, formula changed to :

=IF(F2="","",IF(F3="",MATCH("zzz",A:A)+1,MATCH(F3,A:A,0))-MATCH(F2,A:A,0))

enter image description here



Answered By - bosco_yip
Answer Checked By - David Goodson (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing