Friday, May 13, 2022

[FIXED] How do I iteratively append to text?

Issue

I have a dataframe that needs to be appended upto page/9/ in python

df:

/soccer/england/premier-league-2020-2021/results/
/soccer/england/premier-league-2019-2020/results/
/soccer/england/premier-league-2018-2019/results/

For every row in df , I have to append page/#/, page/2/, page/3/, page/4/, etc upto page/9/ as below

How can I do it in python?

expected df:

/soccer/england/premier-league-2020-2021/results/#/
/soccer/england/premier-league-2020-2021/results/#/page/2/
/soccer/england/premier-league-2020-2021/results/#/page/3/
.
.
/soccer/england/premier-league-2020-2021/results/#/page/9/
/soccer/england/premier-league-2019-2020/results/#/
/soccer/england/premier-league-2019-2020/results/#/page/2/
/soccer/england/premier-league-2019-2020/results/#/page/3/
.
.
/soccer/england/premier-league-2019-2020/results/#/page/9/
/soccer/england/premier-league-2018-2019/results/#/
/soccer/england/premier-league-2018-2019/results/#/page/2/
/soccer/england/premier-league-2018-2019/results/#/page/3/
.
.
/soccer/england/premier-league-2018-2019/results/#/page/9/

Solution

Sample dataframe used by me:

df=pd.DataFrame({'col': {0: '/soccer/england/premier-league-2020-2021/results/',
  1: '/soccer/england/premier-league-2019-2020/results/',
  2: '/soccer/england/premier-league-2018-2019/results/',
  3: '/soccer/england/premier-league-2020-2021/results/',
  4: '/soccer/england/premier-league-2019-2020/results/',
  5: '/soccer/england/premier-league-2018-2019/results/',
  6: '/soccer/england/premier-league-2020-2021/results/',
  7: '/soccer/england/premier-league-2019-2020/results/',
  8: '/soccer/england/premier-league-2018-2019/results/',
  9: '/soccer/england/premier-league-2020-2021/results/',
  10: '/soccer/england/premier-league-2019-2020/results/',
  11: '/soccer/england/premier-league-2018-2019/results/'}})

You can try:

df['h']=df.index%9+1
#created a helper column
df['col']=df['col']+("#/page/"+df['h'].astype(str)+'/').mask(df['h'].eq(1),"#/")
#conditionally adding '"/#/page/pagenumber/"' and '#/'
df=df.drop('h',1)
#remove that helper column

Now If you print df you will get your desired output

Update:

IIUC you need 9 url for every unique url so:

out=pd.DataFrame(df['col'].unique(),columns=['col'])
#created a dataframe from the unique values of 'col' column
out=out.reindex(out.index.repeat(9)).reset_index(drop=True)
#repeated values of each row 9 times
out['h']=out.index%9+1
#created a helper column
out['col']=out['col']+("#/page/"+out['h'].astype(str)+'/').mask(out['h'].eq(1),"#/")
#conditionally adding '"/#/page/pagenumber/"' and '#/'
out=out.drop('h',1)
#remove that helper column

Now If you print out you will get your desired output



Answered By - Anurag Dabas
Answer Checked By - Marie Seifert (PHPFixing Admin)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.