PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Sunday, August 28, 2022

[FIXED] How to read csv with redundant characters as dataframe?

 August 28, 2022     csv, pandas, python     No comments   

Issue

I have hundreds of CSV files separated by comma, and the decimal separator is also a comma. These files look like this:

ID,columnA,columnB
A,0,"15,6"
B,"1,2",0
C,0,

I am trying to read all these files in python using pandas, but I am not able to separate these values properly in three columns, maybe because of the decimal separator or because some values have quotation marks.

I first tried with the code below, but then even with different encodings I could not achieve my goal

df = pd.read_csv("test.csv", sep=",")

Anyone could help me? The result should be a dataframe like this:

  ID  columnA  columnB
0  A      0.0     15.6
1  B      1.2      0.0
2  C      0.0      NaN

Solution

You just need to specify decimal=","

from io import StringIO

file = '''ID,columnA,columnB
A,0,"15,6"
B,"1,2",0
C,0,'''

df = pd.read_csv(StringIO(file), decimal=",")
print(df)

Output:

  ID  columnA  columnB
0  A      0.0     15.6
1  B      1.2      0.0
2  C      0.0      NaN


Answered By - BeRT2me
Answer Checked By - Marilyn (PHPFixing Volunteer)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing