Issue
I have a text file that is separated by commas, but several columns have commas inside them so it creates columns where they are not needed. I have tried eliminating all the commas, then using regex to find only the numbers and add a comma (not worked) using the following solution (Put comma after a pattern in python regex).
Excel has the same problem, and other text editors as well.
0111,Cultivo de cereales y otros cultivos n.c.p.,011,Cultivos en general; cultivo de productos de mercado; hortic,01,AGRICULTURA, GANADERIA, CAZA Y ACTIVIDADES DE SERVICIOS CONE,01,**AGRICULTURA, GANADERIA, CAZA Y SILVICULTURA**
If you can see in the **
text, Python will not create one column but 3.
Another solution would be to place " " marks, but I have not found a solution that creates.
Solution
Your data source is buggy. It should put quotes " "
around such values, then pandas would be able to parse it. Without that, there is now no reliable logical way to tell the data apart now because the meaning of a comma now became ambiguous.
A heuristic solution could be to assume that any comma followed by a space should be removed while the others should be retained, you could try that, but there can still be cases in which it may fail.
data.replace(", ", " ")
Answered By - CherryDT Answer Checked By - Willingham (PHPFixing Volunteer)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.