Issue
My current understanding of the quotechar parameter is that it surrounds the fields that are separated by a comma. I'm reading the csv documentation for python and have written a similar code to theirs as such:
import csv
with open("test.csv", newline="") as file:
reader = csv.reader(file, delimiter=",", quotechar="|")
for row in reader:
print(row)
My csv file contains the following:
|Hello|,|My|,|name|,|is|,|John|
The output gives a list of strings as expected:
['Hello', 'My', 'name', 'is', 'John']
The problem arises when I have whitespace in between the commas in my csv file. For example, if i have a whitespace after the closing | of a field like such:
|Hello| ,|My| ,|name| ,|is| ,|John|
It gives the same output as before but now there's a whitespace included in the strings in the list:
['Hello ', 'My ', 'name ', 'is ', 'John']
It was my understanding that the quotechar parameter would only consider what was between the | symbol. Any help is greatly appreciated!
Solution
The quotechar
argument is
A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters. It defaults to '"'.
For example,
If your csv
file contains data of the form
|Hello|,|My|,|name|,|is|,|"John"|
|Hello|,|My|,|name|,|is|,|"Tom"|
then in that case you can't use the default quotechar
which is "
because its already present in entities of the csv data so to instruct the csv reader that you want "John"
to be included as it is in the output you would specify the some other quotechar
, it may be |
or ;
or any character depending on the requirements.
The output now include John
and Tom
in quotation marks,
['Hello', 'My', 'name', 'is', '"John"']
['Hello', 'My', 'name', 'is', '"Tom"']
Consider another example where csv field itself contains delimiter
, consider the csv
file contains
"Fruit","Quantity","Cost"
"Strawberry","1000","$2,200"
"Apple","500","$1,100"
Now in such case you have to specify the quotechar
explicitly to instruct the csv reader so that it can distinguish between actual delimiter (control character) and comma (literal characters) in the csv field. Now in this case the quotechar
"
will also work.
Now coming to your code, you have to replace the extra white space before the delimiter
in the csv
file with the empty string. You can do this in the following way:
Try this:
from io import StringIO
with open("test.csv", newline="") as f:
file = StringIO(f.read().replace(" ,", ","))
reader = csv.reader(file, delimiter=",", quotechar="|")
for row in reader:
print(row)
This outputs,
['Hello', 'My', 'name', 'is', 'John']
Answered By - Shubham Sharma Answer Checked By - Timothy Miller (PHPFixing Admin)
0 Comments:
Post a Comment
Note: Only a member of this blog may post a comment.