PHPFixing
  • Privacy Policy
  • TOS
  • Ask Question
  • Contact Us
  • Home
  • PHP
  • Programming
  • SQL Injection
  • Web3.0

Friday, August 26, 2022

[FIXED] How does the quotechar parameter of the csv reader function work?

 August 26, 2022     csv, python     No comments   

Issue

My current understanding of the quotechar parameter is that it surrounds the fields that are separated by a comma. I'm reading the csv documentation for python and have written a similar code to theirs as such:

import csv
with open("test.csv", newline="") as file:
    reader = csv.reader(file, delimiter=",", quotechar="|")
    for row in reader:
        print(row)

My csv file contains the following:

|Hello|,|My|,|name|,|is|,|John|

The output gives a list of strings as expected:

['Hello', 'My', 'name', 'is', 'John']

The problem arises when I have whitespace in between the commas in my csv file. For example, if i have a whitespace after the closing | of a field like such:

|Hello| ,|My| ,|name| ,|is| ,|John|

It gives the same output as before but now there's a whitespace included in the strings in the list:

['Hello ', 'My ', 'name ', 'is ', 'John']

It was my understanding that the quotechar parameter would only consider what was between the | symbol. Any help is greatly appreciated!


Solution

The quotechar argument is

A one-character string used to quote fields containing special characters, such as the delimiter or quotechar, or which contain new-line characters. It defaults to '"'.

For example,

If your csv file contains data of the form

|Hello|,|My|,|name|,|is|,|"John"|
|Hello|,|My|,|name|,|is|,|"Tom"|

then in that case you can't use the default quotechar which is " because its already present in entities of the csv data so to instruct the csv reader that you want "John" to be included as it is in the output you would specify the some other quotechar, it may be | or ; or any character depending on the requirements.

The output now include John and Tom in quotation marks,

['Hello', 'My', 'name', 'is', '"John"']
['Hello', 'My', 'name', 'is', '"Tom"']

Consider another example where csv field itself contains delimiter, consider the csv file contains

"Fruit","Quantity","Cost"
"Strawberry","1000","$2,200"
"Apple","500","$1,100"

Now in such case you have to specify the quotechar explicitly to instruct the csv reader so that it can distinguish between actual delimiter (control character) and comma (literal characters) in the csv field. Now in this case the quotechar " will also work.


Now coming to your code, you have to replace the extra white space before the delimiter in the csv file with the empty string. You can do this in the following way:

Try this:

from io import StringIO

with open("test.csv", newline="") as f:
    file = StringIO(f.read().replace(" ,", ","))
    reader = csv.reader(file, delimiter=",", quotechar="|")
    for row in reader:
        print(row)

This outputs,

['Hello', 'My', 'name', 'is', 'John']


Answered By - Shubham Sharma
Answer Checked By - Timothy Miller (PHPFixing Admin)
  • Share This:  
  •  Facebook
  •  Twitter
  •  Stumble
  •  Digg
Newer Post Older Post Home

0 Comments:

Post a Comment

Note: Only a member of this blog may post a comment.

Total Pageviews

Featured Post

Why Learn PHP Programming

Why Learn PHP Programming A widely-used open source scripting language PHP is one of the most popular programming languages in the world. It...

Subscribe To

Posts
Atom
Posts
Comments
Atom
Comments

Copyright © PHPFixing