Fixing a CSV file

June 28, 2011 topher

Dear Lazyweb,

I have a CSV file, fields surrounded by double quotes, comma separated. There are a bit more than 230K rows.

Some of the fields have hard returns in them, so it breaks my rows, and my importer (phpMyAdmin) chokes.

What’s the easiest way to remove only those hard returns? I’m thinking something like a fancy regex in vim to remove hard returns from the ends of lines that don’t end in double quotes, but my regex-fu isn’t that awesome.

Anyone know how to do that, or have a better idea?

5 thoughts on “Fixing a CSV file”

Ed says:

June 28, 2011 at 9:58 am

#!/usr/bin/perl

sub is_legit_csv {
$_[0] =~ /^(“[^”]*”,)(“[^”]*”)/);
}

while (my $line = <>) {
while(not is_legit_csv($line)) {
chomp $line;
$line .= <>;
}
print $line;
}

===============

~ ed$ cat foobar.csv
“foo”,”bar
baz”,”bang”
“fim”,”fam”,”foom”

~ ed$ ./fixer.pl foobar.csv
“foo”,”barbaz”,”bang”
“fim”,”fam”,”foom”

Reply
Mark says:

June 28, 2011 at 10:02 am

Looks like Ed beat me to it, but in vi, you should be able to do this

:%sm/([^”])n/1 /gc

It doesn’t do any fancy checking as to whether it’s a valid csv or not; it just replaces each newline not preceded by a ” with a space.

Reply
Ed Heil says:

June 28, 2011 at 10:04 am

Mine wasn’t quite right, but Topher and I are hashing it out in IMs. 🙂

Reply
Luke Rumley says:

June 28, 2011 at 2:31 pm

Can you post whether or not this is solved? Don’t know what the heck VIM is or anything 😉 but I can dance with you RegEx boys to an extent.

Reply
Topher says:

June 28, 2011 at 6:27 pm

Ed’s method may have worked, but was taking a very long time.

Mark’s did the trick with this:
:%sm/([^”])n/1 /g

in about 3/4 of a second. Vim rocks.

Reply

As it were…

The ruminations of topher1kenobe

Categories
- Audio Books (1)
- Book Review (32)
- Career (4)
- Christmas Songs (5)
- COVID-19 (6)
- Factor (2)
- Family (181)
- Fiction (1)
- Freedom (17)
- General Post (632)
- Going Green (6)
- Hardware Review (11)
- health (1)
- Letters To My Son (3)
- Movie Review (50)
- Music Review (26)
- My Dad (15)
- My Mom (24)
- My Story (12)
- NASCAR (8)
- Photography (16)
- Politics (1)
- Restaurant Review (2)
- Software Review (38)
- Space (54)
- Technology (192)
- Travel (6)
  - Australia (1)
    - Sydney (1)
  - USA (3)
    - Arizona (1)
    - Florida (1)
    - Michigan (1)
    - Nebraska (1)
    - Ohio (1)
    - Pennsylvania (1)
    - Tennessee (1)
    - Texas (2)
    - Washington D. C. (1)
- TV Review (1)
- Uncategorized (6)
- United Kingdom (1)
  - London (1)
- Vonage (6)
- Weight loss (3)
- WordCamp (3)
- WordPress (16)
  - Code (4)
  - Plugins (1)
Archives
Archives
My Latest Photo
See all photos for topher1kenobe
Recent tunes

Topher

topher1kenobe

111,986 Tracks

Sunny Day

danhgilmore

8 hours ago

StartToFinish

danhgilmore

8 hours ago

If You Call Me

danhgilmore

8 hours ago

Somebody Help Me

danhgilmore

8 hours ago

Saws and Acid

danhgilmore

9 hours ago
Recent HeroPress Essays
- Indebted: The Ugandan WordPress Dream that Flew to Asia – Ebbanja: Ekirooto kya Uganda ekyagenda e Buyindi
  Moses Cursor Ssebunya
  Posted 13 May 2026
- The Hero of HeroPress and quiet art of walking with people
  Aditya Kane
  Posted 29 April 2026
- ???? From a Small Village to WordCamp Asia: My WordPress Journey ????✨
  Shital Marakana
  Posted 22 April 2026

As it were…

Screenshots

Wallpaper

Whiskey

HeroPress

Topher.How

Photography

Fixing a CSV file

5 thoughts on “Fixing a CSV file”

Leave a Reply Cancel reply

As it were…

Categories

Archives

My Latest Photo

Recent tunes

Recent HeroPress Essays