CSV Parser in Ruby
So, I have written like a million versions of these in different languages, then I found out that Ruby has an in built CSV parser, so I said cool. I read through the Pickaxe, got the info, plugged it in and boom, no worky, I hate it when things no worky. So I spent the next hour on google looking for how to fix it, people seem to not be having the problems I am having, that is, it just doesn’t work…weird, but fuck it, I have written enough of them to do it my damn self, all I wanted was to parse a csv and maintain the column headers so that I could dynamically create activerecord objects in my rails app and not have to worryabout enforcing column order, or parsing it in the controller. It took me about 20 minutes to write, and yes it shows it, heres the kicker, it works.
That is, like out of the box, plug and play, whatever, it works for me, so maybe it will work for you, that is, unless you want to stick with the poorly documented and wholly mystical ruby version, you might be better off, but I like doing things myself, that way, I completely understand how it works, why, and if I want to make a change I can.
Anyway, so here it is:
Using a csv like this for the example:
"Song Name","Artist","Album","Duration" "Girls And Boys","Good Charlotte","The Young and the Hopeless",03:05 "Meet Virginia","Train","Unknown",03:59 "Sacrifice","Elton John","Unknown",05:06
class Kcsv def initialize(file, options) @file = file @options = options options[:header] == true ? @headers = parse_header_line : @headers = false @children = [] parse end def parse i = 0 @file.rewind @file.each_line do |line| #looping through each line of the file if i > 0 then # we are past the first line x = 0 #we set the index for our position in the headers. row = {} #the row if not line.include? ',' then #Here we check to see if it is a single column row[@headers[x][0]] = clean(line) if not @headers[x].nil? row[x] = clean(line) if @headers[x].nil? else #we have multiple columns line.split(',').each do |column| #now we need to know if we have headers if !@headers.nil? then row[@headers[x][0]] = clean(column) if not @headers[x].nil? else row[x] = clean(column) end x = x + 1 end end @children << row else i = i + 1 #just there to skipp the first line end end return true end def to_a @children end protected def clean(string) return string.gsub('"','').strip end def parse_header_line headers = [] accepted_headers = ["ebay_id","image","price","quantity","title","desc","id","action","is_live","is_type"] @file.rewind @file.each_line do |line| x = 0 if not line.include? ',' then headers << [clean(line),x] if accepted_headers.include? clean(line) headers << [x,x] if not accepted_headers.include? clean(line) else line.split(',').each do |col| headers << [clean(col).strip,x] x = x + 1 end end break end return headers end end #this is just to test that it works as expected, returning and array of arrays with to_a csv = Kcsv.new(File.open('songs.csv','r'), :header => true) csv.to_a.each do |row| puts "#{row["Song Name"]} by #{row["Artist"]} from #{row["Album"]} -- #{row["Duration"]}" end
Anyway, there it is, have some fun. There maybe a better, or faster way, however, this is working, so I am happy enough.