CSV Parser in Ruby

So, I have written like a million versions of these in different languages, then I found out that Ruby has an in built CSV parser, so I said cool. I read through the Pickaxe, got the info, plugged it in and boom, no worky, I hate it when things no worky. So I spent the next hour on google looking for how to fix it, people seem to not be having the problems I am having, that is, it just doesn’t work…weird, but fuck it, I have written enough of them to do it my damn self, all I wanted was to parse a csv and maintain the column headers so that I could dynamically create activerecord objects in my rails app and not have to worryabout enforcing column order, or parsing it in the controller. It took me about 20 minutes to write, and yes it shows it, heres the kicker, it works.

That is, like out of the box, plug and play, whatever, it works for me, so maybe it will work for you, that is, unless you want to stick with the poorly documented and wholly mystical ruby version, you might be better off, but I like doing things myself, that way, I completely understand how it works, why, and if I want to make a change I can.

Anyway, so here it is:

Using a csv like this for the example:

"Song Name","Artist","Album","Duration"
"Girls And Boys","Good Charlotte","The Young and the Hopeless",03:05
"Meet Virginia","Train","Unknown",03:59
"Sacrifice","Elton John","Unknown",05:06
class Kcsv
	def initialize(file, options)
		@file = file
		@options = options
		options[:header] == true ? @headers = parse_header_line : @headers = false
		@children = []
		parse
	end
 
	def parse
		i = 0
		@file.rewind
 
		@file.each_line do |line|
			#looping through each line of the file
			if i > 0 then # we are past the first line
				x = 0 #we set the index for our position in the headers.
				row = {} #the row
				if not line.include? ',' then #Here we check to see if it is a single column
					row[@headers[x][0]] = clean(line) if not @headers[x].nil?
					row[x] = clean(line) if @headers[x].nil?
				else #we have multiple columns
					line.split(',').each do |column|
						#now we need to know if we have headers
						if !@headers.nil? then
								row[@headers[x][0]] = clean(column) if not @headers[x].nil?
						else
							row[x] = clean(column)
						end
						x = x + 1
					end
				end
				@children << row
			else
				i = i + 1 #just there to skipp the first line
			end
		end
		return true
	end
 
	def to_a
		@children
	end
 
	protected
	def clean(string)
		return string.gsub('"','').strip
	end
	def parse_header_line
		headers = []
		accepted_headers = ["ebay_id","image","price","quantity","title","desc","id","action","is_live","is_type"]
		@file.rewind
		@file.each_line do |line|
			x = 0
			if not line.include? ',' then
				headers << [clean(line),x] if accepted_headers.include? clean(line)
				headers << [x,x] if not accepted_headers.include? clean(line)
			else
				line.split(',').each do |col|
					headers << [clean(col).strip,x]
					x = x + 1
				end
			end
			break
		end
		return headers
	end
end
#this is just to test that it works as expected, returning and array of arrays with to_a
csv = Kcsv.new(File.open('songs.csv','r'), :header =&gt; true)
csv.to_a.each do |row|
	puts "#{row["Song Name"]} by #{row["Artist"]} from #{row["Album"]} -- #{row["Duration"]}"
end

Anyway, there it is, have some fun. There maybe a better, or faster way, however, this is working, so I am happy enough.

Leave a Comment

You must be logged in to post a comment.