Zipfiles, CSV and UUIDs

MethodBox currently uses a java servlet to create Comma Separated Value (CSV) files inside a zip file.  These are then sent back to the rails application on demand over http.  What we needed was a metadata file to go into the zip file explaining what the CSV files had in them.  It was easiest to use the active records about the CSV columns to get the metadata.

For the zipfile manipulation thanks go to this post and the original rubyzip API.

So, first make the HTTP request:

http = Net::HTTP.new(“localhost”, 10000)
http.read_timeout=6000
response = http.get(‘csvserver/download/+ filename)

#check that the request was OK
if response.response.class == Net::HTTPOK

#check what the response contains

if response.content_type == ‘application/zip’

#make a new file to hold the zipped data

uuid = UUIDTools::UUID.random_create.to_s

#you might need to install the uuidtools gem

File.makedirs(RAILS_ROOT + “/” + “filestore” + “/” +filename)
uf = File.open(RAILS_ROOT + “/” + “filestore” + “/” + filename  + “/” + uuid+ “.zip”,”w”)

#write the content in the response out to the zip file
uf.write(response.body)
uf.close

After this the metadata is created and written to a string called metadata which can then be added to the zip we just created from the http response.  This next line adds the metadata string to the zip inside a file called metadata.txt.

Zip::ZipFile.open(RAILS_ROOT + “/” + “filestore” + “/” + filename  + “/” + uuid+ “.zip”, Zip::ZipFile::CREATE) {|zip| zip.get_output_stream(“metadata.txt”) { |f| f.puts metadata}}

We can then send the completed file back to the requester

send_file RAILS_ROOT + “/” + “filestore” + “/” + filename  + “/” + uuid+ “.zip”, :filename => “something” + “.zip”, :content_type => @archive.content_type, :disposition => ‘attachment’

I have also been looking at using Ruby to do all the CSV file creation but so far speed is proving to be an issue.  However, I found the following link which mentions several libraries.  I am now looking at using a combination of FasterCSV and CSVScan (also available as a gem).  This enables me to write incredibly complex code like this (don’t you just love ruby’s blocks!):

CSVScan.scan(infile) { |row| line = FCSV.parse(row[0], :col_sep => “\t”); new_row = Array.new; if i==0: variable_hash[key].each {|var| pos.push(line[0].rindex(var))} end; i=i+1; pos.each {|col| val = line[0][col];new_row.push(val)}; csv_arr.push(new_row)}

Yeah, it scared me as well!!  It parses a CSV file for several headers and then pulls out the columns which correspond to these headers and adds them into an array.  In MethodBox these files can have 2000 header (ie columns) and 20000 rows so you can see why speed is important when only pulling out a subset of these.

We can then use FasterCSV to write out the new CSV file:

FasterCSV.open(outfile, “w”) do |csv_file| csv_arr.each {|csv| csv_file << csv} end