Working With CSV Files in ColdFusion

I have been working on few applications that use CSV files quite heavily. I wish people will move into something more sensible like XML when transfering data, but CSV seems to have some magical qualities everybody like. But ColdFusion do not have native function dedicated to dealing with CSV files.

Option 1:
<CFHTTP> have capacity to read a properly formatted CSV into a Query object and it is supper fast. It took 5 tick counts to read my CSV file of 700 rows in the localhost and return as query.

   1: <cfhttp url="urlto.csv" name="q">
But this have a crippling limitation. This uses first row of the CSV as column headers and first row must contains words that can correspond with a legal variable name.  That means if your first row constrains a word like "Phone #" this fails flat on the face and you could not use this on any CSV file that you have not pre approved. 

Option 2:
Ben Nadel's CSV to Query function is a mature and solid piece of code and it does not suffer from <CFHTTP> limitations. I been using this function in few applications over the years and it never gave any trouble.

   1: <cffile action="read" file="#ExpandPath('1.csv')#" variable="myfile">
   2:  <cfset q = CSVToQuery(myfile)>
Show/Hide Line Numbers . Full Screen . Plain

This took 380 tick count to complete. If you do not have to process extremely large CSV files, this is a fine solution.

Option 3: 
If you have to work with large files and speed is critical, jumping into Java always helps. Opencsv completed the same task in 70 Tick counts. Not as fast as <CFHTTP> but it won't fail on unexpected CSV files.

   1: <cffunction name="CSVtoQuery" access="remote" output="No">
   2:  <cfargument name="file" required="yes" type="string">
   3:  <cfargument name="columnlist" required="No" type="string" default="" hint="[Empty] Take the first row as the column header, [Auto] create new column names [Column list]">
   4:  <cfargument name="fixColumn" required="No" type="boolean" default="Yes" hint="If columnlist taken from the first row of the file, validate names & fix it">
   5: 
   6:  <cfset local.fileReader = createobject("java","java.io.FileReader").init("#arguments.file#")>
   7:  <cfset local.csvReader = createObject("java","au.com.bytecode.opencsv.CSVReader").init(fileReader, ',', '"', chr(1), false)>
   8:  <cfset local.array = csvReader.readAll()>
   9: 
  10:  <!--- handle the column name --->
  11:  <cfswitch expression="#arguments.columnlist#">
  12:  <cfcase value="">
  13:  <cfset local.clm = local.array[1]>
  14:  <cfset local.start = 2>
  15:  </cfcase>
  16:  <cfcase value="auto">
  17:  <cfset local.clm = ArrayNew(1)>
  18:  <cfloop from="1" to="#ArrayLen(local.array[1])#" index="i">
  19:  <cfset ArrayAppend(local.clm,'col_#i#')>
  20:  </cfloop>
  21:  <cfset local.start = 1>
  22:  </cfcase>
  23:  <cfdefaultcase>
  24:  <cfset local.clm = ListToArray(arguments.columnlist)>
  25:  <cfset local.start = 1>
  26:  </cfdefaultcase>
  27:  </cfswitch>
  28:  <cfset local.clms = ArrayLen(local.clm)>
  29: 
  30:  <cfif YesNoFormat(arguments.fixColumn)>
  31:  <!--- validate/fix column names --->
  32:  <cfloop from="1" to="#local.clms#" index="i">
  33: <cfset local.clm[i] = rereplacenocase(trim(local.clm[i]),' |##|"|""|',"",'all')>
  34:  <cfif not refindnocase('^[a-zA-Z_][a-zA-Z0-9_]*$',local.clm[i])>
  35:  <cfset local.clm[i] = 'col_#i#'>
  36:  </cfif>
  37:  </cfloop>
  38:  </cfif>
  39:  <cfset local.q = QueryNew( ArrayToList( local.clm ) )>
  40: <!--- convert array to query --->
  41:  <cfloop from="#local.start#" to="#ArrayLen(local.array)#" index="i">
  42:  <cfset QueryAddRow(local.q)>
  43:  <cfloop from="1" to="#local.clms#" index="c">
  44: <cfif ArrayIsDefined(local.array[i],c)>
  45:  <cfset QuerySetCell(local.q,local.clm[c],ToString(local.array[i][c]))>
  46: </cfif>
  47:  </cfloop>
  48:  </cfloop>
  49: 
  50:  <cfreturn local.q>
  51: </cffunction>
Show/Hide Line Numbers . Full Screen . Plain

5 Comments :
Wouter
Wednesday 09 September 2015 01:52 PM
In CF8, I had to use explicit 'JavaCast()' calls to make the constructor call (still 'init()', of course - I was too hasty about that in my previous comment) work correctly. Also when using CF8, you have to remember that it is running on Java 6; you have to recompile 'opencsv' 3.5 or use an earlier version to avoid class version errors...
Wouter
Wednesday 09 September 2015 10:21 AM
About option 3: the 'opencsv' library is now (version 3.5) in the 'com.opencsv' package rather than 'au.com.bytecode.opencsv'. It also has new constructors: 'init()' won't cut it anymore. But I like this approach - thanks!
Wednesday 17 April 2013 10:31 PM
I do this a lot in MS SQL, which is very fast.
Wednesday 19 December 2012 04:18 AM
Hi Saman,

According to the (CF9) docs, you can control how cfhttp handles column names. If you know your CSV will contain invalid column names, you can specify your own using the "columns" attribute. If you then also set the "firstrowasheaders" attribute to true, it will ignore the first row containing the invalid names, and use your valid ones instead.

I haven't tried this, but I'd be interested in how fast that runs against your test CSV file.

Cheers
Julian (cfSimplicity).
Thursday 20 December 2012 11:26 AM
Hello Julian,
The only problem with that is we have to know how many columns are there in the CSV file before we call it. CFHTTP is perfect if we know the layout of the CSV file beforehand.