Comma Separated Value files (CSV) are one of the most common file exchange formats seen. It is a “flat” format, where typically each row has the same format. Each field, or data item, is separated from the others by a comma, and fields containing commas are quoted.
The XML Pipeline Server offers high-performance Java and .NET data conversion components that support converting of CSV to XML and XML to CSV.
CSV Import and Export
CSV Import/Export is a feature of almost every program that deals with tabular data, from accounting programs like Intuit’s Quicken to office programs such as Microsoft Excel. It is a very old format, dating to the earliest days of computing. Even Radio Shack’s TRS-80 model I used CSV data in DATA statements.
In spite of the ubiquity of the CSV data format, there is a surprising number of variations in implementations. For this reason, XML Pipeline Server offers several switches which can customize the behavior for your specific application.
Convert CSV to XML
The following options are supported in the XML Pipeline Server™ CSV converter precisely because they are unspecified in the real world, and vary between applications.
- Whether the first row of the CSV data contains the names of the fields, or they are known by the sending and receiving applications implicitly
- How commas within values are handled in CSV files
- How quotes within quoted values are handled in CSV files. Are they doubled to escape them, or is there a special escape character, such as a backslash?
- Whether single or double quotes are used
- Whether runs of consecutive but empty fields should be ignored
- What encoding is used – Windows1252, ISO-8859-1 or US-ASCII are the most common
- What line ending is used – CR/LF from Windows, LF from Unix/Linux, or CR from the Macintosh are often seen
TSV Files
A close cousin to CSV files are TSV, or Tab Separated Value files. These are a common export format from spreadsheets also. For these, there is a separate converter within XML Pipeline Server, but it actually is just the CSV converter in disguise, since either can be converted to the other by changing the definition of the separator character from comma to tab.
CSV File Reading and Writing
Using the above file, a tiny Java, C# or VB.Net program can quickly transform the input CSV to XML, or convert XML to CSV. For example, the following Java program converts the above CSV sample file into XML and writes the output to the console.
XML to CSV
Going from XML to CSV is just as important, and the code is just as small. It takes no special schema, and it doesn’t even take XQuery or XSLT. Anything immediately under the root element becomes the rows, and anything under that become the values between the commas. In the above example, switching to the ConvertFromXML class and writing the sample XML in would produce the initial CSV file. Just like the EDI converters and other converters in XML Pipeline Server, CSV (and TSV) converters are fully bidirectional.
How To Use CSV Files
CSV files often are used as a bridge between older applications and newer ones. Having a tool that transparently reads and writes CSV format files in your toolbox can be very valuable. So whether you have to merge inventory data coming from an outside vendor into your database or publish a list of financial figures for your accountants, being able to bridge these two worlds is the purpose of XML Pipeline Server.