.Net ramblings
# Tuesday, 24 May 2005
A few useful functions for importing and exporting data in asp.net. Excel, PDF, Datasets, DataGrids, etc.

the last few web projects i've been working on have had a lot of import/export requirements, and i've put together a class library that contains the functionality outlined below.  A lot of it i have collected from newsgroups and modified to my own purposes.

Import an Excel File

Imports all the worksheets in an excel file into a .Net dataset, with one datatable for each work sheet. Uses the ADOX COM component (reference Microsoft.ActiveX extensions 2.8 in VS, and adodb) and OleDb.  The reason i chose to use ADOX was because you need to know the worksheet names in the excel file if you're to query them with OleDb, and this isn't possible in my case.  So i use ADOX to iterate through the table names and create a new ADO.Net DataTable for each one.
I found some problems with excel documents that appeared to contain empty columns past the used range of cells, but OleDb complained about "Too many fields defined", which i presume is because it interprets all the excel columns that go on from A to XYZ or whatever, as proper columns, when we're only interested in using the used range of cells.  To overcome this, I opened the excel file and copied the range of cells into a new worksheet and deleted the old one, and it worked fine. 

Sample usage (with a Html File Control called 'fileToUpload' on the page):  

This sample code below uses the ImportExcel method in the class library code, to have the user browse to the excel file, and then upload it to the server and import it into a DataSet.

string fileName = ((System.Web.UI.HtmlControls.HtmlInputFile)this.fileToUpload).PostedFile.FileName;
if(fileName != "")
{
   try
   {
      // upload the excel file to a temp directory (needs write permissions)
      string uploadPath = Server.MapPath("/Temp/" + fileName.Remove(0,fileName.LastIndexOf("\\") + 1)));
      ((System.Web.UI.HtmlControls.HtmlInputFile)this.fileToUpload).PostedFile.SaveAs(uploadPath);

      // load the excel file into a dataset for processing
      DataSet ds = ImportExcel.ImportExcel(uploadPath);

Export Crystal Report To PDF

Converts a crystal report object into a PDF document that opens in Adobe Acrobat Reader.

Export a DataTable or DataView to Excel

This export method is similar to the common technique of binding a dataset to a datagrid/gridview and rendering the contents to produce a HTML table that Excel can understand. However the datagrid approach is not reliable if the data contains html characters, e.g. < or >, it produces invalid XML, which causes problems in Excel and OpenOffice. An alternative approach is to derive a GridView control that automatically sets HtmlEncode = true on all the BoundColumns, but this can produce very bloated output where non ASCII characters are represented and Excel will not decode the HtmlEncoded text.  I found the simplest approach is to parse the dataview and write out an XHTML table. This way the output is guaranteed to be valid XHTML, and compatible with Excel and OpenOffice (use the HtmlDocument filter when opening the file). 

In case you are worried about the performance of traversing the data like this, don't be, because it is sure to be less code than what happens inside the DataGrid class :)

Export a DataTable or DataView to CSV

this is similar to the above approach of parsing through the data and outputing the delimiters appropriately. there are regular expression based approaches, which i have tried before, but i found them unreliable when dealing with a complex character set, especially when trying to output in a format that both OpenOffice and Excel will be able to open.  This way i know i can trust, and it is lightning fast as well.

Comments?

if you have any questions on how to use it, or if you find bugs, or even better if you have some improvements... post a comment below. 
Enjoy.  Tim.

The Code

using System;
using System.Configuration;
using System.Data;
using System.Data.OleDb;
using System.IO;
using System.Text;
using System.Text.RegularExpressions;
using System.Web;
using System.Web.Caching;
using System.Web.Security;
using System.Web.SessionState;
using System.Web.UI;
using System.Web.UI.WebControls;

namespace Tim.Library.WebForms
{
    /// <summary>
    /// Provides functionality to import and export datasets, datagrids, excel files etc.
    /// </summary>
    public class ImportExport
    {
        public enum ExportFormat{Excel, CSV};

        /// <summary>
        /// Imports all the worksheets in an excel file into a dataset,
        /// with one datatable for each sheet.
        /// </summary>
        public static DataSet ImportExcel(string path)
        {
            ADOX.CatalogClass cat = new ADOX.CatalogClass();
            // create an ADODB connection to use with the catalog
            ADODB.ConnectionClass connAdox = new ADODB.ConnectionClass();
            string connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=\"" + path + "\";Extended Properties=Excel 8.0;";
            // open the excel ADOX connection to get the table names
            connAdox.Open(connectionString, "admin","",0);
            cat.ActiveConnection = connAdox;
            DataSet ds = new DataSet();
            // create an OleDb connection to get data into ADO.Net
            OleDbConnection connOleDb = new OleDbConnection (connectionString);
            connOleDb.Open();
            foreach(ADOX.Table t in cat.Tables)
            {
                try
                {
                    string name = t.Name.Trim('_');
                    if(ds.Tables.Contains(name))
                        continue;    // avoid duplicate worksheet names... strange behaviour where multiple tables were added when only one sheet existed
                    OleDbCommand cmdSelect = new OleDbCommand (@"SELECT * FROM [" + name + "]", connOleDb);
                    OleDbDataAdapter dba = new OleDbDataAdapter();
                    dba.SelectCommand = cmdSelect;
                    DataTable dt = new DataTable(name);
                    dba.Fill(dt);
                    ds.Tables.Add(dt);
                }
                catch(Exception ex)
                {    
                    throw ex;
                }
            }
            connOleDb.Close();                
            connAdox.Close();
            return ds;
        }

        /// <summary>
        /// Opens a PDF window containing the specified crystal report object
        /// The Crystal DLLs must be deployed with the web app for this to work.
        /// </summary>
        /// <param name="rpt">The report object</param>
        /// <param name="filename">Should include include .pdf</param>
        public static void ExportCrystalReportToPDF(CrystalDecisions.CrystalReports.Engine.ReportClass rpt, string filename)
        {
            MemoryStream stream = (MemoryStream)rpt.ExportToStream(CrystalDecisions.Shared.ExportFormatType.PortableDocFormat);
            byte[] bytes = new Byte[stream.Length];
            stream.Read(bytes, 0, (int)stream.Length);
            stream.Close();
            HttpResponse response = HttpContext.Current.Response;
            response.Clear();
            response.ClearContent();
            response.ClearHeaders();
            response.Buffer= true;
            response.ContentType = "application/pdf";
            response.AddHeader("Content-Disposition", "attachment;filename=\"" + filename + "\"");
            response.BinaryWrite(bytes);
            response.End();
        }


/// <summary>
/// This method is an overload for the DataView version of the same name.
/// </summary>
public static void DataTableToCsv(DataTable dt, string filename, bool WriteToResponse)
{
DataView dv = new DataView(dt);
DataViewToCsv(dv, filename, WriteToResponse);
}

/// <summary>
/// Parses a dataview into a CSV format. I always use tab separated columns, with \n separated rows.
/// </summary>
/// <param name="dv">The data source</param>
/// <param name="filename">If WriteToResponse is true, this must be a file name, otherwise a full path+file name to save the file to</param>
/// <param name="WriteToResponse">if true, Response.Writes the output to the client browser,
/// otherwise writes the contents to the specified file path</param>
public static void DataViewToCsv(DataView dv, string filename, bool WriteToResponse)
{
char ColDelim = '\t';
char RowDelim = '\n';

using(StringWriter sw = new StringWriter())
{
// output the header row
foreach(DataColumn dc in dv.Table.Columns)
sw.Write(CsvEscape(dc.ColumnName) + ColDelim);
sw.Write(RowDelim);

foreach(DataRowView dr in dv)
{
foreach(object o in dr.Row.ItemArray)
sw.Write(CsvEscape(o.ToString()) + ColDelim);
sw.Write(RowDelim);
}

if(WriteToResponse)
{
HttpResponse response = HttpContext.Current.Response;
response.Clear();
response.Charset = System.Text.UTF8Encoding.UTF8.WebName;
response.ContentEncoding = System.Text.UTF8Encoding.UTF8;
response.AddHeader("Content-Disposition", String.Format("attachment; filename=\"{0}\";", filename));
response.ContentType = "text/txt";
response.Write(sw.ToString());
response.End();
}
else
{
File.WriteAllText(filename, sw.ToString());
}
}
}

/// <summary>
/// Strips out any row/col delimeters. This could be slightly destructive but not important in my case :)
/// </summary>
public static string CsvEscape(string s)
{
return Regex.Replace(s, "\r|\n|\t", "");
}

/// <summary>
/// This method is an overload for the DataView version of the same name.
/// </summary>
public static void DataTableToXhtmlTable(DataTable dt, string filename, bool WriteToResponse)
{
DataView dv = new DataView(dt);
DataViewToXhtmlTable(dv, filename, WriteToResponse);
}

/// <summary>
/// This export method is similar to the common technique of binding a dataset to a datagrid/gridview
/// and rendering the contents to produce a HTML table that Excel can understand. However the datagrid
/// approach is not reliable if the data contains html characters, e.g. < or >, it produces invalid XML,
/// which causes problems in Excel and OpenOffice.
/// An alternative approach is to derive a GridView control that automatically sets HtmlEncode = true on
/// all the BoundColumns, but this can produce very bloated output where non ASCII characters are represented
/// and Excel will not decode the HtmlEncoded text.
/// I found the simplest approach is to parse the dataview and write out an XHTML table. This way the
/// output is guaranteed to be valid XHTML, and compatible with Excel and OpenOffice (use the HtmlDocument filter).
/// </summary>
/// <param name="dv">The data source</param>
/// <param name="filename">If WriteToResponse is true, this must be a file name, otherwise a full path+file name to save the file to</param>
/// <param name="WriteToResponse">if true, Response.Writes the output to the client browser,
/// otherwise writes the contents to the specified file path</param>
public static void DataViewToXhtmlTable(DataView dv, string filename, bool WriteToResponse)
{
using(StringWriter sw = new StringWriter())
{
sw.Write("<table border=\"1\">\n");

// output the header row
sw.Write("<tr>\n");
foreach(DataColumn dc in dv.Table.Columns)
sw.Write("<th>{0}</th>\n", XmlEscape(dc.ColumnName));
sw.Write("</tr>\n");

foreach(DataRowView dr in dv)
{
sw.Write("<tr>\n");
foreach(object o in dr.Row.ItemArray)
sw.Write("<td>{0}</td>\n", XmlEscape(o.ToString()));
sw.Write("</tr>\n");
}
sw.Write("</table>\n");

if(WriteToResponse)
{
HttpResponse response = HttpContext.Current.Response;
response.Clear();
response.Charset = System.Text.UTF8Encoding.UTF8.WebName;
response.ContentEncoding = System.Text.UTF8Encoding.UTF8;
response.AddHeader("Content-Disposition", String.Format("attachment; filename=\"{0}\";", filename));
response.ContentType = "application/vnd.ms-excel";
response.Write(sw.ToString());
response.End();
}
else
{
File.WriteAllText(filename, sw.ToString());
}
}
}

/// <summary>
/// Replace < & > characters with their xml escaped equivalents
/// </summary>
public static string XmlEscape(string s)
{
s = Regex.Replace(s, "<", "&lt;");
s = Regex.Replace(s, ">", "&gt;");
s = Regex.Replace(s, "&", "&amp;");
return s;
}
}
}

Tuesday, 24 May 2005 23:45:00 (GMT Daylight Time, UTC+01:00)  #    Comments [7]  Asp.NetTracked by:
"ADOX + Excel: bogus worksheets" (Tim Mackey's Weblog) [Trackback]


Wednesday, 01 February 2006 19:41:43 (GMT Standard Time, UTC+00:00)
Hello,

Nice blog. Your Export-To-CSV option gives wings to my application. Do you have some simple way of converting a dataset to HTML without looping through each row?

THanks,

Raghu.
Raghu
Wednesday, 01 February 2006 20:23:17 (GMT Standard Time, UTC+00:00)
Hi raghu.
If you want the dataset in html, use a DataGrid!

If you just the html of the grid, try something like this:

DataGrid dg = new DataGrid();
dg.DataSource = ds.Tables[0].DefaultView;
dg.DataBind();
// get the text of the rendered datagrid
string dgText;
using (StringWriter sw = new StringWriter())
{
using (HtmlTextWriter htw = new HtmlTextWriter(sw))
{
// instantiate a datagrid
dg.RenderControl(htw);
gText = sw.ToString();
}
}

My preference would be just to create and bind the grid and then add it dynamically with Page.Controls.Add(dg);
Or something similar.

Hope this helps
Tim
Tim Mackey
Monday, 26 March 2007 07:23:16 (GMT Daylight Time, UTC+01:00)
Hi Tim,
Seeming that this function only use on Web base. Can you share with me function to export from dataset into CSV and Text. I try to use above function but I can not!

Thanks and best regards,
Dao
Le Trong Dao
Monday, 26 March 2007 11:25:12 (GMT Daylight Time, UTC+01:00)
hi Dao,
yes this code was designed for web application use. i have updated the article today, and removed the DataGrid element in favour of a more reliable (and faster) approach. take a look at the new functions DataViewToXhtmlTable and DataViewToCsv. You can remove the Response.Write parts of the code as they are not relevant for your windows app. it should compile no problem for you.
let me know if you have any trouble with it.
tim.
tim
Wednesday, 23 January 2008 19:05:01 (GMT Standard Time, UTC+00:00)
Hi Scott, with the code above to export a datatable/view to excel, I'm seeing that foreign characters are not being translated correctly.

Any thoughts on why or how to fix?

Thanks,
Matt
Matt
Wednesday, 23 January 2008 19:09:03 (GMT Standard Time, UTC+00:00)
Sorry TIM, I was looking at another post. I'm attempting to write a dataset containing foreign characters to the response stream with excel as the app to open. We tried the standard datagrid render control solution, but the users don't like the bold gridlines.

Your's is the closest thing I've seen so far that might keep the foreign chars intact, but when I open Excel, the chars are still coming out garbled.
Matt
Wednesday, 23 January 2008 21:17:46 (GMT Standard Time, UTC+00:00)
Hi Matt,
Not sure what to suggest here. You could try changing the response encoding, or else debug the export function and see if it is a HTML Encoding issue that might need specific decoding into Unicode before sending to the response.
Good luck with it
Tim
tim
OpenID
Please login with either your OpenID above, or your details below.
Name
E-mail
Home page

Comment (Some html is allowed: a@href@title, strike) where the @ means "attribute." For example, you can use <a href="" title=""> or <blockquote cite="Scott">.  

[Captcha]Enter the code shown (prevents robots):

Live Comment Preview