Search Suggest

Foreach Excel Worksheet Enumerator

Case
I have an Excel file with multiple identical worksheets (one foreach month) and I want to add the data to a single database table. Is there a foreach loop solution so that I don't need to add multiple data flows or sources.?

Solution
a) You could use a union all query in the Excel Source. Not very flexible, but very easy.
SELECT  Column1
, Column2
, Column3
FROM [Sheet1$]
UNION ALL
SELECT Column1
, Column2
, Column3
FROM [Sheet2$]
UNION ALL
SELECT Column1
, Column2
, Column3
FROM [Sheet3$]
b) You could loop through Excel tables by using the Foreach ADO.NET Schema Rowset enumerator. One downside: it returns both worksheets (which have the $ suffix) and named ranges. So you need an extra dummy task in your foreach loop with an expression on the precedence constraint to the next task. Something like: RIGHT(@[User::WorksheetName], 1) == "$"
c) You could use a Script Task to fill a SSIS object variable and use that to loop through. Let's elaborate that solution.

*update: Custom Excel Worksheet Enumerator.


1) Excel Connection Manager and Variables
Add an Excel Connection manager with a link to an Excel File with multiple Worksheets. I named mine "MyExcelFile". If you choose an other name, then make sure to also change that in the script task of step 3.

And we also need two SSIS variables. One object variable (ExcelWorksheets) to loop through in the foreach loop and one string variable (WorksheetName) to be filled by the foreach loop.
Two variables



















2) Script Task
Add a Script Task to your Control Flow and give it a suitable name. Edit the Script Task and add the SSIS Object variable "ExcelWorksheets" from step 1 as ReadWrite variable.
ReadWriteVariables






















3) The script
Edit the Script Task (open VSTA editor) and copy the following script.
// C# Code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Data.OleDb; // Added

namespace ST_0607edbf2c834733a551cd01039cb715.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion

public void Main()
{
try
{
// Get connectionstring from Excel Connection Manager and use it to connect through OLE DB
OleDbConnection excelConnection = new OleDbConnection(Dts.Connections["MyExcelFile"].ConnectionString);
excelConnection.Open();

// Once connected, get the table schema and close the OLE DB connection
DataTable dtDatasetsInExcel = excelConnection.GetSchema("Tables");
excelConnection.Close();


// NOTE: The datatable from GetSchema includes both worksheets (which have the $ suffix)
// and named ranges. So we need to exclude those named ranges.

// Create a dataset.
DataSet dsWorksheetsInExcel = new DataSet();

// Create a new table in the dataset
DataTable dtWorksheetsInExcel = dsWorksheetsInExcel.Tables.Add();
dtWorksheetsInExcel.Columns.Add("WorksheetName", typeof(string));


// Loop through all tables and only get those ending with a $
foreach (DataRow drWorksheet in dtDatasetsInExcel.Rows)
{
// Check for $ suffix
if (drWorksheet["TABLE_NAME"].ToString().EndsWith("$"))
{
dtWorksheetsInExcel.Rows.Add(drWorksheet["TABLE_NAME"].ToString());
}
}

// Fire information event with the total number of worksheets
bool fireAgain = true;
Dts.Events.FireInformation(-1, "Foreach Worksheet", dtWorksheetsInExcel.Rows.Count.ToString() + " worksheets found.", string.Empty, 0, ref fireAgain);


// Fill SSIS Object variable with worksheet dataset.
Dts.Variables["User::ExcelWorksheets"].Value = dsWorksheetsInExcel;

// Success
Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
// Something went wrong. Log error and fail Script Task
Dts.Events.FireError(-1,"Foreach Worksheet", ex.Message, string.Empty, 0);
Dts.TaskResult = (int)ScriptResults.Failure;
}
}
}
}


or VB.Net

' VB.Net code
Imports System
Imports System.Data
Imports System.Data.OleDb ' Added
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="" > _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

Enum ScriptResults
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
End Enum

Public Sub Main()
Try
' Get connectionstring from Excel Connection Manager and use it to connect through OLE DB
Dim excelConnection As New OleDbConnection(Dts.Connections("MyExcelFile").ConnectionString)
excelConnection.Open()

' Once connected, get the table schema and close the OLE DB connection
Dim dtDatasetsInExcel As DataTable = excelConnection.GetSchema("Tables")
excelConnection.Close()

' NOTE: The datatable from GetSchema includes both worksheets (which have the $ suffix)
' and named ranges. So we need to exclude those named ranges.

' Create a dataset.
Dim dsWorksheetsInExcel As New DataSet()

' Create a new table in the dataset
Dim dtWorksheetsInExcel As DataTable = dsWorksheetsInExcel.Tables.Add()
dtWorksheetsInExcel.Columns.Add("WorksheetName", GetType(String))

' Loop through all tables and only get those ending with a $
For Each drWorksheet As DataRow In dtDatasetsInExcel.Rows
' Check for $ suffix
If drWorksheet("TABLE_NAME").ToString().EndsWith("$") Then
dtWorksheetsInExcel.Rows.Add(drWorksheet("TABLE_NAME").ToString())
End If
Next

' Fire information event with the total number of worksheets
Dim fireAgain As Boolean = True
Dts.Events.FireInformation(-1, "Foreach Worksheet", dtWorksheetsInExcel.Rows.Count.ToString() & " worksheets found.", String.Empty, 0, fireAgain)

' Fill SSIS Object variable with worksheet dataset.
Dts.Variables("User::ExcelWorksheets").Value = dsWorksheetsInExcel

' Success
Dts.TaskResult = ScriptResults.Success
Catch ex As Exception
' Something went wrong. Log error and fail Script Task
Dts.Events.FireError(-1, "Foreach Worksheet", ex.Message, String.Empty, 0)
Dts.TaskResult = ScriptResults.Failure
End Try
End Sub
End Class



Worksheets and named ranges











4) Foreach Loop
Add a Foreach Loop Container to the Control Flow and give it a suitable name. Connect it to the Script Task from step 2.
Foreach Loop


























5) Foreach ADO enumerator
Edit the Foreach Loop Container and select the ADO enumerator in the collection tab. After that select the SSIS object variable from step 1 as ADO object source variable and select Rows in first table as the enumeration mode.
Foreach ADO enumerator
























6) Variable mappings
In the Variable mappings tab select the string variable from step 1. Now the Foreach Loop will fill this variable with the currect Worksheet name.
Variable mappings
























7) Excel Source
Add a Data Flow Task in the foreach loop and add a Excel Source to that Data Flow Task. Select the Excel Connection manager created in step 1 and select Table name or view name variable. Now you can select the variable that contains the Worksheet name.
Excel Source using variable for table name
























Note 1: hidden worksheets are not returned by the OLE DB provider.
Note 2: Worksheets are always sorted ascending on their name
Note 3: If you have different data structures, but worksheets all have some common columns, then you could create a query in a variable and use that as a source query. Something like "SELECT column1, column4 FROM " + @[User::WorksheetName]

إرسال تعليق