Search Suggest

Use filedates in SSIS

Case
A while ago someone asked me if it was possible to use the actual filedate of a file in SSIS.

Solution
Well, a Script task can retrieve the creation and modify date of a file. In this example I will import the latest file in a folder and delete all files older than 5 days. And as a bonus I will add an audit column with the filedate. A lot of steps (may be a bit too much for the experienced users, but not for the novice ones).

1) Create variables
Create three variables named FilePath (string), FileDate (datetime) and SourceFolder (string). Fill the SourceFolder variable with an existing path where the sourcefiles are stored.
Variables










2) Script task
Drag a Script task and a Data Flow task to you Control Flow and give them suitable names.
Script and Data Flow task















3) Map variables in Script task
Edit the Script task and select the FilePath and FileDate as ReadWrite variables and the SourceFolder as ReadOnly variable.
Map variables



















4) The Script
Copy the code from the main method to your code and add the extra using.
// C# code 
// Script to get the last modified file
// Note: if there are two files with both
// the highest modified date, the script
// will pick a random one.
using System;
using System.Data;
using System.IO; // Added
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_9f1bb3d696fe44b9b5e50ee8f5f648b5.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion

public void Main()
{
// Get all files in the SourceFolder
string[] sourceFiles = Directory.GetFiles(Dts.Variables["User::SourceFolder"].Value.ToString());

// Store highest date en file
DateTime highestDate = new DateTime();
string lastFile = string.Empty;

// Variable for file information
FileInfo fileInfo;

// Loop through sourceFiles
foreach (string currentFile in sourceFiles)
{
// Fill fileInfo variable with file information
fileInfo = new FileInfo(currentFile);

// Choose between creation date and lastmodified date
// Compare the file date to the variable
if (fileInfo.LastWriteTime > highestDate)
{
// Fill variables if the current file is the newest
lastFile = currentFile;
highestDate = fileInfo.CreationTime;
}
}

// Return values to the SSIS variables
Dts.Variables["User::FilePath"].Value = lastFile;
Dts.Variables["User::FileDate"].Value = highestDate;

// Finish script
Dts.TaskResult = (int)ScriptResults.Success;
}
}
}

5) Edit Data Flow
Create a Flat file source, Derived Column and a target of your choice. It should look something like this:
Data Flow






















6) Source File
Add a connection manager to your Flat File Source. Select one of the files in your source folder for this connection manager. It doesn't matter which one, because the connection string will be overwritten by an expression. Add the colums to finish it. Now go to the properties of your newly created connection manager and add an expression.
Right Click if you don't see the properties


Add an expression






















7) Expression
Add an expression for the ConnectionString Property so the ConnectionString will be overwritten by the value of the FilePath variable.
Create an expression




















8) Derived Column
Edit the Derived Column and add a new column that is filled with the variable FileDate.
New column with FileDate



















9) Target
Finish the dataflow by adding a target and test the first part of this mapping.
The test result

















Now the second part which deletes files older than 5 days. We could do this in one simple Script task, but I will try to limit the .Net code because this is an SSIS blog and not a .Net blog.


10) Delete old files
Add a Foreach Loop to the Control Flow with a Script Task and a File System Task in it. Your Control Flow should look something like this.
Continuing Control Flow






















11) Configure Foreach Loop
Edit the Foreach Loop and select the Foreach File Enumerator. After that enter the path of you sourcefolder.
Foreach File



















12) Expression
This step is not really necessary for this example, but it's just a bit nicer. Goto the Expressions on the Collection tab of the Foreach Loop and add an expression for the Directory which will overwrite the Folder property (don't let the different names confuse you) with the variable SourceFolder.
Add an expression for the Directory/Folder






















13) Variable Mapping
Map the FilePath variable to Index 0 in the Foreach Loop.
Variable Mappings



















14) Script task
Copy the code from the main method to your code and add the extra using. Don't fill in the variables this time (it's done by script).
// C# code 
// Fill SSIS variable with the last modified date
// of the current source file
using System;
using System.Data;
using System.IO; // Added
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_cb8dd466d98149fcb2e3852ead6b6a09.csproj
{
[System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{

#region VSTA generated code
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion

public void Main()
{
// Lock SSIS variables
Dts.VariableDispenser.LockForRead("User::FilePath");
Dts.VariableDispenser.LockForWrite("User::FileDate");

// Create a variables 'container' to store variables
Variables vars = null;

// Add variables from the VariableDispenser to the variables 'container'
Dts.VariableDispenser.GetVariables(ref vars);

// Variable for file information
FileInfo fileInfo;

// Fill fileInfo variable with file information
fileInfo = new FileInfo(vars["User::FilePath"].Value.ToString());

// Choose between creation date and lastmodified date
// Fill SSIS variable with last modified date
vars["User::FileDate"].Value = fileInfo.LastWriteTime;

// Release the locks
vars.Unlock();

Dts.TaskResult = (int)ScriptResults.Success;
}
}
}

15)  Precedence Constraint
Enter the following Contraint to the line between the Script Task and the File System Task:
DATEDIFF("dd", @[User::FileDate], @[System::StartTime] ) > 5
Date check in the Expression

















16) File System Task
Delete the file which is stored in the variabe FilePath. You need to enter some dummy default value in the variable. Otherwise you will get an error message: Variable "FilePath" is used as an source or destination and is empty.
Delete the old file



















17) The result
Now test the package with a couple of files.
Last file wasn't old.






















That's that! Let me know if you have some other interesting SSIS ideas with file properties.

Update: a whole bunch of file properties.

Post a Comment