Search Suggest

Validating XML file against XSD with SSIS - part II Script Task

Case
I have an XML file which I want to validate against an XSD file before processing it. However my XSD contains an include (or import) tag and the XML Task will not validate it and throws an error about certain parts that are not declared.

How can I validate an XML file against an XSD with an include (or import) tag?

Solution
Although the XML source component honors includes (and imports), the XML task doesn't. See(/vote for) this Microsoft Connect feedback.

You can use a Script Task to validate your XML file against the XSD file. For this example I use three files: movies.xmlmovies2.xsd and movies2_include.xsd.
<?xml version="1.0"?>
<movies>
<movie id="123">
<title>Pulp Fiction</title>
<year>1994</year>
<rating>8.9</rating>
</movie>
<movie id="124">
<title>Memento</title>
<year>2000</year>
<rating>8.6</rating>
</movie>
<movie id="124">
<title>The Usual Suspects</title>
<year>2000</year>
<rating>8.6</rating>
</movie>
</movies>

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:include schemaLocation="movies2_include.xsd"/>

<xs:element name="movies">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="0" maxOccurs="unbounded" name="movie">
<xs:complexType>
<xs:sequence>
<xs:element minOccurs="0" name="title" type="LimitedString" />
<xs:element minOccurs="0" name="year" type="xs:unsignedShort" />
<xs:element minOccurs="0" name="rating" type="xs:decimal" />
</xs:sequence>
<xs:attribute name="id" type="xs:unsignedByte" use="optional" />
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:simpleType name="LimitedString">
<xs:restriction base="xs:string">
<xs:maxLength value="50" />
</xs:restriction>
</xs:simpleType>
</xs:schema>

1) New File Connection
I will use Connection Managers in this example to make things easier to configure. Feel free to use something else such as variables. Right click in the Connection Managers pane and select "New File Connection...".
New File Connection for XML and XSD file






















Select the xml file in the File Connection Manager Editor and repeat these steps for the XSD file (movies2.xsd) so that you have two Connection Managers named movies.xml and movies2.xsd. You don't have to create a Connection Manager for the include xsd, because it's included by code.
File Connection Manager Editor













2) Script Task
Add an Script Task to your Control Flow and give it a suitable name. You can connect it to your Data Flow Task that processes the XML file.
Script Task to validate XML






















3) The script
Edit the Script Task and choose your Script Language, hit the edit button to edit the script and copy the code from the main method below to your main method and add the import/include row.
// C# Code
#region Namespaces
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Xml; // Added
#endregion

namespace ST_d2a4d346c17d488da0cd327617293c52
{
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
public void Main()
{
// create object for the XSD file that will be used
// for validating the XML file. Use the Connection
// Manager to get the path from the XSD file.
XmlReaderSettings xmlrs = new XmlReaderSettings();
xmlrs.ValidationType = ValidationType.Schema;
xmlrs.Schemas.Add(null, Dts.Connections["movies2.xsd"].ConnectionString);
xmlrs.Schemas.Compile();

// Try reading the XML file using the XSD. Use the
// Connection Manager to get the path from the XML file
try
{
XmlReader xmlr = XmlReader.Create(Dts.Connections["movies.xml"].ConnectionString, xmlrs);
while (xmlr.Read())
{
}
// The XML file was succesfully read.

// Close XML file
xmlr.Close();

Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
// Validation failed, fire error event with errormessage
Dts.Events.FireError(-1, "Validate XML", "Validation error: " + ex.Message, string.Empty, 0);
Dts.TaskResult = (int)ScriptResults.Failure;
}
}

#region ScriptResults declaration
///
/// This enum provides a convenient shorthand within the scope of this class for setting the
/// result of the script.
///
/// This code was generated automatically.
///

enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
#endregion
}
}
or VB.net
' VB.Net Code
#Region "Imports"
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Xml ' Added
#End Region

<Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute()> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

Public Sub Main()
' create object for the XSD file that will be used
' for validating the XML file. Use the Connection
' Manager to get the path from the XSD file.
Dim xmlrs As New XmlReaderSettings()
xmlrs.ValidationType = ValidationType.Schema
xmlrs.Schemas.Add(Nothing, Dts.Connections("movies2.xsd").ConnectionString)
xmlrs.Schemas.Compile()

' Try reading the XML file using the XSD. Use the
' Connection Manager to get the path from the XML file
Try
Dim xmlr As XmlReader = XmlReader.Create(Dts.Connections("movies.xml").ConnectionString, xmlrs)
While xmlr.Read()
End While
' The XML file was succesfully read.

' Close XML file
xmlr.Close()

Dts.TaskResult = ScriptResults.Success
Catch ex As Exception
' Validation failed, fire error event with errormessage
Dts.Events.FireError(-1, "Validate XML", "Validation error: " + ex.Message, String.Empty, 0)
Dts.TaskResult = ScriptResults.Failure
End Try
End Sub

#Region "ScriptResults declaration"
'This enum provides a convenient shorthand within the scope of this class for setting the
'result of the script.

'This code was generated automatically.
Enum ScriptResults
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
End Enum

#End Region

End Class

Note: For those using SSIS 2005: use the taskresult row from 2005 that can be found in your main method and replace the followin lines: C# line 34 & 40 / VB.Net line 31 & 35.


4) Testing
Run the package. Change something in the XML file to make it fail (a movie title longer than 50 chars).
Test result










Post a Comment