Friday, April 23, 2010

Utilizing Transactions in SSIS to Rollback After a Failed Import

SSIS is used primarily to import data into a database, particularly from flat files, but also from other databases. The typical SSIS imagepackage for doing this task looks something like the picture at the left: backup the current data, delete the current data from the destination table, and then import the data from the source.
This is all good, and it works well. When it works. The problem is that if there’s a problem with reading the data from the source (say, the credentials for the source database changed), you’ve already blown away the current (production) data. The job fails, and the data remains missing.


Configure MSDTC

If you’re running the SSIS package on the destination database server, you can skip this step. We’ve chosen to have SSIS on a dedicated instance, which has made keeping track of packages and scheduled jobs much easier. But it also means we’ve got to do a little work on the back-end before we can make our packages use a single transaction
The MSDTC service (Distributed Transaction Coordinator) facilitates this process, and by default, it doesn’t allow remote communication. So we’ve got to change the settings on two systems: the database server and the server on which the job is executed. In our example, we’ll say that the db server runs Windows Server 2003 and the Server 2008; setting up the MSDTC service under each OS is just a tiny bit different.

Configure MSDTC in Windows Server 2003

Run the component service control panel: Start -> Administrative Tools -> Component services (or Start -> run -> dcomcnfg)
Go to Component Services -> Computers and right click on My Computer. Select Properties.
image
The properties window has a MSDTC tab. Click on the Security Configuration button in the MSDTC tab.
In the security configuration window (below), you'll need to grant Network DTC Access, allowing remote clients, both inbound and outbound. You do not need to allow remote administration, though doing so would allow these settings to be (re)configured remotely.
The settings below are what are required for a database server. image
Basically, we want our SSIS server to be able to communicate to the MSDTC service on our database server, and we want our service to be able to talk back to the SSIS box.
When you click on <OK>, the system will restart the Distributed Transaction Coordinator service.

Configure MSDTC in Windows Server 2008

The actual configuration settings and windows are identical in Windows Server 2008 to those in Windows Server 2003. There is a difference, though, in how you get to those settings:
Run the component service control panel: Start -> Administrative Tools -> Component services (or Start -> run -> dcomcnfg)
Go to Component Services -> Computers -> My Computer -> Distributed Transaction Coordinator
Right-click on on "Local DTC" and select properties. There is a Security tab on that window. The settings below are required on the server that is executing the SSIS job:
image
Here we don’t need remotely-initiated connections to the MSDTC service, so we’re not allowing remote clients or inbound communications; outbound is all we need.
By the way, if we don’t do this configuration of the MSDTC service, we’ll get the following error when we try to run our package:
[Connection manager "DestinationConnectionOLEDB"] Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x8004D024.
and
[Connection manager "DestinationConnectionOLEDB"] Error: The SSIS Runtime has failed to enlist the OLE DB connection in a distributed transaction with error 0x8004D024 "The transaction manager has disabled its support for remote/network transactions.".

Configuring your Package for Transaction Support

There’s a pretty good Microsoft article on using transactions in SSL at http://msdn.microsoft.com/en-us/library/ms137690.aspx; I’d recommend it to you.
Here’s how it plays out.
You can set packages, loops, tasks, and sequence containers to use transactions, which really helps in ensuring consistency.
In our case, we’ll enable transactions at the package level, and then we’ll have the option of having our tasks be a part of that transaction:
Double-click on your package in SQL Server Business Intelligence Development Studio (BITS from now on). That opens the package. Now, right-click on the background and select properties, as below:
image
The properties data shows up on the right-hand side of the screen, and at the end of the list of properties, you’ll see “TransactionOption.” This is how we tell SSIS how to utilize transactions in the execution of the package. The options are as follows:
  • Required – Either start a new transaction, or join the transaction that was started by the parent to this object
  • Supported – Join a transaction, if the parent container was a part of one, but don’t start one. This is the default setting.
  • NotSupported – Don’t join a transaction, under any circumstances.
We’ll set the TransactionOption for our package to “Required.”
image
Now we can click one of our tasks, like the “Backup Original Table” task, and we’ll see that, since its setting is “Supported,” it’ll be a part of our package’s transaction.
Here’s something to consider: there are only certain transactions that can be rolled back. This is especially important to remember when we’re wanting to allow the rollback of deleted data.
There are circumstances in which cannot rollback a drop table, which often is used for these operations. Instead, unless you’re looking at a *ton* of data, use a “delete from” SQL statement. A truncate command will work in most instances, but check the rollback process in a test instance before putting it in production. This will ensure your original data can be rolled back with the transaction.
That’s all that’s required. Now, if we have a failure at the end of the process, when data is imported, that failure will trigger a rollback of our delete task.

2 comments:

  1. what about config firewall, we have to open ports in 2 servers, right?
    Do we need domain account or any specific account exist in both servers?

    ReplyDelete
    Replies
    1. Ah, good question. Generally speaking, you just need to have the database ports open: the DTC communicates with the local database driver for your remote database server. This means that, if you've got the ports open for database communication, you should be set.

      Delete

Thanks for leaving a comment!