I recently had to debug a co-worker’s SSIS package, trying to figure out why it wasn’t working quite right. We have a sharded infrastructure, meaning that our customers’ data is split out among a number of servers with equivalent schemas. For this particular package, we wanted to, for each customer, get the relevant data from a common ETL server and push results into the appropriate server for that customer.
My co-worker had set up a dynamic connection (see Rafael Salas or Hari Bagra for details on how to do this), but something weird was happening: the package was trying to push everything to the same server. I confirmed that if all relevant customers loaded were for the same server, that the process would work correctly, and that I could run each server load one at a time, so there weren’t any problems connecting to particular servers or parameters overriding this choice. It’s like the connection was “sticky,” connecting successfully to the first server and then ignoring the later changes.
It turns out, that’s exactly the case. He turned on the RetainSameConnection flag for the connection manager. If you set the value equal to true, then SSIS will open a single connection for the duration of the package run. There are a couple of times in which you might want to set this to true:
- You have a complex package which hits the same connections over and over again, or you have to do lots of operations in a foreach loop. This can remove the overhead of re-opening connections over and over again, thereby improving performance.
- You are using global temp tables and want multiple SSIS tasks to manipulate data in those global temp tables.
My co-worker was doing option #2, setting up a global temp table. He turned on RetainSameConnection to get that functionality working, not realizing that it necessarily precluded the other requirement, that this package be able to modify the connection to point to a different server based on the current customer.
As a note, if you set RetainSameConnection to false, you will lose the ability to use global temp tables in multiple SSIS tasks. The reason is that the connection will close between tasks, and once the last session which references a global temp table closes, that temp table can be marked for cleanup.
In my case, I ripped out the global temp table logic and set RetainSameConnection to False. That allowed me to use dynamic connections as expected.