SSIS Dataflow Performance Tuning
Buffer Architecture
Synchronous
Aka Row transformations
Asynchronous
Partially blocking
Fully blocking
Synchronous components
Same buffer used for input and output
Number of rows in = Number of rows out
Generally very quick
Examples:
Derived Column
Conditional Split
Multicast
Asynchronous components
Creates new buffers for output
Different “shaped” input and output buffers
Number of rows in <> Number of rows out
Generally slower
Examples:
Aggregate
Sort
Execution trees
…is a section of data flow starting from an asynchronous output and terminating at inputs on transforms that have no synchronous outputs
-Kirk Haselden, P546 of “Microsoft SQL Server Integration Services
Demo 2
Synchronous and Asynchronous components
(Count the expression trees)
Buffers don’t move
Design for perf!
“The Data Flow Task is performant by design, without any tuning or optimization the default settings generally deliver great performance”
Kirk Haselden, Microsoft SQL Server 2005 Integration Services, Chapter 23 – Data Flow Task Internals and Tuning
Only do what you have to
Remove unrequired columns (heed the warnings)
Fixed-width files – only parse what you need
ALWAYS use a SQL statement
Only parse when needed (or leave as strings)