Wednesday, April 15, 2009

SSIS: from Flat File Source to OLE DB destionation, everyday problems

While performing one of the most regular and simple tasks in SSIS – transferring data from a flat file to a table – I struggled with two, maybe the most regular errors coming along with this task. I struggled with them for hours – simply because these things are so simple and basic that most manuals don’t bother to give detailed instructions on them. Anyway, no one needs to spend HOURS inventing this wheel… These two errors are: Output column [column name] failed because truncation occurred and Invalid character value for cast specification.

Let’s, first, fight the truncation problem.

So, I have a simple flat file TEST1.txt:

[Row 1] AAABBB
[Row 2] aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa8
[Row 3] bbbbbbbbb
[Row 4] 12345

The file has 4 rows; the 1st row contains the field names; the “” symbol is a field separator; carriage return is a row separator; there are no values in row 3, column ‘BBB’, and no value in row 4, column ‘AAA’.
This data was planned to go to a table TEST17 with two fields: AAA varchar(30) and BBB varchar(30). However, there was an immediate warning posted on the destination:

Truncation may occur due to inserting data from data flow column "AAA" with a length of 50 to database column "AAA" with a length of 30…

(and same for a column “BBB”).

Speaking of “a length of 50”: beware that 50 is a default value for all columns of a flat file source; and, by default, they all are considered to be strings.
Speaking of warnings... from my small experience, they rarely allow you to pass them by harmlessly. And, sure enough, an attempt to run this package ended in a massive error output:

Data conversion failed. The data conversion for column "AAA" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page."
Error: 0xC020902A at Data Flow Task, Flat File Source [1]: The "output column "AAA" (198)" failed because truncation occurred, and the truncation row disposition on "output column "AAA" (198)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.
Error: 0xC0202092 at Data Flow Task, Flat File Source [1]: An error occurred while processing file "C:\Test1.txt" on data row 2.


(...and so on. Thank God I was warned!) What to do now?

Option 1. So, the truncation row disposition on "output column "AAA" (198)" specifies failure on truncation? Let’s then change this.
Flat File Source – right click opens Flat File Source editor; choosing the Error output tab opens the window:

In a column Truncation open a drop-down list and choose Ignore Failure. Now the package runs with success, and in the result table we have a properly truncated string. (The warning messages, however, keep showing up).

Option 2. To put the column lengths into an accord - instead of truncating them - looks like a more peaceful solution. To do this we need or shorten the column length of the source file either make longer that of the destination.

Let's try shortening first. Right click on Flat File Source and choose Show Advanced Editor. The following window opens:


- where I’ve already switched to the Input and Output Properties and expanded the Flat File Source Output list, and, both, External Columns and Output Columns inside it. I’ll try to change the length of the external column AAA. Highlighting it shows the column’s properties in the right panel, where I change the column length to 30. Now I need to do the same to the outptu AAA columns. (Noteworthy that here, among other things, we also could change the truncation behavior of the column!).


The OK-exiting from this window, however, only changes the warning message to:

Validation warning. Data Flow Task: Flat File Source [1]: The external metadata column collection is out of synchronization with the data source columns. The column "AAA" needs to be updated in the external metadata column collection.

NB! If you go back and try to edit properties of the Flat File source now, you’ll risk to be forced to cancel changes just done, in a form a message:


Just answer “No”! (It will not bother you again).

The right place to match the metadata now is the Flat File Connection Manager. Go (down) to the Connection Managers panel, right click on the Flat File Connection Manager, choose Edit and you’ll see the Flat File Connection Manager Editor window, where – in Advanced tag - we choose the ‘AAA’ column from the column list and could, finally, set up the OutputColumnWidth property to 30:


OK exit… The warning message about the column ‘AAA’ should have gone now.
Of course, I still have the old warning for another column:

Validation warning. Data Flow Task: OLE DB Destination [158]: Truncation may occur due to inserting data from data flow column "BBB" with a length of 50 to database column "BBB" with a length of 20.

Let's see what happens if we try fight the issue from the destination side
The important thing here is that we cannot change the field length of the destination table from the place where it’s kind of suggested – from the Advanced Editor of the OLE DB Destination. I call it a “suggested” place because the value-in-question is there and looks open for editing:

However this is not the case: after changing this value to 50 we’ll only end up with yet another warning message:
Validation warning. Data Flow Task: OLE DB Destination [158]: The external metadata column collection is out of synchronization with the data source columns. The column "BBB" needs to be updated in the external metadata column collection.

Furthermore, when you open the same Advanced Editor window again, you’ll see that the length has been SILENTLY changed back to 20! So, no, we cannot change the length of the destination table’s column here, and maybe for a good reason: the destination table may be an inseparable part of the database. My conclusion is: we don’t change properties of a destination existing in the database - instead, the source’s properties are to be adjusted (or truncation ignored). However, if the destination is a new table that is being created in a course of SSIS execution, we have all power over it, and can request a column length we need.



And after this is done, we are, finally, warning messages clean!

…Now it’s time to pay attention to that fact that a source file, actually, suggests an integer value for a column ‘BBB’. The result, however, totally depends, again, on the destination table’s properties. Let’s change them to have an integer value in column ‘BBB’:



OK-exit, no warning messages, excellent… However, the test run generates the whole train of error messages, of them the most important:

Invalid character value for cast specification.

As I’ve mentioned, I spent long time looking for comments on this error message, while the solution was so on a surface! You either need to change the RetainNulls property to TRUE – in the source’s Properties window, or check the box Retain null values from the source as null values in the data flow – in the Flat File Source Editor. Here is the resulting table:

AAA BBB
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 8
bbbbbbbbb NULL
NULL 12345

However, what’ll happen if we need to change the DML statement in the destination as follows?
CREATE TABLE [TEST17](
[AAA] VARCHAR(30),
[BBB] int NOT NULL )

An error, of course:

Error: 0xC020901C at Data Flow Task, OLE DB Destination [158]: There was an error with input column "BBB" (246) on input "OLE DB Destination Input" (171). The column status returned was: "The value violated the integrity constraints for the column."

This error could not be resolved by editing source-destination properties, but here we, finally, have a real SSIS thing – we’ll need to place a data transformation task in between the source and the destination. I used a Derived Column Task with the following expression for the derived column:

ISNULL(BBB) ? "0" : BBB
(To the word, an excellent source on SSIS expressions is at: http://technet.microsoft.com/en-us/library/ms141827.aspx )

1 comment:

Singamulla? said...

Hi,
we've been using a flat file source, it was working fine. now we have some issues with the file. Lines are not in proper format, we get some box kind of symbol instead of {CR}{LF}; what are the other new line characters which are possible for this? Any inputs on this? if so email me vivekraj@live.com
Thanks