Wednesday, March 28, 2012

HELP! Strange behavior using t-sql REPLACE function

I am seeing very odd behavior with the t-sql REPLACE funtion.

I am trying to wipe out the 'T' character (unicode value 222) with a space character (' ' unicode value 32) and it appears to be replacing occurrences of 'th' together as well. Here are some selects to show this behavior:

select nchar(unicode('T'))
select nchar(unicode(' '))
select unicode('T')
select unicode(' ')
select replace('Urban & Ethnic', nchar(unicode('T')), nchar(unicode(' ')))
select replace('Urban T Ethnic', nchar(unicode('T')), nchar(unicode(' ')))
select replace('Urban & Ethnic', nchar(222), nchar(32))
select replace('Urban T Ethnic', nchar(222), nchar(32))
select replace('Urban & Ethnic', 'T', ' ')
select replace('Urban T Ethnic', 'T', ' ')

NOTICE that the last two statements:

select replace('Urban & Ethnic', 'T', ' ') - RESULT 'Urban & Ethnic'
select replace('Urban T Ethnic', 'T', ' ') - RESULT 'Urban Ethnic'

...work properly, while the other ones dont:

select replace('Urban & Ethnic', nchar(unicode('T')), nchar(unicode(' '))) - RESULT 'Urban & E nic'
select replace('Urban T Ethnic', nchar(unicode('T')), nchar(unicode(' '))) - RESULT 'Urban E nic'
select replace('Urban & Ethnic', nchar(222), nchar(32)) - RESULT 'Urban & E nic'
select replace('Urban T Ethnic', nchar(222), nchar(32)) - RESULT 'Urban E nic'

This function ONLY seems to work with the character representation, any attempt to convert an integer value to a character representation fails miserably...

Please advise, and thanks for your help in advance.

It seems that the collation the active database uses treats "th" and "T" as the same character. To fix this, change the database collation to something more appropriate, or use tricks like the following in your queries:

select replace(N'Urban T Ethnic' COLLATE Latin1_General_BIN, N'T', N' ')

By forcing the collation to BIN (binary), a binary comparison is done, which doesn't do fancy character-equivilancy stuff.

-Ryan

|||

Thanks!

sp_helpsort reveals the default collation = Latin1-General, blah, blah, blah, etc...

I even tried to convert EVERYTHING to binary var types within the query and it STILL came up with the wrong result.

Thanks again for your help. I didn't even consider changing the COLLATION type in the query itself...

sql

No comments:

Post a Comment