Recently I was asked if I could identify columns in a single SQL Server database that had a collation. I didn’t need to change anything (well, not initially, but we’ll look at doing that anyway ), simply identify columns that were candidates to have their collation changed.
Now, there’s a few other considerations here, notably the database collation and server collation, but more on that later. For now, back to the problem at hand.
My first stop was to identify all data types that could possibly have a collation.
In order to do that I’m going to head to my old friends – the system tables.
First off, a quick trip to sys.types to see which datatypes may have collations
SELECT T.[name] FROM sys.types AS T WHERE T.collation_name IS NOT NULL;
This yields the following data results
- text
- ntext
- varchar
- char
- nvarchar
- nchar
- sysname
SYSNAME is an interesting one – it’s basically equivalent to NVARCHAR(128), there’s a discussion about that here
But in order to find the columns in our database that have collations we can use a query such as
SELECT
[Table Name] = T.[name]
,[Column Name] = C.[name]
,C.collation_name
,TY.[name]
,[Typelength] = CASE
WHEN TY.[name] IN ('text' , 'ntext') THEN UPPER(TY.[name])
WHEN C.[max_length] = 8000 THEN TY.[name] + N'(MAX)'
ELSE UPPER(TY.[name]) + + N'(' + CAST(C.[max_length] AS NVARCHAR(31)) + N')'
END
FROM sys.columns AS C
JOIN sys.tables AS T ON C.[object_id] = T.[object_id]
JOIN sys.types AS TY ON TY.[user_type_id] = C.[user_type_id]
WHERE C.collation_name IS NOT NULL
AND T.[is_ms_shipped] = 0
From here, it’s just a matter of extending the query to produced some T-SQL that will update the collations of the columns – I chose to put this in a Common Table Expression (CTE), but go with whatever suits you.
WITH CTE
AS
(
SELECT
[Table Name] = T.[name]
,[Column Name] = C.[name]
,C.collation_name
,TY.[name]
,[Typelength] = CASE
WHEN TY.[name] IN ('text' , 'ntext') THEN UPPER(TY.[name])
WHEN C.[max_length] = 8000 THEN TY.[name] + N'(MAX)'
ELSE UPPER(TY.[name]) + + N'(' + CAST(C.[max_length] AS NVARCHAR(31)) + N')'
END
FROM sys.columns AS C
JOIN sys.tables AS T ON C.[object_id] = T.[object_id]
JOIN sys.types AS TY ON TY.[user_type_id] = C.[user_type_id]
WHERE C.collation_name IS NOT NULL
AND T.[is_ms_shipped] = 0
)
SELECT N'ALTER TABLE ' + [Table Name] + N' ALTER COLUMN ' + [Column Name] + N' ' + [Typelength] + N' COLLATE Latin1_General_CI_AS' + ';' FROM CTE
This will produced statements that will change the collation of columns on an individual basis. Notice in the final SELECT statement from the CTE, I’ve stated COLLATE Latin1_General_CI_AS’. This will generate a lot of statements such as
ALTER TABLE CollationChangeTable ALTER COLUMN ColumnOne NCHAR(8) COLLATE Latin1_General_CI_AS;
I prefer this approach to a cursor as it gives me an opportunity to eyeball the T-SQL and column names.
A cursor may be more appropriate in a large database – you choice, there’s lots of them around.
Proceed with caution as getting column COLLATIONs out of sync with the database and instance can bite tou hard down the track – such as conflicts with tempdb joins.
Have a great day.
Cheers
Marty.