Convert text value in SQL Server from UTF8 to ISO 8859-1(将 SQL Server 中的文本值从 UTF8 转换为 ISO 8859-1)
问题描述
我在 SQL Server 中有一个带有 utf8 SQL_Latin1_General_CP1_CI_AS 编码的列.如何以 ISO 8859-1 编码转换和保存文本?我想在 SQL Server 上的查询中做一些事情.有什么提示吗?
I have a column in SQL Server with utf8 SQL_Latin1_General_CP1_CI_AS encoding. How can I convert and save the text in ISO 8859-1 encoding? I would like to do thing in a query on SQL Server. Any tips?
奥利奥.Gostei 做慢跑.Quando "baixei" até achei que não iria curtir muito
Olá. Gostei do jogo. Quando "baixei" até achei que não iria curtir muito
推荐答案
我编写了一个函数来修复存储在 varchar
字段中的 UTF-8 文本.
I have written a function to repair UTF-8 text that is stored in a varchar
field.
要检查固定值,您可以像这样使用它:
To check the fixed values you can use it like this:
CREATE TABLE #Table1 (Column1 varchar(max))
INSERT #Table1
VALUES ('Olá. Gostei do jogo. Quando "baixei" até achei que não iria curtir muito')
SELECT *, NewColumn1 = dbo.DecodeUTF8String(Column1)
FROM Table1
WHERE Column1 <> dbo.DecodeUTF8String(Column1)
输出:
Column1
-------------------------------
Olá. Gostei do jogo. Quando "baixei" até achei que não iria curtir muito
NewColumn1
-------------------------------
Olá. Gostei do jogo. Quando "baixei" até achei que não iria curtir muito
代码:
CREATE FUNCTION dbo.DecodeUTF8String (@value varchar(max))
RETURNS nvarchar(max)
AS
BEGIN
-- Transforms a UTF-8 encoded varchar string into Unicode
-- By Anthony Faull 2014-07-31
DECLARE @result nvarchar(max);
-- If ASCII or null there's no work to do
IF (@value IS NULL
OR @value NOT LIKE '%[^ -~]%' COLLATE Latin1_General_BIN
)
RETURN @value;
-- Generate all integers from 1 to the length of string
WITH e0(n) AS (SELECT TOP(POWER(2,POWER(2,0))) NULL FROM (VALUES (NULL),(NULL)) e(n))
, e1(n) AS (SELECT TOP(POWER(2,POWER(2,1))) NULL FROM e0 CROSS JOIN e0 e)
, e2(n) AS (SELECT TOP(POWER(2,POWER(2,2))) NULL FROM e1 CROSS JOIN e1 e)
, e3(n) AS (SELECT TOP(POWER(2,POWER(2,3))) NULL FROM e2 CROSS JOIN e2 e)
, e4(n) AS (SELECT TOP(POWER(2,POWER(2,4))) NULL FROM e3 CROSS JOIN e3 e)
, e5(n) AS (SELECT TOP(POWER(2.,POWER(2,5)-1)-1) NULL FROM e4 CROSS JOIN e4 e)
, numbers(position) AS
(
SELECT TOP(DATALENGTH(@value)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM e5
)
-- UTF-8 Algorithm (http://en.wikipedia.org/wiki/UTF-8)
-- For each octet, count the high-order one bits, and extract the data bits.
, octets AS
(
SELECT position, highorderones, partialcodepoint
FROM numbers a
-- Split UTF8 string into rows of one octet each.
CROSS APPLY (SELECT octet = ASCII(SUBSTRING(@value, position, 1))) b
-- Count the number of leading one bits
CROSS APPLY (SELECT highorderones = 8 - FLOOR(LOG( ~CONVERT(tinyint, octet) * 2 + 1)/LOG(2))) c
CROSS APPLY (SELECT databits = 7 - highorderones) d
CROSS APPLY (SELECT partialcodepoint = octet % POWER(2, databits)) e
)
-- Compute the Unicode codepoint for each sequence of 1 to 4 bytes
, codepoints AS
(
SELECT position, codepoint
FROM
(
-- Get the starting octect for each sequence (i.e. exclude the continuation bytes)
SELECT position, highorderones, partialcodepoint
FROM octets
WHERE highorderones <> 1
) lead
CROSS APPLY (SELECT sequencelength = CASE WHEN highorderones in (1,2,3,4) THEN highorderones ELSE 1 END) b
CROSS APPLY (SELECT endposition = position + sequencelength - 1) c
CROSS APPLY
(
-- Compute the codepoint of a single UTF-8 sequence
SELECT codepoint = SUM(POWER(2, shiftleft) * partialcodepoint)
FROM octets
CROSS APPLY (SELECT shiftleft = 6 * (endposition - position)) b
WHERE position BETWEEN lead.position AND endposition
) d
)
-- Concatenate the codepoints into a Unicode string
SELECT @result = CONVERT(xml,
(
SELECT NCHAR(codepoint)
FROM codepoints
ORDER BY position
FOR XML PATH('')
)).value('.', 'nvarchar(max)');
RETURN @result;
END
GO
这篇关于将 SQL Server 中的文本值从 UTF8 转换为 ISO 8859-1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:将 SQL Server 中的文本值从 UTF8 转换为 ISO 8859-1


- 更改自动增量起始编号? 2021-01-01
- 导入具有可变标题的 Excel 文件 2021-01-01
- 使用 Oracle PL/SQL developer 生成测试数据 2021-01-01
- SQL 临时表问题 2022-01-01
- 如何将 SonarQube 6.7 从 MySQL 迁移到 postgresql 2022-01-01
- 远程 mySQL 连接抛出“无法使用旧的不安全身份验证连接到 MySQL 4.1+"来自 XAMPP 的错误 2022-01-01
- 如何使用 pip 安装 Python MySQLdb 模块? 2021-01-01
- 以一个值为轴心,但将一行上的数据按另一行分组? 2022-01-01
- 在SQL中,如何为每个组选择前2行 2021-01-01
- 如何将 Byte[] 插入 SQL Server VARBINARY 列 2021-01-01