Skip to content

Instantly share code, notes, and snippets.

@milovidov983
Created August 14, 2018 13:49
Show Gist options
  • Select an option

  • Save milovidov983/e6bd83028d694cc0c2c8d647b94a99d1 to your computer and use it in GitHub Desktop.

Select an option

Save milovidov983/e6bd83028d694cc0c2c8d647b94a99d1 to your computer and use it in GitHub Desktop.
How to convert UTF-8 to UTF-8 with BOM c# string
private string ConvertStringToUtf8Bom(string source) {
var data = Encoding.UTF8.GetBytes(source);
var result = Encoding.UTF8.GetPreamble().Concat(data).ToArray();
var encoder = new UTF8Encoding(true);
return encoder.GetString(result);
}
@chucklu
Copy link

chucklu commented Apr 12, 2021

@milovidov983,Thanks, I tried to print the bytes array, it make sense.

  var str = "aîn";
            var str2 = ConvertStringToUtf8Bom(str);
            Console.WriteLine(str2 == str);

            Console.WriteLine($"length of {str} is {str.Length}");
            var bytes1 = Encoding.UTF8.GetBytes(str);
            Console.WriteLine(GetHexString(bytes1));

            Console.WriteLine();

            Console.WriteLine($"length of {str2} is {str2.Length}");
            var bytes2 = Encoding.UTF8.GetBytes(str2);
            Console.WriteLine(GetHexString(bytes2));

False
length of aîn is 3
61 C3 AE 6E

length of aîn is 4
EF BB BF 61 C3 AE 6E

@chucklu
Copy link

chucklu commented Apr 12, 2021

By the way, I am using stream writer to create a new file with Encoding.UTF8, and it will handle the BOM automatically.
https://github.com/dotnet/runtime/blob/6ef4b2e7aba70c514d85c2b43eac1616216bea55/src/libraries/System.Private.CoreLib/src/System/IO/StreamWriter.cs#L273

@muru82
Copy link

muru82 commented Nov 9, 2022

does the above code add Carriage return when processing ?

@luanrem
Copy link

luanrem commented Mar 22, 2023

Thanks guys! This helped!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment