[nem-en] Language support and BOM

Alexey Borzenkov snaury at gmail.com
Mon Jan 29 22:55:54 CET 2007


On 1/30/07, vc <vc at rsdn.ru> wrote:
>
> It's really complex. We should simply try read BOM.


And that's exactly where you're wrong. As you can see, it not only detected
BOMs and possible utf-8, it was also letting developers explicitly specify
file encoding "emacs-style", so that people would at least have a chance to
know what encoding the file is in (and emacs would parse that automatically,
of course). Simply reading BOM is worthless, because if we don't have BOM
encoding is non-deterministic. MS C# and VS try to solve that by
auto-detecting utf-8, without auto-detection of utf-8 it would
*immediately*cause current code to compile incorrectly, because by
current convention the
lack of BOM means utf-8.

> Btw, VS does not treat files as encoded in
> > current encoding. When file is correct utf-8 (*not* ASCII) it detects
> that
> > it's utf-8 and parses/saves it as utf-8
>
> You are mistaken. Try use Russin in created by default wizard files, and
> you
> can see as files will be saved in 1251.


Where exactly am I mistaken? In that two sentences I was only talking about
the way VS *opens* files. Try removing BOM from valid utf-8 file and open it
in VS. If you want files to be saved in utf-8, use Save As.

> (Options->Text Editor->Auto-detect
> > UTF-8 encoding without signature is checked by default).
> Yes, but this detection find some spatial utf-8 character prefix which not
> contains in 1251 encoded files (for example).


No detection is perfect. However I didn't really understand this sentence.
Do you mean that if incorrectly detects cp1251 files as utf-8? I'd say
that's hightly unlikely, since Russian language simply won't align itself to
have utf-8 sequences. :-/

> I also
> > believe it's possible to change file encoding from within VS somehow...
> It's wrong way. User should make chose yourself.


Why should they be able to? My MinGW gcc doesn't let me choose files
encoding (always current encoding, sad). A lot of other programs don't let
me too...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: /mailman/pipermail/devel-en/attachments/20070129/7254760a/attachment.html


More information about the devel-en mailing list