lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 29 Apr 2008 13:42:16 +0300
From:	Adrian Bunk <bunk@...nel.org>
To:	Willy Tarreau <w@....eu>
Cc:	Helge Hafting <helge.hafting@...el.hist.no>,
	"H. Peter Anvin" <hpa@...nel.org>, linux-kernel@...r.kernel.org,
	trivial@...nel.org
Subject: Re: [2.6 patch] UTF-8 fixes in comments

On Tue, Apr 29, 2008 at 12:09:34PM +0200, Willy Tarreau wrote:
> On Tue, Apr 29, 2008 at 11:06:05AM +0200, Helge Hafting wrote:
> > >Well, I accidentally used a freshly installed laptop running mandriva 2008.
> > >I was typing in a terminal inside KDE (I don't know the program name, sort
> > >of an xterm, but with huge borders all around). I made a typo in a word and
> > >typed in a "é" (e acute). Pressing backspace to fix it showed me that I
> > >remove more chars than typed. I tried again. Pressing this letter 5 times,
> > >then 10 times backspace. I removed 5 chars from the prompt. I suspect that
> > >if I had used some chars with wider encoding (eg 4 bytes), I could have
> > >removed as many... Clearly those tools are not ready.
> > >  
> > So don't use that particular tool
> 
> It was not my machine, and had you been there, you would have heard me call
> it names !
> 
> > and/or file a bug with the maintainer. :-)
> 
> It's too easy to impose crappy designs to end-users and tell them that if
> that does not work they have to file a bug. There are a minimal set of
> things that must be tested before shipping. Seeing that the default
> terminal emulator in KDE on Mandriva 2008 is configured in UTF-8 and does
> not properly render it simply makes me sick. This is broken by design and
> even distros trying to get it working for years still can't cope with it.
> There must be a reason.

I can reproduce your problem in a plain xterm when setting LANG=en_US
(most likely the same problem can occur with other non UTF-8 settings).

In this case I'm actually more surprised that the character is displayed 
correctly than that you have to type backspace twice.

Any kind of charset mixing is highly problematic (which is also why my 
patch was attached compressed), so if you disable UTF-8 anywhere in a 
modern distribution problems are somehow expected (it could also be a 
bug in Mandrivas default settings, but that would really surprise me).

>...
> > Unicode gives userland an opportunity to actually work decently
> > for the first time.
> 
> Unicode yes, UTF-8 no. UTF-8 is a compressed encoding of unicode.
> That's as silly as if you had to replace your terminals to read
> native gzip, and expect them as well as all the tools to work
> properly!

It's not a compressed encoding, it's a variable-length encoding.

Besides the size advantages one main advantage of UTF-8 is that ASCII is 
valid UTF-8. This means that for the ASCII source code in the kernel it 
doesn't matter whether it's treated as ASCII or UTF-8, and no conversion 
was needed.

You can't get this property with a fixed-size Unicode encoding.

>...
> Willy

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ