lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200120173749.GG15860@mit.edu>
Date:   Mon, 20 Jan 2020 12:37:49 -0500
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     David Laight <David.Laight@...LAB.COM>
Cc:     "'Pali Rohár'" <pali.rohar@...il.com>,
        OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        Namjae Jeon <linkinjeon@...il.com>,
        Gabriel Krisman Bertazi <krisman@...labora.com>
Subject: Re: vfat: Broken case-insensitive support for UTF-8

On Mon, Jan 20, 2020 at 03:07:20PM +0000, David Laight wrote:
> What happens if the filesystem has filenames that invalid UTF8 sequences
> or multiple filenames that decode from UTF8 to the same 'wchar' value.
> Never mind ones that are just case-differences for the same filename.
> 
> UTF8 is just so broken it should never have been allowed to become
> a standard.

Internationalization is an overconstrained problem which is impacted
and influenced by human politics, incuding from the Cold War and who
attended which internal standards bodies meetings.  So much so that an
I18N expert (very knowledgable about the problems in this domain) has
been known to have said (in a bar, late at night, and after much
alcohol) that it would be simpler to teach the entire human race
English.

Unfortunately, that's not going to happen, and if we are going to deal
with the market of "everyone which doesn't speak English", we're going
to have to live with Unicode, warts at and all.  Seriously speaking,
UTF-8 is the worst encoding, except for all of the others.  :-)

						- Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ