lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 15 Aug 2021 20:23:06 +0900
From:   OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
To:     Pali Rohár <pali@...nel.org>
Cc:     linux-fsdevel@...r.kernel.org,
        linux-ntfs-dev@...ts.sourceforge.net, linux-cifs@...r.kernel.org,
        jfs-discussion@...ts.sourceforge.net, linux-kernel@...r.kernel.org,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Jan Kara <jack@...e.cz>, "Theodore Y . Ts'o" <tytso@....edu>,
        Luis de Bethencourt <luisbg@...nel.org>,
        Salah Triki <salah.triki@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Kleikamp <shaggy@...nel.org>,
        Anton Altaparmakov <anton@...era.com>,
        Pavel Machek <pavel@....cz>,
        Marek Behún <marek.behun@....cz>,
        Christoph Hellwig <hch@...radead.org>
Subject: Re: [RFC PATCH 01/20] fat: Fix iocharset=utf8 mount option

To: Pali Rohár <pali@...nel.org>
Cc: linux-fsdevel@...r.kernel.org,  linux-ntfs-dev@...ts.sourceforge.net,  linux-cifs@...r.kernel.org,  jfs-discussion@...ts.sourceforge.net,  linux-kernel@...r.kernel.org,  Alexander Viro <viro@...iv.linux.org.uk>,  Jan Kara <jack@...e.cz>,  "Theodore Y . Ts'o" <tytso@....edu>,  Luis de Bethencourt <luisbg@...nel.org>,  Salah Triki <salah.triki@...il.com>,  Andrew Morton <akpm@...ux-foundation.org>,  Dave Kleikamp <shaggy@...nel.org>,  Anton Altaparmakov <anton@...era.com>,  Pavel Machek <pavel@....cz>,  Marek Behún <marek.behun@....cz>,  Christoph Hellwig <hch@...radead.org>
Subject: Re: [RFC PATCH 01/20] fat: Fix iocharset=utf8 mount option
From: OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
Gcc: nnimap+ibmpc.myhome.or.jp:Sent
--text follows this line--
To: Pali Rohár <pali@...nel.org>
Cc: linux-fsdevel@...r.kernel.org,  linux-ntfs-dev@...ts.sourceforge.net,  linux-cifs@...r.kernel.org,  jfs-discussion@...ts.sourceforge.net,  linux-kernel@...r.kernel.org,  Alexander Viro <viro@...iv.linux.org.uk>,  Jan Kara <jack@...e.cz>,  "Theodore Y . Ts'o" <tytso@....edu>,  Luis de Bethencourt <luisbg@...nel.org>,  Salah Triki <salah.triki@...il.com>,  Andrew Morton <akpm@...ux-foundation.org>,  Dave Kleikamp <shaggy@...nel.org>,  Anton Altaparmakov <anton@...era.com>,  Pavel Machek <pavel@....cz>,  Marek Behún <marek.behun@....cz>,  Christoph Hellwig <hch@...radead.org>
Subject: Re: [RFC PATCH 01/20] fat: Fix iocharset=utf8 mount option
From: OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
Gcc: nnimap+ibmpc.myhome.or.jp:Sent
--text follows this line--
Pali Rohár <pali@...nel.org> writes:

>> This change is not equivalent to utf8=1. In the case of utf8=1, vfat
>> uses iocharset's conversion table and it can handle more than ascii.
>> 
>> So this patch is incompatible changes, and handles less chars than
>> utf8=1. So I think this is clean though, but this would be regression
>> for user of utf8=1.
>
> I do not think so... But please correct me, as this code around is mess.
>
> Without this change when utf8=1 is set then iocharset= encoding is used
> for case-insensitivity implementation (toupper / tolower conversion).
> For all other parts are use correct utf8* conversion functions.
>
> But you use touppper / tolower functions from iocharset= encoding on
> stream of utf8 bytes then you either get identity or some unpredictable
> garbage in utf8. So when comparing two (different) non-ASCII filenames
> via this method you in most cases get that filenames are different.
> Because converting their utf8 bytes via toupper / tolower functions from
> iocharset= encoding results in two different byte sequences in most
> cases. Even for two utf8 case-insensitive same strings.
>
> But you can play with it and I guess it is possible to find two
> different utf8 strings which after toupper / tolower conversion from
> some iocharset= encoding would lead to same byte sequence.
>
> This patch uses for utf8 tolower / touppser function simple 7-bit
> tolower / toupper ascii function. And so for 7-bit ascii file names
> there is no change.
>
> So this patch changes behavior when comparing non 7-bit ascii file
> names, but only in cases when previously two different file names were
> marked as same. As now they are marked correctly as different. So this
> is changed behavior, but I guess it is bug fix which is needed.
> If you want I can put this change into separate patch.
>
> Issue that two case-insensitive same files are marked as different is
> not changed by this patch and therefore this issue stay here.

OK, sure. utf8 looks like broken than I was thinking (although user can
use iocharset=ascii and utf8=1 for this). The code might be better to
clean up a bit more though, looks like good basically.

One thing, please update FAT_DEFAULT_IOCHARSET help in Kconfig and
Documentation/filesystems/vfat.rst (with new warning about iocharset=utf8).

Thanks.
-- 
OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ