[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87sgkan57p.fsf@mail.parknet.co.jp>
Date: Mon, 20 Jan 2020 13:04:42 +0900
From: OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
To: Pali Rohár <pali.rohar@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
"Theodore Y. Ts'o" <tytso@....edu>,
Namjae Jeon <linkinjeon@...il.com>,
Gabriel Krisman Bertazi <krisman@...labora.com>
Subject: Re: vfat: Broken case-insensitive support for UTF-8
Pali Rohár <pali.rohar@...il.com> writes:
> Which means that fat_name_match(), vfat_hashi() and vfat_cmpi() are
> broken for vfat in UTF-8 mode.
Right. It is a known issue.
> I was thinking how to fix it, and the only possible way is to write a
> uni_tolower() function which takes one Unicode code point and returns
> lowercase of input's Unicode code point. We cannot do any Unicode
> normalization as VFAT specification does not say anything about it and
> MS reference fastfat.sys implementation does not do it neither.
>
> So, what would be the best option for implementing that function?
>
> unicode_t uni_tolower(unicode_t u);
>
> Could a new fs/unicode code help with it? Or it is too tied with NFD
> normalization and therefore cannot be easily used or extended?
To be perfect, the table would have to emulate what Windows use. It can
be unicode standard, or something other. And other fs can use different
what Windows use.
So the table would have to be switchable in perfect world (if there is
no consensus to use 1 table). If we use switchable table, I think it
would be better to put in userspace, and loadable like firmware data.
Well, so then it would not be simple work (especially, to be perfect).
Also, not directly same issue though. There is related issue for
case-insensitive. Even if we use some sort of internal wide char
(e.g. in nls, 16bits), dcache is holding name in user's encode
(e.g. utf8). So inefficient to convert cached name to wide char for each
access.
Relatively recent EXT4 case-insensitive may tackled this though, I'm not
checking it yet.
> New exfat code which is under review and hopefully would be merged,
> contains own unicode upcase table (as defined by exfat specification) so
> as exfat is similar to FAT32, maybe reusing it would be a better option?
exfat just put a case conversion table in fs. So I don't think it helps
fatfs.
Thanks.
--
OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
Powered by blists - more mailing lists