[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200120214046.f6uq7rlih7diqahz@pali>
Date:   Mon, 20 Jan 2020 22:40:46 +0100
From:   Pali Rohár <pali.rohar@...il.com>
To:     OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>
Cc:     linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        "Theodore Y. Ts'o" <tytso@....edu>,
        Namjae Jeon <linkinjeon@...il.com>,
        Gabriel Krisman Bertazi <krisman@...labora.com>
Subject: Re: vfat: Broken case-insensitive support for UTF-8
On Monday 20 January 2020 21:07:12 OGAWA Hirofumi wrote:
> Pali Rohár <pali.rohar@...il.com> writes:
> 
> >> To be perfect, the table would have to emulate what Windows use. It can
> >> be unicode standard, or something other.
> >
> > Windows FAT32 implementation (fastfat.sys) is opensource. So it should
> > be possible to inspect code and figure out how it is working.
> >
> > I will try to look at it.
> 
> I don't think the conversion library is not in fs driver though,
> checking implement itself would be good.
Ok, I did some research. It took me it longer as I thought as lot of
stuff is undocumented and hard to find all relevant information.
So... fastfat.sys is using ntos function RtlUpcaseUnicodeString() which
takes UTF-16 string and returns upper case UTF-16 string. There is no
mapping table in fastfat.sys driver itself.
RtlUpcaseUnicodeString() is a ntos kernel function and after my research
it seems that this function is using only conversion table stored in
file l_intl.nls (from c:\windows\system32).
Project wine describe this file as "unicode casing tables" and seems
that it can parse this file format. Even more it distributes its own
version of this file which looks like to be generated from official
Unicode UnicodeData.txt via Perl script make_unicode (part of wine).
So question is... how much is MS changing l_intl.nls file in their
released Windows versions?
I would try to decode what is format of that file l_intl.nls and try to
compare data in it from some Windows versions.
Can we reuse upper case mapping table from that file?
-- 
Pali Rohár
pali.rohar@...il.com
Download attachment "signature.asc" of type "application/pgp-signature" (196 bytes)
Powered by blists - more mailing lists
 
