[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <TY1PR01MB15782019FA3094015950830590C70@TY1PR01MB1578.jpnprd01.prod.outlook.com>
Date: Fri, 3 Apr 2020 02:18:15 +0000
From: "Kohada.Tetsuhiro@...MitsubishiElectric.co.jp"
<Kohada.Tetsuhiro@...MitsubishiElectric.co.jp>
To: "'pali@...nel.org'" <pali@...nel.org>
CC: "'linux-fsdevel@...r.kernel.org'" <linux-fsdevel@...r.kernel.org>,
"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>,
"'namjae.jeon@...sung.com'" <namjae.jeon@...sung.com>,
"'sj1557.seo@...sung.com'" <sj1557.seo@...sung.com>,
"'viro@...iv.linux.org.uk'" <viro@...iv.linux.org.uk>
Subject: Re: [PATCH 1/4] exfat: Simplify exfat_utf8_d_hash() for code points
above U+FFFF
> I guess it was designed for 8bit types, not for long (64bit types) and
> I'm not sure how effective it is even for 16bit types for which it is
> already used.
In partial_name_hash (), when 8bit value or 16bit value is specified,
upper 8-12bits tend to be 0.
> So question is, what should we do for either 21bit number (one Unicode
> code point = equivalent of UTF-32) or for sequence of 16bit numbers
> (UTF-16)?
If you want to get an unbiased hash value by specifying an 8 or 16-bit value,
the hash32() function is a good choice.
ex1: Prepare by hash32 () function.
hash = partial_name_hash (hash32 (val16,32), hash);
ex2: Use the hash32() function directly.
hash + = hash32 (val16,32);
> partial_name_hash(unsigned long c, unsigned long prevhash)
> {
> return (prevhash + (c << 4) + (c >> 4)) * 11;
> }
Another way may replace partial_name_hash().
return prevhash + hash32(c,32)
Powered by blists - more mailing lists