lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 21 Nov 2023 00:12:15 -0500 From: "Theodore Ts'o" <tytso@....edu> To: Linus Torvalds <torvalds@...ux-foundation.org> Cc: Christian Brauner <brauner@...nel.org>, Gabriel Krisman Bertazi <krisman@...e.de>, viro@...iv.linux.org.uk, linux-f2fs-devel@...ts.sourceforge.net, ebiggers@...nel.org, linux-fsdevel@...r.kernel.org, jaegeuk@...nel.org, linux-ext4@...r.kernel.org Subject: Re: [f2fs-dev] [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs On Mon, Nov 20, 2023 at 07:03:13PM -0800, Linus Torvalds wrote: > On Mon, 20 Nov 2023 at 18:29, Linus Torvalds > <torvalds@...ux-foundation.org> wrote: > > > > It's a bit complicated, yes. But no, doing things one unicode > > character at a time is just bad bad bad. > > Put another way: the _point_ of UTF-8 is that ASCII is still ASCII. > It's literally why UTF-8 doesn't suck. > > So you can still compare ASCII strings as-is. > > No, that doesn't help people who are really using other locales, and > are actively using complicated characters. > > But it very much does mean that you can compare "Bad" and "bad" and > never ever look at any unicode translation ever. Yeah, agreed, that would be a nice optimization. However, in the unfortunate case where (a) it's non-ASCII, and (b) the input string is non-normalized and/or differs in case, we end up scanning some portion of the two strings twice; once doing the strcmp, and once doing the Unicode slow path. That being said, given that even in the case where we're dealing with non-ASCII strings, in the fairly common case where the program is doing a readdir() followed by a open() or stat(), the filename will be byte-identical and so a strcmp() will suffice. So I agree that it's a nice optimization. It'd be interesting how much such an optimization would actually show up in various benchmarks. It'd have to be something that was really metadata-heavy, or else the filenamea lookups would get drowned out. - Ted
Powered by blists - more mailing lists