lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 29 Jun 2024 23:41:46 +0800
From: "Ma, Yu" <yu.ma@...el.com>
To: Jan Kara <jack@...e.cz>, Mateusz Guzik <mjguzik@...il.com>
Cc: Christian Brauner <brauner@...nel.org>, viro@...iv.linux.org.uk,
 edumazet@...gle.com, linux-fsdevel@...r.kernel.org,
 linux-kernel@...r.kernel.org, pan.deng@...el.com, tianyou.li@...el.com,
 tim.c.chen@...el.com, tim.c.chen@...ux.intel.com, yu.ma@...el.com
Subject: Re: [PATCH v2 1/3] fs/file.c: add fast path in alloc_fd()


On 6/28/2024 5:12 PM, Jan Kara wrote:
> On Thu 27-06-24 21:59:12, Mateusz Guzik wrote:
>> On Thu, Jun 27, 2024 at 8:27 PM Ma, Yu <yu.ma@...el.com> wrote:
>>> 2. For fast path implementation, the essential and simple point is to
>>> directly return an available bit if there is free bit in [0-63]. I'd
>>> emphasize that it does not only improve low number of open fds (even it
>>> is the majority case on system as Honza agreed), but also improve the
>>> cases that lots of fds open/close frequently with short task (as per the
>>> algorithm, lower bits will be prioritized to allocate after being
>>> recycled). Not only blogbench, a synthetic benchmark, but also the
>>> realistic scenario as claimed in f3f86e33dc3d("vfs: Fix pathological
>>> performance case for __alloc_fd()"), which literally introduced this
>>> 2-levels bitmap searching algorithm to vfs as we see now.
>> I don't understand how using next_fd instead is supposed to be inferior.
>>
>> Maybe I should clarify that by API contract the kernel must return the
>> lowest free fd it can find. To that end it maintains the next_fd field
>> as a hint to hopefully avoid some of the search work.
>>
>> In the stock kernel the first thing done in alloc_fd is setting it as
>> a starting point:
>>          fdt = files_fdtable(files);
>>          fd = start;
>>          if (fd < files->next_fd)
>>                  fd = files->next_fd;
>>
>> that is all the calls which come here with 0 start their search from
>> next_fd position.
> Yup.
>
>> Suppose you implemented the patch as suggested by me and next_fd fits
>> the range of 0-63. Then you get the benefit of lower level bitmap
>> check just like in the patch you submitted, but without having to
>> first branch on whether you happen to be in that range.
>>
>> Suppose next_fd is somewhere higher up, say 80. With your general
>> approach the optimization wont be done whatsoever or it will be
>> attempted at the 0-63 range when it is an invariant it finds no free
>> fds.
>>
>> With what I'm suggesting the general idea of taking a peek at the
>> lower level bitmap can be applied across the entire fd space. Some
>> manual mucking will be needed to make sure this never pulls more than
>> one cacheline, easiest way out I see would be to align next_fd to
>> BITS_PER_LONG for the bitmap search purposes.
> Well, all you need to do is to call:
>
> 	bit = find_next_zero_bit(fdt->open_fds[start / BITS_PER_LONG],
> 				 BITS_PER_LONG, start & (BITS_PER_LONG-1));
> 	if (bit < BITS_PER_LONG)
> 		return bit + (start & ~(BITS_PER_LONG - 1));
>
>
> in find_next_fd(). Not sure if this is what you meant by aligning next_fd
> to BITS_PER_LONG...
>
> 								Honza

So the idea here is to take a peek at the word contains next_fd, return 
directly if get free bit there. Not sure about the efficiency here if 
open/close fd frequently and next_fd is distributed very randomly. Just 
give a quick try for this part of code, seems kernel failed to boot 
up😳, kind of out of expectation ...


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ