lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9c193ad9-801d-a3d3-faf6-3a655a6fa209@huaweicloud.com>
Date: Sat, 6 Jan 2024 14:26:03 +0800
From: Hou Tao <houtao@...weicloud.com>
To: Vivek Goyal <vgoyal@...hat.com>, Matthew Wilcox <willy@...radead.org>
Cc: linux-fsdevel@...r.kernel.org, Miklos Szeredi <miklos@...redi.hu>,
 Stefan Hajnoczi <stefanha@...hat.com>, linux-kernel@...r.kernel.org,
 virtualization@...ts.linux.dev, houtao1@...wei.com
Subject: Re: [PATCH v2] virtiofs: use GFP_NOFS when enqueuing request through
 kworker

Hi Vivek,

On 1/6/2024 5:27 AM, Vivek Goyal wrote:
> On Fri, Jan 05, 2024 at 08:57:55PM +0000, Matthew Wilcox wrote:
>> On Fri, Jan 05, 2024 at 03:41:48PM -0500, Vivek Goyal wrote:
>>> On Fri, Jan 05, 2024 at 08:21:00PM +0000, Matthew Wilcox wrote:
>>>> On Fri, Jan 05, 2024 at 03:17:19PM -0500, Vivek Goyal wrote:
>>>>> On Fri, Jan 05, 2024 at 06:53:05PM +0800, Hou Tao wrote:
>>>>>> From: Hou Tao <houtao1@...wei.com>
>>>>>>
>>>>>> When invoking virtio_fs_enqueue_req() through kworker, both the
>>>>>> allocation of the sg array and the bounce buffer still use GFP_ATOMIC.
>>>>>> Considering the size of both the sg array and the bounce buffer may be
>>>>>> greater than PAGE_SIZE, use GFP_NOFS instead of GFP_ATOMIC to lower the
>>>>>> possibility of memory allocation failure.
>>>>>>
>>>>> What's the practical benefit of this patch. Looks like if memory
>>>>> allocation fails, we keep retrying at interval of 1ms and don't
>>>>> return error to user space.

Motivation for GFP_NOFS comes another fix proposed for virtiofs [1] in
which when trying to insert a big kernel module kept in a cache-disabled
virtiofs, the length of fuse args will be large (e.g., 10MB), and the
memory allocation in copy_args_to_argbuf() will fail forever. The
proposed fix tries to fix the problem by limit the length of data kept
in fuse arg, but because the limitation is still large (256KB in that
patch), so I think using GFP_NOFS will also be helpful for such memory
allocation.

[1]:
https://lore.kernel.org/linux-fsdevel/20240103105929.1902658-1-houtao@huaweicloud.com/
>>>> You don't deplete the atomic reserves unnecessarily?
>>> Sounds reasonable. 
>>>
>>> With GFP_NOFS specificed, can we still get -ENOMEM? Or this will block
>>> indefinitely till memory can be allocated. 
>> If you need the "loop indefinitely" behaviour, that's
>> GFP_NOFS | __GFP_NOFAIL.  If you're actually doing something yourself
>> which can free up memory, this is a bad choice.  If you're just sleeping
>> and retrying, you might as well have the MM do that for you.

Even with GFP_NOFS, I think -ENOMEM is still possible, so the retry
logic is still necessary.
> I probably don't want to wait indefinitely. There might be some cases
> where I might want to return error to user space. For example, if
> virtiofs device has been hot-unplugged, then there is no point in
> waiting indefinitely for memory allocation. Even if memory was allocated,
> soon we will return error to user space with -ENOTCONN. 
>
> We are currently not doing that check after memory allocation failure but
> we probably could as an optimization.

Yes. It seems virtio_fs_enqueue_req() only checks fsvq->connected before
writing sg list to virtual queue, so if the virtio device is
hot-unplugged and the free memory is low, it may do unnecessary retry.
Even worse, it may hang. I volunteer to post a patch to check the
connected status after memory allocation failed if you are OK with that.

>
> So this patch looks good to me as it is. Thanks Hou Tao.
>
> Reviewed-by: Vivek Goyal <vgoyal@...hat.com>
>
> Thanks
> Vivek


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ