lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 25 Jan 2022 14:57:52 -0800
From:   Shakeel Butt <shakeelb@...gle.com>
To:     David Rientjes <rientjes@...gle.com>
Cc:     Jens Axboe <axboe@...nel.dk>,
        Pavel Begunkov <asml.silence@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>, io-uring@...r.kernel.org
Subject: Re: [PATCH] mm: io_uring: allow oom-killer from io_uring_setup

On Tue, Jan 25, 2022 at 10:35 AM David Rientjes <rientjes@...gle.com> wrote:
>
> On Mon, 24 Jan 2022, Shakeel Butt wrote:
>
> > On an overcommitted system which is running multiple workloads of
> > varying priorities, it is preferred to trigger an oom-killer to kill a
> > low priority workload than to let the high priority workload receiving
> > ENOMEMs. On our memory overcommitted systems, we are seeing a lot of
> > ENOMEMs instead of oom-kills because io_uring_setup callchain is using
> > __GFP_NORETRY gfp flag which avoids the oom-killer. Let's remove it and
> > allow the oom-killer to kill a lower priority job.
> >
>
> What is the size of the allocations that io_mem_alloc() is doing?
>
> If get_order(size) > PAGE_ALLOC_COSTLY_ORDER, then this will fail even
> without the __GFP_NORETRY.  To make the guarantee that workloads are not
> receiving ENOMEM, it seems like we'd need to guarantee that allocations
> going through io_mem_alloc() are sufficiently small.
>
> (And if we're really serious about it, then even something like a
> BUILD_BUG_ON().)
>

The test case provided to me for which the user was seeing ENOMEMs was
io_uring_setup() with 64 entries (nothing else).

If I understand rings_size() calculations correctly then the 0 order
allocation was requested in io_mem_alloc().

For order > PAGE_ALLOC_COSTLY_ORDER, maybe we can use
__GFP_RETRY_MAYFAIL. It will at least do more aggressive reclaim
though I think that is a separate discussion. For this issue, we are
seeing ENOMEMs even for order 0 allocations.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ