lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210325210159.r565fvfitoqeuykp@ava.usersys.com>
Date:   Thu, 25 Mar 2021 21:01:59 +0000
From:   Aaron Tomlin <atomlin@...hat.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     linux-mm@...ck.org, akpm@...ux-foundation.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/page_alloc: try oom if reclaim is unable to make
 forward progress

On Mon 2021-03-22 11:47 +0100, Michal Hocko wrote:
> Costly orders already do have heuristics for the retry in place. Could
> you be more specific what kind of problem you see with those?

If I understand correctly, when the gfp_mask consists of
GFP_KERNEL | __GFP_RETRY_MAYFAIL in particular, an allocation request will
fail, if and only if reclaim is unable to make progress.

The costly order allocation retry logic is handled primarily in
should_reclaim_retry(). Looking at should_reclaim_retry() we see that the
no progress counter value is always incremented in the costly order case
even when "some" progress has been made which is represented by the value
stored in did_some_progress.

        if (costly_order && !(gfp_mask & __GFP_RETRY_MAYFAIL))
                goto nopage;

        if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags,
                                 did_some_progress > 0, &no_progress_loops))
                goto retry;

I think after we have tried MAX_RECLAIM_RETRIES in a row without success
and the last known compaction attempt was "skipped", perhaps it would be
better to try and use the OOM killer or fail the allocation attempt?

I encountered a situation when the value of no_progress_loops was found to
be 31,611,688 i.e. significantly over MAX_RECLAIM_RETRIES; the allocation
order was 5. The gfp_mask contained the following:

     #define ___GFP_HIGHMEM          0x02
     #define ___GFP_IO               0x40
     #define ___GFP_FS               0x80
     #define ___GFP_NOWARN           0x200
     #define ___GFP_RETRY_MAYFAIL    0x400
     #define ___GFP_COMP             0x4000
     #define ___GFP_HARDWALL         0x20000
     #define ___GFP_DIRECT_RECLAIM   0x200000
     #define ___GFP_KSWAPD_RECLAIM   0x400000



-- 
Aaron Tomlin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ