lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 4 Jun 2024 00:13:04 +0200
From: Erhard Furtner <erhard_f@...lbox.org>
To: Yu Zhao <yuzhao@...gle.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 linuxppc-dev@...ts.ozlabs.org, Yosry Ahmed <yosryahmed@...gle.com>
Subject: Re: kswapd0: page allocation failure: order:0,
 mode:0x820(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0 (Kernel
 v6.5.9, 32bit ppc)

On Sun, 2 Jun 2024 20:03:32 +0200
Erhard Furtner <erhard_f@...lbox.org> wrote:

> On Sat, 1 Jun 2024 00:01:48 -0600
> Yu Zhao <yuzhao@...gle.com> wrote:
> 
> > The OOM kills on both kernel versions seem to be reasonable to me.
> > 
> > Your system has 2GB memory and it uses zswap with zsmalloc (which is
> > good since it can allocate from the highmem zone) and zstd/lzo (which
> > doesn't matter much). Somehow -- I couldn't figure out why -- it
> > splits the 2GB into a 0.25GB DMA zone and a 1.75GB highmem zone:
> > 
> > [    0.000000] Zone ranges:
> > [    0.000000]   DMA      [mem 0x0000000000000000-0x000000002fffffff]
> > [    0.000000]   Normal   empty
> > [    0.000000]   HighMem  [mem 0x0000000030000000-0x000000007fffffff]
> > 
> > The kernel can't allocate from the highmem zone -- only userspace and
> > zsmalloc can. OOM kills were due to the low memory conditions in the
> > DMA zone where the kernel itself failed to allocate from.
> > 
> > Do you know a kernel version that doesn't have OOM kills while running
> > the same workload? If so, could you send that .config to me? If not,
> > could you try disabling CONFIG_HIGHMEM? (It might not help but I'm out
> > of ideas at the moment.)

Ok, the bisect I did actually revealed something meaningful:

 # git bisect good
b8cf32dc6e8c75b712cbf638e0fd210101c22f17 is the first bad commit
commit b8cf32dc6e8c75b712cbf638e0fd210101c22f17
Author: Yosry Ahmed <yosryahmed@...gle.com>
Date:   Tue Jun 20 19:46:44 2023 +0000

    mm: zswap: multiple zpools support
    
    Support using multiple zpools of the same type in zswap, for concurrency
    purposes.  A fixed number of 32 zpools is suggested by this commit, which
    was determined empirically.  It can be later changed or made into a config
    option if needed.
    
    On a setup with zswap and zsmalloc, comparing a single zpool to 32 zpools
    shows improvements in the zsmalloc lock contention, especially on the swap
    out path.
    
    The following shows the perf analysis of the swapout path when 10
    workloads are simultaneously reclaiming and refaulting tmpfs pages.  There
    are some improvements on the swap in path as well, but less significant.
    
    1 zpool:
    
     |--28.99%--zswap_frontswap_store
           |
           <snip>
           |
           |--8.98%--zpool_map_handle
           |     |
           |      --8.98%--zs_zpool_map
           |           |
           |            --8.95%--zs_map_object
           |                 |
           |                  --8.38%--_raw_spin_lock
           |                       |
           |                        --7.39%--queued_spin_lock_slowpath
           |
           |--8.82%--zpool_malloc
           |     |
           |      --8.82%--zs_zpool_malloc
           |           |
           |            --8.80%--zs_malloc
           |                 |
           |                 |--7.21%--_raw_spin_lock
           |                 |     |
           |                 |      --6.81%--queued_spin_lock_slowpath
           <snip>
    
    32 zpools:
    
     |--16.73%--zswap_frontswap_store
           |
           <snip>
           |
           |--1.81%--zpool_malloc
           |     |
           |      --1.81%--zs_zpool_malloc
           |           |
           |            --1.79%--zs_malloc
           |                 |
           |                  --0.73%--obj_malloc
           |
           |--1.06%--zswap_update_total_size
           |
           |--0.59%--zpool_map_handle
           |     |
           |      --0.59%--zs_zpool_map
           |           |
           |            --0.57%--zs_map_object
           |                 |
           |                  --0.51%--_raw_spin_lock
           <snip>
    
    Link: https://lkml.kernel.org/r/20230620194644.3142384-1-yosryahmed@google.com
    Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
    Suggested-by: Yu Zhao <yuzhao@...gle.com>
    Acked-by: Chris Li (Google) <chrisl@...nel.org>
    Reviewed-by: Nhat Pham <nphamcs@...il.com>
    Tested-by: Nhat Pham <nphamcs@...il.com>
    Cc: Dan Streetman <ddstreet@...e.org>
    Cc: Domenico Cerasuolo <cerasuolodomenico@...il.com>
    Cc: Johannes Weiner <hannes@...xchg.org>
    Cc: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
    Cc: Seth Jennings <sjenning@...hat.com>
    Cc: Vitaly Wool <vitaly.wool@...sulko.com>
    Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>

 mm/zswap.c | 81 +++++++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 54 insertions(+), 27 deletions(-)


'bad' bisects were where the "kswapd0: page allocation failure:" showed up when running the workload, 'good' bisects were the cases where I only got the kernels OOM reaper killing the workload. In the good cases the machine stayed usable via VNC, in the bad cases with the issue showing up the machine crashed and rebooted >80% of the time shortly after showing the issue in dmesg (via netconsole). I triple checked the good cases to be sure only the OOM reaper showed up and not the kswapd0: page allocation failure.

Bisect.log attached.

Regards,
Erhard

View attachment "bisect.log" of type "text/x-log" (3472 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ