lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 6 Dec 2017 17:27:19 -0800
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     mhocko@...e.com, Johannes Weiner <hannes@...xchg.org>,
        hillf.zj@...baba-inc.com, minchan@...nel.org,
        mgorman@...hsingularity.net, ying.huang@...el.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Tim Murray <timmurray@...gle.com>, Todd Kjos <tkjos@...gle.com>
Subject: Re: [PATCH] mm: terminate shrink_slab loop if signal is pending

>
> Some quantification of "quite time consuming" and "delay" would be
> interesting, please.
>

Unfortunately that depends on the implementation of the shrinkers
registered in the system including the ones from drivers. I've
captured traces showing delays of up to 100ms where the process with
pending SIGKILL is in direct memory reclaim and signal handling is
delayed because of that. I realize that it's not the fault of
shrink_slab_lmk() that some shrinkers take long time to shrink their
slabs (sometimes because of justifiable reasons and sometimes because
of a bug which has to be fixed) but this can be a safeguard against
such cases.
Couple shrinker examples that I found most time consuming are (most of
that 100ms delay is the result of the first two ones):

https://patchwork.kernel.org/patch/10096641/
The patch fixes dm-bufio shrinker which in certain conditions reclaims
only one buffer per scan making the shrinking process very
inefficient.

https://android.googlesource.com/kernel/msm/+/android-7.1.0_r0.2/drivers/gpu/msm/kgsl_pool.c#420
This example is from a driver where shrinker returns 0 instead of
SHRINK_STOP when it's unable to reclaim anymore. As a result when
total_scan in do_shrink_slab() is large this will cause multiple
scan_objects() calls with no memory being reclaimed. Patch for this
one is under review by the owners.

Shrinker that seems to be justifiably heavy is super_cache_scan()
inside fs/super.c. I have traces where it takes up to 4ms to complete.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ