linux-kernel - Re: [PATCH v2] mm: terminate shrink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d5cc35f6-57a4-adb9-5b32-07c1db7c2a7a@I-love.SAKURA.ne.jp>
Date:   Fri, 8 Dec 2017 20:36:16 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:     Michal Hocko <mhocko@...nel.org>,
        Suren Baghdasaryan <surenb@...gle.com>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org,
        hillf.zj@...baba-inc.com, minchan@...nel.org,
        mgorman@...hsingularity.net, ying.huang@...el.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        timmurray@...gle.com, tkjos@...gle.com
Subject: Re: [PATCH v2] mm: terminate shrink_slab loop if signal is pending

On 2017/12/08 17:22, Michal Hocko wrote:
> On Thu 07-12-17 17:23:05, Suren Baghdasaryan wrote:
>> Slab shrinkers can be quite time consuming and when signal
>> is pending they can delay handling of the signal. If fatal
>> signal is pending there is no point in shrinking that process
>> since it will be killed anyway.
> 
> The thing is that we are _not_ shrinking _that_ process. We are
> shrinking globally shared objects and the fact that the memory pressure
> is so large that the kswapd doesn't keep pace with it means that we have
> to throttle all allocation sites by doing this direct reclaim. I agree
> that expediting killed task is a good thing in general because such a
> process should free at least some memory.

But doesn't doing direct reclaim mean that allocation request of already
fatal_signal_pending() threads will not succeed unless some memory is
reclaimed (or selected as an OOM victim)? Won't it just spin the "too
small to fail" retry loop at full speed in the worst case?

> 
>> This change checks for pending
>> fatal signals inside shrink_slab loop and if one is detected
>> terminates this loop early.
> 
> This changelog doesn't really address my previous review feedback, I am
> afraid. You should mention more details about problems you are seeing
> and what causes them. If we have a shrinker which takes considerable
> amount of time them we should be addressing that. If that is not
> possible then it should be documented at least.

Unfortunately, it is possible to be get blocked inside shrink_slab() for so long
like an example from http://lkml.kernel.org/r/1512705038.7843.6.camel@gmail.com .

----------
[18432.707027] INFO: task Chrome_IOThread:27225 blocked for more than
120 seconds.
[18432.707034]       Not tainted 4.15.0-rc2-amd-vega+ #10
[18432.707039] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[18432.707045] Chrome_IOThread D11304 27225   3654 0x00000000
[18432.707057] Call Trace:
[18432.707070]  ? __schedule+0x2e3/0xb90
[18432.707086]  ? __lock_page+0xa9/0x180
[18432.707095]  schedule+0x2f/0x90
[18432.707102]  io_schedule+0x12/0x40
[18432.707109]  __lock_page+0xe9/0x180
[18432.707121]  ? page_cache_tree_insert+0x130/0x130
[18432.707138]  deferred_split_scan+0x2b6/0x300
[18432.707160]  shrink_slab.part.47+0x1f8/0x590
[18432.707179]  ? percpu_ref_put_many+0x84/0x100
[18432.707197]  shrink_node+0x2f4/0x300
[18432.707219]  do_try_to_free_pages+0xca/0x350
[18432.707236]  try_to_free_pages+0x140/0x350
[18432.707259]  __alloc_pages_slowpath+0x43c/0x1080
[18432.707298]  __alloc_pages_nodemask+0x3ac/0x430
[18432.707316]  alloc_pages_vma+0x7c/0x200
[18432.707331]  __handle_mm_fault+0x8a1/0x1230
[18432.707359]  handle_mm_fault+0x14c/0x310
[18432.707373]  __do_page_fault+0x28c/0x530
[18432.707450]  do_page_fault+0x32/0x270
[18432.707470]  page_fault+0x22/0x30
----------