linux-kernel - Re: [QUESTION] oom killed the key system process triggered by a bad process alloc memory with MAP

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <2f368ad4-6a7f-798d-11c1-369eed757bb0@huawei.com>
Date:   Mon, 1 Nov 2021 19:16:36 +0800
From:   Yongqiang Liu <liuyongqiang13@...wei.com>
To:     Michal Hocko <mhocko@...e.com>
CC:     <rientjes@...gle.com>, <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>,
        <penguin-kernel@...ove.sakura.ne.jp>,
        "Wangkefeng (OS Kernel Lab)" <wangkefeng.wang@...wei.com>
Subject: Re: [QUESTION] oom killed the key system process triggered by a bad
 process alloc memory with MAP_LOCKED


在 2021/11/1 16:24, Michal Hocko 写道:
> Hi,
>
> On Mon 01-11-21 16:05:50, Yongqiang Liu wrote:
> [...]
>> And we found that when the oom_reaper is done but the memory is still high:
>>
>> [   45.115685] Out of memory: Killed process 2553 (oom) total-vm:953404kB,
>> anon-rss:947748kB, file-rss:388kB, shmem-rss:0kB, UID:0 pgtables:1896kB
>> oom_score_adj:1000
>> [   45.115739] oom_reaper: reaped process 2553 (oom), now anon-rss:947708kB,
>> file-rss:0kB, shmem-rss:0kB
>>
>> This is because the bad proccess which recieved SIGKILL is unlocking the mem
>> to exit which needs more time. And the next oom is triggered to kill the
>> other system process.
> Yes, this is a known limitation of the oom_reaper based OOM killing.
> __oom_reap_task_mm has to skip over mlocked memory areas because
> munlocking requires some locking (or at least that was the case when the
> oom reaper was introduced) and the primary purpose of the oom_reaper is
> to guarantee a forward progress.
>
> Addressing that limitation would require the munlock operation to not
> depend on any locking. I am not sure how much work that would be with
> the current code. Until now this was not a high priority because
> processes with a high mlock limit should be really trusted with their
> memory consumption so they shouldn't be really the primary oom killer
> target.
>
> Are you seeing this problem happening with a real workload or is this
> only triggered with some artificial tests? E.g. LTP oom tests are known
> to trigger this situation but they do not represent any real workload.

I haven't found it in real workload yet. It's just a testcase.

--

Yongqiang Liu