linux-kernel - Re: [PATCH] doc: cgroup: update note about conditions when oom killer is invoked

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0ddb8e58-5bfd-7754-6979-4276acf5b4c8@yandex-team.ru>
Date:   Mon, 11 May 2020 12:34:00 +0300
From:   Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        cgroups@...r.kernel.org, Roman Gushchin <guro@...com>
Subject: Re: [PATCH] doc: cgroup: update note about conditions when oom killer
 is invoked



On 11/05/2020 11.39, Michal Hocko wrote:
> On Fri 08-05-20 17:16:29, Konstantin Khlebnikov wrote:
>> Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
>> back to the charge path") cgroup oom killer is no longer invoked only from
>> page faults. Now it implements the same semantics as global OOM killer:
>> allocation context invokes OOM killer and keeps retrying until success.
>>
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
> 
> Acked-by: Michal Hocko <mhocko@...e.com>
> 
>> ---
>>   Documentation/admin-guide/cgroup-v2.rst |   17 ++++++++---------
>>   1 file changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
>> index bcc80269bb6a..1bb9a8f6ebe1 100644
>> --- a/Documentation/admin-guide/cgroup-v2.rst
>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>> @@ -1172,6 +1172,13 @@ PAGE_SIZE multiple when read back.
>>   	Under certain circumstances, the usage may go over the limit
>>   	temporarily.
>>   
>> +	In default configuration regular 0-order allocation always
>> +	succeed unless OOM killer choose current task as a victim.
>> +
>> +	Some kinds of allocations don't invoke the OOM killer.
>> +	Caller could retry them differently, return into userspace
>> +	as -ENOMEM or silently ignore in cases like disk readahead.
> 
> I would probably add -EFAULT but the less error codes we document the
> better.

Yeah, EFAULT was a most obscure result of memory shortage.
Fortunately with new behaviour this shouldn't happens a lot.

Actually where it is still possible? THP always fallback to 0-order.
I mean EFAULT could appear inside kernel only if task is killed so
nobody would see it.

> 
>> +
>>   	This is the ultimate protection mechanism.  As long as the
>>   	high limit is used and monitored properly, this limit's
>>   	utility is limited to providing the final safety net.
>> @@ -1228,17 +1235,9 @@ PAGE_SIZE multiple when read back.
>>   		The number of time the cgroup's memory usage was
>>   		reached the limit and allocation was about to fail.
>>   
>> -		Depending on context result could be invocation of OOM
>> -		killer and retrying allocation or failing allocation.
>> -
>> -		Failed allocation in its turn could be returned into
>> -		userspace as -ENOMEM or silently ignored in cases like
>> -		disk readahead.  For now OOM in memory cgroup kills
>> -		tasks iff shortage has happened inside page fault.
>> -
>>   		This event is not raised if the OOM killer is not
>>   		considered as an option, e.g. for failed high-order
>> -		allocations.
>> +		allocations or if caller asked to not retry attempts.
>>   
>>   	  oom_kill
>>   		The number of processes belonging to this cgroup
>