[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6AB7FF12-F855-4D5B-9F75-9F7D64823144@didiglobal.com>
Date: Wed, 17 May 2023 10:01:48 +0000
From: 程垲涛 Chengkaitao Cheng
<chengkaitao@...iglobal.com>
To: Yosry Ahmed <yosryahmed@...gle.com>
CC: "tj@...nel.org" <tj@...nel.org>,
"lizefan.x@...edance.com" <lizefan.x@...edance.com>,
"hannes@...xchg.org" <hannes@...xchg.org>,
"corbet@....net" <corbet@....net>,
"mhocko@...nel.org" <mhocko@...nel.org>,
"roman.gushchin@...ux.dev" <roman.gushchin@...ux.dev>,
"shakeelb@...gle.com" <shakeelb@...gle.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"brauner@...nel.org" <brauner@...nel.org>,
"muchun.song@...ux.dev" <muchun.song@...ux.dev>,
"viro@...iv.linux.org.uk" <viro@...iv.linux.org.uk>,
"zhengqi.arch@...edance.com" <zhengqi.arch@...edance.com>,
"ebiederm@...ssion.com" <ebiederm@...ssion.com>,
"Liam.Howlett@...cle.com" <Liam.Howlett@...cle.com>,
"chengzhihao1@...wei.com" <chengzhihao1@...wei.com>,
"pilgrimtao@...il.com" <pilgrimtao@...il.com>,
"haolee.swjtu@...il.com" <haolee.swjtu@...il.com>,
"yuzhao@...gle.com" <yuzhao@...gle.com>,
"willy@...radead.org" <willy@...radead.org>,
"vasily.averin@...ux.dev" <vasily.averin@...ux.dev>,
"vbabka@...e.cz" <vbabka@...e.cz>,
"surenb@...gle.com" <surenb@...gle.com>,
"sfr@...b.auug.org.au" <sfr@...b.auug.org.au>,
"mcgrof@...nel.org" <mcgrof@...nel.org>,
"feng.tang@...el.com" <feng.tang@...el.com>,
"cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH v4 0/2] memcontrol: support cgroup level OOM protection
At 2023-05-17 16:09:50, "Yosry Ahmed" <yosryahmed@...gle.com> wrote:
>On Wed, May 17, 2023 at 1:01 AM 程垲涛 Chengkaitao Cheng
><chengkaitao@...iglobal.com> wrote:
>>
>> At 2023-05-17 14:59:06, "Yosry Ahmed" <yosryahmed@...gle.com> wrote:
>> >+David Rientjes
>> >
>> >On Tue, May 16, 2023 at 8:20 PM chengkaitao <chengkaitao@...iglobal.com> wrote:
>> >>
>> >> Establish a new OOM score algorithm, supports the cgroup level OOM
>> >> protection mechanism. When an global/memcg oom event occurs, we treat
>> >> all processes in the cgroup as a whole, and OOM killers need to select
>> >> the process to kill based on the protection quota of the cgroup.
>> >>
>> >
>> >Perhaps this is only slightly relevant, but at Google we do have a
>> >different per-memcg approach to protect from OOM kills, or more
>> >specifically tell the kernel how we would like the OOM killer to
>> >behave.
>> >
>> >We define an interface called memory.oom_score_badness, and we also
>> >allow it to be specified per-process through a procfs interface,
>> >similar to oom_score_adj.
>> >
>> >These scores essentially tell the OOM killer the order in which we
>> >prefer memcgs to be OOM'd, and the order in which we want processes in
>> >the memcg to be OOM'd. By default, all processes and memcgs start with
>> >the same score. Ties are broken based on the rss of the process or the
>> >usage of the memcg (prefer to kill the process/memcg that will free
>> >more memory) -- similar to the current OOM killer.
>>
>> Thank you for providing a new application scenario. You have described a
>> new per-memcg approach, but a simple introduction cannot explain the
>> details of your approach clearly. If you could compare and analyze my
>> patches for possible defects, or if your new approach has advantages
>> that my patches do not have, I would greatly appreciate it.
>
>Sorry if I was not clear, I am not implying in any way that the
>approach I am describing is better than your patches. I am guilty of
>not conducting the proper analysis you are requesting.
There is no perfect approach in the world, and I also seek your advice with
a learning attitude. You don't need to say sorry, I should say thank you.
>I just saw the thread and thought it might be interesting to you or
>others to know the approach that we have been using for years in our
>production. I guess the target is the same, be able to tell the OOM
>killer which memcgs/processes are more important to protect. The
>fundamental difference is that instead of tuning this based on the
>memory usage of the memcg (your approach), we essentially give the OOM
>killer the ordering in which we want memcgs/processes to be OOM
>killed. This maps to jobs priorities essentially.
Killing processes in order of memory usage cannot effectively protect
important processes. Killing processes in a user-defined priority order
will result in a large number of OOM events and still not being able to
release enough memory. I have been searching for a balance between
the two methods, so that their shortcomings are not too obvious.
The biggest advantage of memcg is its tree topology, and I also hope
to make good use of it.
>If this approach works for you (or any other audience), that's great,
>I can share more details and perhaps we can reach something that we
>can both use :)
If you have a good idea, please share more details or show some code.
I would greatly appreciate it
>>
>> >This has been brought up before in other discussions without much
>> >interest [1], but just thought it may be relevant here.
>> >
>> >[1]https://lore.kernel.org/lkml/CAHS8izN3ej1mqUpnNQ8c-1Bx5EeO7q5NOkh0qrY_4PLqc8rkHA@mail.gmail.com/#t
--
Thanks for your comment!
chengkaitao
Powered by blists - more mailing lists