[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be8cfada-f4bd-4894-848d-1b7706b14035@virtuozzo.com>
Date: Wed, 20 Mar 2024 18:55:05 +0800
From: Pavel Tikhomirov <ptikhomirov@...tuozzo.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeel.butt@...ux.dev>, Muchun Song <muchun.song@...ux.dev>,
Andrew Morton <akpm@...ux-foundation.org>,
Vladimir Davydov <vdavydov.dev@...il.com>, cgroups@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, kernel@...nvz.org
Subject: Re: [PATCH] mm/memcontrol: stop resize loop if limit was changed
again
On 20/03/2024 18:28, Michal Hocko wrote:
> On Wed 20-03-24 18:03:30, Pavel Tikhomirov wrote:
>> In memory_max_write() we first set memcg->memory.max and only then
>> try to enforce it in loop. What if while we are in loop someone else
>> have changed memcg->memory.max but we are still trying to enforce
>> the old value? I believe this can lead to nasty consequence like getting
>> an oom on perfectly fine cgroup within it's limits or excess reclaim.
>
> I would argue that uncoordinated hard limit configuration can cause
> problems on their own.
Sorry, didn't know that.
> Beside how is this any different from changing
> the high limit while we are inside the reclaim loop?
I believe reclaim loop rereads limits on each iteration, e.g. in
reclaim_high(), so it should always be enforcing the right limit.
>
>> We also have exactly the same thing in memory_high_write().
>>
>> So let's stop enforcing old limits if we already have a new ones.
>
> I do see any reasons why this would be harmful I just do not see why
> this is a real thing or why the new behavior is any better for racing
> updaters as those are not deterministic anyway. If you have any actual
> usecase then more details would really help to justify this change.
>
> The existing behavior makes some sense as it enforces the given limit
> deterministically.
I don't have any actual problem, usecase or reproduce at hand, I only
see a potential problem:
Let's imagine that:
a) We set cgroup max limit to some small value, memory_max_write updates
memcg->memory.max and starts spinning in loop as it wants to reclaim
some memory which does not fit in new limit.
b) We don't need small limit anymore and we raise the limit to a big
value, but memory_max_write() from (a) is still spinning. And if we are
lucky enough and processes of cgroup are constantly consuming memory, to
compensate effect from memory_max_write() from (a), so that it will
continue spinning there forever.
Yes it is not that bad, because memory_max/high_write() also constantly
checks for pending signals in loop so they won't actually get
irreversibly stuck. But I just thought it was worth fixing.
>
>> Signed-off-by: Pavel Tikhomirov <ptikhomirov@...tuozzo.com>
>> ---
>> mm/memcontrol.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 61932c9215e7..81b303728491 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -6769,6 +6769,9 @@ static ssize_t memory_high_write(struct kernfs_open_file *of,
>> unsigned long nr_pages = page_counter_read(&memcg->memory);
>> unsigned long reclaimed;
>>
>> + if (memcg->memory.high != high)
>> + break;
>> +
>> if (nr_pages <= high)
>> break;
>>
>> @@ -6817,6 +6820,9 @@ static ssize_t memory_max_write(struct kernfs_open_file *of,
>> for (;;) {
>> unsigned long nr_pages = page_counter_read(&memcg->memory);
>>
>> + if (memcg->memory.max != max)
>> + break;
>> +
>> if (nr_pages <= max)
>> break;
>>
>> --
>> 2.43.0
>
--
Best regards, Tikhomirov Pavel
Senior Software Developer, Virtuozzo.
Powered by blists - more mailing lists