[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8f706bc5-cc9c-01f5-1918-41cd0501f4f0@virtuozzo.com>
Date: Fri, 12 Jan 2018 12:08:12 +0300
From: Andrey Ryabinin <aryabinin@...tuozzo.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Michal Hocko <mhocko@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Vladimir Davydov <vdavydov.dev@...il.com>,
cgroups@...r.kernel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Shakeel Butt <shakeelb@...gle.com>
Subject: Re: [PATCH v4] mm/memcg: try harder to decrease
[memory,memsw].limit_in_bytes
On 01/12/2018 03:21 AM, Andrew Morton wrote:
> On Thu, 11 Jan 2018 14:59:23 +0300 Andrey Ryabinin <aryabinin@...tuozzo.com> wrote:
>
>> On 01/11/2018 01:31 AM, Andrew Morton wrote:
>>> On Wed, 10 Jan 2018 15:43:17 +0300 Andrey Ryabinin <aryabinin@...tuozzo.com> wrote:
>>>
>>>> mem_cgroup_resize_[memsw]_limit() tries to free only 32 (SWAP_CLUSTER_MAX)
>>>> pages on each iteration. This makes practically impossible to decrease
>>>> limit of memory cgroup. Tasks could easily allocate back 32 pages,
>>>> so we can't reduce memory usage, and once retry_count reaches zero we return
>>>> -EBUSY.
>>>>
>>>> Easy to reproduce the problem by running the following commands:
>>>>
>>>> mkdir /sys/fs/cgroup/memory/test
>>>> echo $$ >> /sys/fs/cgroup/memory/test/tasks
>>>> cat big_file > /dev/null &
>>>> sleep 1 && echo $((100*1024*1024)) > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
>>>> -bash: echo: write error: Device or resource busy
>>>>
>>>> Instead of relying on retry_count, keep retrying the reclaim until
>>>> the desired limit is reached or fail if the reclaim doesn't make
>>>> any progress or a signal is pending.
>>>>
>>>
>>> Is there any situation under which that mem_cgroup_resize_limit() can
>>> get stuck semi-indefinitely in a livelockish state? It isn't very
>>> obvious that we're protected from this, so perhaps it would help to
>>> have a comment which describes how loop termination is assured?
>>>
>>
>> We are not protected from this. If tasks in cgroup *indefinitely* generate reclaimable memory at high rate
>> and user asks to set unreachable limit, like 'echo 4096 > memory.limit_in_bytes', than
>> try_to_free_mem_cgroup_pages() will return non-zero indefinitely.
>>
>> Is that a big deal? At least loop can be interrupted by a signal, and we don't hold any locks here.
>
> It may be better to detect this condition, give up and return an error?
>
That's basically what how v1 worked, "if (curusage >= oldusage)" used to be
the way to detect this potential livelock.
So we can just go back to it?
Powered by blists - more mailing lists