[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YrynRcW4/B2nl/kK@mtj.duckdns.org>
Date: Thu, 30 Jun 2022 04:25:57 +0900
From: Tejun Heo <tj@...nel.org>
To: Vasily Averin <vvs@...nvz.org>
Cc: Michal Koutný <mkoutny@...e.com>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeelb@...gle.com>,
Michal Hocko <mhocko@...e.com>,
Zefan Li <lizefan.x@...edance.com>,
Johannes Weiner <hannes@...xchg.org>, kernel@...nvz.org,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
Vlastimil Babka <vbabka@...e.cz>,
Muchun Song <songmuchun@...edance.com>, cgroups@...r.kernel.org
Subject: Re: [PATCH cgroup] cgroup: set the correct return code if hierarchy
limits are reached
On Wed, Jun 29, 2022 at 09:13:02AM +0300, Vasily Averin wrote:
> I experimented on fedora36 node with LXC and centos stream 9 container.
> and I did not noticed any critical systemd troubles with original -EAGAIN.
> When cgroup's limit is reached systemd cannot start new services,
> for example lxc-attach generates following output:
>
> [root@...4-vvs ~]# lxc-attach c9s
> lxc-attach: c9s: cgroups/cgfsng.c: cgroup_attach_leaf: 2084 Resource temporarily unavailable - Failed to create leaf cgroup ".lxc"
> lxc-attach: c9s: cgroups/cgfsng.c: __cgroup_attach_many: 3517 Resource temporarily unavailable - Failed to attach to cgroup fd 11
> lxc-attach: c9s: attach.c: lxc_attach: 1679 Resource temporarily unavailable - Failed to attach cgroup
> lxc-attach: c9s: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
> lxc-attach: c9s: attach.c: do_attach: 1375 Failed to attach to container
>
> I did not found any loop in userspace caused by EAGAIN.
> Messages looks unclear, however situation with the patched kernel is not much better:
>
> [root@...4-vvs ~]# lxc-attach c9s
> lxc-attach: c9s: cgroups/cgfsng.c: cgroup_attach_leaf: 2084 No space left on device - Failed to create leaf cgroup ".lxc"
> lxc-attach: c9s: cgroups/cgfsng.c: __cgroup_attach_many: 3517 No space left on device - Failed to attach to cgroup fd 11
> lxc-attach: c9s: attach.c: lxc_attach: 1679 No space left on device - Failed to attach cgroup
> lxc-attach: c9s: attach.c: do_attach: 1237 No data available - Failed to receive lsm label fd
> lxc-attach: c9s: attach.c: do_attach: 1375 Failed to attach to container
I'd say "resource temporarily unavailable" is better fitting than "no
space left on device" and the syscall restart thing isn't handled by
-EAGAIN return value. Grep restart_block for that.
Thanks.
--
tejun
Powered by blists - more mailing lists