[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <x2zp6vbr5c3oa3xyfctj66y4ikdxtuo7wsqamkqgyt5ppu6ccb@vwxzimqvrhgk>
Date: Fri, 4 Aug 2023 14:59:28 -0400
From: Lucas Karpinski <lkarpins@...hat.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeelb@...gle.com>,
Muchun Song <muchun.song@...ux.dev>, Tejun Heo <tj@...nel.org>,
Zefan Li <lizefan.x@...edance.com>,
Shuah Khan <shuah@...nel.org>, cgroups@...r.kernel.org,
linux-mm@...ck.org, linux-kselftest@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] selftests: cgroup: fix test_kmem_memcg_deletion false
positives
On Fri, Aug 04, 2023 at 12:37:16PM -0400, Johannes Weiner wrote:
> On Fri, Aug 04, 2023 at 11:37:33AM -0400, Lucas Karpinski wrote:
> > The test allocates dcache inside a cgroup, then destroys the cgroups and
> > then checks the sanity of numbers on the parent level. The reason it
> > fails is because dentries are freed with an RCU delay - a debugging
> > sleep shows that usage drops as expected shortly after.
> >
> > Insert a 1s sleep after completing the cgroup creation/deletions. This
> > should be good enough, assuming that machines running those tests are
> > otherwise not very busy. This commit is directly inspired by Johannes
> > over at the link below.
> >
> > Link: https://lore.kernel.org/all/20230801135632.1768830-1-hannes@cmpxchg.org/
> >
> > Signed-off-by: Lucas Karpinski <lkarpins@...hat.com>
>
> Maybe I'm missing something, but there isn't a limit set anywhere that
> would cause the dentries to be reclaimed and freed, no? When the
> subgroups are deleted, the objects are just moved to the parent. The
> counters inside the parent (which are hierarchical) shouldn't change.
>
> So this seems to be a different scenario than test_kmem_basic. If the
> test is failing for you, I can't quite see why.
>
You're right, the parent inherited the counters and it should behave
the same whether I'm directly removing the child or if I was moving it
under another cgroup. I do see the behaviour you described on my
x86_64 setup, but the wrong behaviour on my aarch64 dev. platform. I'll
take a closer look, but just wanted to leave an example here of what I
see.
Example of slab size pre/post sleep:
slab_pre = 18164688, slab_post = 3360000
Thanks,
Lucas
Powered by blists - more mailing lists