lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 23 Apr 2022 14:43:13 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        David Rientjes <rientjes@...gle.com>,
        Tejun Heo <tj@...nel.org>, Zefan Li <lizefan.x@...edance.com>,
        Jonathan Corbet <corbet@....net>,
        Shuah Khan <shuah@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Wei Xu <weixugc@...gle.com>, Greg Thelen <gthelen@...gle.com>,
        Chen Wandun <chenwandun@...wei.com>,
        Vaibhav Jain <vaibhav@...ux.ibm.com>,
        Michal Koutný <mkoutny@...e.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Dan Schatzberg <schatzberg.dan@...il.com>,
        cgroups@...r.kernel.org, linux-doc@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v4 4/4] selftests: cgroup: add a selftest for memory.reclaim

On Sat, Apr 23, 2022 at 7:28 AM Shakeel Butt <shakeelb@...gle.com> wrote:
>
> On Thu, Apr 21, 2022 at 11:44:26PM +0000, Yosry Ahmed wrote:
> > Add a new test for memory.reclaim that verifies that the interface
> > correctly reclaims memory as intended, from both anon and file pages.
> >
> > Signed-off-by: Yosry Ahmed <yosryahmed@...gle.com>
> > Acked-by: Roman Gushchin <roman.gushchin@...ux.dev>
> > ---
> >  .../selftests/cgroup/test_memcontrol.c        | 86 +++++++++++++++++++
> >  1 file changed, 86 insertions(+)
> >
> > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> > index f2ffb3a30194..5f7c20de2426 100644
> > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > @@ -760,6 +760,91 @@ static int test_memcg_max(const char *root)
> >       return ret;
> >  }
> >
> > +/*
> > + * This test checks that memory.reclaim reclaims the given
> > + * amount of memory (from both anon and file).
> > + */
> > +static int test_memcg_reclaim(const char *root)
> > +{
> > +     int ret = KSFT_FAIL, fd, retries;
> > +     char *memcg;
> > +     long current, to_reclaim;
> > +     char buf[64];
> > +
> > +     memcg = cg_name(root, "memcg_test");
> > +     if (!memcg)
> > +             goto cleanup;
> > +
> > +     if (cg_create(memcg))
> > +             goto cleanup;
> > +
> > +     current = cg_read_long(memcg, "memory.current");
> > +     if (current != 0)
> > +             goto cleanup;
> > +
> > +     cg_run_nowait(memcg, alloc_anon_noexit, (void *) MB(50));
>
> Don't you need is_swap_enabled() check before deciding to do the anon
> allocations?

Yes you are right. In the next version will check whether or not swap
is enabled and modify the test accordingly.

> > +     sleep(1);
> > +
> > +     fd = get_temp_fd();
> > +     if (fd < 0)
> > +             goto cleanup;
> > +
> > +     cg_run_nowait(memcg, alloc_pagecache_50M_noexit, (void *)(long)fd);
> > +     sleep(1);
>
> These sleep(1)s do not seem robust. Since kernel keeps the page cache
> around, you can convert anon to use tmpfs and use simple cg_run to
> trigger the allocations of anon (tmpfs) and file which will remain in
> memory even after return from cg_run.

Other tests in the file are also using sleep approach (see
test_memcg_min, although it retries for multiple times until
memory.current reaches an expected amount). In my experience it hasn't
been flaky running for multiple times on different machines, but I
agree it can be flaky (false negative).

I am not sure about the allocating file pages with cg_run, is it
guaranteed that the page cache will remain in memory until the test
ends? If it doesn't, it can also flake, but it would produce false
positives (the test could pass because the kernel drained page cache
for some other reason although the interface is not working
correctly).

In my personal opinion, false negative flakes are better than false
positives. At least currently the test explicitly and clearly fails if
the allocations are not successful. If we rely on the page cache
remaining until the test finishes then it could silently pass if the
interface is not working correctly.

There are a few ways we can go forward with this:
1) Keep everything as-is, but print a message if the test fails due to
memory.current not reaching 100MB to make it clear that it didn't fail
due to a problem with the interface.
2) Add a sleep/retry loop similar to test_memcg_min instead of sleeping once.
3) Send a signal from forked children when they are done with the
allocation, and wait to receive this signal in the test to make sure
the allocation is completed.

In my opinion we should do (1) (and maybe (2)) for now as (3) could be
an overkill if the test is normal passing. Maybe add a comment about
(3) being an option in the future if the test flakes. Let me know what
you think?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ