[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZvrkXj_JSYl9866W@google.com>
Date: Mon, 30 Sep 2024 10:48:14 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Hyeonggon Yoo <42.hyeyoo@...il.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>,
Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>,
Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
LKML <linux-kernel@...r.kernel.org>, bpf@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Christoph Lameter <cl@...ux.com>, Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Vlastimil Babka <vbabka@...e.cz>,
Roman Gushchin <roman.gushchin@...ux.dev>, linux-mm@...ck.org,
Arnaldo Carvalho de Melo <acme@...nel.org>
Subject: Re: [RFC/PATCH bpf-next 3/3] selftests/bpf: Add a test for
kmem_cache_iter
On Sun, Sep 29, 2024 at 09:33:05PM -0700, Namhyung Kim wrote:
> On Mon, Sep 30, 2024 at 12:24:52PM +0900, Hyeonggon Yoo wrote:
> > On Mon, Sep 30, 2024 at 11:18 AM Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > Hello Hyeonggon,
> > >
> > > On Sun, Sep 29, 2024 at 11:27:25PM +0900, Hyeonggon Yoo wrote:
> > > > On Sun, Sep 29, 2024 at 3:13 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > > > > > +SEC("raw_tp/bpf_test_finish")
> > > > > > +int BPF_PROG(check_task_struct)
> > > > > > +{
> > > > > > + __u64 curr = bpf_get_current_task();
> > > > > > + struct kmem_cache *s;
> > > > > > + char *name;
> > > > > > +
> > > > > > + s = bpf_get_kmem_cache(curr);
> > > > > > + if (s == NULL) {
> > > > > > + found = -1;
> > > > > > + return 0;
> > > > >
> > > > > ... it cannot find a kmem_cache for the current task. This program is
> > > > > run by bpf_prog_test_run_opts() with BPF_F_TEST_RUN_ON_CPU. So I think
> > > > > the curr should point a task_struct in a slab cache.
> > > > >
> > > > > Am I missing something?
> > > >
> > > > Hi Namhyung,
> > > >
> > > > Out of curiosity I've been investigating this issue on my machine and
> > > > running some experiments.
> > >
> > > Thanks a lot for looking at this!
> > >
> > > >
> > > > When the test fails, calling dump_page() for the page the task_struct
> > > > belongs to,
> > > > shows that the page does not have the PGTY_slab flag set which is why
> > > > virt_to_slab(current) returns NULL.
> > > >
> > > > Does the test always fails on your environment? On my machine, the
> > > > test passed sometimes but failed some times.
> > >
> > > I'm using vmtest.sh but it succeeded mostly. I thought I couldn't
> > > reproduce it locally, but I also see the failure sometimes. I'll take a
> > > deeper look.
> > >
> > > >
> > > > Maybe sometimes the value returned by 'current' macro belongs to a
> > > > slab, but sometimes it does not.
> > > > But that doesn't really make sense to me as IIUC task_struct
> > > > descriptors are allocated from slab.
> > >
> > > AFAIK the notable exception is the init_task which lives in the kernel
> > > data. I'm not sure the if the test is running by PID 1.
> >
> > I checked that the test is running under PID 0 (swapper) when it fails and
> > non-0 PID when it succeeds. This makes sense as the task_struct for PID 0
> > should be in the kernel image area, not in a slab.
> >
> > Phew, fortunately, it's not a bug! :)
>
> Thanks for the test, I've seen the same now.
>
> >
> > Any plans on how to adjust the test program?
>
> I thought the test runs in a separate task. I'll think about how to
> test this more reliably.
Oh, I think BPF_F_TEST_RUN_ON_CPU was the problem since it requires to
run the test on the given CPU (cpu0 in this case). If the cpu0 was
idle, it would fail like this. I think removing the flag will run the
test on the current CPU so it won't get the swapper task anymore.
Thanks,
Namhyung
Powered by blists - more mailing lists