[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eacd9a90-8c80-4e23-a193-f09d96fe24ee@linux.dev>
Date: Thu, 18 Sep 2025 18:14:33 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Kuniyuki Iwashima <kuniyu@...gle.com>
Cc: Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>,
Stanislav Fomichev <sdf@...ichev.me>, Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeel.butt@...ux.dev>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Neal Cardwell <ncardwell@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
Mina Almasry <almasrymina@...gle.com>, Kuniyuki Iwashima
<kuni1840@...il.com>, bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v9 bpf-next/net 6/6] selftest: bpf: Add test for
SK_MEMCG_EXCLUSIVE.
On 9/17/25 6:17 PM, Kuniyuki Iwashima wrote:
>>> +
>>> +close:
>>> + for (i = 0; i < ARRAY_SIZE(sk); i++)
>>> + close(sk[i]);
>>> +
>>> + if (test_case->type == SOCK_DGRAM) {
>>> + /* UDP recv queue is destroyed after RCU grace period.
>>> + * With one kern_sync_rcu(), memory_allocated[0] of the
>>> + * isoalted case often matches with memory_allocated[1]
>>> + * of the preceding non-exclusive case.
>>> + */
>> I don't think I understand the double kern_sync_rcu() below.
> With one kern_sync_rcu(), when I added bpf_printk() for memory_allocated,
> I sometimes saw two consecutive non-zero values, meaning memory_allocated[0]
> still see the previous test case result (memory_allocated[1]).
> ASSERT_LE() succeeds as expected, but somewhat unintentionally.
>
> bpf_trace_printk: memory_allocated: 0 <-- non exclusive case
> bpf_trace_printk: memory_allocated: 4160
> bpf_trace_printk: memory_allocated: 4160 <-- exclusive case's
> memory_allocated[0]
> bpf_trace_printk: memory_allocated: 0
> bpf_trace_printk: memory_allocated: 0
> bpf_trace_printk: memory_allocated: 0
>
> One kern_sync_rcu() is enough to kick call_rcu() + sk_destruct() but
> does not guarantee that it completes, so if the queue length was too long,
> the memory_allocated does not go down fast enough.
>
> But now I don't see the flakiness with NR_SEND 32, and one
> kern_sync_rcu() might be enough unless the env is too slow...?
Ah, got it. I put you in the wrong path. It needs rcu_barrier() instead.
Is recv() enough? May be just recv(MSG_DONTWAIT) to consume it only for UDP
socket? that will slow down the udp test... only read 1 byte and the remaining
can be MSG_TRUNC?
btw, does the test need 64 sockets? is it because of the per socket snd/rcvbuf
limitation?
Another option is to trace SEC("fexit/__sk_destruct") to ensure all the cleanup
is done but seems overkill if recv() can do.
Powered by blists - more mailing lists