lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eacd9a90-8c80-4e23-a193-f09d96fe24ee@linux.dev>
Date: Thu, 18 Sep 2025 18:14:33 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Kuniyuki Iwashima <kuniyu@...gle.com>
Cc: Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>,
 John Fastabend <john.fastabend@...il.com>,
 Stanislav Fomichev <sdf@...ichev.me>, Johannes Weiner <hannes@...xchg.org>,
 Michal Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
 Shakeel Butt <shakeel.butt@...ux.dev>, "David S. Miller"
 <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 Neal Cardwell <ncardwell@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
 Mina Almasry <almasrymina@...gle.com>, Kuniyuki Iwashima
 <kuni1840@...il.com>, bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v9 bpf-next/net 6/6] selftest: bpf: Add test for
 SK_MEMCG_EXCLUSIVE.

On 9/17/25 6:17 PM, Kuniyuki Iwashima wrote:
>>> +
>>> +close:
>>> +     for (i = 0; i < ARRAY_SIZE(sk); i++)
>>> +             close(sk[i]);
>>> +
>>> +     if (test_case->type == SOCK_DGRAM) {
>>> +             /* UDP recv queue is destroyed after RCU grace period.
>>> +              * With one kern_sync_rcu(), memory_allocated[0] of the
>>> +              * isoalted case often matches with memory_allocated[1]
>>> +              * of the preceding non-exclusive case.
>>> +              */
>> I don't think I understand the double kern_sync_rcu() below.
> With one kern_sync_rcu(), when I added bpf_printk() for memory_allocated,
> I sometimes saw two consecutive non-zero values, meaning memory_allocated[0]
> still see the previous test case result (memory_allocated[1]).
> ASSERT_LE() succeeds as expected, but somewhat unintentionally.
> 
> bpf_trace_printk: memory_allocated: 0 <-- non exclusive case
> bpf_trace_printk: memory_allocated: 4160
> bpf_trace_printk: memory_allocated: 4160 <-- exclusive case's
> memory_allocated[0]
> bpf_trace_printk: memory_allocated: 0
> bpf_trace_printk: memory_allocated: 0
> bpf_trace_printk: memory_allocated: 0
> 
> One kern_sync_rcu() is enough to kick call_rcu() + sk_destruct() but
> does not guarantee that it completes, so if the queue length was too long,
> the memory_allocated does not go down fast enough.
> 
> But now I don't see the flakiness with NR_SEND 32, and one
> kern_sync_rcu() might be enough unless the env is too slow...?

Ah, got it. I put you in the wrong path. It needs rcu_barrier() instead.

Is recv() enough? May be just recv(MSG_DONTWAIT) to consume it only for UDP 
socket? that will slow down the udp test... only read 1 byte and the remaining 
can be MSG_TRUNC?

btw, does the test need 64 sockets? is it because of the per socket snd/rcvbuf 
limitation?

Another option is to trace SEC("fexit/__sk_destruct") to ensure all the cleanup 
is done but seems overkill if recv() can do.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ