[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a8ebb0c6-5f67-411a-8513-a82c083abd8c@linux.dev>
Date: Mon, 25 Aug 2025 16:14:35 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Kuniyuki Iwashima <kuniyu@...gle.com>
Cc: Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
John Fastabend <john.fastabend@...il.com>,
Stanislav Fomichev <sdf@...ichev.me>, Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>,
Shakeel Butt <shakeel.butt@...ux.dev>, "David S. Miller"
<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Neal Cardwell <ncardwell@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
Mina Almasry <almasrymina@...gle.com>, Kuniyuki Iwashima
<kuni1840@...il.com>, bpf@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH v1 bpf-next/net 2/8] bpf: Add a bpf hook in
__inet_accept().
On 8/25/25 11:14 AM, Kuniyuki Iwashima wrote:
> On Mon, Aug 25, 2025 at 10:57 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>>
>> On 8/22/25 3:17 PM, Kuniyuki Iwashima wrote:
>>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
>>> index ae83ecda3983..ab613abdfaa4 100644
>>> --- a/net/ipv4/af_inet.c
>>> +++ b/net/ipv4/af_inet.c
>>> @@ -763,6 +763,8 @@ void __inet_accept(struct socket *sock, struct socket *newsock, struct sock *new
>>> kmem_cache_charge(newsk, gfp);
>>> }
>>>
>>> + BPF_CGROUP_RUN_PROG_INET_SOCK_ACCEPT(newsk);
>>> +
>>> if (mem_cgroup_sk_enabled(newsk)) {
>>> int amt;
>>>
>>> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
>>> index 233de8677382..80df246d4741 100644
>>> --- a/tools/include/uapi/linux/bpf.h
>>> +++ b/tools/include/uapi/linux/bpf.h
>>> @@ -1133,6 +1133,7 @@ enum bpf_attach_type {
>>> BPF_NETKIT_PEER,
>>> BPF_TRACE_KPROBE_SESSION,
>>> BPF_TRACE_UPROBE_SESSION,
>>> + BPF_CGROUP_INET_SOCK_ACCEPT,
>>
>> Instead of adding another hook, can the SK_BPF_MEMCG_SOCK_ISOLATED bit be
>> inherited from the listener?
>
> Since e876ecc67db80 and d752a4986532c , we defer memcg allocation to
> accept() because the child socket could be created during irq context with
> unrelated cgroup. This had another reason; if the listener was created in the
> root cgroup and passed to a process under cgroup, child sockets would never
> have sk_memcg if sk_memcg was inherited.
>
> So, the child's memcg is not always the same one with the listener's, and
> we cannot rely on the listener's sk_memcg.
I didn't mean to inherit the entire sk_memcg pointer. I meant to only inherit
the SK_BPF_MEMCG_SOCK_ISOLATED bit.
If it can only be done at accept, there is already an existing
SEC("lsm_cgroup/socket_accept") hook. Take a look at
tools/testing/selftests/bpf/progs/lsm_cgroup.c. The lsm socket_accept doesn't
have access to the "newsock->sk" but it should have access to the "sock->sk", do
bpf_setsockopt and then inherit by the newsock->sk (?)
There are already quite enough cgroup-sk style hooks. I would prefer not to add
another cgroup attach_type and instead see if some of the existing ones can be
reused. There is also SEC("lsm/sock_graft").
Powered by blists - more mailing lists