[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAAVpQUDUULCrcTP4AQ31B5bfo-+dtw3H8CQGq9_SQ7d28xXSvA@mail.gmail.com>
Date: Mon, 25 Aug 2025 11:14:22 -0700
From: Kuniyuki Iwashima <kuniyu@...gle.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, John Fastabend <john.fastabend@...il.com>,
Stanislav Fomichev <sdf@...ichev.me>, Johannes Weiner <hannes@...xchg.org>, Michal Hocko <mhocko@...nel.org>,
Roman Gushchin <roman.gushchin@...ux.dev>, Shakeel Butt <shakeel.butt@...ux.dev>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Neal Cardwell <ncardwell@...gle.com>, Willem de Bruijn <willemb@...gle.com>,
Mina Almasry <almasrymina@...gle.com>, Kuniyuki Iwashima <kuni1840@...il.com>, bpf@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH v1 bpf-next/net 2/8] bpf: Add a bpf hook in __inet_accept().
On Mon, Aug 25, 2025 at 10:57 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>
> On 8/22/25 3:17 PM, Kuniyuki Iwashima wrote:
> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > index ae83ecda3983..ab613abdfaa4 100644
> > --- a/net/ipv4/af_inet.c
> > +++ b/net/ipv4/af_inet.c
> > @@ -763,6 +763,8 @@ void __inet_accept(struct socket *sock, struct socket *newsock, struct sock *new
> > kmem_cache_charge(newsk, gfp);
> > }
> >
> > + BPF_CGROUP_RUN_PROG_INET_SOCK_ACCEPT(newsk);
> > +
> > if (mem_cgroup_sk_enabled(newsk)) {
> > int amt;
> >
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index 233de8677382..80df246d4741 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -1133,6 +1133,7 @@ enum bpf_attach_type {
> > BPF_NETKIT_PEER,
> > BPF_TRACE_KPROBE_SESSION,
> > BPF_TRACE_UPROBE_SESSION,
> > + BPF_CGROUP_INET_SOCK_ACCEPT,
>
> Instead of adding another hook, can the SK_BPF_MEMCG_SOCK_ISOLATED bit be
> inherited from the listener?
Since e876ecc67db80 and d752a4986532c , we defer memcg allocation to
accept() because the child socket could be created during irq context with
unrelated cgroup. This had another reason; if the listener was created in the
root cgroup and passed to a process under cgroup, child sockets would never
have sk_memcg if sk_memcg was inherited.
So, the child's memcg is not always the same one with the listener's, and
we cannot rely on the listener's sk_memcg.
Powered by blists - more mailing lists