[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO3-Pbqo_bfYsstH47hgqx7GC0CUg1H0xUaewq=MkUvb2BzCZA@mail.gmail.com>
Date: Tue, 18 Jul 2023 22:10:44 -0500
From: Yan Zhai <yan@...udflare.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: David Ahern <dsahern@...nel.org>, Ivan Babrou <ivan@...udflare.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>, kernel-team <kernel-team@...udflare.com>,
Eric Dumazet <edumazet@...gle.com>, "David S. Miller" <davem@...emloft.net>,
Paolo Abeni <pabeni@...hat.com>, Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Willem de Bruijn <willemdebruijn.kernel@...il.com>
Subject: Re: Stacks leading into skb:kfree_skb
On Tue, Jul 18, 2023 at 5:36 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Fri, 14 Jul 2023 18:54:14 -0600 David Ahern wrote:
> > > I made some aggregations for the stacks we see leading into
> > > skb:kfree_skb endpoint. There's a lot of data that is not easily
> > > digestible, so I lightly massaged the data and added flamegraphs in
> > > addition to raw stack counts. Here's the gist link:
> > >
> > > * https://gist.github.com/bobrik/0e57671c732d9b13ac49fed85a2b2290
> >
> > I see a lot of packet_rcv as the tip before kfree_skb. How many packet
> > sockets do you have running on that box? Can you accumulate the total
> > packet_rcv -> kfree_skb_reasons into 1 count -- regardless of remaining
> > stacktrace?
>
> On a quick look we have 3 branches which can get us to kfree_skb from
> packet_rcv:
>
> if (skb->pkt_type == PACKET_LOOPBACK)
> goto drop;
> ...
> if (!net_eq(dev_net(dev), sock_net(sk)))
> goto drop;
> ...
> res = run_filter(skb, sk, snaplen);
> if (!res)
> goto drop_n_restore;
>
> I'd guess is the last one? Which we should mark with the SOCKET_FILTER
> drop reason?
So we have multiple packet socket consumers on our edge:
* systemd-networkd: listens on ETH_P_LLDPD, which is the role model
that does not do excessive things
* lldpd: I am not sure why we needed this one in presence of
systemd-networkd, but it is running atm, which contributes to constant
packet_rcv calls. It listens on ETH_P_ALL because of
https://github.com/lldpd/lldpd/pull/414. But its filter is doing the
correct work, so packets hitting this one is mostly "consumed"
Now the bad kids:
* arping: listens on ETH_P_ALL. This one contributes all the
skb:kfree_skb spikes, and the reason is sk_rmem_alloc overflows
rcvbuf. I suspect it is due to a poorly constructed filter so too many
packets get queued too fast.
* conduit-watcher: a health checker, sending packets on ETH_P_IP in
non-init netns. Majority of packet_rcv on this one goes to direct drop
due to netns difference.
So to conclude, it might be useful to set a reason for rcvbuf related
drops at least. On the other hand, almost all packets entered
packet_rcv are shared, so clone failure probably can also be a thing
under memory pressure.
--
Yan
Powered by blists - more mailing lists