[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <94c88020-5282-c82b-8f88-a2d012444699@iogearbox.net>
Date: Mon, 30 Oct 2023 15:19:26 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: Peilin Ye <yepeilin.cs@...il.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>, Martin KaFai Lau
<martin.lau@...ux.dev>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>,
John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>, Jesper Dangaard Brouer <hawk@...nel.org>,
Peilin Ye <peilin.ye@...edance.com>, netdev@...r.kernel.org,
bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
Cong Wang <cong.wang@...edance.com>, Jiang Wang <jiang.wang@...edance.com>,
Youlun Zhang <zhangyoulun@...edance.com>
Subject: Re: [PATCH net] veth: Fix RX stats for bpf_redirect_peer() traffic
On 10/29/23 1:11 AM, Peilin Ye wrote:
> On Sat, Oct 28, 2023 at 09:06:44AM +0200, Daniel Borkmann wrote:
>>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>>> index 21d75108c2e9..7aca28b7d0fd 100644
>>>> --- a/net/core/filter.c
>>>> +++ b/net/core/filter.c
>>>> @@ -2492,6 +2492,7 @@ int skb_do_redirect(struct sk_buff *skb)
>>>> net_eq(net, dev_net(dev))))
>>>> goto out_drop;
>>>> skb->dev = dev;
>>>> + dev_sw_netstats_rx_add(dev, skb->len);
>>>
>>> This assumes that all devices that support BPF_F_PEER (currently only
>>> veth) use tstats (instead of lstats, or dstats) - is that okay?
>>
>> Dumb question, but why all this change and not simply just call ...
>>
>> dev_lstats_add(dev, skb->len)
>>
>> ... on the host dev ?
>
> Since I didn't want to update host-veth's TX counters. If we
> bpf_redirect_peer()ed a packet from NIC TC ingress to Pod-veth TC ingress,
> I think it means we've bypassed host-veth TX?
Yes. So the idea is to transition to tstats replace the location where
we used to bump lstats with tstat's tx counter, and only the peer redirect
would bump the rx counter.. then upon stats traversal we fold the latter into
the rx stats which was populated by the opposite's tx counters. Makes sense.
OT: does cadvisor run inside the Pod to collect the device stats? Just
curious how it gathers them.
>>> If not, should I add another NDO e.g. ->ndo_stats_rx_add()?
>>
>> Definitely no new stats ndo resp indirect call in fast path.
>
> Yeah, I think I'll put a comment saying that all devices that support
> BPF_F_PEER must use tstats (or must use lstats), then.
sgtm.
Thanks,
Daniel
Powered by blists - more mailing lists