[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67b3da89bc6c7_c0e25294cb@willemb.c.googlers.com.notmuch>
Date: Mon, 17 Feb 2025 19:55:37 -0500
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Jason Xing <kerneljasonxing@...il.com>,
Martin KaFai Lau <martin.lau@...ux.dev>
Cc: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
davem@...emloft.net,
edumazet@...gle.com,
kuba@...nel.org,
pabeni@...hat.com,
dsahern@...nel.org,
willemb@...gle.com,
ast@...nel.org,
daniel@...earbox.net,
andrii@...nel.org,
eddyz87@...il.com,
song@...nel.org,
yonghong.song@...ux.dev,
john.fastabend@...il.com,
kpsingh@...nel.org,
sdf@...ichev.me,
haoluo@...gle.com,
jolsa@...nel.org,
horms@...nel.org,
bpf@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH bpf-next v11 08/12] bpf: add BPF_SOCK_OPS_TS_HW_OPT_CB
callback
Willem de Bruijn wrote:
> Jason Xing wrote:
> > On Sun, Feb 16, 2025 at 6:58 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
> > >
> > > On 2/15/25 2:23 PM, Jason Xing wrote:
> > > > On Sun, Feb 16, 2025 at 2:08 AM Willem de Bruijn
> > > > <willemdebruijn.kernel@...il.com> wrote:
> > > >>
> > > >> Jason Xing wrote:
> > > >>> On Sat, Feb 15, 2025 at 11:06 PM Willem de Bruijn
> > > >>> <willemdebruijn.kernel@...il.com> wrote:
> > > >>>>
> > > >>>> Jason Xing wrote:
> > > >>>>> Support hw SCM_TSTAMP_SND case for bpf timestamping.
> > > >>>>>
> > > >>>>> Add a new sock_ops callback, BPF_SOCK_OPS_TS_HW_OPT_CB. This
> > > >>>>> callback will occur at the same timestamping point as the user
> > > >>>>> space's hardware SCM_TSTAMP_SND. The BPF program can use it to
> > > >>>>> get the same SCM_TSTAMP_SND timestamp without modifying the
> > > >>>>> user-space application.
> > > >>>>>
> > > >>>>> To avoid increasing the code complexity, replace SKBTX_HW_TSTAMP
> > > >>>>> with SKBTX_HW_TSTAMP_NOBPF instead of changing numerous callers
> > > >>>>> from driver side using SKBTX_HW_TSTAMP. The new definition of
> > > >>>>> SKBTX_HW_TSTAMP means the combination tests of socket timestamping
> > > >>>>> and bpf timestamping. After this patch, drivers can work under the
> > > >>>>> bpf timestamping.
> > > >>>>>
> > > >>>>> Considering some drivers doesn't assign the skb with hardware
> > > >>>>> timestamp,
> > > >>>>
> > > >>>> This is not for a real technical limitation, like the skb perhaps
> > > >>>> being cloned or shared?
> > > >>>
> > > >>> Agreed on this point. I'm kind of familiar with I40E, so I dare to say
> > > >>> the reason why it doesn't assign the hwtstamp is because the skb will
> > > >>> soon be destroyed, that is to say, it's pointless to assign the
> > > >>> timestamp.
> > > >>
> > > >> Makes sense.
> > > >>
> > > >> But that does not ensure that the skb is exclusively owned. Nor that
> > > >> the same is true for all drivers using this API (which is not small,
> > > >> but small enough to manually review if need be).
> > > >>
> > > >> The first two examples I happened to look at, i40e and bnx2x, both use
> > > >> skb_get() to get a non-exclusive skb reference for their ptp_tx_skb.
> > >
> > > I think the existing __skb_tstamp_tx() function is also assigning to
> > > skb_hwtstamps(skb). The skb may be cloned from the orig_skb first, but they
> > > still share the same shinfo. My understanding is that this patch is assigning to
> > > the shinfo earlier, so it should not have changed the driver's expectation on
> > > the skb_hwtstamps(skb) after calling __skb_tstamp_tx(). If there are drivers
> > > assuming exclusive access to the skb_hwtstamps(skb), probably it is something
> > > that needs to be addressed regardless and should not be the common case?
> >
> > Right, it's also what I was trying to say but missed. Thanks for the
> > supplementary info:)
>
> That existing behavior looks dodgy then, too.
>
> I don't have time to look into it deeply right now. But it seems to go
> back all the way to the introduction of hw timestamping in commit
> ac45f602ee3d in 2009.
>
> I can see how it works in that nothing else holding a clone will
> likely have a reason to touch those fields. But that does not make it
> correct.
>
> Your point that the new code is no worse than today probably is true.
> But when we spot something we prefer to fix it probably. Will need a
> deeper look..
The original commit explains the rationale. It is as I expected: the
field is newly introduced and for every skb it is therefore known that
no other path exists that touches that field.
"
The new semantic for hardware/software time stamping around
ndo_start_xmit() is based on two assumptions about existing
network device drivers which don't support hardware time
stamping and know nothing about it:
- they leave the new skb_shared_tx unmodified
- the keep the connection to the originating socket in skb->sk
alive, i.e., don't call skb_orphan()
Given that skb_shared_tx is new, the first assumption is safe.
"
I'm not aware of us relying on such soft assurances for other fields
in skb_shared_info wrt accessing while cloned. But we can assume it
out of scope for this series.
Powered by blists - more mailing lists