lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 10 Dec 2021 11:08:23 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Martin KaFai Lau <kafai@...com>
Cc:     Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        netdev@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>,
        David Miller <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>, kernel-team@...com
Subject: Re: [RFC PATCH net-next 2/2] net: Reset forwarded skb->tstamp before
 delivering to user space

On 12/10/21 2:37 AM, Martin KaFai Lau wrote:
> On Thu, Dec 09, 2021 at 01:58:52PM +0100, Daniel Borkmann wrote:
>>> Daniel, do you have suggestion on where to temporarily store
>>> the forwarded EDT so that the bpf@...ress can access?
>>
>> Hm, was thinking maybe moving skb->skb_mstamp_ns into the shared info as
>> in skb_hwtstamps(skb)->skb_mstamp_ns could work. In other words, as a union
>> with hwtstamp to not bloat it further. And TCP stack as well as everything
>> else (like sch_fq) could switch to it natively (hwtstamp might only be used
>> on RX or TX completion from driver side if I'm not mistaken).
>>
>> But then while this would solve the netns transfer, we would run into the
>> /same/ issue again when implementing a hairpinning LB where we loop from RX
>> to TX given this would have to be cleared somewhere again if driver populates
>> hwtstamp, so not really feasible and bloating shared info with a second
>> tstamp would bump it by one cacheline. :(
> If the edt is set at skb_hwtstamps,
> skb->tstamp probably needs to be re-populated for the bpf@...egress
> but should be minor since there is a skb_at_tc_ingress() test.
> 
> It seems fq does not need shinfo now, so that will be an extra cacheline to
> bring... hmm

Right. :/ The other thing I was wondering (but haven't looked enough into the
code yet whether feasible or not) ... maybe skb_hwtstamps(skb)->hwtstamp could
be changed to cover both hw & sw ingress tstamp (meaning, if nic doesn't provide
it, then we fall back to the sw one and __net_timestamp() stores it there, too)
whereas skb->tstamp would always concern an egress tstamp. However, it might
result in quite a lot of churn given the wider-spread use, but more importantly,
performance implications are also not quite clear as you mentioned above wrt
extra cache miss.

>> A cleaner BUT still non-generic solution compared to the previous diff I could
>> think of might be the below. So no change in behavior in general, but if the
>> bpf@...ress@...h@...t needs to access the original tstamp, it could do so
>> via existing mapping we already have in BPF, and then it could transfer it
>> for all or certain traffic (up to the prog) via BPF code setting ...
>>
>>    skb->tstamp = skb->hwtstamp
>>
>> ... and do the redirect from there to the phys dev with BPF_F_KEEP_TSTAMP
>> flag. Minimal intrusive, but unfortunately only accessible for BPF. Maybe use
>> of skb_hwtstamps(skb)->nststamp could be extended though (?)
> I like the idea of the possibility in temporarily storing a future mono EDT
> in skb_shared_hwtstamps.
> 
> It may open up some possibilities.  Not sure how that may look like yet
> but I will try to develop on this.

Ok! One thing I noticed later in the diff, that for the ingressing direction
aka phys -> host veth -> netns veth, we also do the skb_xfer_tstamp() switch
and might override the one stored from driver with potentially the one from
__net_timestamp(), but maybe for netns'es that's acceptable (perhaps a test
for existing skb->sk owner before skb_xfer_tstamp() could do the trick..).

> I may have to separate the fwd-edt problem from __sk_buff->tstamp accessibility
> @ingress to keep it simple first.
> will try to make it generic also before scaling back to a bpf-specific solution.

Yeah sounds good, if we can solve it generically, even better!

> Thanks for the code and the idea !
Thanks,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ