lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <32c53d3c-8393-c5ba-4a43-6e40bd2ed258@iogearbox.net>
Date:   Fri, 17 Dec 2021 12:13:14 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Martin KaFai Lau <kafai@...com>
Cc:     Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        netdev@...r.kernel.org, Alexei Starovoitov <ast@...nel.org>,
        David Miller <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>, kernel-team@...com
Subject: Re: [RFC PATCH v2 net-next] net: Preserve skb delivery time during
 forward

On 12/17/21 8:33 AM, Martin KaFai Lau wrote:
> On Fri, Dec 17, 2021 at 12:42:30AM +0100, Daniel Borkmann wrote:
>> On 12/16/21 11:58 PM, Willem de Bruijn wrote:
>>>>>> @@ -530,7 +538,14 @@ struct skb_shared_info {
>>>>>>           /* Warning: this field is not always filled in (UFO)! */
>>>>>>           unsigned short  gso_segs;
>>>>>>           struct sk_buff  *frag_list;
>>>>>> -       struct skb_shared_hwtstamps hwtstamps;
>>>>>> +       union {
>>>>>> +               /* If SKBTX_DELIVERY_TSTAMP is set in tx_flags,
>>>>>> +                * tx_delivery_tstamp is stored instead of
>>>>>> +                * hwtstamps.
>>>>>> +                */
>>>>>
>>>>> Should we just encode the timebase and/or type { timestamp,
>>>>> delivery_time } in th lower bits of the timestamp field? Its
>>>>> resolution is higher than actual clock precision.
>>>> In skb->tstamp ?
>>>
>>> Yes. Arguably a hack, but those bits are in the noise now, and it
>>> avoids the clone issue with skb_shinfo (and scarcity of flag bits
>>> there).
>>>
>>>>> is non-zero skb->tstamp test not sufficient, instead of
>>>>> SKBTX_DELIVERY_TSTAMP_ALLOW_FWD.
>>>>>
>>>>> It is if only called on the egress path. Is bpf on ingress the only
>>>>> reason for this?
>>>> Ah. ic.  meaning testing non-zero skb->tstamp and then call
>>>> skb_save_delivery_time() only during the veth-egress-path:
>>>> somewhere in veth_xmit() => veth_forward_skb() but before
>>>> skb->tstamp was reset to 0 in __dev_forward_skb().
>>>
>>> Right. If delivery_time is the only use of skb->tstamp on egress, and
>>> timestamp is the only use on ingress, then the only time the
>>> delivery_time needs to be cached if when looping from egress to
>>> ingress and this field is non-zero.
>>>
>>>> Keep *_forward() and bpf_out_*() unchanged (i.e. keep skb->tstamp = 0)
>>>> because the skb->tstamp could be stamped by net_timestamp_check().
>>>>
>>>> Then SKBTX_DELIVERY_TSTAMP_ALLOW_FWD is not needed.
>>>>
>>>> Did I understand your suggestion correctly?
>>>
>>> I think so.
>>>
>>> But the reality is complicated if something may be setting a delivery
>>> time on ingress (a BPF filter?)
>>
>> I'm not quite following the 'bpf_out_*() unchanged (i.e. keep skb->tstamp = 0)'
>> part yet; in our case we would need to preserve it as well, for example, we are
>> redirecting via bpf from bpf@...ingress@...t-veth to bpf@...egress@...s-dev in
>> the egress path and fq sits on phys-dev.. (I mean if needed we could easily do
>> that as shown in my prev diff with a flag for the helper).
> Right, we have the same use case:
>      redirecting from bpf@...ingress@...t-veth to bpf@...egress@...s-dev in
>      the egress path and fq sits on phys-dev
> 
> My earlier comment was on having the delivery_time preserved in
> the skb_shared_hwtstamps.  The delivery_time (e.g. EDT) and
> timestamp (timestamp as RX timestamp) are separately stored when
> looping from veth-egress to veth-ingress:
> 
> 	delivery_time in skb_shared_hwtstamps
> 	rx timestamp in skb->tstamp
> 
> Thus, when bpf_redirect_neigh(phys-dev) happens, bpf_out_*() can
> continue to reset skb->tstamp as-is while delivery_time will
> automatically be kept in skb_shared_hwtstamps.  When the skb
> reaches the egress@...s-dev (__dev_queue_xmit), the delivery_time
> in skb_shared_hwtstamps will be restored into skb->tstamp (done
> in skb_restore_delivery_time in this patch).

I think that could probably work, also in particular if it's restored once
on stacked devs e.g. when instead of phys-dev we're dealing with upper
tunnel dev (e.g. vxlan/geneve + BPF with collect_md). Wouldn't we still
need something like SKBTX_DELIVERY_TSTAMP_ALLOW_FWD, e.g. when the phys
driver sets skb_hwtstamps(skb)->hwtstamp on RX, and this gets carried on
the ingress path to the target namespaces' socket?

Thanks,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ