lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c7c5f25-593f-4b48-9274-a18a9ea61e8f@linux.dev>
Date: Tue, 29 Oct 2024 22:37:18 -0700
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Jason Xing <kerneljasonxing@...il.com>,
 Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: willemb@...gle.com, davem@...emloft.net, edumazet@...gle.com,
 kuba@...nel.org, pabeni@...hat.com, dsahern@...nel.org, ast@...nel.org,
 daniel@...earbox.net, andrii@...nel.org, eddyz87@...il.com, song@...nel.org,
 yonghong.song@...ux.dev, john.fastabend@...il.com, kpsingh@...nel.org,
 sdf@...ichev.me, haoluo@...gle.com, jolsa@...nel.org, shuah@...nel.org,
 ykolal@...com, bpf@...r.kernel.org, netdev@...r.kernel.org,
 Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next v3 02/14] net-timestamp: allow two features to
 work parallelly

On 10/29/24 8:04 PM, Jason Xing wrote:
>>>>>>>    static void skb_tstamp_tx_output(struct sk_buff *orig_skb,
>>>>>>>                                 const struct sk_buff *ack_skb,
>>>>>>>                                 struct skb_shared_hwtstamps *hwtstamps,
>>>>>>> @@ -5549,6 +5575,9 @@ static void skb_tstamp_tx_output(struct sk_buff *orig_skb,
>>>>>>>        u32 tsflags;
>>>>>>>
>>>>>>>        tsflags = READ_ONCE(sk->sk_tsflags);
>>>>>>> +     if (!sk_tstamp_tx_flags(sk, tsflags, tstype))
>>>>>>
>>>>>> I still don't get this part since v2. How does it work with cmsg only
>>>>>> SOF_TIMESTAMPING_TX_*?
>>>>>>
>>>>>> I tried with "./txtimestamp -6 -c 1 -C -N -L ::1" and it does not return any tx
>>>>>> time stamp after this patch.
>>>>>>
>>>>>> I am likely missing something
>>>>>> or v2 concluded that this behavior change is acceptable?
>>>>>
>>>>> Sorry, I submitted this series accidentally removing one important
>>>>> thing which is similar to what Vadim Fedorenko mentioned in the v1
>>>>> [1]:
>>>>> adding another member like sk_flags_bpf to handle the cmsg case.
>>>>>
>>>>> Willem, would it be acceptable to add another field in struct sock to
>>>>> help us recognise the case where BPF and cmsg works parallelly?
>>>>>
>>>>> [1]: https://lore.kernel.org/all/662873cb-a897-464e-bdb3-edf01363c3b2@linux.dev/
>>>>
>>>> The current timestamp flags don't need a u32. Maybe just reserve a bit
>>>> for this purpose?
>>>
>>> Sure. Good suggestion.
>>>
>>> But I think only using one bit to reflect whether the sk->sk_tsflags
>>> is used by normal or cmsg features is not enough. The existing
>>> implementation in tcp_sendmsg_locked() doesn't override the
>>> sk->sk_tsflags even the normal and cmsg features enabled parallelly.
>>> It only overrides sockc.tsflags in tcp_sendmsg_locked(). Based on
>>> that, even if at some point users suddenly remove the cmsg use and
>>> then the prior normal SO_TIMESTAMPING continues to work.
>>>
>>> How about this, please see below:
>>> For now, sk->sk_tsflags only uses 17 bits (see the last one
>>> SOF_TIMESTAMPING_OPT_RX_FILTER). The cmsg feature only uses 4 flags
>>> (see SOF_TIMESTAMPING_TX_RECORD_MASK in __sock_cmsg_send()). With that
>>> said, we could reserve the highest four bits for cmsg use for the
>>> moment. Four bits represents four points where we can record the
>>> timestamp in the tx case.
>>>
>>> Do you agree on this point?
>>
>> I don't follow.
>>
>> I probably miss the entire point.
>>
>> The goal for sockcm fields is to start with the sk field and
>> optionally override based on cmsg. This is what sockcm_init does for
>> tsflags.
>>
>> This information is for the skb, so these are recording flags.
>>
>> Why does the new datapath need to know whether features are enabled
>> through setsockopt or on a per-call basis with a cmsg?
>>
>> The goal was always to keep the reporting flags per socket, but make
>> the recording flag per packet, mainly for sampling.
> 
> If a user uses 1) cmsg feature, 2) bpf feature at the same time, we
> allow each feature to work independently.
> 
> How could it work? It relies on sk_tstamp_tx_flags() function in the
> current patch: when we are in __skb_tstamp_tx(), we cannot know which
> flags in each feature are set without fetching sk->sk_tsflags and
> sk->sk_tsflags_bpf. Then we are able to know what timestamp we want to
> record. To put it in a simple way, we're not sure if the user wants to
> see a SCHED timestamp by using the cmsg feature in __skb_tstamp_tx()
> if we hit this test statement "skb_shinfo(skb)->tx_flags &
> SKBTX_SCHED_TSTAMP)". So we need those two socket tsflag fields to
> help us.

I also don't see how a new bit/integer in a sk can help to tell the per cmsg 
on/off. This cmsg may have tx timestamp on while the next cmsg can have it off.

There is still one bit in skb_shinfo(skb)->tx_flags. How about define a 
SKBTX_BPF for everything. imo, the fine control on 
SOF_TIMESTAMPING_TX_{SCHED,SOFTWARE} is not useful for bpf. Almost all of the 
time the bpf program wants all available time stamps (sched, software, and 
hwtstamp if the NIC has it). Since bpf is in the kernel, it is much cheaper 
because it does not need to do skb_alloc/clone and queue to the error queue.

I think the bpf prog needs to capture a timestamp at the sendmsg() time, so a 
bpf prog needs to be called at sendmsg(). Then it may as well allow the bpf 
prog@...dmsg() to decide if it needs to set the SKBTX_BPF bit in 
skb_shinfo(skb)->tx_flags or not.

TCP_SKB_CB(skb)->txstamp_ack can also work similarly. There is still unused bit 
in "struct tcp_skb_cb", so may be adding TCP_SKB_CB(skb)->bpf_txstamp_ack

Then there is no need to control SOF_TIMESTAMPING_TX_* through bpf_setsockopt(). 
It only needs one bpf specific socket option like bpf_setsockopt(SOL_SOCKET, 
BPF_TX_TIMESTAMPING) to guard if the bpf-prog@...dmsg() needs to be called or 
not. There are already other TCP_BPF_IW,TCP_BPF_SNDCWND_CLAMP,... specific 
socket options.

imo, this is a simpler interface and also gives the bpf prog per packet control 
at the same time.

[ This user space cmsg-only testing has to be in the selftests/bpf to show how 
it can work. ]

> 
> Thanks,
> Jason


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ