[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f28de1e7-4a9b-4a97-b4f9-723425725b58@quicinc.com>
Date: Wed, 10 Apr 2024 13:25:12 -0700
From: "Abhishek Chauhan (ABC)" <quic_abchauha@...cinc.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        "David S. Miller"
	<davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
	<kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>, <netdev@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, Andrew Halaney <ahalaney@...hat.com>,
        "Martin
 KaFai Lau" <martin.lau@...nel.org>,
        Martin KaFai Lau <martin.lau@...ux.dev>,
        Daniel Borkmann <daniel@...earbox.net>, bpf <bpf@...r.kernel.org>
CC: <kernel@...cinc.com>
Subject: Re: [RFC PATCH bpf-next v1 3/3] net: Add additional bit to support
 userspace timestamp type
On 4/10/2024 8:42 AM, Willem de Bruijn wrote:
> Abhishek Chauhan wrote:
>> tstamp_type can be real, mono or userspace timestamp.
>>
>> This commit adds userspace timestamp and sets it if there is
>> valid transmit_time available in socket coming from userspace.
>>
>> To make the design scalable for future needs this commit bring in
>> the change to extend the tstamp_type:1 to tstamp_type:2 to support
>> userspace timestamp.
>>
>> Link: https://lore.kernel.org/netdev/bc037db4-58bb-4861-ac31-a361a93841d3@linux.dev/
>> Signed-off-by: Abhishek Chauhan <quic_abchauha@...cinc.com>
>> ---
>>  include/linux/skbuff.h | 19 +++++++++++++++++--
>>  net/ipv4/ip_output.c   |  2 +-
>>  net/ipv4/raw.c         |  2 +-
>>  net/ipv6/ip6_output.c  |  2 +-
>>  net/ipv6/raw.c         |  2 +-
>>  net/packet/af_packet.c |  6 +++---
>>  6 files changed, 24 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 6160185f0fe0..2f91a8a2157a 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -705,6 +705,9 @@ typedef unsigned char *sk_buff_data_t;
>>  enum skb_tstamp_type {
>>  	SKB_TSTAMP_TYPE_RX_REAL = 0,    /* A RX (receive) time in real */
>>  	SKB_TSTAMP_TYPE_TX_MONO = 1,    /* A TX (delivery) time in mono */
>> +	SKB_TSTAMP_TYPE_TX_USER = 2,    /* A TX (delivery) time and its clock
>> +									 * is in skb->sk->sk_clockid.
>> +									 */
> 
> Weird indentation?
> 
I will correct it. 
> More fundamentally: instead of defining a type TX_USER, can we use a
> real clockid (e.g., CLOCK_TAI) based on skb->sk->sk_clockid? Rather
> than store an id that means "go look at sk_clockid".
> 
>>  };
>>  
>>  /**
>> @@ -830,6 +833,9 @@ enum skb_tstamp_type {
>>   *		delivery_time in mono clock base (i.e. EDT).  Otherwise, the
>>   *		skb->tstamp has the (rcv) timestamp at ingress and
>>   *		delivery_time at egress.
>> + *		delivery_time in mono clock base (i.e., EDT) or a clock base chosen
>> + *		by SO_TXTIME. If zero, skb->tstamp has the (rcv) timestamp at
>> + *		ingress.
>>   *	@napi_id: id of the NAPI struct this skb came from
>>   *	@sender_cpu: (aka @napi_id) source CPU in XPS
>>   *	@alloc_cpu: CPU which did the skb allocation.
>> @@ -960,7 +966,7 @@ struct sk_buff {
>>  	/* private: */
>>  	__u8			__mono_tc_offset[0];
>>  	/* public: */
>> -	__u8			tstamp_type:1;	/* See SKB_MONO_DELIVERY_TIME_MASK */
>> +	__u8			tstamp_type:2;	/* See SKB_MONO_DELIVERY_TIME_MASK */
>>  #ifdef CONFIG_NET_XGRESS
>>  	__u8			tc_at_ingress:1;	/* See TC_AT_INGRESS_MASK */
>>  	__u8			tc_skip_classify:1;
> 
> With pahole, does this have an effect on sk_buff layout?
> 
I think it does and it also impacts BPF testing. Hence in my cover letter i have mentioned that these
changes will impact BPF. My level of expertise is very limited to BPF hence the reason for RFC. 
That being said i am actually trying to understand/learn BPF instructions to know things better. 
I think we need to also change the offset SKB_MONO_DELIVERY_TIME_MASK and TC_AT_INGRESS_MASK
#ifdef __BIG_ENDIAN_BITFIELD
#define SKB_MONO_DELIVERY_TIME_MASK	(1 << 7) //Suspecting changes here too
#define TC_AT_INGRESS_MASK		(1 << 6) // and here 
#else
#define SKB_MONO_DELIVERY_TIME_MASK	(1 << 0)
#define TC_AT_INGRESS_MASK		(1 << 1) (this might have to change to 1<<2 )
#endif
#define SKB_BF_MONO_TC_OFFSET		offsetof(struct sk_buff, __mono_tc_offset)
Also i suspect i change in /selftests/bpf/prog_tests/ctx_rewrite.c 
I am trying to figure out what this part of the code is doing.
https://lore.kernel.org/all/20230321014115.997841-4-kuba@kernel.org/
Please correct me if i am wrong here. 
>> @@ -4274,7 +4280,16 @@ static inline void skb_set_delivery_time(struct sk_buff *skb, ktime_t kt,
>>  					enum skb_tstamp_type tstamp_type)
>>  {
>>  	skb->tstamp = kt;
>> -	skb->tstamp_type = kt && tstamp_type;
>> +
>> +	if (skb->tstamp_type)
>> +		return;
>> +
> 
I can put a warn on here incase if both MONO and TAI are set. 
OR 
Rather make it simple as you have mentioned below. 
> Why bail if a type is already set? And what if
> skb->tstamp_type != tstamp_type? Should skb->tstamp then not be
> written to (i.e., the test moved up), and perhaps a rate limited
> warning.
> 
>> +	if (kt && tstamp_type == SKB_TSTAMP_TYPE_TX_MONO)
>> +		skb->tstamp_type = SKB_TSTAMP_TYPE_TX_MONO;
>> +
>> +	if (kt && tstamp_type == SKB_TSTAMP_TYPE_TX_USER)
>> +		skb->tstamp_type = SKB_TSTAMP_TYPE_TX_USER;
> 
> Simpler
> 
>     if (kt)
>         skb->tstamp_type = tstamp_type;
Powered by blists - more mailing lists
 
