netdev - Re: [RFC PATCH net-next v6 12/13] net-timestamp: introduce cgroup lock to avoid affecting non-bpf cases

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0697db8c-9909-4abb-932d-51413850cdd4@linux.dev>
Date: Fri, 24 Jan 2025 17:09:41 -0800
From: Martin KaFai Lau <martin.lau@...ux.dev>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
 pabeni@...hat.com, dsahern@...nel.org, willemdebruijn.kernel@...il.com,
 willemb@...gle.com, ast@...nel.org, daniel@...earbox.net, andrii@...nel.org,
 eddyz87@...il.com, song@...nel.org, yonghong.song@...ux.dev,
 john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me,
 haoluo@...gle.com, jolsa@...nel.org, horms@...nel.org, bpf@...r.kernel.org,
 netdev@...r.kernel.org
Subject: Re: [RFC PATCH net-next v6 12/13] net-timestamp: introduce cgroup
 lock to avoid affecting non-bpf cases

On 1/20/25 5:29 PM, Jason Xing wrote:
> Introducing the lock to avoid affecting the applications which
> are not using timestamping bpf feature.
> 
> Signed-off-by: Jason Xing <kerneljasonxing@...il.com>
> ---
>   net/core/skbuff.c     | 6 ++++--
>   net/ipv4/tcp.c        | 3 ++-
>   net/ipv4/tcp_input.c  | 3 ++-
>   net/ipv4/tcp_output.c | 3 ++-
>   4 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 33340e0b094f..db5b4b653351 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -5605,11 +5605,13 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
>   		return;
>   
>   	/* bpf extension feature entry */
> -	if (skb_shinfo(orig_skb)->tx_flags & SKBTX_BPF)
> +	if (cgroup_bpf_enabled(CGROUP_SOCK_OPS) &&

I wonder if it is really needed. The caller has just tested the tx_flags.

> +	    skb_shinfo(orig_skb)->tx_flags & SKBTX_BPF)
>   		skb_tstamp_tx_bpf(orig_skb, sk, tstype, sw, hwtstamps);
>   
>   	/* application feature entry */
> -	if (!skb_enable_app_tstamp(orig_skb, tstype, sw))
> +	if (cgroup_bpf_enabled(CGROUP_SOCK_OPS) &&

Same here and this one looks wrong also. The userspace may get something 
unexpected in the err queue. The bpf prog may have already detached here after 
setting the SKBTX_BPF earlier.

> +	    !skb_enable_app_tstamp(orig_skb, tstype, sw))
>   		return;
>   
>   	tsflags = READ_ONCE(sk->sk_tsflags);
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 49e489c346ea..d88160af00c4 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -493,7 +493,8 @@ static void tcp_tx_timestamp(struct sock *sk, struct sockcm_cookie *sockc)
>   			shinfo->tskey = TCP_SKB_CB(skb)->seq + skb->len - 1;
>   	}
>   
> -	if (SK_BPF_CB_FLAG_TEST(sk, SK_BPF_CB_TX_TIMESTAMPING) && skb) {
> +	if (cgroup_bpf_enabled(CGROUP_SOCK_OPS) &&

This looks ok considering SK_BPF_CB_FLAG_TEST may get to another cacheline.

> +	    SK_BPF_CB_FLAG_TEST(sk, SK_BPF_CB_TX_TIMESTAMPING) && skb) {
>   		struct skb_shared_info *shinfo = skb_shinfo(skb);
>   		struct tcp_skb_cb *tcb = TCP_SKB_CB(skb);
>   
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index c8945f5be31b..e30607ba41e5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -3324,7 +3324,8 @@ static void tcp_ack_tstamp(struct sock *sk, struct sk_buff *skb,
>   
>   	/* Avoid cache line misses to get skb_shinfo() and shinfo->tx_flags */
>   	if (likely(!TCP_SKB_CB(skb)->txstamp_ack &&
> -		   !TCP_SKB_CB(skb)->txstamp_ack_bpf))
> +		   !(cgroup_bpf_enabled(CGROUP_SOCK_OPS) &&

Same here. txtstamp_ack has just been tested.... txstamp_ack_bpf is the next bit.

> +		     TCP_SKB_CB(skb)->txstamp_ack_bpf)))
>   		return;
>   
>   	shinfo = skb_shinfo(skb);
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index fc84ca669b76..483f19c2083e 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -1556,7 +1556,8 @@ static void tcp_adjust_pcount(struct sock *sk, const struct sk_buff *skb, int de
>   static bool tcp_has_tx_tstamp(const struct sk_buff *skb)
>   {
>   	return TCP_SKB_CB(skb)->txstamp_ack ||
> -	       TCP_SKB_CB(skb)->txstamp_ack_bpf ||
> +	       (cgroup_bpf_enabled(CGROUP_SOCK_OPS) &&

Same here.

> +		TCP_SKB_CB(skb)->txstamp_ack_bpf) ||
>   		(skb_shinfo(skb)->tx_flags & SKBTX_ANY_TSTAMP);
>   }
>