lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ffcb4d4-a5b4-4c87-8c92-ef87269bfd07@quicinc.com>
Date: Thu, 24 Jul 2025 11:45:59 +0530
From: Sharath Chandra Vurukala <quic_sharathv@...cinc.com>
To: Eric Dumazet <edumazet@...gle.com>
CC: <davem@...emloft.net>, <dsahern@...nel.org>, <kuba@...nel.org>,
        <pabeni@...hat.com>, <netdev@...r.kernel.org>,
        <quic_kapandey@...cinc.com>, <quic_subashab@...cinc.com>
Subject: Re: [PATCH] net: Add locking to protect skb->dev access in ip_output



On 7/23/2025 8:38 PM, Eric Dumazet wrote:
> On Wed, Jul 23, 2025 at 1:22 AM Sharath Chandra Vurukala
> <quic_sharathv@...cinc.com> wrote:
>>
>> In ip_output() skb->dev is updated from the skb_dst(skb)->dev
>> this can become invalid when the interface is unregistered and freed,
>>
>> Added rcu locks to ip_output().This will ensure that all the skb's
>> associated with the dev being deregistered will be transnmitted
>> out first, before freeing the dev.
>>
>> Multiple panic call stacks were observed when UL traffic was run
>> in concurrency with device deregistration from different functions,
>> pasting one sample for reference.
>>
>> [496733.627565][T13385] Call trace:
>> [496733.627570][T13385] bpf_prog_ce7c9180c3b128ea_cgroupskb_egres+0x24c/0x7f0
>> [496733.627581][T13385] __cgroup_bpf_run_filter_skb+0x128/0x498
>> [496733.627595][T13385] ip_finish_output+0xa4/0xf4
>> [496733.627605][T13385] ip_output+0x100/0x1a0
>> [496733.627613][T13385] ip_send_skb+0x68/0x100
>> [496733.627618][T13385] udp_send_skb+0x1c4/0x384
>> [496733.627625][T13385] udp_sendmsg+0x7b0/0x898
>> [496733.627631][T13385] inet_sendmsg+0x5c/0x7c
>> [496733.627639][T13385] __sys_sendto+0x174/0x1e4
>> [496733.627647][T13385] __arm64_sys_sendto+0x28/0x3c
>> [496733.627653][T13385] invoke_syscall+0x58/0x11c
>> [496733.627662][T13385] el0_svc_common+0x88/0xf4
>> [496733.627669][T13385] do_el0_svc+0x2c/0xb0
>> [496733.627676][T13385] el0_svc+0x2c/0xa4
>> [496733.627683][T13385] el0t_64_sync_handler+0x68/0xb4
>> [496733.627689][T13385] el0t_64_sync+0x1a4/0x1a8
>>
>> Signed-off-by: Sharath Chandra Vurukala <quic_sharathv@...cinc.com>
>> ---
>>  net/ipv4/ip_output.c | 17 ++++++++++++-----
>>  1 file changed, 12 insertions(+), 5 deletions(-)
>>
>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>> index 10a1d182fd84..95c5e9cfc971 100644
>> --- a/net/ipv4/ip_output.c
>> +++ b/net/ipv4/ip_output.c
>> @@ -425,15 +425,22 @@ int ip_mc_output(struct net *net, struct sock *sk, struct sk_buff *skb)
>>
>>  int ip_output(struct net *net, struct sock *sk, struct sk_buff *skb)
>>  {
>> -       struct net_device *dev = skb_dst_dev(skb), *indev = skb->dev;
>> +       struct net_device *dev, *indev = skb->dev;
>>
>> +       IP_UPD_PO_STATS(net, IPSTATS_MIB_OUT, skb->len);
>> +
>> +       rcu_read_lock();
>> +
>> +       dev = skb_dst(skb)->dev;
> 
> Arg... Please do not remove skb_dst_dev(skb), and instead expand it.
> 
> I recently started to work on this class of problems.
> 
> commit a74fc62eec155ca5a6da8ff3856f3dc87fe24558
> Author: Eric Dumazet <edumazet@...gle.com>
> Date:   Mon Jun 30 12:19:31 2025 +0000
> 
>     ipv4: adopt dst_dev, skb_dst_dev and skb_dst_dev_net[_rcu]
> 
>     Use the new helpers as a first step to deal with
>     potential dst->dev races.
> 
>     Signed-off-by: Eric Dumazet <edumazet@...gle.com>
>     Reviewed-by: Kuniyuki Iwashima <kuniyu@...gle.com>
>     Link: https://patch.msgid.link/20250630121934.3399505-8-edumazet@google.com
>     Signed-off-by: Jakub Kicinski <kuba@...nel.org>
> 
> 
> Adding RCU is not good enough, we still need the READ_ONCE() to
> prevent potential load/store tearing.
> 
> I was planning to add skb_dst_dev_rcu() helper and start replacing
> skb_dst_dev() where needed.
> 
> diff --git a/include/net/dst.h b/include/net/dst.h
> index 00467c1b509389a8e37d6e3d0912374a0ff12c4a..692ebb1b3f421210dbb58990b77a200b9189d0f7
> 100644
> --- a/include/net/dst.h
> +++ b/include/net/dst.h
> @@ -568,11 +568,23 @@ static inline struct net_device *dst_dev(const
> struct dst_entry *dst)
>         return READ_ONCE(dst->dev);
>  }
> 
> +static inline struct net_device *dst_dev_rcu(const struct dst_entry *dst)
> +{
> +       /* In the future, use rcu_dereference(dst->dev) */
> +       WARN_ON(!rcu_read_lock_held());
> +       return READ_ONCE(dst->dev);
> +}
> +
>  static inline struct net_device *skb_dst_dev(const struct sk_buff *skb)
>  {
>         return dst_dev(skb_dst(skb));
>  }
> 
> +static inline struct net_device *skb_dst_dev_rcu(const struct sk_buff *skb)
> +{
> +       return dst_dev_rcu(skb_dst(skb));
> +}
> +
>  static inline struct net *skb_dst_dev_net(const struct sk_buff *skb)
>  {
>         return dev_net(skb_dst_dev(skb));
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 10a1d182fd848f0d2348f65fde269383f9c07baa..37b982dd53f69247634c67c493c44fa482100dee
> 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -425,15 +425,20 @@ int ip_mc_output(struct net *net, struct sock
> *sk, struct sk_buff *skb)
> 
>  int ip_output(struct net *net, struct sock *sk, struct sk_buff *skb)
>  {
> -       struct net_device *dev = skb_dst_dev(skb), *indev = skb->dev;
> +       struct net_device *dev, *indev = skb->dev;
> +       int res;
> 
> +       rcu_read_lock();
> +       dev = skb_dst_dev_rcu(skb);
>         skb->dev = dev;
>         skb->protocol = htons(ETH_P_IP);
> 
> -       return NF_HOOK_COND(NFPROTO_IPV4, NF_INET_POST_ROUTING,
> +       res = NF_HOOK_COND(NFPROTO_IPV4, NF_INET_POST_ROUTING,
>                             net, sk, skb, indev, dev,
>                             ip_finish_output,
>                             !(IPCB(skb)->flags & IPSKB_REROUTED));
> +       rcu_read_unlock();
> +       return res;
>  }
>  EXPORT_SYMBOL(ip_output);
Thanks Eric for the review, as this work is already underway on your end, I’ll pause and wait for your changes to become available.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ