lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f9af1c5.1a76.187304726fd.Coremail.windhl@126.com>
Date:   Thu, 30 Mar 2023 10:09:42 +0800 (CST)
From:   "Liang He" <windhl@....com>
To:     "Simon Horman" <simon.horman@...igine.com>
Cc:     "Jakub Kicinski" <kuba@...nel.org>, davem@...emloft.net,
        edumazet@...gle.com, pabeni@...hat.com, netdev@...r.kernel.org
Subject: Re:Re: [PATCH] rionet: Fix refcounting bugs



At 2023-03-30 02:09:42, "Simon Horman" <simon.horman@...igine.com> wrote:
>On Wed, Mar 29, 2023 at 02:01:30PM +0800, Liang He wrote:
>> 
>> Hi, Jakub,
>> 
>> First, thanks for you reviewing our patch again.
>> 
>> At 2023-03-29 10:10:51, "Jakub Kicinski" <kuba@...nel.org> wrote:
>> >On Tue, 28 Mar 2023 12:50:06 +0800 Liang He wrote:
>> >> In rionet_start_xmit(), we should put the refcount_inc()
>> >> before we add *skb* into the queue, otherwise it may cause
>> >> the consumer to prematurely call refcount_dec().
>> >
>> >Are you sure the race can happen? Look around the code, please.
>> 
>> We commit this patch based on the pattern we learned from these commits,
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc4&id=bb765d1c331f62b59049d35607ed2e365802bef9
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc4&id=47a017f33943278570c072bc71681809b2567b3a
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc4&id=b780d7415aacec855e2f2370cbf98f918b224903
>> 
>> If it is indeed in different context which makes our pattern failed, please kindly pointing out it.
>
>The difference is that these are not start_xmit callbacks.
>See below.
>
>> >> Besides, before the next rionet_queue_tx_msg() when we
>> >> meet the 'RIONET_MAC_MATCH', we should also call
>> >> refcount_inc() before the skb is added into the queue.
>> >
>> >And why is that?
>> 
>> We think it should be better to keep the consistent refcounting-operation
>> when we put the *skb* into the queue.
>
>It's subjective.
>Due to the locking of the code in question there is no race.
>The code you modify is protected by &rnet->tx_lock.

Thanks for your reply, we get it.

>As is the code that will decrement (and free) the skb.
>
>> >As far as I can tell your patch reorders something that doesn't matter
>> >and then adds a bug :|
>> 
>> If there is any bug, can you kindly tell us?
>
>We are dealing with a start_xmit NDO callback.
>To simplify things, let us only consider cases where
>NETDEV_TX_OK is returned, which is the case for the code you modify.
>In such cases the callback should consume the skb, which already
>has a reference taken on it. This is done in the driver by a call
>to dev_kfree_skb_irq() which is executed indirectly via
>rionet_queue_tx_msg().
>
>So currently the reference count handling is correct.
>By taking an extra reference, it will never reach zero,
>and thus the resources of the skb will be leaked.

Sorry for our trouble.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ