lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZCR+5p8TGvS+GQsP@corigine.com>
Date:   Wed, 29 Mar 2023 20:09:42 +0200
From:   Simon Horman <simon.horman@...igine.com>
To:     Liang He <windhl@....com>
Cc:     Jakub Kicinski <kuba@...nel.org>, davem@...emloft.net,
        edumazet@...gle.com, pabeni@...hat.com, netdev@...r.kernel.org
Subject: Re: [PATCH] rionet: Fix refcounting bugs

On Wed, Mar 29, 2023 at 02:01:30PM +0800, Liang He wrote:
> 
> Hi, Jakub,
> 
> First, thanks for you reviewing our patch again.
> 
> At 2023-03-29 10:10:51, "Jakub Kicinski" <kuba@...nel.org> wrote:
> >On Tue, 28 Mar 2023 12:50:06 +0800 Liang He wrote:
> >> In rionet_start_xmit(), we should put the refcount_inc()
> >> before we add *skb* into the queue, otherwise it may cause
> >> the consumer to prematurely call refcount_dec().
> >
> >Are you sure the race can happen? Look around the code, please.
> 
> We commit this patch based on the pattern we learned from these commits,
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc4&id=bb765d1c331f62b59049d35607ed2e365802bef9
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc4&id=47a017f33943278570c072bc71681809b2567b3a
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.3-rc4&id=b780d7415aacec855e2f2370cbf98f918b224903
> 
> If it is indeed in different context which makes our pattern failed, please kindly pointing out it.

The difference is that these are not start_xmit callbacks.
See below.

> >> Besides, before the next rionet_queue_tx_msg() when we
> >> meet the 'RIONET_MAC_MATCH', we should also call
> >> refcount_inc() before the skb is added into the queue.
> >
> >And why is that?
> 
> We think it should be better to keep the consistent refcounting-operation
> when we put the *skb* into the queue.

It's subjective.
Due to the locking of the code in question there is no race.
The code you modify is protected by &rnet->tx_lock.
As is the code that will decrement (and free) the skb.

> >As far as I can tell your patch reorders something that doesn't matter
> >and then adds a bug :|
> 
> If there is any bug, can you kindly tell us?

We are dealing with a start_xmit NDO callback.
To simplify things, let us only consider cases where
NETDEV_TX_OK is returned, which is the case for the code you modify.
In such cases the callback should consume the skb, which already
has a reference taken on it. This is done in the driver by a call
to dev_kfree_skb_irq() which is executed indirectly via
rionet_queue_tx_msg().

So currently the reference count handling is correct.
By taking an extra reference, it will never reach zero,
and thus the resources of the skb will be leaked.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ