lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150314143820.GA22071@linux>
Date:	Sat, 14 Mar 2015 10:38:20 -0400
From:	"Ahmed S. Darwish" <darwish.07@...il.com>
To:	Marc Kleine-Budde <mkl@...gutronix.de>
Cc:	Olivier Sobrie <olivier@...rie.be>,
	Oliver Hartkopp <socketcan@...tkopp.net>,
	Wolfgang Grandegger <wg@...ndegger.com>,
	Andri Yngvason <andri.yngvason@...el.com>,
	Linux-CAN <linux-can@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org
Subject: Re: [PATCH v4 1/3] can: kvaser_usb: Fix tx queue start/stop race
 conditions

Hi Marc,

On Sat, Mar 14, 2015 at 02:41:18PM +0100, Marc Kleine-Budde wrote:
> On 03/14/2015 02:02 PM, Ahmed S. Darwish wrote:
> > From: Ahmed S. Darwish <ahmed.darwish@...eo.com>
> > 
> > A number of tx queue wake-up events went missing due to the
> > outlined scenario below. Start state is a pool of 16 tx URBs,
> > active tx_urbs count = 15, with the netdev tx queue open.
> > 
> > CPU #1 [softirq]                         CPU #2 [softirq]
> > start_xmit()                             tx_acknowledge()
> > ................                         ................
> > 
> > atomic_inc(&tx_urbs);
> > if (atomic_read(&tx_urbs) >= 16) {
> >                         -->
> >                                          atomic_dec(&tx_urbs);
> >                                          netif_wake_queue();
> >                                          return;
> >                         <--
> >     netif_stop_queue();
> > }
> > 
> > At the end, the correct state expected is a 15 tx_urbs count
> > value with the tx queue state _open_. Due to the race, we get
> > the same tx_urbs value but with the tx queue state _stopped_.
> > The wake-up event is completely lost.
> > 
> > Thus avoid hand-rolled concurrency mechanisms and use a proper
> > lock for contexts and tx queue protection.
> > 
> > Signed-off-by: Ahmed S. Darwish <ahmed.darwish@...eo.com>
> 
> Applied to can. This will go into David's net tree and finally into
> net-next. Then I'll apply patches 2+3. Nag me, if I forget about them ;)
> 

Thanks! :-)

So if I've understood correctly, this patch will go to -rc5 and
the rest will go into net-next?

If so, IMHO patch #2 should also go to -rc5. I know it's usually
frowned up on to add further patches at this late -rc stage, but
here's my logic:

The original driver code just _arbitrarily_ assumed a MAX_TX_URB
value of 16 parallel transmissions. This value was chosen, it seems,
because the driver was heavily based on esd_usb2.c and the esd code
just did so :-(

Meanwhile, in the Kvaser hardware at hand, if I've increased the
driver's max parallel transmissions little above the recommended
limit reported by firmware, the firmware breaks up badly, reports a
massive list of internal errors, and the candump traces becomes
sort of an internal mess hardly related to the actual frames sent
and received.

In my case, I was lucky that the brand we own here (*) had a higher
max outstanding transmissions value than 16. But if there is hardware
out there with a max oustanding tx support < 16 (#), such hardware
will break badly under a heavy transmission load :-(

(*) http://www.kvaser.com/products/kvaser-usb-hs-ii-hsls/

(#) There are a huge list of Kvaser products having the same controller
    but with different performance metrics, so this is quite a
    possiblity.

Thanks,
Darwish
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ