lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161207232141.GD2250@amd>
Date:   Thu, 8 Dec 2016 00:21:41 +0100
From:   Pavel Machek <pavel@....cz>
To:     Lino Sanfilippo <LinoSanfilippo@....de>
Cc:     bh74.an@...sung.com, ks.giri@...sung.com, vipul.pandya@...sung.com,
        peppe.cavallaro@...com, alexandre.torgue@...com,
        davem@...emloft.net, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org
Subject: Re: [PATCH 2/2] net: ethernet: stmmac: remove private tx queue lock

On Wed 2016-12-07 23:34:19, Lino Sanfilippo wrote:
> On 07.12.2016 22:43, Lino Sanfilippo wrote:
> > Hi Pavel,
> > 
> > On 07.12.2016 22:37, Pavel Machek wrote:
> >> On Wed 2016-12-07 21:05:38, Lino Sanfilippo wrote:
> >>> The driver uses a private lock for synchronization between the xmit
> >>> function and the xmit completion handler, but since the NETIF_F_LLTX flag
> >>> is not set, the xmit function is also called with the xmit_lock held.
> >>> 
> >>> On the other hand the xmit completion handler first takes the private lock
> >>> and (in case that the tx queue has been stopped) the xmit_lock, leading to
> >>> a reverse locking order and the potential danger of a deadlock.
> >>> 
> >>> Fix this by removing the private lock completely and synchronizing the xmit
> >>> function and completion handler solely by means of the xmit_lock. By doing
> >>> this remove also the now unnecessary double check for a stopped tx queue.
> >>> 
> >> 
> >> FYI, here's modified version. I believe _bh versions are needed, and
> >> I'm testing that version now. (Oh and I also ported it to net-next).
> >> 
> >> It survived 30 minutes of testing so far...
> >> 
> > 
> > First off, thanks for testing.
> > Hmm. I dont understand why _bh would be needed. We call that function from
> > BH context only (napi poll and timer).
> > Any idea?
> > 
> 
> Could this once again be caused by irq coalescing? When the tx queue has been stopped
> the cleanup handler has to wakeup the queue within a certain time span, otherwise the
> watchdog will complain (as it happened in your test). Could you retest this with
> irq coalescing disabled?

I actually had TX coalescing disabled, with

-#define STMMAC_TX_FRAMES       64
+#define STMMAC_TX_FRAMES       0


										Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ