lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c32640d3bd55b826cc8566c0799ecdb0483ce6f5.camel@linux.intel.com>
Date:   Fri, 15 Mar 2019 13:40:10 -0700
From:   Alexander Duyck <alexander.h.duyck@...ux.intel.com>
To:     Heiner Kallweit <hkallweit1@...il.com>,
        VDR User <user.vdr@...il.com>
Cc:     netdev@...r.kernel.org
Subject: Re: r8169 driver from kernel 5.0 crashing - napi_consume_skb

On Fri, 2019-03-15 at 21:26 +0100, Heiner Kallweit wrote:
> On 15.03.2019 21:09, VDR User wrote:
> > > > > > Thanks for the additional info and for testing 4.20.15.
> > > > > > To rule out that the issue is caused by a regression in network or
> > > > > > some other subsystem: Can you take the r8169.c from 4.20.15 and test
> > > > > > it on top of 5.0?
> > > > > > Meanwhile I'll look at the changes in the driver between 4.20 and 5.0.
> > > > > 
> > > > > Sure, no problem! I'll copy the driver & recompile now actually.
> > > > > Hopefully there aren't a ton of changes to r8169.c to sift through and
> > > > > the cause isn't good at hiding itself!
> > > > > 
> > > > I checked the driver changes new in 5.0 and there are very few
> > > > functional changes. You could try to revert the following:
> > > > 
> > > > 5317d5c6d47e ("r8169: use napi_consume_skb where possible")
> > > 
> > > Will do, and fwiw, while I haven't been able to do tons of testing
> > > today, I haven't been able to trigger the crash after replacing
> > > 5.0.0's r8169.c with 4.20.15's r8169.c this morning. I'll restore the
> > > file and revert the change you mentioned, and report back my findings.
> > 
> > Heiner,
> > 
> > After going back to vanilla kernel 5.0 and then reverting 5317d5c6d47e
> > ("r8169: use napi_consume_skb where possible"), I so far have not had
> > any crashes after transferring roughly 30GB back & forth. I'm not
> > completely confident yet the crash is resolve with that revert and
> > will continue to do further testing throughout the weekend as well.
> > What confidence level do you have that 5317d5c6d47e is the culprit at
> > this point?
> > 
> Good, thanks for testing. I simply see no other change since 4.20 that
> could cause these symptoms.
> Using napi_consume_skb() at this place in r8169.c looks safe to me.
> Option 1 is that I miss something, option 2 is that there's an issue
> in the NAPI subsystem. However in the latter case I assume at least
> the Mellanox and/or Intel guys would have observed the same issue
> on their respective CI systems.
> Let me add Alexander, maybe he can provide a hint before we go and
> revert the change.

Do you have the crash log? I'd be curious what the issue is we are
seeing.

I agree I can't see anything obvious, but it is possible that we may be
running into something we hadn't seen with the Intel and Mellanox
parts.

- Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ