lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 25 Mar 2020 17:42:33 +0800
From:   Kai-Heng Feng <kai.heng.feng@...onical.com>
To:     "Brown, Aaron F" <aaron.f.brown@...el.com>
Cc:     "davem@...emloft.net" <davem@...emloft.net>,
        "mkubecek@...e.cz" <mkubecek@...e.cz>,
        "Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
        "open list:NETWORKING DRIVERS" <netdev@...r.kernel.org>,
        "moderated list:INTEL ETHERNET DRIVERS" 
        <intel-wired-lan@...ts.osuosl.org>,
        open list <linux-kernel@...r.kernel.org>
Subject: Re: [Intel-wired-lan] [PATCH v3 1/2] igb: Use device_lock() insead of
 rtnl_lock()

Hi Aaron,

> On Mar 20, 2020, at 15:00, Brown, Aaron F <aaron.f.brown@...el.com> wrote:
> 
>> From: Kai-Heng Feng <kai.heng.feng@...onical.com>
>> Sent: Monday, February 24, 2020 3:02 AM
>> To: Brown, Aaron F <aaron.f.brown@...el.com>
>> Cc: davem@...emloft.net; mkubecek@...e.cz; Kirsher, Jeffrey T
>> <jeffrey.t.kirsher@...el.com>; open list:NETWORKING DRIVERS
>> <netdev@...r.kernel.org>; moderated list:INTEL ETHERNET DRIVERS <intel-
>> wired-lan@...ts.osuosl.org>; open list <linux-kernel@...r.kernel.org>
>> Subject: Re: [Intel-wired-lan] [PATCH v3 1/2] igb: Use device_lock() insead of
>> rtnl_lock()
>> 
>> 
>> 
>>> On Feb 22, 2020, at 08:30, Brown, Aaron F <aaron.f.brown@...el.com> wrote:
>>> 
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Intel-wired-lan <intel-wired-lan-bounces@...osl.org> On Behalf Of
>>>> Kai-Heng Feng
>>>> Sent: Friday, February 7, 2020 2:10 AM
>>>> To: davem@...emloft.net; mkubecek@...e.cz; Kirsher, Jeffrey T
>>>> <jeffrey.t.kirsher@...el.com>
>>>> Cc: open list:NETWORKING DRIVERS <netdev@...r.kernel.org>; Kai-Heng
>>>> Feng <kai.heng.feng@...onical.com>; moderated list:INTEL ETHERNET
>>>> DRIVERS <intel-wired-lan@...ts.osuosl.org>; open list <linux-
>>>> kernel@...r.kernel.org>
>>>> Subject: [Intel-wired-lan] [PATCH v3 1/2] igb: Use device_lock() insead of
>>>> rtnl_lock()
>>>> 
>>>> Commit 9474933caf21 ("igb: close/suspend race in netif_device_detach")
>>>> fixed race condition between close and power management ops by using
>>>> rtnl_lock().
>>>> 
>>>> However we can achieve the same by using device_lock() since all power
>>>> management ops are protected by device_lock().
>>>> 
>>>> This fix is a preparation for next patch, to prevent a dead lock under
>>>> rtnl_lock() when calling runtime resume routine.
>>>> 
>>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
>>>> ---
>>>> v3:
>>>> - Fix unreleased lock reported by 0-day test bot.
>>>> v2:
>>>> - No change.
>>>> 
>>>> drivers/net/ethernet/intel/igb/igb_main.c | 14 ++++++++------
>>>> 1 file changed, 8 insertions(+), 6 deletions(-)
>>> 
>>> This patch introduces the following call trace / RIP when I sleep / resume (via
>> rtcwake) a system that has an igb port with link up:  I'm not sure if it introduces
>> the issue or just exposes / displays it as it only shows up on the first sleep /
>> resume cycle and the systems I have that were stable for many sleep / resume
>> cycles (arbitrarily 50+) continue to be so.
>> 
>> I can't reproduce the issue here.
>> 
> 
> I just got back to looking at the igb driver and  found a similar call trace / RIP with this patch.  Turns out any of my igb systems will freeze if the igb driver is unloaded while the interface is logically up with link.  The system continues to run if I switch to another console, but any attempt to look at the network (ifconfig, ethtool, etc...) makes that other session freeze up.  Then about 5 minutes later a trace appears on the screen and continues to do so every few minutes.  Here's what I pulled out of the system log for this instance:

Yes I can reproduce the bug by removing the module while link is up.
I am currently finding a fix for this issue.

Kai-Heng

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ