lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87o7wpxb1m.fsf@intel.com>
Date:   Fri, 12 Aug 2022 21:05:41 -0300
From:   Vinicius Costa Gomes <vinicius.gomes@...el.com>
To:     James Hogan <jhogan@...nel.org>
Cc:     Paul Menzel <pmenzel@...gen.mpg.de>,
        Tony Nguyen <anthony.l.nguyen@...el.com>,
        Jesse Brandeburg <jesse.brandeburg@...el.com>,
        netdev@...r.kernel.org, intel-wired-lan@...ts.osuosl.org,
        Sasha Neftin <sasha.neftin@...el.com>,
        Aleksandr Loktionov <aleksandr.loktionov@...el.com>
Subject: Re: [WIP v2] igc: fix deadlock caused by taking RTNL in RPM resume
 path

Hi James,

James Hogan <jhogan@...nel.org> writes:

> On Thursday, 11 August 2022 21:25:24 BST Vinicius Costa Gomes wrote:
>> It was reported a RTNL deadlock in the igc driver that was causing
>> problems during suspend/resume.
>> 
>> The solution is similar to commit ac8c58f5b535 ("igb: fix deadlock
>> caused by taking RTNL in RPM resume path").
>> 
>> Reported-by: James Hogan <jhogan@...nel.org>
>> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@...el.com>
>> ---
>> Sorry for the noise earlier, my kernel config didn't have runtime PM
>> enabled.
>
> Thanks for looking into this.
>
> This is identical to the patch I've been running for the last week. The 
> deadlock is avoided, however I now occasionally see an assertion from 
> netif_set_real_num_tx_queues due to the lock not being taken in some cases via 
> the runtime_resume path, and a suspicious rcu_dereference_protected() warning 
> (presumably due to the same issue of the lock not being taken). See here for 
> details:
> https://lore.kernel.org/netdev/4765029.31r3eYUQgx@saruman/

Oh, sorry. I missed the part that the rtnl assert splat was already
using similar/identical code to what I got/copied from igb.

So what this seems to be telling us is that the "fix" from igb is only
hiding the issue, and we would need to remove the need for taking the
RTNL for the suspend/resume paths in igc and igb? (as someone else said
in that igb thread, iirc)


Cheers,
-- 
Vinicius

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ