lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210420122715.2066b537@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Tue, 20 Apr 2021 12:27:15 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Eric Dumazet <edumazet@...gle.com>,
        AceLan Kao <acelan.kao@...onical.com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andriin@...com>, Wei Wang <weiwan@...gle.com>,
        Cong Wang <cong.wang@...edance.com>,
        Taehee Yoo <ap420073@...il.com>,
        Björn Töpel <bjorn@...nel.org>,
        netdev <netdev@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] net: called rtnl_unlock() before runpm resumes devices

On Tue, 20 Apr 2021 10:34:17 +0200 Eric Dumazet wrote:
> On Tue, Apr 20, 2021 at 9:54 AM AceLan Kao <acelan.kao@...onical.com> wrote:
> >
> > From: "Chia-Lin Kao (AceLan)" <acelan.kao@...onical.com>
> >
> > The rtnl_lock() has been called in rtnetlink_rcv_msg(), and then in
> > __dev_open() it calls pm_runtime_resume() to resume devices, and in
> > some devices' resume function(igb_resum,igc_resume) they calls rtnl_lock()
> > again. That leads to a recursive lock.
> >
> > It should leave the devices' resume function to decide if they need to
> > call rtnl_lock()/rtnl_unlock(), so call rtnl_unlock() before calling
> > pm_runtime_resume() and then call rtnl_lock() after it in __dev_open().
> >
> >  
> 
> Hi Acelan
> 
> When was the bugg added ?
> Please add a Fixes: tag

For immediate cause probably:

Fixes: 9474933caf21 ("igb: close/suspend race in netif_device_detach")

> By doing so, you give more chances for reviewers to understand why the
> fix is not risky,
> and help stable teams work.

IMO the driver lacks internal locking. Taking rtnl from resume is just
one example, git history shows many more places that lacked locking and
got papered over with rtnl here.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ