netdev - Re: [PATCH RFC 08/26] locking: Remove spin_unlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20170704005438.GA19389@linux.vnet.ibm.com>
Date:   Mon, 3 Jul 2017 17:54:38 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Will Deacon <will.deacon@....com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        NetFilter <netfilter-devel@...r.kernel.org>,
        Network Development <netdev@...r.kernel.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ingo Molnar <mingo@...hat.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Manfred Spraul <manfred@...orfullife.com>,
        Tejun Heo <tj@...nel.org>, Arnd Bergmann <arnd@...db.de>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Alan Stern <stern@...land.harvard.edu>,
        Andrea Parri <parri.andrea@...il.com>
Subject: Re: [PATCH RFC 08/26] locking: Remove spin_unlock_wait() generic
 definitions

On Mon, Jul 03, 2017 at 05:39:36PM -0700, Paul E. McKenney wrote:
> On Mon, Jul 03, 2017 at 03:49:42PM -0700, Linus Torvalds wrote:
> > On Mon, Jul 3, 2017 at 3:30 PM, Paul E. McKenney
> > <paulmck@...ux.vnet.ibm.com> wrote:
> > >
> > > That certainly is one interesting function, isn't it?  I wonder what
> > > happens if you replace the raw_spin_is_locked() calls with an
> > > unlock under a trylock check?  ;-)
> > 
> > Deadlock due to interrupts again?
> 
> Unless I am missing something subtle, the kgdb_cpu_enter() function in
> question has a local_irq_save() over the "interesting" portion of its
> workings, so interrupt-handler self-deadlock should not happen.
> 
> > Didn't your spin_unlock_wait() patches teach you anything? Checking
> > state is fundamentally different from taking the lock. Even a trylock.
> 
> That was an embarrassing bug, no two ways about it.  :-/
> 
> > I guess you could try with the irqsave versions. But no, we're not doing that.
> 
> Again, no need in this case.
> 
> But I agree with Will's assessment of this function...
> 
> The raw_spin_is_locked() looks to be asking if -any- CPU holds the
> dbg_slave_lock, and the answer could of course change immediately
> on return from raw_spin_is_locked().  Perhaps the theory is that
> if other CPU holds the lock, this CPU is supposed to be subjected to
> kgdb_roundup_cpus().  Except that the CPU that held dbg_slave_lock might
> be just about to release that lock.  Odd.
> 
> Seems like there should be a get_online_cpus() somewhere, but maybe
> that constraint is to be manually enforced.

Except that invoking get_online_cpus() from an exception handler would
be of course be a spectacularly bad idea.  I would feel better if the
num_online_cpus() was under the local_irq_save(), but perhaps this code
is relying on the stop_machine().  Except that it appears we could
deadlock with offline waiting for stop_machine() to complete and kdbg
waiting for all CPUs to report, including those in stop_machine().

Looks like the current situation is "Don't use kdbg if there is any
possibility of CPU-hotplug operations."  Not necessarily an unreasonable
restriction.

But I need to let me eyes heal a bit before looking at this more.

							Thanx, Paul