linux-kernel - Re: [V9fs-developer] Hang triggered by udev coldplug, looks like a race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20160104192737.GJ6344@twins.programming.kicks-ass.net>
Date:	Mon, 4 Jan 2016 20:27:37 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Dominique Martinet <dominique.martinet@....fr>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...nel.org>,
	Al Viro <viro@...iv.linux.org.uk>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	V9FS Developers <v9fs-developer@...ts.sourceforge.net>,
	Linux FS Devel <linux-fsdevel@...r.kernel.org>
Subject: Re: [V9fs-developer] Hang triggered by udev coldplug, looks like a
 race

On Mon, Jan 04, 2016 at 11:07:43AM -0800, Andy Lutomirski wrote:
> On Mon, Jan 4, 2016 at 8:09 AM, Dominique Martinet
> <dominique.martinet@....fr> wrote:
> > Peter Zijlstra wrote on Mon, Jan 04, 2016 at 04:59:15PM +0100:
> >> On Tue, Dec 29, 2015 at 10:43:26PM -0800, Andy Lutomirski wrote:
> >> > [add cc's]
> >> >
> >> > Hi scheduler people:
> >> >
> >> > This is relatively easy for me to reproduce.  Any hints for debugging
> >> > it?  Could we really have a bug in which processes that are
> >> > schedulable as a result of mutex unlock aren't always reliably
> >> > scheduled?
> >>
> >> I would expect that to cause wide-spread fail, then again, virt is known
> >> to tickle timing issues that are improbable on actual hardware so
> >> anything is possible.
> >>
> >> Does it reproduce with DEBUG_MUTEXES set? (I'm not seeing a .config
> >> here).
> >
> > The config has CONFIG_DEBUG_MUTEXES=y
> >
> > It got attached a while ago, reposting it here.
> >
> >> If its really easy you could start by tracing events/sched/sched_switch
> >> events/sched/sched_wakeup, those would be the actual scheduling events.
> >
> > I'm sure I've missed something in /Documentation but I'm not aware how
> > to trace these? (I'm happy to save Andy some precious time as I've got a
> > reproducer all set up now)
> 
> My reproducer, at least, would make this tricky -- the system ends up
> mostly hung, so I don't know how I'd read out the result.  Maybe I'd
> try to get something to dump the ftrace buffer to serial console after
> a delay and stick all that in initramfs where it wouldn't get stuck
> behind the same mutex as everything else.
> 
> Or is there a way to tell the kernel to do that for us?

If you can generate a core, I think crash knows how to read the ftrace
buffers from it.

  http://people.redhat.com/anderson/extensions/trace_help_trace.html

But yes, you can use one of the watchdog thingies to dump the buffers
over 'serial' too, but I suspect that will take a little longer, even
with virtual serial ports.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/