lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 31 Jan 2015 21:16:01 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Oleg Nesterov <oleg@...hat.com>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Hurley <peter@...leysoftware.com>,
	Davidlohr Bueso <dave@...olabs.net>,
	Bruno Prémont <bonbons@...ux-vserver.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ilya Dryomov <ilya.dryomov@...tank.com>,
	Mike Galbraith <umgwanakikbuti@...il.com>
Subject: Re: Linux 3.19-rc5

On Sat, Jan 31, 2015 at 10:32:23AM -0800, Linus Torvalds wrote:
> On Fri, Jan 30, 2015 at 7:47 AM, Oleg Nesterov <oleg@...hat.com> wrote:
> >
> > Perhaps sched_annotate_sleep() shouldn't depend on CONFIG_DEBUG_ATOMIC_SLEEP
> > too...
> 
> Ugh. That thing is horrible. The naming doesn't make it obvious at all
> that it's actually making sure that we have state set to TASK_RUNNING,
> and I could easily imagine that it would cause similar "busy-loops
> while scheduling" issues if anybody ever uses it in the wrong context.

The alternative was putting unconditional __set_task_state(TASK_RUNNING)
things in a few code paths. I didn't want to cause the extra code in
case we didn't need them. Particularly the include/net/sock.h one, as I
know the network people are cycle counters.

But this function should indeed be used very rarely. Currently we're at
5 instances.

> So I really think that whole thing is a sign of "the debug
> infrastructure is buggy, and people are introducing fragile things to
> just shut up the false positives".

Aside from this one annotation, which is used in cases where we broke
out of the wait loop but haven't reset task state yet, there have not
really been false positives.

All the stuff it flagged are genuinely wrong, albeit not disastrously
so, things mostly just work.

Should I make the default wait_event() safe against sleeps? It would
make it slightly more expensive, but on the whole that should not really
be a problem.

Alternatively we should provide an alternative to wait_event() that
allows sleeps, but I'm failing to come up with a good name.

> I don't know how to fix it. I really get the feeling that the whole
> new "nested sleep" detection code was a mistake to begin with, since
> it wasn't even a real bug, and it has now created more bugs than it
> ever detected afaik.

I appreciate I caused you some pain here, and I'm very sorry for that.
But I do think we should be little more careful with task::state, on
occasion things do go wrong there and funny stuff happens.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ