linux-kernel - Re: Linux 3.1-rc9

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.02.1110172237030.3240@ionos>
Date:	Mon, 17 Oct 2011 23:00:35 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Simon Kirby <sim@...tway.ca>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Dave Jones <davej@...hat.com>,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: Linux 3.1-rc9

On Mon, 17 Oct 2011, Peter Zijlstra wrote:

> On Mon, 2011-10-17 at 11:31 -0700, Linus Torvalds wrote:
> > On Mon, Oct 17, 2011 at 10:54 AM, Peter Zijlstra <a.p.zijlstra@...llo.nl> wrote:
> > >
> > > I could of course propose this... but I really won't since I'm half
> > > retching by now.. ;-)
> > 
> > Wow. Is this "ugly and fragile code week" and I just didn't get the memo?
> 
> Do I get a price?
> 
> > I do wonder if we might not fix the problem by just taking the
> > *existing* lock in the right order?
> > 
> > IOW, how nasty would be it be to make "scheduler_tick()" just get the
> > cputimer->lock outside or rq->lock?
> > 
> > Sure, we'd hold that lock *much* longer than we need, but how much do
> > we care? Is that a lock that gets contention? It migth be the simple
> > solution for now - I *would* like to get 3.1 out..
> 
> Ah, sadly the tick isn't the only one with the inverted callchain,
> pretty much every callchain in the scheduler ends up in update_curr()
> one way or another.
> 
> The easier way around might be something like this... even when two
> threads in a process race to enable this clock the the wasted time is
> pretty much of the same order as we would otherwise have wasted spinning
> on the lock and the update_gt_cputime() think would end up moving the
> clock fwd to the latest outcome any which way.
> 
> Humm,. Thomas anything?
 
No, that should work. It does not make that call path more racy
against exit, which is another trainwreck at least on 32bit machines
which I discovered while looking for the problems with your patch.

thread_group_cputime() reads task->signal->utime/stime/sum_sched_runtime

These fields are updated in __exit_signal() w/o holding
task->signal->cputimer.lock. So nothing prevents that these values
change while we read them.

All callers of thread_group_cputime() except the scheduler callpath
hold sighand lock, which is also taken in __exit_signal().

So your patch does not make that particular case worse.

That said, I really need some sleep before I can make a final
judgement on that horror. The call paths are such an intermingled mess
that it's not funny anymore. I do that tomorrow morning first thing.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/