linux-kernel - Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070414190148.GA2844@1wt.eu>
Date:	Sat, 14 Apr 2007 21:01:48 +0200
From:	Willy Tarreau <w@....eu>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Ingo Molnar <mingo@...e.hu>, Nick Piggin <npiggin@...e.de>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Con Kolivas <kernel@...ivas.org>,
	Mike Galbraith <efault@....de>,
	Arjan van de Ven <arjan@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Jiri Slaby <jirislaby@...il.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>
Subject: Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

On Sat, Apr 14, 2007 at 12:40:15PM -0600, Eric W. Biederman wrote:
> Willy Tarreau <w@....eu> writes:
> 
> > On Sat, Apr 14, 2007 at 07:54:33PM +0200, Ingo Molnar wrote:
> >> 
> >> * Eric W. Biederman <ebiederm@...ssion.com> wrote:
> >> 
> >> > > Thinking about it, I don't know if there are calls to schedule() 
> >> > > while switching from tty1 to tty2. Alt-F2 had no effect anymore, and 
> >> > > "chvt 2" simply blocked. It would have been possible that a 
> >> > > schedule() call somewhere got starved due to the load, I don't know.
> >> > 
> >> > It looks like there is a call to schedule_work.
> >> 
> >> so this goes over keventd, right?
> >> 
> >> > There are two pieces of the path. If you are switching in and out of a 
> >> > tty controlled by something like X.  User space has to grant 
> >> > permission before the operation happens.  Where there isn't a gate 
> >> > keeper I know it is cheaper but I don't know by how much, I suspect 
> >> > there is still a schedule happening in there.
> >> 
> >> Could keventd perhaps be starved? Willy, to exclude this possibility, 
> >> could you perhaps chrt keventd to RT priority? If events/0 is PID 5 then 
> >> the command to set it to SCHED_FIFO:50 would be:
> >> 
> >>   chrt -f -p 50 5
> >> 
> >> but ... events/0 is reniced to -5 by default, so it should definitely 
> >> not be starved.
> >
> > Well, since I merged the fair-fork patch, I cannot reproduce (in fact,
> > bash forks 1000 processes, then progressively execs scheddos, but it
> > takes some time). So I'm rebuilding right now. But I think that Linus
> > has an interesting clue about GPM and notification before switching
> > the terminal. I think it was enabled in console mode. I don't know
> > how that translates to frozen xterms, but let's attack the problems
> > one at a time.
> 
> I think it is a good clue.  However the intention of the mechanism is
> that only processes that change the video mode on a VT are supposed to
> use it.  So I really don't think gpm is the culprit.  However it easily could
> be something else that has similar characteristics.
> 
> I just realized we do have proof that schedule_work is actually working
> because SAK works, and we can't sanely do SAK from interrupt context
> so we call schedule work.

Eric,

I can say that Linus, Ingo and you all got on the right track.
I could reproduce, I got a hung tty around 1400 running processes.
Fortunately, it was the one with the root shell which was reniced
to -19.

I could strace chvt 2 :

20:44:23.761117 open("/dev/tty", O_RDONLY) = 3 <0.004000>
20:44:23.765117 ioctl(3, KDGKBTYPE, 0xbfa305a3) = 0 <0.024002>
20:44:23.789119 ioctl(3, VIDIOC_G_COMP or VT_ACTIVATE, 0x3) = 0 <0.000000>
20:44:23.789119 ioctl(3, VIDIOC_S_COMP or VT_WAITACTIVE <unfinished ...>

Then I applied Ingo's suggestion about changing keventd prio :

root@pcw:~# ps auxw|grep event
root         8  0.0  0.0     0    0 ?        SW<  20:31   0:00 [events/0]
root         9  0.0  0.0     0    0 ?        RW<  20:31   0:00 [events/1]

root@pcw:~# rtprio -s 1 -p 50 8 9     (I don't have chrt but it does the same)

My VT immediately switched as soon as I hit Enter. Everything's
working fine again now. So the good news is that it's not a bug
in the tty code, nor a deadlock.

Now, maybe keventd should get a higher prio ? It seems worrying to
me that it may starve when it seems so much sensible.

Also, that may explain why I couldn't reproduce with the fork patch.
Since all new processes got no runtime at first, their impact on
existing ones must have been lower. But I think that if I had waited
longer, I would have had the problem again (though I did not see it
even under a load of 7800).

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/