lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 03 Jun 2011 16:17:54 +1000
From:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:	Alan Cox <alan@...rguk.ukuu.org.uk>
Cc:	gregkh@...e.de,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Felipe Balbi <balbi@...com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Tejun Heo <tj@...nel.org>
Subject: Re: tty breakage in X (Was: tty vs workqueue oddities)

On Fri, 2011-06-03 at 10:56 +1000, Benjamin Herrenschmidt wrote:

> I just noticed it doesn't happen (or if it does, it recovers fast enough
> to not be noticable) on an SMP machine (dual G5). However, if I boot the
> same machine with maxcpus=1, the problem is back. A simple "dmesg" in
> gnome terminal shows it.
> 
> However, on that much faster machine, it also recovers a lot faster. On
> the powerbook, it hangs a few minutes, on the G5 it hangs a few seconds.
> 
> I don't have the bandwidth to dive into the workqueue/tty before this
> week-end, I'll give it a shot next week if nobody beats me to it.

Some more data: It -looks- like what happens is that the flush_to_ldisc
work queue entry constantly re-queues itself (because the PTY is full ?)
and the workqueue thread will basically loop forver calling it without
ever scheduling, thus starving the consumer process that could have
emptied the PTY.

At least that's a semi half-assed theory. If I add a schedule() to
process_one_work() after dropping the lock, the problem disappears.

So there's a combination of things here that are quite interesting:

 - A lot of work queued for the kworker will essentially go on without
scheduling for as long as it takes to empty all work items. That doesn't
sound very nice latency-wise. At least on a non-PREEMPT kernel.

 - flush_to_ldisc seems to be nasty and requeues itself over and over
again from what I can tell, when it can't push the data out, in this
case, I suspect because the PTY is full but I don't know for sure yet.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists