linux-kernel - Re: excessive kworker activity when idle. (was Re: vma corruption in today's -git)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTikvXSZ2NSA7Ar+bTA1H+S3HBs9e5NNb71RPTs32@mail.gmail.com>
Date:	Thu, 31 Mar 2011 08:45:51 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Dave Jones <davej@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Tejun Heo <tj@...nel.org>
Subject: Re: excessive kworker activity when idle. (was Re: vma corruption in
 today's -git)

On Thu, Mar 31, 2011 at 8:09 AM, Dave Jones <davej@...hat.com> wrote:
>
> I thought that trace looked familiar.
>
> http://lkml.org/lkml/2010/11/30/592
>
> It's the same thing.

Ok, that's before the "tty: stop using "delayed_work" in the tty
layer" commit I just pointed to.

So apparently you've been able to trigger this even with the old code
too - although maybe the lack of delays anywhere has made it easier,
and has made it use more cpu.

I'll have to think about it, but I wonder if it's the crazy "reflush"
case in flush_to_ldisc. We do

                        if (!tty->receive_room || seen_tail) {
                                schedule_work(&tty->buf.work);
                                break;
                        }

inside the routine that is the work itself - basically we're saying
that "if there's no more room to flip, of we've seen a new buffer,
give up now and reschedule outselves".

Which doesn't really make much sense to me, I have to admit. The code
that actually empties the buffer, or the code that adds one, should
already have scheduled us for a flip _anyway_. So the only thing that
"schedule_work()" is doing is causing infinite work if nothing empties
the buffer, of more likely if we have a flushing bug elsewhere.

So I'm not sure, but my gut feel is that removing that
"schedule_work()" line there is the right thing to do.

At a guess, it was hiding some locking problem - and it's been carried
around even though hopefully we've fixed all the crazy races we used
to have (and it was a mindless "hey, we can retry in one jiffy - it
doesn't really cost us anything")

NOTE! Even if I'm right, and that line is just buggy, the bug may well
have been hiding some other issue - ie just some using not flushing
the tty when it made more room available. So I think the "make tty
flush cause a re-flush when it cannot make progress" is wrong, but
removing the line may well expose some other problem.

                             Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/