lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 4 Jul 2016 22:40:21 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Vladimir Panteleev <thecybershadow@...il.com>
cc:	Ingo Molnar <mingo@...hat.com>, "H. Peter Anvin" <hpa@...or.com>,
	x86@...nel.org, LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: Subject: PROBLEM: CPU accounting/scheduling regression in v4.6
 CPU scheduling patchset?

On Sun, 3 Jul 2016, Vladimir Panteleev wrote:
> Since updating my PC to Linux 4.6, I noticed the following problems:
> 
> 1. CPU-bound tasks which use all CPU cores have a severe impact on
>    responsiveness.  For example, the following bash command (which
>    simply starts one busyloop per core) is enough to make the machine
>    almost completely unresponsive:
> 
>    for N in $(seq $(nproc)) ; do while true ; do ; done & ; done
> 
> 2. Nearly all tasks in the process listing are shown with 0% CPU
>    usage, even when they're CPU-bound. The only exceptions are the
>    kernel migration and kthreadd tasks, and occasionally the init
>    process.
> 
> I have bisected the problem to commit
> 1cf4f629d9d246519a1e76c021806f2a51ddba4d ("cpu/hotplug: Move online
> calls to hotplugged cpu"), which is part of Thomas Gleixner's CPU
> hotplug refactoring patchset [1]. It introduces both problems
> described above.

I doubt that, but that commit has been a bisect victim before ...

> My system is a GIGABYTE X79S-UP5-WIFI motherboard (F5f BIOS) with an
> i7-4960X CPU, running Arch Linux. I've reproduced with both the
> distro's kernel config [2], as well as a minimal config for my
> system. I can reproduce the problems on the latest rc at the moment,
> v4.7-rc5.
> 
> Comparing dmesg output before and after 1cf4f629, I see no notable
> differences.
> 
> I noticed an existing thread "S3 resume regression" [3] referencing
> this commit, however it describes a different problem. I also found a
> Bugzilla issue for the zero CPU usage problem [4], however it has no
> replies.

That one says:
 * After an hour or less (I have no idea), the top/ps start working
 * I do have exactly the same problem with the LTS branch 4.4.14
 * With 4.5.4 I cannot reproduce the problem just after booting

I tried to reproduce the issue on a couple of machines, but no luck.

No idea at the moment, but Cc'ed scheduler folks.

Thanks,

	tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ