lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <28a9fabb-c9fe-c865-016a-467a4d5e2a34@molgen.mpg.de>
Date:   Tue, 8 Nov 2016 13:22:28 +0100
From:   Paul Menzel <pmenzel@...gen.mpg.de>
To:     linux-kernel@...r.kernel.org
Cc:     "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Josh Triplett <josh@...htriplett.org>, dvteam@...gen.mpg.de
Subject: INFO: rcu_sched detected stalls on CPUs/tasks with `kswapd` and
 `mem_cgroup_shrink_node`

Dear Linux folks,


Could you please help me shedding some light into the messages below?

With Linux 4.4.X, these messages were not seen. When updating to Linux 
4.8.4, and Linux 4.8.6 they started to appear. In that version, we 
enabled several CGROUP options.

> $ dmesg -T
> […]
> [Mon Nov  7 15:09:45 2016] INFO: rcu_sched detected stalls on CPUs/tasks:
> [Mon Nov  7 15:09:45 2016]     3-...: (493 ticks this GP) idle=515/140000000000000/0 softirq=5504423/5504423 fqs=13876
> [Mon Nov  7 15:09:45 2016]     (detected by 5, t=60002 jiffies, g=1363193, c=1363192, q=268508)
> [Mon Nov  7 15:09:45 2016] Task dump for CPU 3:
> [Mon Nov  7 15:09:45 2016] kswapd1         R  running task        0    87      2 0x00000008
> [Mon Nov  7 15:09:45 2016]  ffffffff81aabdfd ffff8810042a5cb8 ffff88080ad34000 ffff88080ad33dc8
> [Mon Nov  7 15:09:45 2016]  ffff88080ad33d00 0000000000003501 0000000000000000 0000000000000000
> [Mon Nov  7 15:09:45 2016]  0000000000000000 0000000000000000 0000000000022316 000000000002bc9f
> [Mon Nov  7 15:09:45 2016] Call Trace:
> [Mon Nov  7 15:09:45 2016]  [<ffffffff81aabdfd>] ? __schedule+0x21d/0x5b0
> [Mon Nov  7 15:09:45 2016]  [<ffffffff81106dcf>] ? shrink_node+0xbf/0x1c0
> [Mon Nov  7 15:09:45 2016]  [<ffffffff81107865>] ? kswapd+0x315/0x5f0
> [Mon Nov  7 15:09:45 2016]  [<ffffffff81107550>] ? mem_cgroup_shrink_node+0x90/0x90
> [Mon Nov  7 15:09:45 2016]  [<ffffffff8106c614>] ? kthread+0xc4/0xe0
> [Mon Nov  7 15:09:45 2016]  [<ffffffff81aaf64f>] ? ret_from_fork+0x1f/0x40
> [Mon Nov  7 15:09:45 2016]  [<ffffffff8106c550>] ? kthread_worker_fn+0x160/0x160

Even after reading `stallwarn.txt` [1], I don’t know what could cause 
this. All items in the backtrace seem to belong to the Linux kernel.

There is also nothing suspicious in the monitoring graphs during that time.


Kind regards,

Paul


[1] https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt

View attachment "config-4.8.6.mx64.115" of type "text/plain" (112117 bytes)

View attachment "config-4.4.14.mx64.90" of type "text/plain" (107630 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ