lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <6243f7cc-1a80-db0b-4765-fa12bda9b06a@comcast.net>
Date:   Sun, 23 Sep 2018 16:07:12 -0400
From:   Rob Prowel <rprowel@...cast.net>
To:     linux-kernel@...r.kernel.org
Subject: AMD Athlon bogus performance value causing RCU stalls?


Please CC me on comments.

I'm seeing a lot of these errors on my dual core fileserver:
-----------------------------------------------------------------------

Sep 23 01:51:28 files kernel: INFO: rcu_sched detected stalls on CPUs/tasks:
Sep 23 01:51:28 files kernel:         1-...!: (0 ticks this GP) idle=27c/0/0 softirq=35425/35425 fqs=0
Sep 23 01:51:28 files kernel:         (detected by 0, t=60009 jiffies, g=20812, c=20811, q=121)
Sep 23 01:51:28 files kernel: Sending NMI from CPU 0 to CPUs 1:
Sep 23 01:51:28 files kernel: NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0x2/0x10
Sep 23 01:51:28 files kernel: rcu_sched kthread starved for 60009 jiffies! g20812 c20811 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=1
Sep 23 01:51:28 files kernel: RCU grace-period kthread stack dump:
Sep 23 01:51:28 files kernel: rcu_sched       I    0    10      2 0x80000000
Sep 23 01:51:33 files kernel: Call Trace:
Sep 23 01:51:33 files kernel:  ? __schedule+0x25c/0x860
Sep 23 01:51:33 files kernel:  schedule+0x28/0x80
Sep 23 01:51:33 files kernel:  schedule_timeout+0x174/0x370
Sep 23 01:51:33 files kernel:  ? __next_timer_interrupt+0xc0/0xc0
Sep 23 01:51:33 files kernel:  rcu_gp_kthread+0x4b6/0x8c0
Sep 23 01:51:33 files kernel:  ? _synchronize_rcu_expedited.constprop.68+0x310/0x310
Sep 23 01:51:33 files kernel:  kthread+0x113/0x130
Sep 23 01:51:33 files kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
Sep 23 01:51:33 files kernel:  ret_from_fork+0x35/0x40

-----------------------------------------------------------------------

The kernel reported bogoMIPS for the cores are as follows:

$ grep bogo /proc/cpuinfo
bogomips        : 4219.49
bogomips        : 184253.06
$

What is that value for the second Athlon core (seems extremely bogus), and would/could that be the reason for the schedule_timeouts?  This bogus value also shows up in the bootup log when the second core is activated.  Seems to be AMD specific, as the values are correct on my Xeon machines.

Kernel is a stock Fedora 4.18.7-100 release.  Machine is an old Dell Experion that I've repurposed as a fileserver and postgresql machine.

Other than RTFM, or please build a bunch of kernels from source on your slow machine, using differing config options to help track down the cause of this...any thoughts about a solution?


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ