lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 30 Apr 2017 15:41:29 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Mike Galbraith <efault@....de>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>, Ingo Molnar <mingo@...e.hu>,
        Thomas Gleixner <tglx@...utronix.de>,
        PeterZijlstra <peterz@...radead.org>,
        Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [patch] timer: Fix timers_update_migration(), and call it in
 tmigr_init()

On Sat, Apr 29, 2017 at 09:36:37PM -0700, Paul E. McKenney wrote:
> On Sun, Apr 30, 2017 at 06:20:15AM +0200, Mike Galbraith wrote:
> > On Sat, 2017-04-29 at 20:43 -0700, Paul E. McKenney wrote:
> > > On Sun, Apr 30, 2017 at 03:21:58AM +0200, Mike Galbraith wrote:
> > > > On Sat, 2017-04-29 at 14:45 -0700, Paul E. McKenney wrote:
> > > > > On Sat, Apr 29, 2017 at 08:20:33PM +0200, Mike Galbraith wrote:
> > > > > > On Sat, 2017-04-29 at 11:06 -0700, Paul E. McKenney wrote:
> > > > > > 
> > > > > > > If someone will either repost a fresh series or point me at exactly
> > > > > > > the set of patches to use, I will run it through rcutorture again.
> > > > > > 
> > > > > > Patchlet is against x86-tip/master.today.
> > > > > 
> > > > > So today's (as in Saturday April 29) x86-tip/master with the following
> > > > > patch applied?
> > > > 
> > > > Yeah.
> > > 
> > > OK, will fire it up once the current set of overnight tests complete.
> > 
> > I certainly don't want to discourage you from beating hell outta tip,
> > just want to make sure you know that I'm seeing zero RCU woes, only
> > late timer expiry (sharpening rocks/sticks to focus trace).
> 
> I got timer_migration splats from an earlier rcutorture run.  Please see
> message-ID <20170421192853.GD3956@...ux.vnet.ibm.com> on LKML on April
> 21st in reply to Thomas's V2 00/10 cover letter.  So I am curious to
> learn if your patches fix them.

And sadly, the splats are still there.  Please see the following for
the relevant console output and .config files:

http://www2.rdrop.com/users/paulmck/submission/TREE04.2017.04.30a.config
http://www2.rdrop.com/users/paulmck/submission/TREE04.2017.04.30a.console.log
http://www2.rdrop.com/users/paulmck/submission/TREE04.3.2017.04.30a.console.log

http://www2.rdrop.com/users/paulmck/submission/TREE07.2017.04.30a.config
http://www2.rdrop.com/users/paulmck/submission/TREE07.2.2017.04.30a.bzImage
http://www2.rdrop.com/users/paulmck/submission/TREE07.2017.04.30a.console.log
http://www2.rdrop.com/users/paulmck/submission/TREE07.2.2017.04.30a.console.log

Please let me know if you have any trouble accessing these.

Here is the first splat from the first TREE04 run:

[    3.310642] WARNING: CPU: 1 PID: 0 at /home/paulmck/public_git/timer-tip/kernel/time/timer_migration.c:387 tmigr_set_cpu_active+0xc6/0xe0
[    3.313210] Modules linked in:
[    3.313861] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc8+ #1
[    3.315196] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[    3.317196] task: ffff8c1fde9f0000 task.stack: ffff8e4fc0124000
[    3.318433] RIP: 0010:tmigr_set_cpu_active+0xc6/0xe0
[    3.319464] RSP: 0000:ffff8e4fc0127e90 EFLAGS: 00010046
[    3.320598] RAX: 0000000000000004 RBX: 0000000000000001 RCX: 000000000000001f
[    3.322146] RDX: 0000000000000001 RSI: ffff8c1fdfc54cc8 RDI: ffff8c1fdeb26f80
[    3.323652] RBP: ffff8e4fc0127ea8 R08: 0000000000000000 R09: 0000000000000008
[    3.325237] R10: ffff8e4fc0127e80 R11: 0000000000000400 R12: ffff8c1fdeb26f80
[    3.326699] R13: ffff8c1fdfc54cc8 R14: 0000000000000000 R15: 0000000000000000
[    3.328149] FS:  0000000000000000(0000) GS:ffff8c1fdfc40000(0000) knlGS:0000000000000000
[    3.329845] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    3.331078] CR2: ffff8e4fc02f0000 CR3: 0000000015e0a000 CR4: 00000000000006e0
[    3.332509] Call Trace:
[    3.333107]  tmigr_cpu_activate+0x36/0x40
[    3.333972]  tick_nohz_idle_exit+0xd1/0xf0
[    3.334845]  do_idle+0x113/0x170
[    3.335501]  cpu_startup_entry+0x18/0x20
[    3.336338]  start_secondary+0xe8/0xf0
[    3.337147]  secondary_startup_64+0x9f/0x9f
[    3.337998] Code: d0 48 8b 03 48 85 c0 75 eb eb a0 49 8b 7c 24 50 41 89 5c 24 08 48 85 ff 74 8c 49 8d 74 24 20 89 da e8 3f ff ff ff e9 7b ff ff ff <0f> ff 41 c6 04 24 00 5b 41 5c 41 5d 5d c3 66 90 66 2e 0f 1f 8

This is the first WARN_ON() in tmigr_set_cpu_active().  I got four splats
in 12 hours of running the rcutorture TREE04 test scenario, that is, three
runs of four hours each.

The TREE07 runs fared worse, with many more splats, starting with a
page fault.  The scripting claimed a hang, but that looks to have instead
been so many splats that the test failed to terminate itself in time.
I ran two TREE07 runs of four hours each.

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ