lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 26 Jun 2019 11:01:28 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        Linux Next Mailing List <linux-next@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: xarray soft lockups on 5.2-rc's

On 6/3/19 3:59 PM, Matthew Wilcox wrote:
>> I'm seeing a similar softlockup:
> I've taken that commit out of my xarray tree.  It's only a performance
> optimisation, so I don't mind dropping it until it's debugged.

I'm also seeing these on mainline pretty regularly (~once a week).  The
latest was on -rc4, but the xarray code doesn't look like it's been
updated in mainline since 5.1.

I only seem to see these for the swap cache path, never the page cache
path, which makes me suspicious (typical dump is below).  The stack is
always the same path, but the exact instruction pointer address seems to
bounce around between find_get_entry() and deeper into the xarray code.

find_get_entry() seems to have some recent changes, and this patch of
backtraces would be consistent with it getting stuck in a "goto repeat"
loop in there.

     12 RIP: 0010:xas_load+0x35/0x80
      9 RIP: 0010:xas_load+0x27/0x80
      7 RIP: 0010:xas_load+0x13/0x80
      4 RIP: 0010:xas_load+0x66/0x80
      4 RIP: 0010:xas_load+0x45/0x80
      2 RIP: 0010:xas_start+0x0/0x90
      2 RIP: 0010:xas_load+0x3f/0x80
      2 RIP: 0010:xas_load+0x31/0x80
      2 RIP: 0010:xas_load+0x20/0x80
      2 RIP: 0010:xas_load+0x0/0x80
      2 RIP: 0010:find_get_entry+0x87/0x1a0
      2 RIP: 0010:find_get_entry+0x74/0x1a0
      1 RIP: 0010:xas_start+0x57/0x90
      1 RIP: 0010:xas_start+0x42/0x90
      1 RIP: 0010:xas_load+0xb/0x80
      1 RIP: 0010:xas_load+0x72/0x80
      1 RIP: 0010:xas_load+0x5e/0x80
      1 RIP: 0010:xas_load+0x43/0x80
      1 RIP: 0010:xas_load+0x39/0x80
      1 RIP: 0010:xas_load+0x24/0x80
      1 RIP: 0010:find_get_entry+0x63/0x1a0
      1 RIP: 0010:dev_watchdog+0x273/0x280


> CPU: 2 PID: 1025 Comm: TaskSchedulerFo Not tainted 5.2.0-rc4 #15
> Hardware name: LENOVO 20F5S7V800/20F5S7V800, BIOS R02ET70W (1.43 ) 01/28/2019
> RIP: 0010:xas_start+0x0/0x90
> Code: 48 85 d2 75 e3 48 8b 17 b8 00 00 80 00 89 f1 d3 e0 8b 7a 18 85 c7 75 07 09 f8 89 42 18 f3 c3 f3 c3 f3 c3 0f 1f 80 00 00 00 00 <48> 8b 47 18 48 89 c2 83 e2 03 74 19 48 83 fa 02 75 26 48 3d 05 c0
> RSP: 0000:ffffc9000996fcf0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000000 RBX: ffffc9000996fd08 RCX: 0000000000000000 
> RDX: 0000000000000021 RSI: ffff888168184490 RDI: ffffc9000996fd08
> RBP: 000000000027db21 R08: 000fffffffe00000 R09: ffff8884216fa5c0 
> R10: ffff8884216fa000 R11: 0000000000000822 R12: 0000000000000000
> R13: ffff888409fc9018 R14: 000000000027db21 R15: 0000000000000001 
> FS:  00007f4f44d01700(0000) GS:ffff888411900000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f4f4cb21048 CR3: 000000029751a003 CR4: 00000000003626e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  xas_load+0x9/0x80
>  find_get_entry+0x74/0x1a0
>  pagecache_get_page+0x27/0x250
>  lookup_swap_cache+0x40/0x130
>  do_swap_page+0x84/0x860
>  __handle_mm_fault+0x865/0xfb0
>  handle_mm_fault+0x103/0x210
>  __do_page_fault+0x2f5/0x530
>  ? page_fault+0x8/0x30
>  page_fault+0x1e/0x30
> RIP: 0033:0x7f4ff44cb836
> Code: 39 f7 0f 84 48 fd ff ff 44 89 ca 83 4b 04 04 e9 f3 fe ff ff 48 8b 53 68 49 8d 04 2f 4c 39 62 18 75 30 48 81 3c 24 ff 03 00 00 <4c> 89 60 18 48 89 50 10 48 89 43 68 48 89 42 18 0f 86 55 fe ff ff
> RSP: 002b:00007f4f44cffa50 EFLAGS: 00010212 
> RAX: 00007f4f4cb21030 RBX: 00007f4f4c000020 RCX: 00007f4f4c75c5f0
> RDX: 00007f4f4c000078 RSI: 00000000000030f1 RDI: 00000000007117f0 
> RBP: 00000000001eda40 R08: 00007f4f4c350370 R09: 0000000004422604
> R10: 0000000000184cb5 R11: 0000000000000000 R12: 00007f4f4c000078
> R13: 00007f4f4c75c5f0 R14: 00000000008ff230 R15: 00007f4f4c9335f0
> watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [TaskSchedulerFo:1025]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ