lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160215213526.GA9766@node.shutemov.name>
Date:	Mon, 15 Feb 2016 23:35:26 +0200
From:	"Kirill A. Shutemov" <kirill@...temov.name>
To:	Gerald Schaefer <gerald.schaefer@...ibm.com>
Cc:	Sebastian Ott <sebott@...ux.vnet.ibm.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Christian Borntraeger <borntraeger@...ibm.com>,
	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Michael Ellerman <mpe@...erman.id.au>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Paul Mackerras <paulus@...ba.org>,
	linuxppc-dev@...ts.ozlabs.org,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will.deacon@....com>,
	linux-arm-kernel@...ts.infradead.org,
	Martin Schwidefsky <schwidefsky@...ibm.com>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	linux-s390@...r.kernel.org
Subject: Re: [BUG] random kernel crashes after THP rework on s390 (maybe also
 on PowerPC and ARM)

On Mon, Feb 15, 2016 at 07:37:02PM +0100, Gerald Schaefer wrote:
> On Mon, 15 Feb 2016 13:31:59 +0200
> "Kirill A. Shutemov" <kirill@...temov.name> wrote:
> 
> > On Sat, Feb 13, 2016 at 12:58:31PM +0100, Sebastian Ott wrote:
> > > 
> > > On Sat, 13 Feb 2016, Kirill A. Shutemov wrote:
> > > > Could you check if revert of fecffad25458 helps?
> > > 
> > > I reverted fecffad25458 on top of 721675fcf277cf - it oopsed with:
> > > 
> > > ¢ 1851.721062! Unable to handle kernel pointer dereference in virtual kernel address space
> > > ¢ 1851.721075! failing address: 0000000000000000 TEID: 0000000000000483
> > > ¢ 1851.721078! Fault in home space mode while using kernel ASCE.
> > > ¢ 1851.721085! AS:0000000000d5c007 R3:00000000ffff0007 S:00000000ffffa800 P:000000000000003d
> > > ¢ 1851.721128! Oops: 0004 ilc:3 ¢#1! PREEMPT SMP DEBUG_PAGEALLOC
> > > ¢ 1851.721135! Modules linked in: bridge stp llc btrfs mlx4_ib mlx4_en ib_sa ib_mad vxlan xor ip6_udp_tunnel ib_core udp_tunnel ptp pps_core ib_addr ghash_s390raid6_pq prng ecb aes_s390 mlx4_core des_s390 des_generic genwqe_card sha512_s390 sha256_s390 sha1_s390 sha_common crc_itu_t dm_mod scm_block vhost_net tun vhost eadm_sch macvtap macvlan kvm autofs4
> > > ¢ 1851.721183! CPU: 7 PID: 256422 Comm: bash Not tainted 4.5.0-rc3-00058-g07923d7-dirty #178
> > > ¢ 1851.721186! task: 000000007fbfd290 ti: 000000008c604000 task.ti: 000000008c604000
> > > ¢ 1851.721189! Krnl PSW : 0704d00180000000 000000000045d3b8 (__rb_erase_color+0x280/0x308)
> > > ¢ 1851.721200!            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
> > >                Krnl GPRS: 0000000000000001 0000000000000020 0000000000000000 00000000bd07eff1
> > > ¢ 1851.721205!            000000000027ca10 0000000000000000 0000000083e45898 0000000077b61198
> > > ¢ 1851.721207!            000000007ce1a490 00000000bd07eff0 000000007ce1a548 000000000027ca10
> > > ¢ 1851.721210!            00000000bd07c350 00000000bd07eff0 000000008c607aa8 000000008c607a68
> > > ¢ 1851.721221! Krnl Code: 000000000045d3aa: e3c0d0080024       stg     %%r12,8(%%r13)
> > >                           000000000045d3b0: b9040039           lgr     %%r3,%%r9
> > >                          #000000000045d3b4: a53b0001           oill    %%r3,1
> > >                          >000000000045d3b8: e33010000024       stg     %%r3,0(%%r1)
> > >                           000000000045d3be: ec28000e007c       cgij    %%r2,0,8,45d3da
> > >                           000000000045d3c4: e34020000004       lg      %%r4,0(%%r2)
> > >                           000000000045d3ca: b904001c           lgr     %%r1,%%r12
> > >                           000000000045d3ce: ec143f3f0056       rosbg   %%r1,%%r4,63,63,0
> > > ¢ 1851.721269! Call Trace:
> > > ¢ 1851.721273! (¢<0000000083e45898>! 0x83e45898)
> > > ¢ 1851.721279!  ¢<000000000029342a>! unlink_anon_vmas+0x9a/0x1d8
> > > ¢ 1851.721282!  ¢<0000000000283f34>! free_pgtables+0xcc/0x148
> > > ¢ 1851.721285!  ¢<000000000028c376>! exit_mmap+0xd6/0x300
> > > ¢ 1851.721289!  ¢<0000000000134db8>! mmput+0x90/0x118
> > > ¢ 1851.721294!  ¢<00000000002d76bc>! flush_old_exec+0x5d4/0x700
> > > ¢ 1851.721298!  ¢<00000000003369f4>! load_elf_binary+0x2f4/0x13e8
> > > ¢ 1851.721301!  ¢<00000000002d6e4a>! search_binary_handler+0x9a/0x1f8
> > > ¢ 1851.721304!  ¢<00000000002d8970>! do_execveat_common.isra.32+0x668/0x9a0
> > > ¢ 1851.721307!  ¢<00000000002d8cec>! do_execve+0x44/0x58
> > > ¢ 1851.721310!  ¢<00000000002d8f92>! SyS_execve+0x3a/0x48
> > > ¢ 1851.721315!  ¢<00000000006fb096>! system_call+0xd6/0x258
> > > ¢ 1851.721317!  ¢<000003ff997436d6>! 0x3ff997436d6
> > > ¢ 1851.721319! INFO: lockdep is turned off.
> > > ¢ 1851.721321! Last Breaking-Event-Address:
> > > ¢ 1851.721323!  ¢<000000000045d31a>! __rb_erase_color+0x1e2/0x308
> > > ¢ 1851.721327!
> > > ¢ 1851.721329! ---¢ end trace 0d80041ac00cfae2 !---
> > > 
> > > 
> > > > 
> > > > And could you share how crashes looks like? I haven't seen backtraces yet.
> > > > 
> > > 
> > > Sure. I didn't because they really looked random to me. Most of the time
> > > in rcu or list debugging but I thought these have just been the messenger
> > > observing a corruption first. Anyhow, here is an older one that might look
> > > interesting:
> > > 
> > > [   59.851421] list_del corruption. next->prev should be 000000006e1eb000, but was 0000000000000400
> > 
> > This kinda interesting: 0x400 is TAIL_MAPPING.. Hm..
> > 
> > Could you check if you see the problem on commit 1c290f642101 and its
> > immediate parent?
> > 
> 
> How should the page->mapping poison end up as next->prev in the list of
> pre-allocated THP splitting page tables?

May be pgtable was casted to struct page or something. I don't know.

> Also, commit 1c290f642101 is before the THP rework, at least the
> non-bisectable part, so we should expect not to see the problem there.

Just to make sure: commit 122afea9626a is fine, commit 61f5d698cc97
crashes. Correct?

> 0x400 is also the value of an empty pte on s390, and the thp_deposit/withdraw
> listheads are placed inside the pre-allocated pagetables instead of page->lru,
> because we have 2K pagetables on s390 and cannot use struct page == pgtable_t.

0x400 from empty pte makes more sense than TAIL_MAPPING. But I guess it
worth changing TAIL_MAPPING to some other value to make sure.

> So, for example, two concurrent withdraws could produce such a list
> corruption, because the first withdraw will overwrite the listhead at the
> beginning of the pagetable with 2 empty ptes.
> 
> Has anything changed regarding the general THP deposit/withdraw logic?

I don't see any changes in this area.

To eliminate one more variable, I would propose to disable split pmd lock
for testing and check if it makes difference.

Is there any chance that I'll be able to trigger the bug using QEMU?
Does anybody have an QEMU image I can use?

-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ