[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1530176707.24245.12.camel@abdul.in.ibm.com>
Date: Thu, 28 Jun 2018 14:35:07 +0530
From: Abdul Haleem <abdhalee@...ux.vnet.ibm.com>
To: Michael Ellerman <mpe@...erman.id.au>
Cc: linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-next <linux-next@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
linux-scsi <linux-scsi@...r.kernel.org>,
Stephen Rothwell <sfr@...b.auug.org.au>,
sachinp <sachinp@...ux.vnet.ibm.com>,
sim <sim@...ux.vnet.ibm.com>,
manvanth <manvanth@...ux.vnet.ibm.com>,
Brian King <brking@...ux.vnet.ibm.com>
Subject: Re: [next-20180601][nvme][ppc] Kernel Oops is triggered when
creating lvm snapshots on nvme disks
On Tue, 2018-06-26 at 23:36 +1000, Michael Ellerman wrote:
> Abdul Haleem <abdhalee@...ux.vnet.ibm.com> writes:
>
> > Greeting's
> >
> > Kernel Oops is seen on 4.17.0-rc7-next-20180601 kernel on a bare-metal
> > machine when running lvm snapshot tests on nvme disks.
> >
> > Machine Type: Power 8 bare-metal
> > kernel : 4.17.0-rc7-next-20180601
> > test:
> > $ pvcreate -y /dev/nvme0n1
> > $ vgcreate avocado_vg /dev/nvme0n1
> > $ lvcreate --size 1.4T --name avocado_lv avocado_vg -y
> > $ mkfs.ext2 /dev/avocado_vg/avocado_lv
> > $ lvcreate --size 1G --snapshot --name avocado_sn /dev/avocado_vg/avocado_lv -y
> > $ lvconvert --merge /dev/avocado_vg/avocado_sn
>
> > the last command results in Oops:
> >
> > Unable to handle kernel paging request for data at address 0x000000d0
> > Faulting instruction address: 0xc0000000002dced4
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > LE SMP NR_CPUS=2048 NUMA PowerNV
> > Dumping ftrace buffer:
> > (ftrace buffer empty)
> > Modules linked in: dm_snapshot dm_bufio nvme bnx2x iptable_mangle
> > ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
> > nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4
> > xt_tcpudp tun bridge stp llc iptable_filter dm_mirror dm_region_hash
> > dm_log dm_service_time vmx_crypto powernv_rng rng_core dm_multipath
> > kvm_hv binfmt_misc kvm nfsd ip_tables x_tables autofs4 xfs lpfc
> > crc_t10dif crct10dif_generic mdio nvme_fc libcrc32c nvme_fabrics
> > nvme_core crct10dif_common [last unloaded: nvme]
> > CPU: 70 PID: 157763 Comm: lvconvert Not tainted 4.17.0-rc7-next-20180601-autotest-autotest #1
> > NIP: c0000000002dced4 LR: c000000000244d14 CTR: c000000000244cf0
> > REGS: c000001f81d6b5a0 TRAP: 0300 Not tainted (4.17.0-rc7-next-20180601-autotest-autotest)
> > MSR: 900000010280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 22442444 XER: 20000000
> > CFAR: c000000000008934 DAR: 00000000000000d0 DSISR: 40000000 SOFTE: 0
> > GPR00: c000000000244d14 c000001f81d6b820 c00000000109c400 c000003c9d080180
> > GPR04: 0000000000000001 c000001fad510000 c000001fad510000 0000000000000001
> > GPR08: 0000000000000000 f000000000000000 f000000000000008 0000000000000000
> > GPR12: c000000000244cf0 c000001ffffc4f80 00007fffa0e31090 00007fffd9d9b470
> > GPR16: 0000000000000000 000000000000005c 00007fffa0e3a5b0 00007fffa0e62040
> > GPR20: 0000010014ad7d50 0000010014ad7d20 00007fffa0e64210 0000000000000001
> > GPR24: 0000000000000000 c00000000081bae0 c000001ed2461b00 d00000000f859d08
> > GPR28: c000003c9d080180 c000000000244d14 0000000000000001 0000000000000000
> > NIP [c0000000002dced4] kmem_cache_free+0x1a4/0x2b0
> > LR [c000000000244d14] mempool_free_slab+0x24/0x40
>
> Are you running with slub debugging enabled?
> Try booting with slub_debug=FZP
I was able to reproduce again with slub_debug=FZP and DEBUG_INFO enabled
on 4.17.0-rc7-next-20180601, but not much traces other than the Oops
stack trace
cat /proc/cmdline
rw,slub_debug=FZP root=UUID=e62c58bb-2824-4075-a31d-455f1bb62504
.config
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB=y
CONFIG_SLUB_CPU_PARTIAL=y
CONFIG_SLUB_DEBUG_ON=y
CONFIG_SLUB_STATS=y
the faulty instruction points to below code path :
gdb -batch vmlinux -ex 'list *(0xc000000000304fe0)'
0xc000000000304fe0 is in kmem_cache_free (mm/slab.h:231).
226 }
227
228 static inline bool slab_equal_or_root(struct kmem_cache *s,
229 struct kmem_cache *p)
230 {
231 return p == s || p == s->memcg_params.root_cache;
232 }
233
234 /*
235 * We use suffixes to the name in memcg because we can't have caches
detailed dmesg logs attached.
--
Regard's
Abdul Haleem
IBM Linux Technology Centre
View attachment "dmesg-slubon.txt" of type "text/plain" (93399 bytes)
Powered by blists - more mailing lists