[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1493652871.30303.15.camel@hpe.com>
Date: Mon, 1 May 2017 15:34:32 +0000
From: "Kani, Toshimitsu" <toshi.kani@....com>
To: "dan.j.williams@...el.com" <dan.j.williams@...el.com>,
"linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"dave.jiang@...el.com" <dave.jiang@...el.com>,
"vishal.l.verma@...el.com" <vishal.l.verma@...el.com>
Subject: Re: [PATCH] libnvdimm: rework region badblocks clearing
On Sun, 2017-04-30 at 05:39 -0700, Dan Williams wrote:
> Toshi noticed that the new support for a region-level badblocks
> missed the case where errors are cleared due to BTT I/O.
>
> An initial attempt to fix this ran into a "sleeping while atomic"
> warning due to taking the nvdimm_bus_lock() in the BTT I/O path to
> satisfy the locking requirements of __nvdimm_bus_badblocks_clear().
> However, that lock is not needed since we are not acting any data
> that is subject to change due to a change of state of the bus /
> region. The badblocks instance has its own internal lock to handle
> mutations of the error list.
>
> So, to make it clear that we are just acting on region devices and
> don't need the lock rename __nvdimm_bus_badblocks_clear() to
> nvdimm_clear_badblocks_regions(). Eliminate the lock and consolidate
> all routines in drivers/nvdimm/bus.c. Also, make some cleanups to
> remove unnecessary casts, make the calling convention of
> nvdimm_clear_badblocks_regions() clearer by replacing struct resource
> with the minimal struct clear_badblocks_context, and use the
> DEVICE_ATTR macro.
Hi Dan,
I was testing the change with CONFIG_DEBUG_ATOMIC_SLEEP set this time,
and hit the following BUG with BTT. This is a separate issue (not
introduced by this patch), but it shows that we have an issue with the
DSM call path as well.
[ 1279.712933] nfit ACPI0012:00: acpi_nfit_ctl:bus cmd: 1: func: 1
input length: 16
[ 1279.721111] nvdimm in 00000000: 60000000 00000002 00001000
00000000 ...`............
[ 1279.729799] BUG: sleeping function called from invalid context at
mm/slab.h:432
[ 1279.738005] in_atomic(): 1, irqs_disabled(): 0, pid: 13353, name: dd
[ 1279.745187] INFO: lockdep is turned off.
:
[ 1279.767908] Call Trace:
[ 1279.771116] dump_stack+0x86/0xc3
[ 1279.775201] ___might_sleep+0x17d/0x250
[ 1279.779808] __might_sleep+0x4a/0x80
[ 1279.784214] __kmalloc+0x1c0/0x2e0
[ 1279.788388] acpi_os_allocate_zeroed+0x2d/0x2f
[ 1279.793604] acpi_evaluate_object+0x59/0x3b1
[ 1279.798640] acpi_evaluate_dsm+0xbd/0x10c
[ 1279.803458] acpi_nfit_ctl+0x1ef/0x7c0 [nfit]
[ 1279.808584] ? nsio_rw_bytes+0x152/0x280
[ 1279.813258] nvdimm_clear_poison+0x77/0x140
[ 1279.818193] nsio_rw_bytes+0x18f/0x280
[ 1279.822684] btt_write_pg+0x1d4/0x3d0 [nd_btt]
[ 1279.827869] btt_make_request+0x119/0x2d0 [nd_btt]
[ 1279.833398] ? generic_make_request+0xef/0x3b0
[ 1279.838575] generic_make_request+0x122/0x3b0
[ 1279.843661] ? iov_iter_get_pages+0xbd/0x380
[ 1279.848666] submit_bio+0x73/0x150
[ 1279.852801] ? bio_iov_iter_get_pages+0xd7/0x120
[ 1279.858166] ? __blkdev_direct_IO_simple+0x17b/0x340
[ 1279.863877] __blkdev_direct_IO_simple+0x177/0x340
[ 1279.869453] ? bdput+0x20/0x20
[ 1279.873231] blkdev_direct_IO+0x3b1/0x3c0
[ 1279.877963] ? current_time+0x18/0x70
[ 1279.882344] generic_file_direct_write+0xba/0x180
[ 1279.887765] __generic_file_write_iter+0xc0/0x1c0
[ 1279.893185] ? __clear_user+0x23/0x70
[ 1279.897550] blkdev_write_iter+0x8b/0x100
[ 1279.902258] ? __might_sleep+0x4a/0x80
[ 1279.906699] __vfs_write+0xe8/0x160
[ 1279.910876] vfs_write+0xcb/0x1f0
[ 1279.914867] SyS_write+0x58/0xc0
[ 1279.918773] do_syscall_64+0x6c/0x1f0
[ 1279.923120] entry_SYSCALL64_slow_path+0x25/0x25
Thanks,
-Toshi
Powered by blists - more mailing lists