lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1493652871.30303.15.camel@hpe.com>
Date:   Mon, 1 May 2017 15:34:32 +0000
From:   "Kani, Toshimitsu" <toshi.kani@....com>
To:     "dan.j.williams@...el.com" <dan.j.williams@...el.com>,
        "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "dave.jiang@...el.com" <dave.jiang@...el.com>,
        "vishal.l.verma@...el.com" <vishal.l.verma@...el.com>
Subject: Re: [PATCH] libnvdimm: rework region badblocks clearing

On Sun, 2017-04-30 at 05:39 -0700, Dan Williams wrote:
> Toshi noticed that the new support for a region-level badblocks
> missed the case where errors are cleared due to BTT I/O.
> 
> An initial attempt to fix this ran into a "sleeping while atomic"
> warning due to taking the nvdimm_bus_lock() in the BTT I/O path to
> satisfy the locking requirements of __nvdimm_bus_badblocks_clear().
> However, that lock is not needed since we are not acting any data
> that is subject to change due to a change of state of the bus /
> region. The badblocks instance has its own internal lock to handle
> mutations of the error list.
> 
> So, to make it clear that we are just acting on region devices and
> don't need the lock rename __nvdimm_bus_badblocks_clear() to
> nvdimm_clear_badblocks_regions(). Eliminate the lock and consolidate
> all routines in drivers/nvdimm/bus.c. Also, make some cleanups to
> remove unnecessary casts, make the calling convention of
> nvdimm_clear_badblocks_regions() clearer by replacing struct resource
> with the minimal struct clear_badblocks_context, and use the
> DEVICE_ATTR macro.

Hi Dan,

I was testing the change with CONFIG_DEBUG_ATOMIC_SLEEP set this time,
and hit the following BUG with BTT.  This is a separate issue (not
introduced by this patch), but it shows that we have an issue with the
DSM call path as well.

[ 1279.712933] nfit ACPI0012:00: acpi_nfit_ctl:bus cmd: 1: func: 1
input length: 16
[ 1279.721111] nvdimm in  00000000: 60000000 00000002 00001000
00000000  ...`............
[ 1279.729799] BUG: sleeping function called from invalid context at
mm/slab.h:432
[ 1279.738005] in_atomic(): 1, irqs_disabled(): 0, pid: 13353, name: dd
[ 1279.745187] INFO: lockdep is turned off.
 :
[ 1279.767908] Call Trace:
[ 1279.771116]  dump_stack+0x86/0xc3
[ 1279.775201]  ___might_sleep+0x17d/0x250
[ 1279.779808]  __might_sleep+0x4a/0x80
[ 1279.784214]  __kmalloc+0x1c0/0x2e0
[ 1279.788388]  acpi_os_allocate_zeroed+0x2d/0x2f
[ 1279.793604]  acpi_evaluate_object+0x59/0x3b1
[ 1279.798640]  acpi_evaluate_dsm+0xbd/0x10c
[ 1279.803458]  acpi_nfit_ctl+0x1ef/0x7c0 [nfit]
[ 1279.808584]  ? nsio_rw_bytes+0x152/0x280
[ 1279.813258]  nvdimm_clear_poison+0x77/0x140
[ 1279.818193]  nsio_rw_bytes+0x18f/0x280
[ 1279.822684]  btt_write_pg+0x1d4/0x3d0 [nd_btt]
[ 1279.827869]  btt_make_request+0x119/0x2d0 [nd_btt]
[ 1279.833398]  ? generic_make_request+0xef/0x3b0
[ 1279.838575]  generic_make_request+0x122/0x3b0
[ 1279.843661]  ? iov_iter_get_pages+0xbd/0x380
[ 1279.848666]  submit_bio+0x73/0x150
[ 1279.852801]  ? bio_iov_iter_get_pages+0xd7/0x120
[ 1279.858166]  ? __blkdev_direct_IO_simple+0x17b/0x340
[ 1279.863877]  __blkdev_direct_IO_simple+0x177/0x340
[ 1279.869453]  ? bdput+0x20/0x20
[ 1279.873231]  blkdev_direct_IO+0x3b1/0x3c0
[ 1279.877963]  ? current_time+0x18/0x70
[ 1279.882344]  generic_file_direct_write+0xba/0x180
[ 1279.887765]  __generic_file_write_iter+0xc0/0x1c0
[ 1279.893185]  ? __clear_user+0x23/0x70
[ 1279.897550]  blkdev_write_iter+0x8b/0x100
[ 1279.902258]  ? __might_sleep+0x4a/0x80
[ 1279.906699]  __vfs_write+0xe8/0x160
[ 1279.910876]  vfs_write+0xcb/0x1f0
[ 1279.914867]  SyS_write+0x58/0xc0
[ 1279.918773]  do_syscall_64+0x6c/0x1f0
[ 1279.923120]  entry_SYSCALL64_slow_path+0x25/0x25

Thanks,
-Toshi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ