lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 31 Oct 2011 10:05:29 +0530
From:	Tiju Jacob <jacobtiju@...il.com>
To:	Shaohua Li <shaohua.li@...el.com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Multi-partition block layer behaviour

On Thu, Oct 27, 2011 at 6:12 AM, Shaohua Li <shaohua.li@...el.com> wrote:
> On Wed, 2011-10-26 at 18:10 +0800, Tiju Jacob wrote:
>> >> 1. When an I/O request is made to the filesystem, process 'A' acquires
>> >> a mutex FS lock and a mutex block driver lock.
>> >>
>> >> 2. Process 'B' tries to acquire the mutex FS lock, which is not
>> >> available. Hence, it goes to sleep. Due to the new plugging mechanism,
>> >> before going to sleep, shcedule() is invoked which disables preemption
>> >> and the context becomes atomic. In schedule(), the newly added
>> >> blk_flush_plug_list() is invoked which unplugs the block driver.
>> >>
>> >> 3) During unplug operation the block driver tries to acquire the mutex
>> >> lock which fails, because the lock was held by process 'A'. Previous
>> >> invocation of scheudle() in step 2 has already made the context as
>> >> atomic, hence the error "Schedule while atomic" occured.
>> > if blk_flush_plug_list() is called in schedule(), it will use
>> > blk_run_queue_async
>> > to unplug the queue. This runs in a workqueue. So how could this happen?
>> >
>>
>> The call stack goes as follows:
>>
>> From schedule() it calls blk_schedule_flush_plug()  and
>> blk_flush_plug_list() gets invoked.
>>
>> In blk_flush_plug_list() queue_unplugged() does not get invoked. Hence
>>  blk_run_queue_async is not called.
>> Instead __elv_add_request() is invoked with ELEVATOR_INSERT_SORT_MERGE
>> flag and the flag gets reassigned to ELEVATOR_INSERT_BACK.
>>
>> In ELEVATOR_INSERT_BACK, __blk_run_queue() gets invoked and calls request_fn().

> This doesn't make sense. why the flag is changed from
> ELEVATOR_INSERT_SORT_MERGE to ELEVATOR_INSERT_BACK?

In  __elv_add_request() "where" gets reassigned as follows:

	} else if (!(rq->cmd_flags & REQ_ELVPRIV) &&
		    (where == ELEVATOR_INSERT_SORT ||
		     where == ELEVATOR_INSERT_SORT_MERGE))
		where = ELEVATOR_INSERT_BACK;

>	
> can you post a full log? or did your driver have something special?

Our driver doesn't have anything special. Our FTL driver works fine
with linux kernels 2.6.38 and prior 2.6 kernels. This error occurs
from 2.6.39 onwards.
However, here's the log.

.....
.....
BUG: scheduling while atomic: fsstress.fork_n/498/0x00000002
Modules linked in: fs_fat(P) fs_glue(P) ftl_driver(P) fsr(P)
[<c0042e30>] (unwind_backtrace+0x0/0xec) from [<c031e234>] (schedule+0x54/0x3ec)
[<c031e234>] (schedule+0x54/0x3ec) from [<c031f884>]
(__mutex_lock_slowpath+0x174/0x294)
[<c031f884>] (__mutex_lock_slowpath+0x174/0x294) from [<c031f9b0>]
(mutex_lock+0xc/0x20)
[<c031f9b0>] (mutex_lock+0xc/0x20) from [<bf062b50>]
(ftl_request+0x264/0x3c0 [ftl_driver])
[<bf062b50>] (ftl_request+0x264/0x3c0 [ftl_driver]) from [<c01c1d6c>]
(__blk_run_queue+0x1c/0x24)
[<c01c1d6c>] (__blk_run_queue+0x1c/0x24) from [<c01c11a8>]
(__elv_add_request+0x1ec/0x248)
[<c01c11a8>] (__elv_add_request+0x1ec/0x248) from [<c01c3bbc>]
(blk_flush_plug_list+0x1b4/0x204)
[<c01c3bbc>] (blk_flush_plug_list+0x1b4/0x204) from [<c031e3a0>]
(schedule+0x1c0/0x3ec)
[<c031e3a0>] (schedule+0x1c0/0x3ec) from [<c016acb8>]
(start_this_handle+0x318/0x50c)
[<c016acb8>] (start_this_handle+0x318/0x50c) from [<c016b0ac>]
(jbd2__journal_start+0xa8/0xd8)
[<c016b0ac>] (jbd2__journal_start+0xa8/0xd8) from [<c0148114>]
(ext4_journal_start_sb+0x110/0x128)
[<c0148114>] (ext4_journal_start_sb+0x110/0x128) from [<c013bb54>]
(_ext4_get_block+0x74/0x138)
[<c013bb54>] (_ext4_get_block+0x74/0x138) from [<c00f2d5c>]
(__blockdev_direct_IO+0x594/0xc1c)
[<c00f2d5c>] (__blockdev_direct_IO+0x594/0xc1c) from [<c013e208>]
(ext4_direct_IO+0x120/0x214)
[<c013e208>] (ext4_direct_IO+0x120/0x214) from [<c0097d48>]
(generic_file_direct_write+0x120/0x208)
[<c0097d48>] (generic_file_direct_write+0x120/0x208) from [<c00981f0>]
(__generic_file_aio_write+0x3c0/0x4f4)
[<c00981f0>] (__generic_file_aio_write+0x3c0/0x4f4) from [<c0098390>]
(generic_file_aio_write+0x6c/0xdc)
[<c0098390>] (generic_file_aio_write+0x6c/0xdc) from [<c0135d58>]
(ext4_file_write+0x268/0x2dc)
[<c0135d58>] (ext4_file_write+0x268/0x2dc) from [<c00c3ec0>]
(do_sync_write+0x9c/0xe8)
[<c00c3ec0>] (do_sync_write+0x9c/0xe8) from [<c00c4704>] (vfs_write+0xb0/0x13c)
[<c00c4704>] (vfs_write+0xb0/0x13c) from [<c00c4c98>] (sys_write+0x3c/0x68)
[<c00c4c98>] (sys_write+0x3c/0x68) from [<c003d4a0>] (ret_fast_syscall+0x0/0x30)
.....
.....
.....
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ