[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de04afaf-5e17-7999-0f10-3da466c66310@intel.com>
Date: Tue, 4 Jun 2019 17:06:44 +0800
From: Rong Chen <rong.a.chen@...el.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: Jens Axboe <axboe@...nel.dk>, Bart Van Assche <bvanassche@....org>,
Christoph Hellwig <hch@....de>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: Re: [block] 47cdee29ef: BUG:kernel_NULL_pointer_dereference,address
Hi,
On 6/4/19 12:03 PM, Ming Lei wrote:
> Hi Rong Chen,
>
> Thanks for your test & report!
>
> On Tue, Jun 04, 2019 at 10:09:56AM +0800, kernel test robot wrote:
>> FYI, we noticed the following commit (built with gcc-7):
>>
>> commit: 47cdee29ef9d94e485eb08f962c74943023a5271 ("block: move blk_exit_queue into __blk_release_queue")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> in testcase: trinity
>> with following parameters:
>>
>> runtime: 300s
>>
>> test-description: Trinity is a linux system call fuzz tester.
>> test-url: http://codemonkey.org.uk/projects/trinity/
>>
>>
>> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 2G
>>
>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>
>>
>> +-------------------------------------------------+------------+------------+
>> | | 31cb1d64da | 47cdee29ef |
>> +-------------------------------------------------+------------+------------+
>> | boot_successes | 3 | 0 |
>> | boot_failures | 13 | 8 |
>> | BUG:kernel_reboot-without-warning_in_test_stage | 13 | |
>> | BUG:kernel_NULL_pointer_dereference,address | 0 | 8 |
>> | Oops:#[##] | 0 | 8 |
>> | RIP:blk_mq_free_rqs | 0 | 8 |
>> | Kernel_panic-not_syncing:Fatal_exception | 0 | 8 |
>> +-------------------------------------------------+------------+------------+
>>
>>
>> If you fix the issue, kindly add following tag
>> Reported-by: kernel test robot <rong.a.chen@...el.com>
>>
>>
>> [ 6.560544] BUG: kernel NULL pointer dereference, address: 0000000000000020
>> [ 6.561658] #PF: supervisor read access in kernel mode
>> [ 6.562495] #PF: error_code(0x0000) - not-present page
>> [ 6.563277] PGD 0 P4D 0
>> [ 6.563277] Oops: 0000 [#1] PTI
>> [ 6.563277] CPU: 0 PID: 147 Comm: kworker/0:2 Tainted: G T 5.2.0-rc1-00387-g47cdee29 #1
>> [ 6.563277] Workqueue: events __blk_release_queue
>> [ 6.563277] RIP: 0010:blk_mq_free_rqs+0x2c/0xaf
>
> Looks there is race between removing queue and switching elevator, and
> which should be done by Trinity.
>
> I guess that commit 47cdee29ef9d94e485eb08f962c74943023a5271 just
> changes the timing and makes it easy to trigger.
>
> Please test the following patch and see if difference can be made.
> If the patch can't fix the issue, please enable KASAN and reproduce,
> then more useful log may be got.
The patch doesn't work, Attached please find the dmesg file with KASAN
enabled.
Best Regards,
Rong Chen
>
>
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 75b5281cc577..400a2102a4e4 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -848,11 +848,13 @@ static void blk_exit_queue(struct request_queue *q)
> * perform I/O scheduler exit before disassociating from the block
> * cgroup controller.
> */
> + mutex_lock(&q->sysfs_lock);
> if (q->elevator) {
> ioc_clear_queue(q);
> elevator_exit(q, q->elevator);
> q->elevator = NULL;
> }
> + mutex_unlock(&q->sysfs_lock);
>
> /*
> * Remove all references to @q from the block cgroup controller before
>
> Thanks,
> Ming
View attachment "dmesg-yocto-vm-yocto-202:20190604165150:x86_64-randconfig-ne0-06030921+CONFIG_KASAN:5.2.0-rc1-00387-g47cdee29:1" of type "text/plain" (204973 bytes)
Powered by blists - more mailing lists