[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b21e5580-ed75-6150-3c83-43ecbb0292a5@kernel.dk>
Date: Wed, 8 Feb 2017 10:43:59 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Dexuan Cui <decui@...rosoft.com>,
Bart Van Assche <Bart.VanAssche@...disk.com>,
"hare@...e.com" <hare@...e.com>, "hare@...e.de" <hare@...e.de>,
"Martin K. Petersen" <martin.petersen@...cle.com>
Cc: "hch@....de" <hch@....de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"jth@...nel.org" <jth@...nel.org>
Subject: Boot regression (was "Re: [PATCH] genhd: Do not hold event lock when
scheduling workqueue elements")
On 02/08/2017 03:48 AM, Dexuan Cui wrote:
>> From: Jens Axboe [mailto:axboe@...nel.dk]
>> Sent: Wednesday, February 8, 2017 00:09
>> To: Dexuan Cui <decui@...rosoft.com>; Bart Van Assche
>> <Bart.VanAssche@...disk.com>; hare@...e.com; hare@...e.de
>> Cc: hch@....de; linux-kernel@...r.kernel.org; linux-block@...r.kernel.org;
>> jth@...nel.org
>> Subject: Re: [PATCH] genhd: Do not hold event lock when scheduling workqueue
>> elements
>>
>> On 02/06/2017 11:29 PM, Dexuan Cui wrote:
>>>> From: linux-block-owner@...r.kernel.org [mailto:linux-block-
>>>> owner@...r.kernel.org] On Behalf Of Dexuan Cui
>>>> with the linux-next kernel.
>>>>
>>>> I can boot the guest with linux-next's next-20170130 without any issue,
>>>> but since next-20170131 I haven't succeeded in booting the guest.
>>>>
>>>> With next-20170203 (mentioned in my mail last Friday), I got the same
>>>> calltrace as Hannes.
>>>>
>>>> With today's linux-next (next-20170206), actually the calltrace changed to
>>>> the below.
>>>> [ 122.023036] ? remove_wait_queue+0x70/0x70
>>>> [ 122.051383] async_synchronize_full+0x17/0x20
>>>> [ 122.076925] do_init_module+0xc1/0x1f9
>>>> [ 122.097530] load_module+0x24bc/0x2980
>>>
>>> I don't know why it hangs here, but this is the same calltrace in my
>>> last-Friday mail, which contains 2 calltraces. It looks the other calltrace has
>>> been resolved by some changes between next-20170203 and today.
>>>
>>> Here the kernel is trying to load the Hyper-V storage driver (hv_storvsc), and
>>> the driver's __init and .probe have finished successfully and then the kernel
>>> hangs here.
>>>
>>> I believe something is broken recently, because I don't have any issue before
>>> Jan 31.
>>
>> Can you try and bisect it?
>>
>> Jens Axboe
>
> I bisected it on the branch for-4.11/next of the linux-block repo and the log shows
> the first bad commit is
> [e9c787e6] scsi: allocate scsi_cmnd structures as part of struct request
>
> # git bisect log
> git bisect start
> # bad: [80c6b15732f0d8830032149cbcbc8d67e074b5e8] blk-mq-sched: (un)register elevator when (un)registering queue
> git bisect bad 80c6b15732f0d8830032149cbcbc8d67e074b5e8
> # good: [309bd96af9e26da3038661bf5cdad780eef49dd9] md: cleanup bio op / flags handling in raid1_write_request
> git bisect good 309bd96af9e26da3038661bf5cdad780eef49dd9
> # bad: [27410a8927fb89bd150de08d749a8ed7f67b7739] nbd: remove REQ_TYPE_DRV_PRIV leftovers
> git bisect bad 27410a8927fb89bd150de08d749a8ed7f67b7739
> # bad: [e9c787e65c0c36529745be47d490d998b4b6e589] scsi: allocate scsi_cmnd structures as part of struct request
> git bisect bad e9c787e65c0c36529745be47d490d998b4b6e589
> # good: [3278255741326b6d66d8ca7d1cb2c57633ee43d9] scsi_dh_rdac: switch to scsi_execute_req_flags()
> git bisect good 3278255741326b6d66d8ca7d1cb2c57633ee43d9
> # good: [0fbc3e0ff623f1012e7c2af96e781eeb26bcc0d7] scsi: remove gfp_flags member in scsi_host_cmd_pool
> git bisect good 0fbc3e0ff623f1012e7c2af96e781eeb26bcc0d7
> # good: [eeff68c5618c8d0920b14533c70b2df007bd94b4] scsi: remove scsi_cmd_dma_pool
> git bisect good eeff68c5618c8d0920b14533c70b2df007bd94b4
> # good: [d48777a633d6fa7ccde0f0e6509f0c01fbfc5299] scsi: remove __scsi_alloc_queue
> git bisect good d48777a633d6fa7ccde0f0e6509f0c01fbfc5299
> # first bad commit: [e9c787e65c0c36529745be47d490d998b4b6e589] scsi: allocate scsi_cmnd structures as part of struct request
Christoph?
I've changed the subject line, this issue has nothing to do with the
issue that Hannes was attempting to fix.
--
Jens Axboe
Powered by blists - more mailing lists