lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <04aff6c6-5c04-a2b5-e886-b747cb51f39e@de.ibm.com>
Date:   Wed, 20 Dec 2017 16:47:21 +0100
From:   Christian Borntraeger <borntraeger@...ibm.com>
To:     Stefan Haberland <sth@...ux.vnet.ibm.com>,
        Christoph Hellwig <hch@....de>, Jens Axboe <axboe@...nel.dk>
Cc:     Bart Van Assche <Bart.VanAssche@....com>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-s390 <linux-s390@...r.kernel.org>,
        Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with
 virtio-blk (also 4.12 stable)

On 12/18/2017 02:56 PM, Stefan Haberland wrote:
> On 07.12.2017 00:29, Christoph Hellwig wrote:
>> On Wed, Dec 06, 2017 at 01:25:11PM +0100, Christian Borntraeger wrote:
>> t > commit 11b2025c3326f7096ceb588c3117c7883850c068    -> bad
>>>      blk-mq: create a blk_mq_ctx for each possible CPU
>>> does not boot on DASD and
>>> commit 9c6ae239e01ae9a9f8657f05c55c4372e9fc8bcc    -> good
>>>     genirq/affinity: assign vectors to all possible CPUs
>>> does boot with DASD disks.
>>>
>>> Also adding Stefan Haberland if he has an idea why this fails on DASD and adding Martin (for the
>>> s390 irq handling code).
>> That is interesting as it really isn't related to interrupts at all,
>> it just ensures that possible CPUs are set in ->cpumask.
>>
>> I guess we'd really want:
>>
>> e005655c389e3d25bf3e43f71611ec12f3012de0
>> "blk-mq: only select online CPUs in blk_mq_hctx_next_cpu"
>>
>> before this commit, but it seems like the whole stack didn't work for
>> your either.
>>
>> I wonder if there is some weird thing about nr_cpu_ids in s390?
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-s390" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> I tried this on my system and the blk-mq-hotplug-fix branch does not boot for me as well.
> The disks get up and running and I/O works fine. At least the partition detection and EXT4-fs mount works.
> 
> But at some point in time the disk do not get any requests.
> 
> I currently have no clue why.
> I took a dump and had a look at the disk states and they are fine. No error in the logs or in our debug entrys. Just empty DASD devices waiting to be called for I/O requests.
> 
> Do you have anything I could have a look at?

Jens, Christoph, so what do we do about this?
To summarize:
- commit 4b855ad37194f7 ("blk-mq: Create hctx for each present CPU") broke CPU hotplug.
- Jens' quick revert did fix the issue and did not broke DASD support but has some issues
with interrupt affinity.
- Christoph patch set fixes the hotplug issue for virtio blk but causes I/O hangs on DASDs (even
without hotplug).

Christian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ