[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33510a1f-e681-2c5b-137e-3986b1f91779@suse.de>
Date: Mon, 16 Jan 2017 09:11:30 +0100
From: Hannes Reinecke <hare@...e.de>
To: Jens Axboe <axboe@...nel.dk>, linux-kernel@...r.kernel.org,
linux-block@...r.kernel.org
Cc: osandov@...ndov.com, bart.vanassche@...disk.com
Subject: Re: [PATCHSET v6] blk-mq scheduling framework
On 01/13/2017 05:02 PM, Jens Axboe wrote:
> On 01/13/2017 09:00 AM, Jens Axboe wrote:
>> On 01/13/2017 08:59 AM, Hannes Reinecke wrote:
>>> On 01/13/2017 04:34 PM, Jens Axboe wrote:
>>>> On 01/13/2017 08:33 AM, Hannes Reinecke wrote:
>>> [ .. ]
>>>>> Ah, indeed.
>>>>> There is an ominous udev rule here, trying to switch to 'deadline'.
>>>>>
>>>>> # cat 60-ssd-scheduler.rules
>>>>> # do not edit this file, it will be overwritten on update
>>>>>
>>>>> ACTION!="add", GOTO="ssd_scheduler_end"
>>>>> SUBSYSTEM!="block", GOTO="ssd_scheduler_end"
>>>>>
>>>>> IMPORT{cmdline}="elevator"
>>>>> ENV{elevator}=="*?", GOTO="ssd_scheduler_end"
>>>>>
>>>>> KERNEL=="sd*[!0-9]", ATTR{queue/rotational}=="0",
>>>>> ATTR{queue/scheduler}="deadline"
>>>>>
>>>>> LABEL="ssd_scheduler_end"
>>>>>
>>>>> Still shouldn't crash the kernel, though ...
>>>>
>>>> Of course not, and it's not a given that it does, it could just be
>>>> triggering after the device load and failing like expected. But just in
>>>> case, can you try and disable that rule and see if it still crashes with
>>>> MQ_DEADLINE set as the default?
>>>>
>>> Yes, it does.
>>> Same stacktrace as before.
>>
>> Alright, that's as expected. I've tried with your rule and making
>> everything modular, but it still boots fine for me. Very odd. Can you
>> send me your .config? And are all the SCSI disks hanging off ahci? Or
>> sdb specifically, is that ahci or something else?
>
> Also, would be great if you could pull:
>
> git://git.kernel.dk/linux-block blk-mq-sched
>
> into current 'master' and see if it still reproduces. I expect that it
> will, but just want to ensure that it's a problem in the current code
> base as well.
>
Actually, it doesn't. Seems to have resolved itself with the latest drop.
However, not I've got a lockdep splat:
Jan 16 09:05:02 lammermuir kernel: ------------[ cut here ]------------
Jan 16 09:05:02 lammermuir kernel: WARNING: CPU: 29 PID: 5860 at
kernel/locking/lockdep.c:3514 lock_release+0x2a7/0x490
Jan 16 09:05:02 lammermuir kernel: DEBUG_LOCKS_WARN_ON(depth <= 0)
Jan 16 09:05:02 lammermuir kernel: Modules linked in: raid0 mpt3sas
raid_class rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache e
Jan 16 09:05:02 lammermuir kernel: fb_sys_fops ahci uhci_hcd ttm
ehci_pci libahci ehci_hcd serio_raw crc32c_intel drm libata usbcore hpsa
Jan 16 09:05:02 lammermuir kernel: CPU: 29 PID: 5860 Comm: fio Not
tainted 4.10.0-rc3+ #540
Jan 16 09:05:02 lammermuir kernel: Hardware name: HP ProLiant ML350p
Gen8, BIOS P72 09/08/2013
Jan 16 09:05:02 lammermuir kernel: Call Trace:
Jan 16 09:05:02 lammermuir kernel: dump_stack+0x85/0xc9
Jan 16 09:05:02 lammermuir kernel: __warn+0xd1/0xf0
Jan 16 09:05:02 lammermuir kernel: ? aio_write+0x118/0x170
Jan 16 09:05:02 lammermuir kernel: warn_slowpath_fmt+0x4f/0x60
Jan 16 09:05:02 lammermuir kernel: lock_release+0x2a7/0x490
Jan 16 09:05:02 lammermuir kernel: ? blkdev_write_iter+0x89/0xd0
Jan 16 09:05:02 lammermuir kernel: aio_write+0x138/0x170
Jan 16 09:05:02 lammermuir kernel: do_io_submit+0x4d2/0x8f0
Jan 16 09:05:02 lammermuir kernel: ? do_io_submit+0x413/0x8f0
Jan 16 09:05:02 lammermuir kernel: SyS_io_submit+0x10/0x20
Jan 16 09:05:02 lammermuir kernel: entry_SYSCALL_64_fastpath+0x23/0xc6
Cheers,
Hannes
--
Dr. Hannes Reinecke Teamlead Storage & Networking
hare@...e.de +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)
Powered by blists - more mailing lists