lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5421968B.7080309@kernel.dk>
Date:	Tue, 23 Sep 2014 09:49:31 -0600
From:	Jens Axboe <axboe@...nel.dk>
To:	Tejun Heo <tj@...nel.org>, Christoph Hellwig <hch@....de>
CC:	linux-kernel@...r.kernel.org, linux-scsi@...r.kernel.org
Subject: Re: boot stall regression due to blk-mq: use percpu_ref for mq usage
 count

On 09/23/2014 12:11 AM, Tejun Heo wrote:
> On Tue, Sep 23, 2014 at 08:09:06AM +0200, Christoph Hellwig wrote:
>> On Tue, Sep 23, 2014 at 02:01:41AM -0400, Tejun Heo wrote:
>>> On Tue, Sep 23, 2014 at 07:59:24AM +0200, Christoph Hellwig wrote:
>>>> "[PATCHSET percpu/for-3.18] percpu_ref: implement switch_to_atomic/percpu()"
>>>>
>>>> looks way to big for 3.17, and the regression was introduced in the 3.17
>>>> merge window.  I'm not sure what was broken before, but it defintively
>>>> survived a lot of testing.
>>>
>>> Do we even care about fixing it for 3.17?  scsi-mq isn't enabled by
>>> default even for 3.18.  The open-coded percpu ref thing was subtly
>>> broken there.  It'd be difficult to trigger but I'm fairly sure it'd
>>> crap out in the wild once in a blue moon.
>>
>> It's compiled in by default, and people are extremly eager to test it.
> 
> Ugh, I don't know.  It's not like we have a very good baseline we can
> go back to and reverting it for -stable and then redoing it seems
> kinda excessive for a yet experimental feature.  Jens?

It's not just scsi-mq, there are active users of blk-mq in the current
tree - like virtio_blk, mtip32xx. None of those are affected by the RCU
slowdown due to these changes, so it's not a big deal to them. But it is
a big deal if we can't tell people to test scsi-mq in 3.17, that was the
entire point of having it there but not default to on. So yeah, this
really should be fixed for 3.17.

I'm not aware of any reports on the existing enter count breaking things
for them. So while it may not be perfect, reverting the percpu ref count
changes for 3.17 may be the best option that we have.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ