lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d01ece91-c147-10a4-9977-6b567cb80c38@fb.com>
Date:   Wed, 26 Oct 2016 16:55:43 -0600
From:   Jens Axboe <axboe@...com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Dave Jones <davej@...emonkey.org.uk>, Chris Mason <clm@...com>,
        Andy Lutomirski <luto@...capital.net>,
        Andy Lutomirski <luto@...nel.org>,
        Al Viro <viro@...iv.linux.org.uk>, Josef Bacik <jbacik@...com>,
        David Sterba <dsterba@...e.com>,
        linux-btrfs <linux-btrfs@...r.kernel.org>,
        Linux Kernel <linux-kernel@...r.kernel.org>,
        Dave Chinner <david@...morbit.com>
Subject: Re: bio linked list corruption.

On 10/26/2016 04:51 PM, Linus Torvalds wrote:
> On Wed, Oct 26, 2016 at 3:40 PM, Dave Jones <davej@...emonkey.org.uk> wrote:
>>
>> I gave it a shot too for shits & giggles.
>> This falls out during boot.
>>
>> [    9.278420] WARNING: CPU: 0 PID: 1 at block/blk-mq.c:1181 blk_sq_make_request+0x465/0x4a0
>
> Hmm. That's the
>
>     WARN_ON_ONCE(rq->mq_ctx != ctx);
>
> that I added to blk_mq_merge_queue_io(), and I really think that
> warning is valid, and the fact that it triggers shows that something
> is wrong with locking.
>
> We just did a
>
>                 spin_lock(&ctx->lock);
>
> and that lock is *supposed* to protect the __blk_mq_insert_request(),
> but that uses rq->mq_ctx.
>
> So if rq->mq_ctx != ctx, then we're locking the wrong context.
>
> Jens - please explain to me why I'm wrong.
>
> Or maybe I actually might have found the problem? In which case please
> send me a patch that fixes it ;)

I think you're pretty close, the two should not be different and I don't
immediately see how. I'll run some testing here, should be easier with
this knowledge.

> Dave: it might be a good idea to split that "WARN_ON_ONCE()" in
> blk_mq_merge_queue_io() into two, since right now it can trigger both
> for the
>
>                 blk_mq_bio_to_request(rq, bio);
>
> path _and_ for the
>
>                 if (!blk_mq_attempt_merge(q, ctx, bio)) {
>                         blk_mq_bio_to_request(rq, bio);
>                         goto insert_rq;

And just in case I can't trigger, would be interesting to add a call to
blk_rq_dump_flags() as well, in case this is some special request.


-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ