lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 8 Sep 2021 11:57:26 +0000
From:   Niklas Cassel <Niklas.Cassel@....com>
To:     Bart Van Assche <bvanassche@....org>
CC:     Jens Axboe <axboe@...nel.dk>,
        Damien Le Moal <Damien.LeMoal@....com>,
        Paolo Valente <paolo.valente@...aro.org>,
        Ming Lei <ming.lei@...hat.com>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/2] Revert "mq-deadline: Fix request accounting"

On Tue, Sep 07, 2021 at 10:12:49AM -0700, Bart Van Assche wrote:
> On 9/7/21 9:28 AM, Niklas Cassel wrote:
> > On Tue, Sep 07, 2021 at 08:15:03AM -0700, Bart Van Assche wrote:
> > > On 9/7/21 7:21 AM, Niklas Cassel wrote:
> > > > blk-mq will no longer call the I/O scheduler .finish_request() callback
> > > > for requests that were never inserted to the I/O scheduler.
> > > 
> > > I do not agree. Even with patch 1/2 from this series applied, finish_request()
> > > will still be called for requests inserted by blk_insert_cloned_request()
> > > although these requests are never inserted to the I/O scheduler.
> > 
> > Looking at blk_mq_free_request(),
> > e->type->ops.finish_request() will only be called if RQF_ELVPRIV
> > is set.
> > 
> > blk_insert_cloned_request() doesn't seem to allocate a request
> > itself, but instead takes an already cloned request.
> > 
> > So I guess it depends on how the supplied request was cloned.
> > 
> > I would assume if the original request doesn't have RQF_ELVPRIV set,
> > then neither will the cloned request?
> > 
> > I tried to look at blk_rq_prep_clone(), which seems to be a common
> > cloning function, but I don't see req->rq_flags being copied
> > (except for RQF_SPECIAL_PAYLOAD).
> > 
> > Anyway, I don't see how .finish_request() will be called in relation
> > to blk_insert_cloned_request(). Could you please help me out and
> > give me an example of a call chain where this can happen?
> 
> Hi Niklas,
> 
> This is a bit outside my area of expertise. Anyway: map_request() calls
> .clone_and_map_rq(). At least multipath_clone_and_map() calls
> blk_get_request(). I think this shows that blk_insert_cloned_request()
> may insert an entirely new request. Is my understanding correct that
> blk_mq_rq_ctx_init() will set RQF_ELVPRIV for the cloned request if a
> scheduler is associated with the request queue associated with the
> cloned request?
> 
> Bart.

Thank you Bart.


dm-rq.c:map_request() calls .clone_and_map_rq()

one example of a .clone_and_map_rq() implementation is
multipath_clone_and_map().

multipath_clone_and_map(), to get a clone simply does blk_get_request(),
which does blk_mq_alloc_request(), which calls __blk_mq_alloc_request(),
which calls blk_mq_rq_ctx_init(), which will set RQF_ELVPRIV, as long as
the request is not a flush or a passthrough request.

dm-rq.c:dm_dispatch_clone_request() calls blk_insert_cloned_request(),
which will bypass the I/O scheduler.

So, a request cloned by e.g. drivers/md/dm-mpath.c will
bypass the I/O scheduler, but can still have the RQF_ELVPRIV flag set.



This just tells me that if someone wants to clean up this mess,
we have to do something similar to what I proposed in my initial RFC:
https://lore.kernel.org/linux-block/20210827124100.98112-2-Niklas.Cassel@wdc.com/

i.e., set RQF_ELVPRIV flag immediately before calling
e->type->ops.insert_requests(), and only then.

It seems logical to simply set the flag before actually inserting the request
to the scheduler, rather than finding all the corner cases where we don't.

Anyway, I'll leave this potential cleanup to people more familiar with the
blk-mq code for now.


Kind regards,
Niklas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ