lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181127221549.GA33981@dennisz-mbp.dhcp.thefacebook.com>
Date:   Tue, 27 Nov 2018 17:15:49 -0500
From:   Dennis Zhou <dennis@...nel.org>
To:     Josef Bacik <josef@...icpanda.com>
Cc:     Dennis Zhou <dennis@...nel.org>, Jens Axboe <axboe@...nel.dk>,
        Tejun Heo <tj@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>, kernel-team@...com,
        linux-block@...r.kernel.org, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 00/13 v4] block: always associate blkg and refcount
 cleanup

On Tue, Nov 27, 2018 at 04:10:01PM -0500, Josef Bacik wrote:
> On Mon, Nov 26, 2018 at 04:19:33PM -0500, Dennis Zhou wrote:
> > Hi everyone,
> > 
> > This is respin of v3 [1] with fixes for the errors reported in [2] and
> > [3]. v3 was reverted in [4].
> > 
> > The issue in [3] was that bio->bi_disk->queue and blkg->q were out
> > of sync. So when I changed blk_get_rl() to use blkg->q, the wrong queue
> > was returned and elevator from q->elevator->type threw a NPE. Note, with
> > v4.21, the old block stack was removed and so this patch was dropped. I
> > did backport this to v4.20 and verified this series does not encounter
> > the error.
> > 
> > The biggest changes in v4 are when association occurs and clearly
> > defining the cases where association should happen.
> >   1. Association is now done when the device is set to keep blkg->q and
> >      bio->bi_disk->queue in sync.
> >   2. When a bio is submitted directly to the device, it will not be
> >      associated with a blkg. This is because a blkg represents the
> >      relationship between a blkcg and a request_queue. Going directly to
> >      the device means the request_queue may not exist meaning no blkg
> >      will exist.
> > 
> > The patch updating blk_get_rl() was dropped (v3 10/12). The patch to
> > always associate a blkg from v3 (v3 04/12) was fixed and split into
> > patches 0004 and 0005. 0011 is new removing bio_disassociate_task().
> > 
> > Summarizing the ideas of this series:
> >   1. Gracefully handle blkg failure to create by walking up the blkg
> >      tree rather than fall through to root.
> >   2. Associate a bio with a blkg in core logic rather than per
> >      controller logic.
> >   3. Rather than have a css and blkg reference, hold just a blkg ref
> >      as it also holds a css ref.
> >   4. Switch to percpu ref counting for blkg.
> > 
> 
> Hmm so reading through this series it's made me realize that iolatency is sort
> of broken.  It relies on knowing if it needs to do something with the bio if
> there is a blkg associated with it.  Before this patchset there wouldn't be a

I don't think there is anything wrong with blk-iolatency. blk-iolatency
piggybacks off of the rq_qos hooks which when throttle gets called means
submit has happened on the bio. As a byproduct, all bios that
blk-iolatency sees has a blkcg associated with it from
blkcg_bio_issue_check(). This lets blk-iolatency associate a blkg in the
throttle hook - blkcg_iolatency_throttle().

Order of functions called:
  generic_make_request()
    generic_make_request_checks() <- associates blkcg
    make_request_fn() (eventually blk_mq_make_request)
      rq_qos_throttle()
        blkcg_iolatency_throttle() <- associate blkg based on blkcg

blk-throttle is another story, it kind of does association really high
up, but lets the block layer manage the request_queue and attribute it
all to the first blkg seen. I believe this is just the top level blkg
(same css, first request_queue). Disclaimer, I haven't read through much
of the blk-throttle code.

> blkg on the bio unless it was specifically associated.  I'm going to need to
> figure out a different way to tag bio's to indicate that blk-iolatency should
> care about it.  Probably add a bio flag or something.  Thanks,

I think the interaction gets a little confusing with stacked block
devices, but regardless, shouldn't blk-iolatency care about every bio
that comes through anyway? I think this would just translate to
throttling at each request_queue and not just the entrance queue.

If a bio has a device with no request_queue, I don't think it will ever
reach blk-iolatency because the bio goes straight to device and a
requirement to get to call rq_qos_throttle is to have a request_queue.

Thanks,
Dennis

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ