[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20210311105442.GA27754@veeam.com>
Date: Thu, 11 Mar 2021 13:54:42 +0300
From: Sergei Shtepa <sergei.shtepa@...am.com>
To: Christoph Hellwig <hch@...radead.org>, <snitzer@...hat.com>
CC: "snitzer@...hat.com" <snitzer@...hat.com>,
"agk@...hat.com" <agk@...hat.com>, "hare@...e.de" <hare@...e.de>,
"song@...nel.org" <song@...nel.org>,
"axboe@...nel.dk" <axboe@...nel.dk>,
"dm-devel@...hat.com" <dm-devel@...hat.com>,
"linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-raid@...r.kernel.org" <linux-raid@...r.kernel.org>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
Pavel Tide <Pavel.TIde@...am.com>
Subject: Re: [PATCH v6 4/4] dm: add DM_INTERPOSED_FLAG
The 03/10/2021 15:34, Christoph Hellwig wrote:
> On Wed, Mar 10, 2021 at 08:28:12AM +0300, Sergei Shtepa wrote:
> > > So instead of doing this shoudn't the interposer just always submit to the
> > > whole device? But if we keep it, the logic in this funtion should go
> > > into a block layer helper, passing a block device instead of the
>
> >
> > device-mapper allows to create devices of any size using only part of
> > the underlying device. Therefore, it is not possible to apply the
> > interposer to the whole block device.
> > Perhaps it makes sense to put the blk_partition_unremap() function in the
> > block layer? I'm not sure that's a good thing.
>
> I suspect the answer is to not remap bios that are going to be handled
> by the interposer. In fact much of submit_bio_checks as-is is a bad
> idea for interposed devices. I think what we need to do instead is to
> pass an explicit bdev to submit_bio_checks and use that everywhere,
> including in the subfunctions.
>
> With that we might also be able to remove the separate interpose hook
> and thus struct bdev_interposer entirely as now ->submit_bio of the
> interposer could do all the work:
>
> static noinline blk_qc_t submit_bio_interposed(struct bio *bio)
> {
> struct block_device *orig_bdev = bio->bi_bdev, *interposer;
> struct bio_list bio_list[2] = { };
> blk_qc_t ret = BLK_QC_T_NONE;
>
> if (current->bio_list) {
> bio_list_add(¤t->bio_list[0], bio);
> return BLK_QC_T_NONE;
> }
>
> if (unlikely(bio_queue_enter(bio)))
> return BLK_QC_T_NONE;
>
> interposer = orig_bdev->bd_interposer;
> if (unlikely(!interposer)) {
> /* interposer was removed */
> bio_list_add(¤t->bio_list[0], bio);
> goto queue_exit;
> }
> if (!submit_bio_checks(bio, interposer))
> goto queue_exit;
>
> bio_set_flag(bio, BIO_INTERPOSED);
>
> current->bio_list = bio_list;
> ret = interposer->bd_disk->fops->submit_bio(bio);
> current->bio_list = NULL;
>
> queue_exit:
> blk_queue_exit(bdev->bd_disk->queue);
>
> /* Resubmit remaining bios */
> while ((bio = bio_list_pop(&bio_list[0])))
> ret = submit_bio_noacct(bio);
> return ret;
> }
>
> blk_qc_t submit_bio_noacct(struct bio *bio)
> {
> if (bio->bi_bdev->bd_interposer && !bio_flagged(bio, BIO_INTERPOSED)
> return submit_bio_interposed(bio);
>
> ...
> }
Your point of view is very interesting. I like.
I will try to implement it and check how it works.
So far, I see the problem in that the interposer device has to intercept
all bio requests from the original device. It will not be possible to
implement an interception of some part. Device mapper can create its own
target for a part of the block device.
But maybe it's a good thing. First, there is little real benefit from
being able to intercept bio requests from a part of the block device.
In real use, this may not be necessary. Secondly, it will get rid of the
problem when part of the bio needs to be intercepted, and part does not.
I'd like to know Mike's opinion on this issue.
>
> Note that both with this and your original code the interposer must
> never resubmit I/O to itself. Is that actually the case for DM? I'm
> trying to think of a good debug check for that, but right now I can't
> think of something that doesn't cause any overhead for n
I believe that the BIO_INTERPOSED flag is quite good at solving this
problem. When cloning a bio, the flag is passed, which means that bio
cannot be called twice.
Thank you again.
Because of you, I will have to rewrite some code again ;)
But it's all for the best.
--
Sergei Shtepa
Veeam Software developer.
Powered by blists - more mailing lists