linux-kernel - Re: [dm-devel] [PATCH 1/2] blkdev: fix merge_bvec

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.1003041615120.17032@hs20-bc2-1.build.redhat.com>
Date:	Thu, 4 Mar 2010 16:55:29 -0500 (EST)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	Jens Axboe <jens.axboe@...cle.com>
cc:	device-mapper development <dm-devel@...hat.com>,
	Dmitry Monakhov <dmonakhov@...nvz.org>,
	linux-kernel@...r.kernel.org, Mike Snitzer <snitzer@...hat.com>
Subject: Re: [dm-devel] [PATCH 1/2] blkdev: fix merge_bvec_fn return value
 checks

> > I think it should be redesigned in this way:
> > * the bio should be allowed to have arbitrary size, up to BIO_MAX_PAGES
> > * the code for splitting bios should be just at one place --- in the 
> > generic block layer and it should work with all drivers. I.e.
> > q->make_request_fn will return that the request is too big for the given 
> > driver and the generic code will allocate few another bios from 
> > per-request_queue mempool and split the bio.
> > * then, we would fix the raid1 bug and we would also simplify md and dm 
> > --- remove the splitting code.
> 
> The design certainly isn't perfect, but neither is the one that you
> suggest. For instance, what would the point of building bigger bios just
> to split them _everytime_ be? That's not good design, and it's certainly
> not helping performance much.

The point for building a big bio and splitting it is code size reduction.

You must realize that you need to split the request anyway. If the caller 
needs to read/write M sectors and the device can process at most N 
sectors, where N<M, then the i/o must be split. You can't get rid of that 
split.

Now, the question is, where this splitting happens. Currently, the 
splitting happens in the issuers of the bio (for example dm-io, which is a 
big chunk of code written just to work around the uncertain bio size 
limits) as well as in the consumers of the bio (md, dm and request-based 
drivers, if their maximum transfer size < PAGE_SIZE).

What I'm proposing is to put that bio splitting just in one place and 
remove it from both the issuers and the consumers of the bio.

There is just one case, where it is not desirable to create larger 
requests than the physical transfer size because of performance --- that 
is page cache readahead. That could use the current approach of querying 
the queue for limits and not stuffing more data than the device can 
handle.

In all the other (non-readahead) cases, the caller is waiting for all the 
data anyway, there is no performance problem with creating a larger 
requests and splitting them.

> Alasdair had an idea for doing a map/unmap like interface instead, which
> I think is a lot better. I don't think he ever coded it up, though.
> Perhaps talk to him, if you want to improve our situation in this area.

It would increase code size, not reduce it.

> -- 
> Jens Axboe

BTW. see fs/bio.c:bio_split --- it uses a shared mempool. It can deadlock 
if the devices depend on each other. You need to use per-queue mempool.

Mikulas

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/