lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 4 Mar 2010 06:47:48 -0500 (EST)
From:	Mikulas Patocka <mpatocka@...hat.com>
To:	device-mapper development <dm-devel@...hat.com>,
	Jens Axboe <jens.axboe@...cle.com>
cc:	Dmitry Monakhov <dmonakhov@...nvz.org>,
	linux-kernel@...r.kernel.org, Mike Snitzer <snitzer@...hat.com>
Subject: Re: [dm-devel] [PATCH 1/2] blkdev: fix merge_bvec_fn return value
 checks



On Wed, 3 Mar 2010, Jens Axboe wrote:

> On Wed, Mar 03 2010, Dmitry Monakhov wrote:
> > Mike Snitzer <snitzer@...hat.com> writes:
> > 
> > > Linux has all sorts of internal interfaces that are "odd"... the current
> > > 'q->merge_bvec_fn' interface included.  But odd is not a problem (nor is
> > > it "broken") unless you make changes that don't consider how the current
> > > interface is defined.
> > Ok. then cant you please explain more historical questions
> > 1) Why bio_add_page() can not add less data than requested?
> >    Seems that it doesn't make caller's code much complicate
> >    Off course barrier bio is special case. I don't consider it here.
> 
> Because the caller may not expect that, a partial add may not make any
> sense to the caller. The bio code obviously doesn't care. And it
> certainly could complicate the caller a lot, if they need to now issue
> and wait for several bio's instead of just a single one. Now a single
> completion queue and wait_for_completion() is not enough.

The whole thing is lame by design.

As a consequence, the code for splitting these page-sized bios is being 
duplicated in md, dm and request-based device (and maybe in other drivers).

And there is a long standing unfixable bug --- adding a device to raid1 
may fail if the device has smaller request size than the other raid1 leg.
Think about this:

* thread 1:
bio_add_page() -> returns ok
...
bio_add_page() -> returns ok
* scheduling to thread 2 *
* thread 2:
add a raid1 leg to the block device
* scheduling to thread 1 *
* thread 1:
submit_bio() --- the bio already oversized for the new raid1 leg and there 
is no way to shrink it.


I think it should be redesigned in this way:
* the bio should be allowed to have arbitrary size, up to BIO_MAX_PAGES
* the code for splitting bios should be just at one place --- in the 
generic block layer and it should work with all drivers. I.e.
q->make_request_fn will return that the request is too big for the given 
driver and the generic code will allocate few another bios from 
per-request_queue mempool and split the bio.
* then, we would fix the raid1 bug and we would also simplify md and dm 
--- remove the splitting code.

Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ