lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 26 Jun 2008 14:39:36 +0200
From:	Jens Axboe <jens.axboe@...cle.com>
To:	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>
Cc:	stern@...land.harvard.edu, andi@...stfloor.org,
	linux-kernel@...r.kernel.org, antonio.lin@...ormicro.com,
	david.vrabel@....com
Subject: Re: Scatter-gather list constraints

On Thu, Jun 26 2008, FUJITA Tomonori wrote:
> On Thu, 26 Jun 2008 08:35:59 +0200
> Jens Axboe <jens.axboe@...cle.com> wrote:
> 
> > On Thu, Jun 26 2008, FUJITA Tomonori wrote:
> > > On Thu, 26 Jun 2008 11:06:03 +0900
> > > FUJITA Tomonori <fujita.tomonori@....ntt.co.jp> wrote:
> > > 
> > > > On Wed, 25 Jun 2008 10:23:00 -0400 (EDT)
> > > > Alan Stern <stern@...land.harvard.edu> wrote:
> > > > 
> > > > > On Wed, 25 Jun 2008, FUJITA Tomonori wrote:
> > > > > 
> > > > > > > For example, suppose an I/O request starts out with two S-G elements
> > > > > > > of 1536 bytes and 2048 bytes respectively, and the DMA requirement is
> > > > > > > that all elements except the last must have length divisible by 1024.  
> > > > > > > Then the request could be broken up into three requests of 1024, 512,
> > > > > > > and 2048 bytes.
> > > > > > 
> > > > > > I can't say that it's easy to implement a clean mechanism to break up
> > > > > > a request into multiple requests until I see a patch.
> > > > > 
> > > > > And I can't write a patch without learning a lot more about how the
> > > > > block core works.
> > > > > 
> > > > > > What I said is that you think that this is about extending something
> > > > > > in the block layer but it's about adding a new concept to the block
> > > > > > layer.
> > > > > 
> > > > > Is it?  What does the block layer do when it receives an I/O request
> > > > > that don't satisfy the other constraints (max_sectors or
> > > > > dma_alignment_mask, for example)?
> > > > 
> > > > As I explained, you need something new.
> > > > 
> > > > I don't think that max_sectors works as you expect.
> > > 
> > > The block layer looks at max_sectors when merging two things (or add
> > > one to another). So the test fails, it doesn't merge them.
> > > 
> > > 
> > > > dma_alignment_mask is not used in the FS path. And I think that
> > > > dma_alignment_mask doens't solve your problems.
> > > 
> > > If dma_alignment_mask test fails, the block layer allocates temporary
> > > buffers and does memory copies.
> > 
> > I don't think adding anything in the general IO path makes a lot of
> > sense, this is a really screwy case. I don't mind adding work-arounds to
> > the block layer to cater for hardware weirdness, but this is getting a
> > little silly. We could provide a helper function for 'bouncing' this
> > request and thus reuse the block bounce buffer for this, but I'm not
> > even sure how to simply express this generically. As it is likely of no
> > use outside of this specific case, putting it in the driver (or usb
> > layer, if you expect more of these similar cases) is the best option.
> 
> Yeah, agreed, as I wrote in the first mail:
> 
> http://marc.info/?l=linux-kernel&m=121430416329618&w=2
> 
> I guess that a generic mechanism reserving some buffers in the block
> layer might work for them. I also need such a mechnism to convert sg
> and st to use the block layer (yeah, it's overdue but still on my todo
> list).

On the fs side, just setting a hw block size of 1k should fix the
problem, since that'd be your minimum transfer size AND alignment there
even for O_DIRECT IO.

So that leaves SG_IO (and similar) issued IO, which are typically really
small (and thus not an issue, since it'll be a single sg element). For
the bigger ones, sg elements should be tightly packed (eg page size)
except the last one.

Alan, in what specific cases have you observed IO requests that violate
the rules you gave? The example of:

"For example, suppose an I/O request starts out with two S-G elements of
1536 bytes and 2048 bytes respectively, and the DMA requirement is"

really sounds concocted, have you ever seen something like that?

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ