linux-kernel - Re: New TRIM/UNMAP tree published (2009-05-02)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Sun, 03 May 2009 14:47:47 -0500
From:	James Bottomley <James.Bottomley@...senPartnership.com>
To:	Jeff Garzik <jeff@...zik.org>
Cc:	Matthew Wilcox <matthew@....cx>,
	Jens Axboe <jens.axboe@...cle.com>,
	Boaz Harrosh <bharrosh@...asas.com>,
	Hugh Dickins <hugh@...itas.com>,
	Matthew Wilcox <willy@...ux.intel.com>,
	linux-ide@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-scsi@...r.kernel.org,
	Bartlomiej Zolnierkiewicz <bzolnier@...il.com>,
	Mark Lord <lkml@....ca>
Subject: Re: New TRIM/UNMAP tree published (2009-05-02)

On Sun, 2009-05-03 at 15:20 -0400, Jeff Garzik wrote:
> [tangent...]
> 
> Does make you wonder if a ->init_rq_fn() would be helpful, one that 
> could perform gfp_t allocations rather than GFP_ATOMIC?  The idea being 
> to call ->init_rq_fn() almost immediately after creation of struct 
> request, by the struct request creator.

Isn't that what the current prep_fn actually is?

> I obviously have not thought in depth about this, but it does seem that 
> init_rq_fn(), called earlier in struct request lifetime, could eliminate 
> the need for ->prepare_flush, ->prepare_discard, and perhaps could be a 
> better place for some of the ->prep_rq_fn logic.

It's hard to see how ... prep_rq_fn is already called pretty early ...
almost as soon as the elevator has decided to spit out the request

> The creator of struct request generally has more freedom to sleep, and 
> it seems logical to give low-level drivers a "fill in LLD-specific info" 
> hook BEFORE the request is ever added to a request_queue.

Unfortunately it's not really possible to find a sleeping context in
there:  The elevators have to operate from the current
elv_next_request() context, which, in most drivers can either be user or
interrupt.

The way the block layer is designed is to pull allocations up the stack
much closer to the process (usually at the bio creation point) because
that allows the elevators to operate even in memory starved conditions.
If we pushed the allocation down into the request level, we'd need some
type of threading (bad for performance) and the request processing would
stall when some GFP_KERNEL allocation went out to lunch finding memory.

The ideal for REQ_TYPE_DISCARD seems to be to force a page allocation
tied to a bio when it's issued at the top.  That way everyone has enough
memory when it comes down the stack (both extents and WRITE SAME sector
will fit into a page ... although only just for WRITE SAME on 4k
sectors).

James

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/