lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20070110.180627.41627650.k-ueda@ct.jp.nec.com>
Date:	Wed, 10 Jan 2007 18:06:27 -0500 (EST)
From:	Kiyoshi Ueda <k-ueda@...jp.nec.com>
To:	jens.axboe@...cle.com, linux-kernel@...r.kernel.org,
	linux-scsi@...r.kernel.org
Cc:	dm-devel@...hat.com, j-nomura@...jp.nec.com, k-ueda@...jp.nec.com
Subject: [RFC PATCH 0/3] blk_end_request: full I/O completion handler

Hi Jens and device driver maintainers,

I would like to make block interface to hook in before completing
each chunk of request (for errors) to allow request-based multipath.
The first approach was to add a new hook in __end_that_request_first().
  - http://marc.theaimsgroup.com/?l=linux-scsi&m=115520444515914&w=2
  - http://marc.theaimsgroup.com/?l=linux-kernel&m=116656637425880&w=2
However, Jens pointed out that redesigning ->end_io() as a full
completion handler would be better:

On Thu, 21 Dec 2006 08:49:47 +0100, Jens Axboe <jens.axboe@...cle.com> wrote:
> Ok, I see what you are getting at. The current ->end_io() is called when
> the request has fully completed, you want notification for each chunk
> potentially completed.
> 
> I think a better design here would be to use ->end_io() as the full
> completion handler, similar to how bio->bi_end_io() works. A request
> originating from __make_request() would set something ala:
.....
> instead of calling the functions manually. That would allow you to get
> notification right at the beginning and do what you need, without adding
> a special hook for this.

So I made a tentative patch based on Jens' suggestion.
Are the interface and the design acceptable for block layer and
device drivers, though a little bit big modification of device drivers
is needed to take this approach?

What is changed:
  o blk_end_request() is a helper for drivers to call a full completion
    handler (req->end_io) for the request.
    (end_that_request_* are no longer used from outside of the handlers.)
  o If blk_end_request() returns 0, the request is completed.
    If blk_end_request() returns 1, the request isn't completed and
    the caller should decide reactions for it.
  o __make_request() sets blk_end_io() in req->end_io
  o Added some templates to existing req->end_io users so they can be
    full completion handler.

The interface of blk_end_request():
  o About the 4th argument (int locked)
    blk_end_request() has to get a queue lock before it calls
    end_that_request_last().  On the other hand, some callers need to
    call blk_end_request() with the lock held.
    Callers can tell blk_end_request() by this argument whether a queue
    lock needs to be held in blk_end_request().

  o About the 5th and the 6th arguments (int (callback)(void *), void *arg)
    Most of callers do caller-specific work between
    end_that_request_{first,chunk} and end_that_request_last.
    Callers now pass a callback for such work by these arguments.

    By using the callback feature, critical sections protected by
    single queue lock might need to be divided.
    (e.g. In scsi_end_request(), blk_queue_end_tag() and
          end_that_request_last() was protected by 1 spinlock but 
          1 spinlock is used for each call. See PATCH#3)
    If it can't be divided, the caller has to get the lock before
    calling blk_end_request(), though it may have performance affects.

    To cover callers which wants to call only __end_that_request_first()
    like PIO mode in cdrom_newpc_intr(), the callback can return int value.
    If the return value is non 0, blk_end_request() returns 1 immediately
    without the request completion. (See cdrom_newpc_intr() in PATCH#3)

If this is acceptable approach, I will break down, clean up and repost
the patch.

Thanks,
Kiyoshi Ueda

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ