linux-kernel - Re: [RFC PATCH 1/5] rust: block: introduce `kernel::block::mq` module

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f405ff55-fcb0-4592-ae4b-e1188eae9953@proton.me>
Date: Wed, 03 Apr 2024 19:37:44 +0000
From: Benno Lossin <benno.lossin@...ton.me>
To: Andreas Hindborg <nmi@...aspace.dk>
Cc: Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>, Keith Busch <kbusch@...nel.org>, Damien Le Moal <Damien.LeMoal@....com>, Bart Van Assche <bvanassche@....org>, Hannes Reinecke <hare@...e.de>, "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>, Andreas Hindborg <a.hindborg@...sung.com>, Wedson Almeida Filho <wedsonaf@...il.com>, Niklas Cassel <Niklas.Cassel@....com>, Greg KH <gregkh@...uxfoundation.org>, Matthew Wilcox <willy@...radead.org>, Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>, Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>, Björn Roy Baron <bjorn3_gh@...tonmail.com>, Alice Ryhl <aliceryhl@...gle.com>, Chaitanya Kulkarni <chaitanyak@...dia.com>, Luis Chamberlain <mcgrof@...nel.org>, Yexuan Yang <1182282462@...t.edu.cn>, Sergio González Collado <sergio.collado@...il.com>, Joel Granados <j.granados@...sung.com>, "Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>, Daniel Gomez
	<da.gomez@...sung.com>, open list <linux-kernel@...r.kernel.org>, "rust-for-linux@...r.kernel.org" <rust-for-linux@...r.kernel.org>, "lsf-pc@...ts.linux-foundation.org" <lsf-pc@...ts.linux-foundation.org>, "gost.dev@...sung.com" <gost.dev@...sung.com>, Ming Lei <ming.lei@...hat.com>
Subject: Re: [RFC PATCH 1/5] rust: block: introduce `kernel::block::mq` module

On 03.04.24 10:46, Andreas Hindborg wrote:
> Benno Lossin <benno.lossin@...ton.me> writes:
> 
>> On 23.03.24 07:32, Andreas Hindborg wrote:
>>> Benno Lossin <benno.lossin@...ton.me> writes:
>>>> On 3/13/24 12:05, Andreas Hindborg wrote:
>>>>> +//! implementations of the `Operations` trait.
>>>>> +//!
>>>>> +//! IO requests are passed to the driver as [`Request`] references. The
>>>>> +//! `Request` type is a wrapper around the C `struct request`. The driver must
>>>>> +//! mark start of request processing by calling [`Request::start`] and end of
>>>>> +//! processing by calling one of the [`Request::end`], methods. Failure to do so
>>>>> +//! can lead to IO failures.
>>>>
>>>> I am unfamiliar with this, what are "IO failures"?
>>>> Do you think that it might be better to change the API to use a
>>>> callback? So instead of calling start and end, you would do
>>>>
>>>>        request.handle(|req| {
>>>>            // do the stuff that would be done between start and end
>>>>        });
>>>>
>>>> I took a quick look at the rnull driver and there you are calling
>>>> `Request::end_ok` from a different function. So my suggestion might not
>>>> be possible, since you really need the freedom.
>>>>
>>>> Do you think that a guard approach might work better? ie `start` returns
>>>> a guard that when dropped will call `end` and you need the guard to
>>>> operate on the request.
>>>
>>> I don't think that would fit, since the driver might not complete the
>>> request immediately. We might be able to call `start` on behalf of the
>>> driver.
>>>
>>> At any rate, since the request is reference counted now, we can
>>> automatically fail a request when the last reference is dropped and it
>>> was not marked successfully completed. I would need to measure the
>>> performance implications of such a feature.
>>
>> Are there cases where you still need access to the request after you
>> have called `end`?
> 
> In general no, there is no need to handle the request after calling end.
> C drivers are not allowed to, because this transfers ownership of the
> request back to the block layer. This patch series defer the transfer of
> ownership to the point when the ARef<Request> refcount goes to zero, so
> there should be no danger associated with touching the `Request` after
> end.
> 
>> If no, I think it would be better for the request to
>> be consumed by the `end` function.
>> This is a bit difficult with `ARef`, since the user can just clone it
>> though... Do you think that it might be necessary to clone requests?
> 
> Looking into the details now I see that calling `Request::end` more than
> once will trigger UAF, because C code decrements the refcount on the
> request. When we have `ARef<Request>` around, that is a problem. It
> probably also messes with other things in C land. Good catch.
> 
> I did implement `Request::end` to consume the request at one point
> before I fell back on reference counting. It works fine for simple
> drivers. However, most drivers will need to use the block layer tag set
> service, that allows conversion of an integer id to a request pointer.
> The abstraction for this feature is not part of this patch set. But the
> block layer manages a mapping of integer to request mapping, and drivers
> typically use this to identify the request that corresponds to
> completion messages that arrive from hardware. When drivers are able to
> turn integers into requests like this, consuming the request in the call
> to `end` makes little sense (because we can just construct more).

How do you ensure that this is fine?:

     let r1 = tagset.get(0);
     let r2 = tagset.get(0);
     r1.end_ok();
     r2.do_something_that_would_only_be_done_while_active();

One thing that comes to my mind would be to only give out `&Request`
from the tag set. And to destroy, you could have a separate operation
that also removes the request from the tag set. (I am thinking of a tag
set as a `HashMap<u64, Request>`.

> 
> What I do now is issue the an `Option<ARef<Request>>` with
> `bindings::req_ref_inc_not_zero(rq_ptr)`, to make sure that the request
> is currently owned by the driver.
> 
> I guess we can check the absolute value of the refcount, and only issue
> a request handle if the count matches what we expect. Then we can be certain
> that the handle is unique, and we can require transfer of ownership of
> the handle to `Request::end` to make sure it can never be called more
> than once.
> 
> Another option is to error out in `Request::end` if the
> refcount is not what we expect.

I am a bit confused, why does the refcount matter in this case? Can't
the user just have multiple `ARef`s?

I think it would be weird to use `ARef<Request>` if you expect the
refcount to be 1. Maybe the API should be different?
As I understand it, a request has the following life cycle (please
correct me if I am wrong):
1. A new request is created, it is given to the driver via `queue_rq`.
2. The driver can now decide what to do with it (theoretically it can
    store it somewhere and later do something with it), but it should at
    some point call `Request::start`.
3. Work happens and eventually the driver calls `Request::end`.

To me this does not seem like something where we need a refcount (we
still might need one for safety, but it does not need to be exposed to
the user).

-- 
Cheers,
Benno