[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874jf3kflx.fsf@metaspace.dk>
Date: Tue, 23 Jan 2024 19:39:15 +0100
From: "Andreas Hindborg (Samsung)" <nmi@...aspace.dk>
To: Benno Lossin <benno.lossin@...ton.me>
Cc: Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>, Keith
Busch <kbusch@...nel.org>, Damien Le Moal <Damien.LeMoal@....com>, Hannes
Reinecke <hare@...e.de>, lsf-pc@...ts.linux-foundation.org,
rust-for-linux@...r.kernel.org, linux-block@...r.kernel.org, Matthew
Wilcox <willy@...radead.org>, Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor
<alex.gaynor@...il.com>, Wedson Almeida Filho <wedsonaf@...il.com>, Boqun
Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>, Björn Roy Baron
<bjorn3_gh@...tonmail.com>, linux-kernel@...r.kernel.org,
gost.dev@...sung.com
Subject: Re: [RFC PATCH 03/11] rust: block: introduce `kernel::block::mq`
module
Benno Lossin <benno.lossin@...ton.me> writes:
> Hi Andreas,
>
> just so you know, I received this email today, so it was very late,
> since the send date is January 12.
My mistake. I started drafting Jan 12, but did not get time to finish
the mail until today. I guess that is how mu4e does things, I should be
aware and fix up the date. Thanks for letting me know 👍
>
> On 12.01.24 10:18, Andreas Hindborg (Samsung) wrote:
>>
>> Hi Benno,
>>
>> Benno Lossin <benno.lossin@...ton.me> writes:
>>
>> <...>
>>
>>>> diff --git a/rust/kernel/block/mq/gen_disk.rs b/rust/kernel/block/mq/gen_disk.rs
>>>> new file mode 100644
>>>> index 000000000000..50496af15bbf
>>>> --- /dev/null
>>>> +++ b/rust/kernel/block/mq/gen_disk.rs
>>>> @@ -0,0 +1,133 @@
>>>> +// SPDX-License-Identifier: GPL-2.0
>>>> +
>>>> +//! GenDisk abstraction
>>>> +//!
>>>> +//! C header: [`include/linux/blkdev.h`](../../include/linux/blkdev.h)
>>>> +//! C header: [`include/linux/blk_mq.h`](../../include/linux/blk_mq.h)
>>>> +
>>>> +use crate::block::mq::{raw_writer::RawWriter, Operations, TagSet};
>>>> +use crate::{
>>>> + bindings, error::from_err_ptr, error::Result, sync::Arc, types::ForeignOwnable,
>>>> + types::ScopeGuard,
>>>> +};
>>>> +use core::fmt::{self, Write};
>>>> +
>>>> +/// A generic block device
>>>> +///
>>>> +/// # Invariants
>>>> +///
>>>> +/// - `gendisk` must always point to an initialized and valid `struct gendisk`.
>>>> +pub struct GenDisk<T: Operations> {
>>>> + _tagset: Arc<TagSet<T>>,
>>>> + gendisk: *mut bindings::gendisk,
>>>
>>> Why are these two fields not embedded? Shouldn't the user decide where
>>> to allocate?
>>
>> The `TagSet` can be shared between multiple `GenDisk`. Using an `Arc`
>> seems resonable?
>>
>> For the `gendisk` field, the allocation is done by C and the address
>> must be stable. We are owning the pointee and must drop it when it goes out
>> of scope. I could do this:
>>
>> #[repr(transparent)]
>> struct GenDisk(Opaque<bindings::gendisk>);
>>
>> struct UniqueGenDiskRef {
>> _tagset: Arc<TagSet<T>>,
>> gendisk: Pin<&'static mut GenDisk>,
>>
>> }
>>
>> but it seems pointless. `struct GenDisk` would not be pub in that case. What do you think?
>
> Hmm, I am a bit confused as to how you usually use a `struct gendisk`.
> You said that a `TagSet` might be shared between multiple `GenDisk`s,
> but that is not facilitated by the C side?
>
> Is it the case that on the C side you create a struct containing a
> tagset and a gendisk for every block device you want to represent?
Yes, but the `struct tag_set` can be shared between multiple `struct
gendisk`.
Let me try to elaborate:
In C you would first allocate a `struct tag_set` and partially
initialize it. The allocation can be dynamic, static or part of existing
allocation. You would then partially initialize the structure and finish
the initialization by calling `blk_mq_alloc_tag_set()`. This populates
the rest of the structure which includes more dynamic allocations.
You then allocate a `struct gendisk` by calling `blk_mq_alloc_disk()`,
passing in a pointer to the `struct tag_set` you just created. This
function will return a pointer to a `struct gendisk` on success.
In the Rust abstractions, we allocate the `TagSet`:
#[pin_data(PinnedDrop)]
#[repr(transparent)]
pub struct TagSet<T: Operations> {
#[pin]
inner: Opaque<bindings::blk_mq_tag_set>,
_p: PhantomData<T>,
}
with `PinInit` [^1]. The initializer will partially initialize the struct and
finish the initialization like C does by calling
`blk_mq_alloc_tag_set()`. We now need a place to point the initializer.
`Arc::pin_init()` is that place for now. It allows us to pass the
`TagSet` reference to multiple `GenDisk` if required. Maybe we could be
generic over `Deref<TagSet>` in the future. Bottom line is that we need
to hold on to that `TagSet` reference until the `GenDisk` is dropped.
`struct tag_set` is not reference counted on the C side. C
implementations just take care to keep it alive, for instance by storing
it next to a pointer to `struct gendisk` that it is servicing.
> And you decided for the Rust abstractions that you want to have only a
> single generic struct for any block device, distinguished by the generic
> parameter?
Yes, we have a single generic struct (`GenDisk`) representing the C
`struct gendisk`, and a single generic struct (`TagSet`) representing
the C `struct tag_set`. These are both generic over `T: Operations`.
`Operations` represent a C vtable (`struct blk_mq_ops`) attached to the
`struct tag_set`. This vtable is provided by the driver and holds
function pointers that allow the kernel to perform actions such as queue
IO requests with the driver. A C driver can instantiate multiple `struct
gendisk` and service them with the same `struct tag_set` and thereby the
same vtable. Or it can use separate tag sets and the same vtable. Or a
separate tag_set and vtable for each gendisk.
> I think these kinds of details would be nice to know. Not only for
> reviewers, but also for veterans of the C APIs.
I should write some module level documentation clarifying the use of
these types. The null block driver is a simple example, but it is just
code. I will include more docs in the next version.
Best regards
Andreas
[^1]: This was not `PinInit` in the RFC, I changed this based on your
feedback. The main points are still the same though.
Powered by blists - more mailing lists