lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <874j74x111.fsf@metaspace.dk>
Date: Wed, 28 Aug 2024 12:52:36 +0000
From: Andreas Hindborg <nmi@...aspace.dk>
To: Trevor Gross <tmgross@...ch.edu>
Cc: Luis Chamberlain <mcgrof@...nel.org>, Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>, Wedson Almeida Filho <wedsonaf@...il.com>, Andreas Hindborg <a.hindborg@...sung.com>, Adam Bratschi-Kaye <ark.email@...il.com>, Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>, Björn Roy Baron <bjorn3_gh@...tonmail.com>, Benno Lossin <benno.lossin@...ton.me>, Alice Ryhl <aliceryhl@...gle.com>, Daniel Gomez <da.gomez@...sung.com>, rust-for-linux@...r.kernel.org, linux-modules@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] rust: add `module_params` macro


Hi Trevor,

Thanks a lot for the thorough review!

"Trevor Gross" <tmgross@...ch.edu> writes:

> On Mon, Aug 19, 2024 at 8:35 AM Andreas Hindborg <nmi@...aspace.dk> wrote:
>>
>> From: Andreas Hindborg <a.hindborg@...sung.com>
>>
>> This patch includes changes required for Rust kernel modules to utilize
>> module parameters. This code implements read only support for integer
>> types without `sysfs` support.
>
>> diff --git a/rust/kernel/module_param.rs b/rust/kernel/module_param.rs
>> new file mode 100644
>> index 000000000000..9dfee0311d65
>> --- /dev/null
>> +++ b/rust/kernel/module_param.rs
>> @@ -0,0 +1,339 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +//! Types for module parameters.
>> +//!
>> +//! C header: [`include/linux/moduleparam.h`](../../../include/linux/moduleparam.h)
>> +
>> +use crate::prelude::*;
>> +
>> +/// Types that can be used for module parameters.
>> +///
>> +/// Note that displaying the type in `sysfs` will fail if
>> +/// [`core::str::from_utf8`] (as implemented through the [`core::fmt::Display`]
>> +/// trait) writes more than [`PAGE_SIZE`] bytes (including an additional null
>> +/// terminator).
>
> I'm a bit confused where `str::from_utf8` comes into play - maybe just:
>
>     Note that displaying the type in `sysfs` will fail if the [`Display`]
>     (core::fmt::Display) implementation would write more than
>     [`PAGE_SIZE`] - 1 bytes.

I think it is a hint for where to look if things go wrong for `core`
types. But it makes sense to cut it to cut `Display`.

>> +/// [`PAGE_SIZE`]: `bindings::PAGE_SIZE`
>> +pub trait ModuleParam: core::fmt::Display + core::marker::Sized {
>
> I don't think `Sized` should need `core::marker`

Thanks.

>
>> +    /// The `ModuleParam` will be used by the kernel module through this type.
>> +    ///
>> +    /// This may differ from `Self` if, for example, `Self` needs to track
>> +    /// ownership without exposing it or allocate extra space for other possible
>> +    /// parameter values. This is required to support string parameters in the
>> +    /// future.
>> +    type Value: ?Sized;
>
> ^ proof of the above :)
>
> "This is required... in the future" could probably be moved to a
> non-doc comment or just dropped.

OK.

>
>> +/// Write a string representation of the current parameter value to `buf`.
>> +///
>> +/// # Safety
>> +///
>> +/// Must not be called.
>> +///
>> +/// # Note
>> +///
>> +/// This should not be called as we declare all parameters as read only.
>> +#[allow(clippy::extra_unused_type_parameters)]
>> +unsafe extern "C" fn get_param<T>(
>
> I think this could make sense being a safe function. I get that
> calling it would mean something is wrong - but that's a problem of
> broken invariants elsewhere and not just calling this, no?

The "must not be called" requirement will be dropped when we support
parameters writable through sysfs. Once we allow it to be called, it is
going to have safety requirements and it will have to be unsafe. So I
see no point it not making it unsafe now.

>
>> +/// Trait for parsing integers.
>> +///
>> +/// Strings beginning with `0x`, `0o`, or `0b` are parsed as hex, octal, or
>> +/// binary respectively. Strings beginning with `0` otherwise are parsed as
>> +/// octal. Anything else is parsed as decimal. A leading `+` or `-` is also
>> +/// permitted. Any string parsed by [`kstrtol()`] or [`kstrtoul()`] will be
>> +/// successfully parsed.
>> +///
>> +/// [`kstrtol()`]: https://www.kernel.org/doc/html/latest/core-api/kernel-api.html#c.kstrtol
>> +/// [`kstrtoul()`]: https://www.kernel.org/doc/html/latest/core-api/kernel-api.html#c.kstrtoul
>> +trait ParseInt: Sized {
>
> Since this is standalone functionality, it might be better moved to a
> different module for reuse (and probably its own patch).
>
> I'm not sure about the name either, would something like `FromKStr`
> with a method `from_kstr` make more sense? Also because the current
> trait will conflict if `core::str::FromStr` is also in scope.

I see, that makes sense. I can rename it and move it to a separate patch
and module.

>
> The methods could use docs, and the trait could probably get a doctest
> and/or some unit tests.

I will add that.

>
>> +    fn from_str(src: &str) -> Result<Self> {
>> +        match src.bytes().next() {
>> +            None => Err(EINVAL),
>> +            Some(b'-') => Self::from_str_unsigned(&src[1..])
>> +                .map_err(|_| EINVAL)?
>> +                .checked_neg()
>> +                .ok_or(EINVAL),
>> +            Some(b'+') => Self::from_str_unsigned(&src[1..]).map_err(|_| EINVAL),
>> +            Some(_) => Self::from_str_unsigned(src).map_err(|_| EINVAL),
>> +        }
>> +    }
>
> kstrtol returns -ERANGE on overflow, this might need to check the
> `.kind()` of parse errors to match this

Good idea.

>
>> +macro_rules! impl_parse_int {
>> +    ($ty:ident) => {
>
> Nit: `$ty` could be a `:ty` here (makes no difference)

👍

>
>> +macro_rules! impl_module_param {
>
> Nit: maybe `impl_int_module_param` since it doesn't handle other types

Yes.

>
>> +#[doc(hidden)]
>> +#[macro_export]
>> +/// Generate a static [`kernel_param_ops`](../../../include/linux/moduleparam.h) struct.
>> +///
>> +/// # Examples
>> +///
>> +/// ```ignore
>> +/// make_param_ops!(
>> +///     /// Documentation for new param ops.
>> +///     PARAM_OPS_MYTYPE, // Name for the static.
>> +///     MyType // A type which implements [`ModuleParam`].
>> +/// );
>> +/// ```
>
> Nit: move attributes to after the docs

Will do. Is that a style thing we documented somewhere? Could we have
`clippy` warn about this?

>
>> +macro_rules! make_param_ops {
>> +    ($ops:ident, $ty:ty) => {
>> +        $crate::make_param_ops!(
>> +            #[doc=""]
>> +            $ops,
>> +            $ty
>> +        );
>> +    };
>
> I don't think you need this first pattern, since the meta in the below
> pattern is optional. But...
>
>> +    ($(#[$meta:meta])* $ops:ident, $ty:ty) => {
>> +        $(#[$meta])*
>
> ...you could probably just drop the `$meta` and remove docs from the
> invocation by adding the following here
>
>     /// Rust implementation of [`kernel_param_ops`]
>     /// (../../../include/linux/moduleparam.h)
>     #[doc(concat!("for [`", stringify!($ty), "`].))]
>
> Since the docs are the same except the type

I am not sure I think that is more simple. But either way is fine by me.

>
>> +        ///
>> +        /// Static [`kernel_param_ops`](../../../include/linux/moduleparam.h)
>> +        /// struct generated by [`make_param_ops`].
>> +        pub static $ops: $crate::bindings::kernel_param_ops = $crate::bindings::kernel_param_ops {
>> +            flags: 0,
>> +            set: Some(set_param::<$ty>),
>> +            get: Some(get_param::<$ty>),
>> +            free: Some(free::<$ty>),
>> +        };
>
> Could this be a const rather than a static?

It cannot, because we need an address for it. As far as I understand
`const` items are not guaranteed to have an address.

>
>> +impl_module_param!(i8);
>> +impl_module_param!(u8);
>> +impl_module_param!(i16);
>> +impl_module_param!(u16);
>> +impl_module_param!(i32);
>> +impl_module_param!(u32);
>> +impl_module_param!(i64);
>> +impl_module_param!(u64);
>> +impl_module_param!(isize);
>> +impl_module_param!(usize);
>
> Nit: these could probably go above `impl_module_param` to be next to
> their macro definition.

Either that, or we move each of the three macro instances for each type together.

>
>> diff --git a/rust/macros/helpers.rs b/rust/macros/helpers.rs
>> index 563dcd2b7ace..49388907370d 100644
>> --- a/rust/macros/helpers.rs
>> +++ b/rust/macros/helpers.rs
>> @@ -107,6 +107,14 @@ pub(crate) struct Generics {
>>      pub(crate) ty_generics: Vec<TokenTree>,
>>  }
>>
>> +pub(crate) fn expect_string_field(it: &mut token_stream::IntoIter, expected_name: &str) -> String {
>
> Just to make it obvious what it is looking for:
>
>     /// Expect `expected_name: "value",`

Right.

>> diff --git a/rust/macros/module.rs b/rust/macros/module.rs
>> index 411dc103d82e..2fa9ed8e78ff 100644
>> --- a/rust/macros/module.rs
>> +++ b/rust/macros/module.rs
>
>> +    fn emit_params(&mut self, info: &ModuleInfo) {
>> +        if let Some(params) = &info.params {
>
> Switching to
>
>     let Some(params) = &info.params else {
>         return;
>     };
>
> would save an indentation level

I don't mind the indentation level, but I think what you propose with
early abort is easier to read. I was actually not aware of this syntax.


>> +            for param in params {
>> +                let ops = param_ops_path(&param.ptype);
>> +
>> +                self.emit_param("parmtype", &param.name, &param.ptype);
>> +                self.emit_param("parm", &param.name, &param.description);
>> +
>
> (Referring to below) could the template string body be indented
> another level, so it doesn't hang left of the parent `write!` macro?

Sure.

>
> Also - maybe `let pfx = format!("__{name}_{param_name}") here then use
> `{pfx}` in the template, since that sequence is repeated a lot.

Ok.

>
>> +                write!(
>> +                    self.param_buffer,
>> +                    "
>> +                static mut __{name}_{param_name}_value: {param_type} = {param_default};
>
> Ah.. we need to migrate from `static mut` to `UnsafeCell` wrappers at
> some point. Since `module!` already uses `static mut`, this may need
> to happen separately - meaning I don't think we need to block on
> making any change here.
>
> This would mean adding an `UnsafeSyncCell` / `RacyCell` (just a
> wrapper around `UnsafeCell` that always implements `Sync`), using
> `UnsafeSyncCell<{param_type}>` as the type here, and changing from
> `static mut` to just `static`.
>
> (I can take a look at doing this change for existing `static mut` in
> the near future).

It seems this discussion is happening on Zulip. Dropping a link to
discussion there for people not aware [1].

>
>> +                /// Newtype to make `bindings::kernel_param` `Sync`.
>> +                #[repr(transparent)]
>> +                struct __{name}_{param_name}_RacyKernelParam(::kernel::bindings::kernel_param);
>> +
>> +                // SAFETY: C kernel handles serializing access to this type. We
>> +                // never access from Rust module.
>> +                unsafe impl Sync for __{name}_{param_name}_RacyKernelParam {{ }}
>
> Could there just be a type for this in the kernel crate rather than
> creating one as part of the macro?

We can, but I do not think we gain anything? If we move it to the kernel
crate, we will have to mark it as `pub` and it will just clutter the
name space.

>
>> +                #[cfg(not(MODULE))]
>> +                const __{name}_{param_name}_name: *const ::core::ffi::c_char =
>> +                    b\"{name}.{param_name}\\0\" as *const _ as *const ::core::ffi::c_char;
>
> This should work as `c\"{name}.{param_name}\".as_ptr()` now

Cool.

>
>> +
>> +                #[cfg(MODULE)]
>> +                const __{name}_{param_name}_name: *const ::core::ffi::c_char =
>> +                    b\"{param_name}\\0\" as *const _ as *const ::core::ffi::c_char;
>
> Same as above.
>
>> +                #[link_section = \"__param\"]
>> +                #[used]
>> +                static __{name}_{param_name}_struct: __{name}_{param_name}_RacyKernelParam =
>> +                    __{name}_{param_name}_RacyKernelParam(::kernel::bindings::kernel_param {{
>> +                        name: __{name}_{param_name}_name,
>
> You could eliminate the above `__{name}_{param_name}_name` consts by using:
>
>     name: if cfg!(MODULE) {
>         c\"{param_name}\\"
>     } else {
>         c\"{name}.{param_name}\\"
>     }.as_ptr(),

Either way is fine for me.

>
>> +                        // SAFETY: `__this_module` is constructed by the kernel at load time
>> +                        // and will not be freed until the module is unloaded.
>> +                        #[cfg(MODULE)]
>> +                        mod_: unsafe {{ &::kernel::bindings::__this_module as *const _ as *mut _ }},
>
> Prefer `.cast::<T>().cast_mut()`


Changed to `addr_of_mut!`, but this is also subject to the discussion in
[1], `const_mut_refs` being flaky, right?

>
>> +                        #[cfg(not(MODULE))]
>> +                        mod_: ::core::ptr::null_mut(),
>> +                        ops: &{ops} as *const ::kernel::bindings::kernel_param_ops,
>> +                        perm: 0, // Will not appear in sysfs
>
> For my knowledge, what would be needed to make sysfs work?

We would set the flags field to allow read and/or write, and we would
need to apply proper locking around the parameter storage.

>
>> +                        level: -1,
>> +                        flags: 0,
>> +                        __bindgen_anon_1:
>> +                            ::kernel::bindings::kernel_param__bindgen_ty_1 {{
>> +                                // SAFETY: As this is evaluated in const context, it is
>> +                                // safe to take a reference to a mut static.
>> +                                arg: unsafe {{
>> +                                    ::core::ptr::addr_of_mut!(__{name}_{param_name}_value)
>> +                                 }}.cast::<::core::ffi::c_void>(),
>
> A note on this toward the end
>
>> +                            }},
>> +                    }});
>> +                ",
>
>> @@ -216,12 +387,14 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream {
>
> Is a lot of below just some rewrapping and leading `::`? It may be
> good to split off tweaks to the existing code

Yes, absolute type paths were requested in v1. I will split those that
are not directly related to this change set.

>
>>              // SAFETY: `__this_module` is constructed by the kernel at load time and will not be
>>              // freed until the module is unloaded.
>>              #[cfg(MODULE)]
>> -            static THIS_MODULE: kernel::ThisModule = unsafe {{
>> -                kernel::ThisModule::from_ptr(&kernel::bindings::__this_module as *const _ as *mut _)
>> +            static THIS_MODULE: ::kernel::ThisModule = unsafe {{
>> +                ::kernel::ThisModule::from_ptr(
>> +                    &::kernel::bindings::__this_module as *const _ as *mut _
>> +                )
>>              }};
>>              #[cfg(not(MODULE))]
>> -            static THIS_MODULE: kernel::ThisModule = unsafe {{
>> -                kernel::ThisModule::from_ptr(core::ptr::null_mut())
>> +            static THIS_MODULE: ::kernel::ThisModule = unsafe {{
>> +                ::kernel::ThisModule::from_ptr(::core::ptr::null_mut())
>>              }};
>>
>>              // Double nested modules, since then nobody can access the public items inside.
>> @@ -276,7 +449,8 @@ mod __module_init {{
>>                      #[doc(hidden)]
>>                      #[link_section = \"{initcall_section}\"]
>>                      #[used]
>> -                    pub static __{name}_initcall: extern \"C\" fn() -> core::ffi::c_int = __{name}_init;
>> +                    pub static __{name}_initcall: extern \"C\" fn()
>> +                        -> ::core::ffi::c_int = __{name}_init;
>>
>>                      #[cfg(not(MODULE))]
>>                      #[cfg(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS)]
>> @@ -291,7 +465,7 @@ mod __module_init {{
>>                      #[cfg(not(MODULE))]
>>                      #[doc(hidden)]
>>                      #[no_mangle]
>> -                    pub extern \"C\" fn __{name}_init() -> core::ffi::c_int {{
>> +                    pub extern \"C\" fn __{name}_init() -> ::core::ffi::c_int {{
>>                          // SAFETY: This function is inaccessible to the outside due to the double
>>                          // module wrapping it. It is called exactly once by the C side via its
>>                          // placement above in the initcall section.
>> @@ -314,8 +488,8 @@ mod __module_init {{
>>                      /// # Safety
>>                      ///
>>                      /// This function must only be called once.
>> -                    unsafe fn __init() -> core::ffi::c_int {{
>> -                        match <{type_} as kernel::Module>::init(&super::super::THIS_MODULE) {{
>> +                    unsafe fn __init() -> ::core::ffi::c_int {{
>> +                        match <{type_} as ::kernel::Module>::init(&super::super::THIS_MODULE) {{
>>                              Ok(m) => {{
>>                                  // SAFETY: No data race, since `__MOD` can only be accessed by this
>>                                  // module and there only `__init` and `__exit` access it. These
>> @@ -346,14 +520,17 @@ unsafe fn __exit() {{
>>                              __MOD = None;
>>                          }}
>>                      }}
>> -
>>                      {modinfo}
>>                  }}
>>              }}
>> +            mod module_parameters {{
>> +                {params}
>> +            }}
>>          ",
>>          type_ = info.type_,
>>          name = info.name,
>>          modinfo = modinfo.buffer,
>> +        params = modinfo.param_buffer,
>>          initcall_section = ".initcall6.init"
>>      )
>>      .parse()
>
>> diff --git a/samples/rust/rust_minimal.rs b/samples/rust/rust_minimal.rs
>> index 2a9eaab62d1c..d9bc2218d504 100644
>> --- a/samples/rust/rust_minimal.rs
>> +++ b/samples/rust/rust_minimal.rs
>> @@ -10,6 +10,12 @@
>>      author: "Rust for Linux Contributors",
>>      description: "Rust minimal sample",
>>      license: "GPL",
>> +    params: {
>> +        test_parameter: i64 {
>> +            default: 1,
>> +            description: "This parameter has a default of 1",
>> +        },
>> +    },
>>  }
>
> I was going to suggest updating the sample then saw you did that
> already, thanks :)
>
>> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
>> index efacca63c897..a65bd0233843 100644
>> --- a/scripts/Makefile.build
>> +++ b/scripts/Makefile.build
>> @@ -263,7 +263,7 @@ $(obj)/%.lst: $(obj)/%.c FORCE
>>  # Compile Rust sources (.rs)
>>  # ---------------------------------------------------------------------------
>>
>> -rust_allowed_features := new_uninit
>> +rust_allowed_features := new_uninit,const_mut_refs
>
> We shouldn't enable `const_mut_refs`. It is indeed close to
> stabilization, but it is still kind of churny right now and we don't
> want to enable the sharp edges everywhere.
>
> If the change from `static mut` to `UnsafeCell` that I mentioned above
> happens, `addr_of_mut!` turns into a `.get().cast::<...>()` takes the
> place of `addr_of_mut!` and doesn't require this feature (and also
> isn't unsafe).
>
> If you prefer not to make that change, I think
> `addr_of!(...).cast_mut()` might be the best solution.

>From the discussion at [1] it sounds to me like `static UnsafeCell` is the way
forward. I can apply that for the `static mut` instances this patch
introduces, and I can wait for `module` to be patched with that change
before resending this. No rush.

>
> ---
>
> Other thought: would a wrapper type make more sense here? Something like this:
>
> ```
> /* implementation */
> struct ModParam<T>(UnsafeCell<T>);
>
> // `Parameter` is the existing `ModParameter` (could be
> // any name). It could be sealed.
> impl<T: Parameter> ModParam<T> {
>     #[doc(hidden)] // used in the macro
>     fn new(value: T) -> Self { ... }
>
>     fn get(&self) -> T::Value { ... }
>     fn set(&self, v: T::Value) { ... }
> }
> ```
>
> With usage:
>
> ```
> module! {
>     // ...
>     // instantiated as:
>     // `static MY_PARAM: ModParam<i64> = ModParam::new(1);`
>     MY_PARAM: i64 {
>         default: 1,
>         description: "foo",
>     },
> }
>
> fn foo() {
>     pr_info!("My param is {}", MY_PARAM.get());
> }
> ```
>
> Advantages I see:
> - You bring your own name, rather than being scoped and needing to
> remember the module name

The scoping is intentional. We did not want to pollute the name space,
and `pin_init` had some issues when names were clashing. I think maybe
this was discussed in v1.

> - You can check `ModParam` in the docs to see the API, rather than
> needing to refer to source for the exact signatures of `read` and
> `write`

Yes, that is a bit annoying. We could also implement `read` as a trait
impl, so the trait would be documented in the `kernel` crate.

In the original code that this patch descends from, `read` was generated
differently, depending on how the parameter would integrate with sysfs.
We might be able to do that with generics withthe `ModParam` you suggest.

> - The interface gets a bit more like a mutex or atomic

We discussed passing `self` in v1 and dropped it in favor of the
associated method accessor. It would make sense with the `ModParam`
approach you suggest though.


Best regards,
Andreas


[1] https://rust-for-linux.zulipchat.com/#narrow/stream/x/topic/x/with/465040682



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ