linux-kernel - Re: [PATCH 1/3] rust: add untrusted data abstraction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZvJy0aZL5I339r6j@phenom.ffwll.local>
Date: Tue, 24 Sep 2024 10:05:37 +0200
From: Simona Vetter <simona.vetter@...ll.ch>
To: Benno Lossin <benno.lossin@...ton.me>
Cc: Greg KH <gregkh@...uxfoundation.org>, Miguel Ojeda <ojeda@...nel.org>,
	Alex Gaynor <alex.gaynor@...il.com>,
	Wedson Almeida Filho <wedsonaf@...il.com>,
	Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
	Björn Roy Baron <bjorn3_gh@...tonmail.com>,
	Andreas Hindborg <a.hindborg@...sung.com>,
	Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
	rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] rust: add untrusted data abstraction

On Mon, Sep 23, 2024 at 04:56:08PM +0000, Benno Lossin wrote:
> On 23.09.24 18:08, Simona Vetter wrote:
> > On Sat, Sep 21, 2024 at 07:45:44AM +0000, Benno Lossin wrote:
> >> On 16.09.24 17:49, Simona Vetter wrote:
> >>> On Fri, Sep 13, 2024 at 04:49:29PM +0000, Benno Lossin wrote:
> >>>> On 13.09.24 17:33, Simona Vetter wrote:
> >>>>> On Fri, Sep 13, 2024 at 11:26:57AM +0000, Benno Lossin wrote:
> >>>>>>  /// Used to transfer ownership to and from foreign (non-Rust) languages.
> >>>>>> @@ -532,3 +533,248 @@ unsafe impl AsBytes for str {}
> >>>>>>  // does not have any uninitialized portions either.
> >>>>>>  unsafe impl<T: AsBytes> AsBytes for [T] {}
> >>>>>>  unsafe impl<T: AsBytes, const N: usize> AsBytes for [T; N] {}
> >>>>>> +
> >>>>>> +/// Untrusted data of type `T`.
> >>>>>> +///
> >>>>>> +/// When reading data from userspace, hardware or other external untrusted sources, the data must
> >>>>>> +/// be validated before it is used for logic within the kernel. To do so, the [`validate()`]
> >>>>>> +/// function exists and uses the [`Validator`] trait. For raw bytes validation there also is the
> >>>>>> +/// [`validate_bytes()`] function.
> >>>>>> +///
> >>>>>> +///
> >>>>>> +/// [`validate()`]: Self::validate
> >>>>>> +/// [`validate_bytes()`]: Self::validate_bytes
> >>>>>> +#[repr(transparent)]
> >>>>>> +pub struct Untrusted<T: ?Sized>(T);
> >>>>>
> >>>>> I think we could make this a bit more strict and instead of encouraging
> >>>>> people to put all their validation code into a Validator<T>, force
> >>>>> them to. Which means assuming apis wrap all untrusted stuff in
> >>>>> Untrusted<T> reviewers only need to look at validate implementations and
> >>>>> nothing else.
> >>>>>
> >>>>> If I'm not too wrong with my rust I think this would work with slitting
> >>>>> Untrusted<T> into two types:
> >>>>>
> >>>>> - Untrusted<T> is just a box you can't access, and the type returned by
> >>>>>   apis for untrusted data. It doesn't have any of the functions except
> >>>>>   validate and new_untrusted so that you can't get at the data at all.
> >>>>>
> >>>>> - UntrustedUnsafe<T> does have all the accessors needed for validation,
> >>>>>   but you can't construct that outside of this module. Maybe simplest to
> >>>>>   just nest this within the Untrusted<T> box.
> >>>>>
> >>>>> - Untrusted::validate does the unboxing and passes a reference to
> >>>>>   UntrustedUnsafe<T> to Validator::validate, to guarantee that all the
> >>>>>   validation code is in there and nowhere else. Similar for
> >>>>>   validate_bytes.
> >>>>>
> >>>>> Of course people can still bypass this easily, but it gets a bit harder to
> >>>>> do so, and easier for reviewers to verify everything.
> >>>>
> >>>> This is a great idea!
> >>>> I think we should also remove `validate_bytes`, then you really must
> >>>> implement `Validator`. If we figure out later that people need it, we
> >>>> can add it later again. I added it because I thought that implementing
> >>>> `Validator` for something very small might be very annoying, but it
> >>>> seems like you actually want to force people to implement it :)
> >>>
> >>> See further down, I think there's a real use-case for validate_bytes, or
> >>> something really close to it at least.
> >>
> >> Could you point me to it? I wasn't able to find the use-case you are
> >> referring to here.
> >>
> >> If I remove the `untrusted()` function, then I would think that I also
> >> should remove the `validate_bytes` function, since you could just do:
> >>
> >>     untrusted.validate_bytes(|_| true)
> >>
> >> Which would give you access to the untrusted data without validation.
> >> (of course one could also do this in the `Validate` trait, but I feel
> >> like if we have exactly one place where this can happen, it will be a
> >> lot easier to catch. We could even have some tool that looks for
> >> `Validate` implementations that just return `Ok(untrusted)`)
> > 
> > So the use-case I've meant is the in-place validation, without copying
> > stuff around. So it wouldn't be the specific validate_bytes, but a
> > validate_inplace. In my original email I did directly call it that, but it
> > took me a w/e to realize that validate_bytes is really just a special-case
> > of validate_inplace when T = [u8], and so really we only need
> > validate_inplace for every case where we do some in-place validation.
> 
> Ahh I see, then we don't need `validate_bytes`.
> 
> > And yes validate_inplace would be a very easy minimal escape hatch to just
> > unbox the unvalidated data. But fundamentally we can't prevent that, all
> > we can do is make sure that the validation code is concentrated in very
> > specific places reviewers can easily catch, and |_| true probably not
> > being correct validation code is rather easy to spot.
> 
> I don't think that it necessarily has to be an escape hatch. I think I
> can design an API where you can choose if it will be done in-place vs
> copied, while still using the `Validate` trait.

Yeah if you can put both copy and in-place validation into the Validate
trait then might be cleaner, I just didn't come up with a way to do that.

> > I do think we need inplace validation for the fairly common case where the
> > copy_from_user is split from the validation, e.g. when you copy a big
> > chunk of memory which is composed itself of variable-length objects that
> > reside within: You'd do one big copy_from_user and then a pile of inplace
> > validations for each thing that's in there, maybe with the assistance of
> > alice's Range module to make sure you don't split up the overall memory
> > area badly into individual pieces.
> 
> Yes, will definitely add it.

Sounds good!
-Sima
-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch