[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250307165751.GT354511@nvidia.com>
Date: Fri, 7 Mar 2025 12:57:51 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Danilo Krummrich <dakr@...nel.org>
Cc: Abdiel Janulgue <abdiel.janulgue@...il.com>, aliceryhl@...gle.com,
robin.murphy@....com, daniel.almeida@...labora.com,
rust-for-linux@...r.kernel.org, Miguel Ojeda <ojeda@...nel.org>,
Alex Gaynor <alex.gaynor@...il.com>,
Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>,
Andreas Hindborg <a.hindborg@...nel.org>,
Trevor Gross <tmgross@...ch.edu>,
Valentin Obst <kernel@...entinobst.de>,
open list <linux-kernel@...r.kernel.org>,
Christoph Hellwig <hch@....de>,
Marek Szyprowski <m.szyprowski@...sung.com>, airlied@...hat.com,
"open list:DMA MAPPING HELPERS" <iommu@...ts.linux.dev>
Subject: Re: [PATCH v12 2/3] rust: add dma coherent allocator abstraction.
On Fri, Mar 07, 2025 at 05:09:17PM +0100, Danilo Krummrich wrote:
> On Fri, Mar 07, 2025 at 08:48:09AM -0400, Jason Gunthorpe wrote:
> > On Fri, Mar 07, 2025 at 09:50:07AM +0100, Danilo Krummrich wrote:
> > > > The actual critical region extends into the HW itself, it is not
> > > > simple to model this with a pure SW construct of bracketing some
> > > > allocation. You need to bracket the *entire lifecycle* of the
> > > > dma_addr_t that has been returned and passed into HW, until the
> > > > dma_addr_t is removed from HW.
> > >
> > > Devres callbacks run after remove(). It's the drivers job to stop operating the
> > > device latest in remove(). Which means that the design is correct.
> >
> > It could be the drivers job to unmap the dma as well if you take that
> > logic.
>
> I really don't understand what you want: *You* brought up that the
> CoherentAllocation is not allowed to out-live driver unbind.
Really? I don't want you to use revoke to solve these problems when
the kernel design pattern is fence.
I thought that was clear.
> > You still didn't answer the question, what is the critical region of
> > the DevRes for a dma_alloc_coherent() actually going to protect?
>
> Devres, just like in C, ensures that an object can't out-live driver unbind. The
> RCU read side critical section is to revoke access to the then invalid pointer
> of the object.
>
> C leaves you with an invalid pointer, whereas Rust revokes the access to the
> invalid pointer for safety reasons. The pointer is never written to, except for
> on driver unbind, hence RCU.
>
> We discussed all this in other threads already.
Why are you explaining very simple concepts as though I do not
understand how RCU or devm works?
I asked you what you intend to protect with the critical region.
I belive you intend to wrapper every memcpy/etc of the allocated
coherent memory with a RCU critical section, correct?
Meaning something like:
mem.ptr = dma_alloc_coherent(&handle)
make_hw_do_dma(handle)
start RCU critical section on mem:
copy_to_user(mem.ptr) // Sleeps! Can't do it!
dma_free_coherent(mem, handle)
Right?
Further, if the critical section ever fails to obtain mem.ptr the
above code is *BUGGY* because it has left a HW DMA running, UAF'd the
now free'd buffer *and the driver author cannot fix it*.
This is an API design that is impossible for a driver author to use
correctly.
Even worse it actively discourages the driver author from thinking
about the lifetime issues at work here because it has this magical
critical section that advertised to provide safety, but actually has a
great big hole in it that the driver author has to understand and
mitigate.
I don't care one bit if the HW UAF issue is in scope or out for Rust -
I *EXPECT* driver authors to prevent it regardless.
> It should prevent all safety related bug, but the one above is impossible to
> solve, so we have to live with it.
You have to live with it, but you should not *ignore* it and should
try to make the problem visible to the driver author and provide
assistance to implement the correct design patterns that do address
it.
Revoke is doing the opposite in my opinion.
In any event, I have to leave the keyboard for some travel, so this
will probably be my last posting on this topic.
Regards,
Jason
Powered by blists - more mailing lists