[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aD2Ge8RM1uTT726z@pollux>
Date: Mon, 2 Jun 2025 13:09:47 +0200
From: Danilo Krummrich <dakr@...nel.org>
To: Lyude Paul <lyude@...hat.com>, Alexandre Courbot <acourbot@...dia.com>
Cc: Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>,
Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>,
Andreas Hindborg <a.hindborg@...nel.org>,
Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>,
Thomas Zimmermann <tzimmermann@...e.de>,
John Hubbard <jhubbard@...dia.com>, Ben Skeggs <bskeggs@...dia.com>,
Joel Fernandes <joelagnelf@...dia.com>,
Timur Tabi <ttabi@...dia.com>, Alistair Popple <apopple@...dia.com>,
linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org,
nouveau@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH v4 13/20] gpu: nova-core: register sysmem flush page
On Fri, May 30, 2025 at 05:57:44PM -0400, Lyude Paul wrote:
> On Wed, 2025-05-21 at 15:45 +0900, Alexandre Courbot wrote:
> > Reserve a page of system memory so sysmembar can perform a read on it if
> > a system write occurred since the last flush. Do this early as it can be
> > required to e.g. reset the GPU falcons.
> >
> > Signed-off-by: Alexandre Courbot <acourbot@...dia.com>
> > ---
> > drivers/gpu/nova-core/gpu.rs | 45 +++++++++++++++++++++++++++++++++++++++++--
> > drivers/gpu/nova-core/regs.rs | 10 ++++++++++
> > 2 files changed, 53 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> > index 50417f608dc7b445958ae43444a13c7593204fcf..a4e2cf1b529cc25fc168f68f9eaa6f4a7a9748eb 100644
> > --- a/drivers/gpu/nova-core/gpu.rs
> > +++ b/drivers/gpu/nova-core/gpu.rs
> > @@ -2,6 +2,7 @@
> >
> > use kernel::{device, devres::Devres, error::code::*, pci, prelude::*};
> >
> > +use crate::dma::DmaObject;
> > use crate::driver::Bar0;
> > use crate::firmware::{Firmware, FIRMWARE_VERSION};
> > use crate::gfw;
> > @@ -158,12 +159,32 @@ fn new(bar: &Bar0) -> Result<Spec> {
> > }
> >
> > /// Structure holding the resources required to operate the GPU.
> > -#[pin_data]
> > +#[pin_data(PinnedDrop)]
> > pub(crate) struct Gpu {
> > spec: Spec,
> > /// MMIO mapping of PCI BAR 0
> > bar: Devres<Bar0>,
> > fw: Firmware,
> > + /// System memory page required for flushing all pending GPU-side memory writes done through
> > + /// PCIE into system memory.
> > + sysmem_flush: DmaObject,
> > +}
> > +
> > +#[pinned_drop]
> > +impl PinnedDrop for Gpu {
> > + fn drop(self: Pin<&mut Self>) {
> > + // Unregister the sysmem flush page before we release it.
> > + let _ = self.bar.try_access_with(|b| {
> > + regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::default()
> > + .set_adr_39_08(0)
> > + .write(b);
> > + if self.spec.chipset >= Chipset::GA102 {
> > + regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::default()
> > + .set_adr_63_40(0)
> > + .write(b);
> > + }
> > + });
> > + }
Sorry that I haven't noticed this before -- I think this should be self
contained in a new type (e.g. SysmemFlush).
We should also move this kind of cleanup into the Driver::remove() callback,
where we still have a bound device, to avoid try_access_with().
I already have this on my list to implement for quite a while, because I wasn't
quite sure yet what's the best way to approach this, but I think the simple
remove() callback to perform tear down operations on device resources is fine.
I'll prepare the corresponding patches and subsequently rework those bits
accordingly.
> > }
> >
> > impl Gpu {
> > @@ -187,10 +208,30 @@ pub(crate) fn new(
> > gfw::wait_gfw_boot_completion(bar)
> > .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?;
> >
> > + // System memory page required for sysmembar to properly flush into system memory.
> > + let sysmem_flush = {
> > + let page = DmaObject::new(pdev.as_ref(), kernel::bindings::PAGE_SIZE)?;
> > +
> > + // Register the sysmem flush page.
> > + let handle = page.dma_handle();
> > +
> > + regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::default()
> > + .set_adr_39_08((handle >> 8) as u32)
> > + .write(bar);
> > + if spec.chipset >= Chipset::GA102 {
> > + regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::default()
> > + .set_adr_63_40((handle >> 40) as u32)
> > + .write(bar);
> > + }
> > +
>
> Small nit - would it make sense for us to just add a function for initiating a
> sysmem memory flush that you could pass the bar to? Seems like it might be a
> bit less error prone if we end up having to do this elsewhere
Agreed -- but let's solve this with a new type and make it a method instead.
Powered by blists - more mailing lists