lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aD2Ge8RM1uTT726z@pollux>
Date: Mon, 2 Jun 2025 13:09:47 +0200
From: Danilo Krummrich <dakr@...nel.org>
To: Lyude Paul <lyude@...hat.com>, Alexandre Courbot <acourbot@...dia.com>
Cc: Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>,
	Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
	Björn Roy Baron <bjorn3_gh@...tonmail.com>,
	Benno Lossin <benno.lossin@...ton.me>,
	Andreas Hindborg <a.hindborg@...nel.org>,
	Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
	David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
	Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
	Maxime Ripard <mripard@...nel.org>,
	Thomas Zimmermann <tzimmermann@...e.de>,
	John Hubbard <jhubbard@...dia.com>, Ben Skeggs <bskeggs@...dia.com>,
	Joel Fernandes <joelagnelf@...dia.com>,
	Timur Tabi <ttabi@...dia.com>, Alistair Popple <apopple@...dia.com>,
	linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org,
	nouveau@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org
Subject: Re: [PATCH v4 13/20] gpu: nova-core: register sysmem flush page

On Fri, May 30, 2025 at 05:57:44PM -0400, Lyude Paul wrote:
> On Wed, 2025-05-21 at 15:45 +0900, Alexandre Courbot wrote:
> > Reserve a page of system memory so sysmembar can perform a read on it if
> > a system write occurred since the last flush. Do this early as it can be
> > required to e.g. reset the GPU falcons.
> > 
> > Signed-off-by: Alexandre Courbot <acourbot@...dia.com>
> > ---
> >  drivers/gpu/nova-core/gpu.rs  | 45 +++++++++++++++++++++++++++++++++++++++++--
> >  drivers/gpu/nova-core/regs.rs | 10 ++++++++++
> >  2 files changed, 53 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
> > index 50417f608dc7b445958ae43444a13c7593204fcf..a4e2cf1b529cc25fc168f68f9eaa6f4a7a9748eb 100644
> > --- a/drivers/gpu/nova-core/gpu.rs
> > +++ b/drivers/gpu/nova-core/gpu.rs
> > @@ -2,6 +2,7 @@
> >  
> >  use kernel::{device, devres::Devres, error::code::*, pci, prelude::*};
> >  
> > +use crate::dma::DmaObject;
> >  use crate::driver::Bar0;
> >  use crate::firmware::{Firmware, FIRMWARE_VERSION};
> >  use crate::gfw;
> > @@ -158,12 +159,32 @@ fn new(bar: &Bar0) -> Result<Spec> {
> >  }
> >  
> >  /// Structure holding the resources required to operate the GPU.
> > -#[pin_data]
> > +#[pin_data(PinnedDrop)]
> >  pub(crate) struct Gpu {
> >      spec: Spec,
> >      /// MMIO mapping of PCI BAR 0
> >      bar: Devres<Bar0>,
> >      fw: Firmware,
> > +    /// System memory page required for flushing all pending GPU-side memory writes done through
> > +    /// PCIE into system memory.
> > +    sysmem_flush: DmaObject,
> > +}
> > +
> > +#[pinned_drop]
> > +impl PinnedDrop for Gpu {
> > +    fn drop(self: Pin<&mut Self>) {
> > +        // Unregister the sysmem flush page before we release it.
> > +        let _ = self.bar.try_access_with(|b| {
> > +            regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::default()
> > +                .set_adr_39_08(0)
> > +                .write(b);
> > +            if self.spec.chipset >= Chipset::GA102 {
> > +                regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::default()
> > +                    .set_adr_63_40(0)
> > +                    .write(b);
> > +            }
> > +        });
> > +    }

Sorry that I haven't noticed this before -- I think this should be self
contained in a new type (e.g. SysmemFlush).

We should also move this kind of cleanup into the Driver::remove() callback,
where we still have a bound device, to avoid try_access_with().

I already have this on my list to implement for quite a while, because I wasn't
quite sure yet what's the best way to approach this, but I think the simple
remove() callback to perform tear down operations on device resources is fine.

I'll prepare the corresponding patches and subsequently rework those bits
accordingly.

> >  }
> >  
> >  impl Gpu {
> > @@ -187,10 +208,30 @@ pub(crate) fn new(
> >          gfw::wait_gfw_boot_completion(bar)
> >              .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?;
> >  
> > +        // System memory page required for sysmembar to properly flush into system memory.
> > +        let sysmem_flush = {
> > +            let page = DmaObject::new(pdev.as_ref(), kernel::bindings::PAGE_SIZE)?;
> > +
> > +            // Register the sysmem flush page.
> > +            let handle = page.dma_handle();
> > +
> > +            regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::default()
> > +                .set_adr_39_08((handle >> 8) as u32)
> > +                .write(bar);
> > +            if spec.chipset >= Chipset::GA102 {
> > +                regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::default()
> > +                    .set_adr_63_40((handle >> 40) as u32)
> > +                    .write(bar);
> > +            }
> > +
> 
> Small nit - would it make sense for us to just add a function for initiating a
> sysmem memory flush that you could pass the bar to? Seems like it might be a
> bit less error prone if we end up having to do this elsewhere

Agreed -- but let's solve this with a new type and make it a method instead.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ