lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D80AK2CLL4AZ.1G6R7OBHOF08O@nvidia.com>
Date: Mon, 24 Feb 2025 10:40:00 +0900
From: "Alexandre Courbot" <acourbot@...dia.com>
To: "Dave Airlie" <airlied@...il.com>, "Danilo Krummrich" <dakr@...nel.org>,
 "Joel Fernandes" <joel@...lfernandes.org>, "Boqun Feng"
 <boqun.feng@...il.com>
Cc: "John Hubbard" <jhubbard@...dia.com>, "Ben Skeggs" <bskeggs@...dia.com>,
 <linux-kernel@...r.kernel.org>, <rust-for-linux@...r.kernel.org>,
 <nouveau@...ts.freedesktop.org>, <dri-devel@...ts.freedesktop.org>
Subject: Re: [RFC PATCH 0/3] gpu: nova-core: add basic timer subdevice
 implementation

On Tue Feb 18, 2025 at 10:46 AM JST, Dave Airlie wrote:
>> 1. How to avoid unnecessary calls to try_access().
>>
>> This is why I made Boot0.read() take a &RevocableGuard<'_, Bar0> as argument. I
>> think we can just call try_access() once and then propage the guard through the
>> callchain, where necessary.
>
> Nope, you can't do that, RevocableGuard holds a lock and things
> explode badly in lockdep if you do.
>
> [ 39.960247] =============================
> [ 39.960265] [ BUG: Invalid wait context ]
> [ 39.960282] 6.12.0-rc2+ #151 Not tainted
> [ 39.960298] -----------------------------
> [ 39.960316] modprobe/2006 is trying to lock:
> [ 39.960335] ffffa08dd7783a68
> (drivers/gpu/nova-core/gsp/sharedq.rs:259){....}-{3:3}, at:
> _RNvMs0_NtNtCs6v51TV2h8sK_6nova_c3gsp7sharedqNtB5_26GSPSharedQueuesr535_113_018rpc_push+0x34/0x4c0
> [nova_core]
> [ 39.960413] other info that might help us debug this:
> [ 39.960434] context-{4:4}
> [ 39.960447] 2 locks held by modprobe/2006:
> [ 39.960465] #0: ffffa08dc27581b0 (&dev->mutex){....}-{3:3}, at:
> __driver_attach+0x111/0x260
> [ 39.960505] #1: ffffffffad55ac10 (rcu_read_lock){....}-{1:2}, at:
> rust_helper_rcu_read_lock+0x11/0x80
> [ 39.960545] stack backtrace:
> [ 39.960559] CPU: 8 UID: 0 PID: 2006 Comm: modprobe Not tainted 6.12.0-rc2+ #151
> [ 39.960586] Hardware name: System manufacturer System Product
> Name/PRIME X370-PRO, BIOS 6231 08/31/2024
> [ 39.960618] Call Trace:
> [ 39.960632] <TASK>
>
> was one time I didn't drop a revocable before proceeding to do other things,

This inability to sleep while we are accessing registers seems very
constraining to me, if not dangerous. It is pretty common to have
functions intermingle hardware accesses with other operations that might
sleep, and this constraint means that in such cases the caller would
need to perform guard lifetime management manually:

  let bar_guard = bar.try_access()?;
  /* do something non-sleeping with bar_guard */
  drop(bar_guard);

  /* do something that might sleep */

  let bar_guard = bar.try_access()?;
  /* do something non-sleeping with bar_guard */
  drop(bar_guard);

  ...

Failure to drop the guard potentially introduces a race condition, which
will receive no compile-time warning and potentialy not even a runtime
one unless lockdep is enabled. This problem does not exist with the
equivalent C code AFAICT, which makes the Rust version actually more
error-prone and dangerous, the opposite of what we are trying to achieve
with Rust. Or am I missing something?


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ