linux-kernel - Re: [PATCH v7 0/5] rust: adds Bitmap API, ID pool and bindings

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <68094345.d40a0220.1c7d6a.d84e@mx.google.com>
Date: Wed, 23 Apr 2025 12:45:07 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Yury Norov <yury.norov@...il.com>
Cc: Alice Ryhl <aliceryhl@...gle.com>, Burak Emir <bqe@...gle.com>,
	Rasmus Villemoes <linux@...musvillemoes.dk>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Miguel Ojeda <ojeda@...nel.org>,
	Alex Gaynor <alex.gaynor@...il.com>, Gary Guo <gary@...yguo.net>,
	Björn Roy Baron <bjorn3_gh@...tonmail.com>,
	Benno Lossin <benno.lossin@...ton.me>,
	Andreas Hindborg <a.hindborg@...nel.org>,
	Trevor Gross <tmgross@...ch.edu>, rust-for-linux@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v7 0/5] rust: adds Bitmap API, ID pool and bindings

On Wed, Apr 23, 2025 at 02:00:50PM -0400, Yury Norov wrote:
[...]
> > > > > Yeah, and it's not just "flushing of caches", it's making CPU1's memory
> > > > > operations on the object pointed by "mut ref" observable to CPU2. If
> > > > > CPU1 and CPU2 sync with the a lock, then lock guarantees that, 
> > > 
> > > The problem here is that the object pointed by the 'mut ref' is the
> > > rust class Bitmap. The class itself allocates an array, which is used
> > > as an actual storage. The Rust class and C array will likely not share
> > > cache lines.
> > > 
> > > The pointer is returned from a C call bitmap_zalloc(), so I don't
> > > think it's possible for Rust compiler to realize that the number
> > > stored in Bitmap is a pointer to data of certain size, and that it
> > > should be flushed at "mut ref" put... That's why I guessed a global
> > > flush.
> > > 
> > 
> > You don't do the flush in the C code either, right? You would rely on
> > some existing synchronization between threads to make sure CPU2 observes
> > the memory effect of CPU1 (if that's what you want).
> > 
> > > Yeah, would be great to understand how this all works.
> > > 
> > > As a side question: in regular C spinlocks, can you point me to the
> > > place where the caches get flushed when a lock moves from CPU1 to
> > > CPU2? I spent some time looking at the code, but found nothing myself.
> > > Or this implemented in a different way?
> > 
> > Oh I see, the simple answer would be "the fact that cache flushing is
> > done is implied", now let's take a simple example:
> > 
> > 	CPU 1			CPU 2
> > 	=====			=====
> > 	spin_lock();
> > 	x = 1;
> > 	spin_unlock();
> > 
> > 				spin_lock();
> > 				r1 = x;		// r1 == 1
> > 				spin_unlock();
> > 
> > that is, if CPU 2 gets the lock later than CPU 1, r1 is guaranteed to be
> > 1, right? Now let's open the box, with a trivial spinlock implementation:
> > 
> > 	CPU 1			CPU 2
> > 	=====			=====
> > 	spin_lock();
> > 	x = 1;
> > 	spin_unlock():
> > 	  smp_store_release(lock, 0);
> > 
> > 				spin_lock():
> > 				  while (cmpxchg_acquire(lock, 0, 1) != 0) { }
> > 				  
> > 				r1 = x;		// r1 == 1
> > 				spin_unlock();
> > 
> > now, for CPU2 to acquire the lock, the cmpxchg_acquire() has to succeed,
> > that means a few things:
> > 
> > 1. 	CPU2 observes the lock value to be 0, i.e CPU2 observes the
> > 	store of CPU1 on the lock.
> > 
> > 2.	Since the smp_store_release() on CPU1, and the cmpxchg_acquire()
> > 	on CPU2, it's guaranteed that CPU2 has observed the memory
> > 	effect before the smp_store_release() on CPU1. And this is the
> > 	"implied" part. In the real hardware cache protocal, what the
> > 	smp_store_release() does is basically "flush/invalidate the
> > 	cache and issue the store", therefore since CPU2 observes the
> > 	store part of the smp_store_release(), it's implied that the
> > 	cache flush/invalidate is observed by CPU2 already. Of course
> > 	the actual hardware cache protocal is more complicated, but this
> > 	is the gist of it.
> > 
> > Based on 1 & 2, normally a programer won't need to reason about where
> > the cache flush is actually issued, but rather the synchronization built
> > vi the shared variables (in this case, it's the "lock").
> > 
> > Hope this could help.
> 
> Yeah, that helped a lot. Thank you!
> 
> So, if this Rust mutable reference is implemented similarly to a
> regular spinlock, I've no more questions.
> 

Just to be clear, a mutable reference in Rust is just a pointer (with
special compiler treatment for checking and optimzation), so mutable
reference is not "implemented similarly to a regular spinlock", it's
rather that: if you have a shared data, and you want to get a mutable
reference, you will have to use some synchronization, and maybe 90% case
that's a lock.

So here, what Burak did in Bitmap was defining those non-atomic
functions as requiring mutable references, and if we also get the Sync
and Send part, right. A real user would 90% use a lock to access a
mutable reference to `Bitmap`.

Makes sense?

Regards,
Boqun

> Thanks again for the explanation.