linux-kernel - Re: [PATCH v6 3/3] rust: sync: Add SpinLockIrq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a5fgunoc.ffs@tglx>
Date: Mon, 07 Oct 2024 14:01:39 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Lyude Paul <lyude@...hat.com>, rust-for-linux@...r.kernel.org
Cc: Danilo Krummrich <dakr@...hat.com>, airlied@...hat.com, Ingo Molnar
 <mingo@...hat.com>, Will Deacon <will@...nel.org>, Waiman Long
 <longman@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 linux-kernel@...r.kernel.org, Benno Lossin <benno.lossin@...ton.me>,
 Daniel
 Almeida <daniel.almeida@...labora.com>, Gary Guo <gary@...yguo.net>, Miguel
 Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>, Wedson
 Almeida Filho <wedsonaf@...il.com>, Boqun Feng <boqun.feng@...il.com>,
 Björn Roy Baron <bjorn3_gh@...tonmail.com>, Andreas
 Hindborg <a.hindborg@...sung.com>, Alice Ryhl <aliceryhl@...gle.com>,
 Trevor Gross <tmgross@...ch.edu>, Martin Rodriguez Reboredo
 <yakoyoku@...il.com>, Valentin Obst <kernel@...entinobst.de>
Subject: Re: [PATCH v6 3/3] rust: sync: Add SpinLockIrq

On Fri, Oct 04 2024 at 14:48, Lyude Paul wrote:
> On Wed, 2024-10-02 at 22:53 +0200, Thomas Gleixner wrote:
>> At this phase of rust integration there is no need to wrap
>> raw_spinlock_t, so you have two options to solve that:
>> 
>>    1) Map Rust's SpinLockIrq() to spin_lock_irqsave() and
>>       spin_unlock_irqrestore() which does the right thing
>> 
>>    2) Play all the PREEMPT_RT games in the local irq disable abstraction
>
> I would very strongly rather #2. The problem with #1 is that one of the goals
> with the way I designed this abstraction with was to make it so that we could
> have lock guards that share the lifetime of the IrqDisabled token - which
> means the compiler can stop you from holding the lock outside of an
> IrqDisabled context. We have a powerful type system in rust, so IMO we should
> use it.
>
> I don't think this is as difficult to do as it seems either. One thing we
> could do is have two different versions of the with_irqs_disabled functions:
>
> with_irqs_disabled_on_nort
> with_irqs_disabled
>
> And as well, have each respectively return a different token type:
>
> IrqsDisabledNoRt -> Local interrupts are disabled on non-RT kernels
> IrqsDisabled -> Local interrupts are disabled always
>
> I think this actually is a nice solution, because it provides a number of
> benefits:
>
>  * It makes it much more clear that interrupts won't always be disabled. I'll
>    be honest, I've been working on drivers for almost a decade in the upstream
>    kernel and as you can see I don't think any of us actually realized
>    interrupts being turned off here wasn't a given :P. I'm sure it's
>    documented, but when you've been working on this stuff for so long you
>    don't always default to going back to documentation for stuff like this.
>  * Having two different token types would prevent raw spinlocks from being
>    used in contexts where it's not guaranteed local IRQs would be disabled -
>    and vice versa.

You really want to have two distinct lock types: spinlock and
raw_spinlock. On a non-RT kernel spinlock maps to raw_spinlock, but
that's an implementation detail.

>> #1 is the right thing to do because no driver should rely on actually
>> disabling interrupts on the CPU. If there is a driver which does that,
>> then it's not compatible with RT and should use a local lock instead.
>
> FWIW too - that seems reasonable. The reason I still see benefit in with
> with_irqs_disabled_on_nort though is that this feels a bit closer to some of
> the goals of the C API to me. We have spin_lock_irqsave and spin_lock, with
> the intention that on non-RT kernels IRQs should only need to be disabled a
> single time even if multiple spinlocks are acquired within the scope of a
> single function. I'd like to ensure we can still do that on rust since it's
> possible to do.

Sure. That's not the problem. The problem is:

      local_irq_save();
      spin_lock();

instead of

      spin_lock_irqsave();

The latter allows RT kernels to substitute spin_lock_irqsave() with:

      rt_spin_lock();

which maps to a rtmutex variant and does neither disable interrupts nor
preemption. It only disables migration to guarantee that the task stays
on the CPU, which in turn is a prerequisite for protecting per CPU data
with the lock.

The former does not work on RT because then the rtmutex is acquired with
interrupts disabled, which is a nono because the acquire can sleep.

There is another problem with this split. The example in your spinlock
patch is exactly what we don't want:

> +/// // Accessing an `Example` from a context where IRQs may not be disabled already.
> +/// let b = with_irqs_disabled(|irq| {
> +///     noirq_work(&e, irq);
> +///     e.d.lock_with(irq).b
> +/// });

Why? 

This pattern is in 99% of the cases wrong to begin with independent of
RT because noirq_work() can only be safe if it operates strictly on per
CPU data. If it accesses any shared resource including hardware it's
broken on SMP.

Outside of a very narrow part of core code which uses raw spinlocks,
there is absolutely zero reason for such a construct. We've educated
driver writers to avoid this pattern and now Rust tries to reintroduce
it.

Please do not encourage people to do the wrong thing.

I completely understand and agree with the goal of taking advantage of
Rust's safety, but not for the price of misguiding people.

So you want to make this work:

   spin_lock_irqsave(A);
   spin_lock(B);

and let the compiler validate that the nested spin_lock() is invoked in
a context which has interrupts disabled, right?

To do that you split the spin_lock_irqsave() into

   local_irq_save();     #1
     spin_lock(A);       #2
       spin_lock(B);     #3
       spin_unlock(B);
     spin_unlock(A);
   local_irq_restore();

That makes sense as it gives you three distinct guard contexts, but the
outermost guard context (interrupt disable) is an illusion in most cases
as it does not provide a guard for anything. It merely provides the
prerequisite for locking lock A.

The above example really should not end up in 3 guard contexts, but in
two by combining #1 and #2 into one. In C this looks like:

    scoped_guard(spinlock_irqsave)(&A) {
        // Allows to operate on resources which are exclusively
        // protected by A (DataA)
        
    	scoped_guard(spinlock)(&B) {
           // Allows to operate on resources which are exclusively
           // protected by B (DataB)
        }
    }

Nesting B into lock A is required to keep some aspects of DataA and
DataB consistent. But the other parts of DataB require only B to be
held.

For extended fun lock B is not necessarily required to be acquired with
interrupts disabled. The fact that it nests into lock A does not make it
mandatory.

A lock is only required to be acquired with interrupts disabled if it
can be taken in interrupt context. That's a per lock property.

Thanks,

        tglx