netdev - Re: [patch V2 08/15] Documentation: Add lock ordering and nesting documentation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87mu8apzxr.fsf@nanos.tec.linutronix.de>
Date:   Fri, 20 Mar 2020 20:51:44 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     paulmck@...nel.org
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>, Will Deacon <will@...nel.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Randy Dunlap <rdunlap@...radead.org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Logan Gunthorpe <logang@...tatee.com>,
        Kurt Schwemmer <kurt.schwemmer@...rosemi.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org,
        Felipe Balbi <balbi@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-usb@...r.kernel.org, Kalle Valo <kvalo@...eaurora.org>,
        "David S. Miller" <davem@...emloft.net>,
        linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
        Oleg Nesterov <oleg@...hat.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Michael Ellerman <mpe@...erman.id.au>,
        Arnd Bergmann <arnd@...db.de>, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [patch V2 08/15] Documentation: Add lock ordering and nesting documentation

"Paul E. McKenney" <paulmck@...nel.org> writes:
>
>  - The soft interrupt related suffix (_bh()) still disables softirq
>    handlers.  However, unlike non-PREEMPT_RT kernels (which disable
>    preemption to get this effect), PREEMPT_RT kernels use a per-CPU
>    lock to exclude softirq handlers.

I've made that:

  - The soft interrupt related suffix (_bh()) still disables softirq
    handlers.

    Non-PREEMPT_RT kernels disable preemption to get this effect.

    PREEMPT_RT kernels use a per-CPU lock for serialization. The lock
    disables softirq handlers and prevents reentrancy by a preempting
    task.
    
On non-RT this is implicit through preemption disable, but it's non
obvious for RT as preemption stays enabled.

> PREEMPT_RT kernels preserve all other spinlock_t semantics:
>
>  - Tasks holding a spinlock_t do not migrate.  Non-PREEMPT_RT kernels
>    avoid migration by disabling preemption.  PREEMPT_RT kernels instead
>    disable migration, which ensures that pointers to per-CPU variables
>    remain valid even if the task is preempted.
>
>  - Task state is preserved across spinlock acquisition, ensuring that the
>    task-state rules apply to all kernel configurations.  In non-PREEMPT_RT
>    kernels leave task state untouched.  However, PREEMPT_RT must change
>    task state if the task blocks during acquisition.  Therefore, the
>    corresponding lock wakeup restores the task state.  Note that regular
>    (not lock related) wakeups do not restore task state.

   - Task state is preserved across spinlock acquisition, ensuring that the
     task-state rules apply to all kernel configurations.  Non-PREEMPT_RT
     kernels leave task state untouched.  However, PREEMPT_RT must change
     task state if the task blocks during acquisition.  Therefore, it
     saves the current task state before blocking and the corresponding
     lock wakeup restores it. A regular not lock related wakeup sets the
     task state to RUNNING. If this happens while the task is blocked on
     a spinlock then the saved task state is changed so that correct
     state is restored on lock wakeup.

Hmm?

> But this code failes on PREEMPT_RT kernels because the memory allocator
> is fully preemptible and therefore cannot be invoked from truly atomic
> contexts.  However, it is perfectly fine to invoke the memory allocator
> while holding a normal non-raw spinlocks because they do not disable
> preemption::
>
>> +  spin_lock(&lock);
>> +  p = kmalloc(sizeof(*p), GFP_ATOMIC);
>> +
>> +Most places which use GFP_ATOMIC allocations are safe on PREEMPT_RT as the
>> +execution is forced into thread context and the lock substitution is
>> +ensuring preemptibility.
>
> Interestingly enough, most uses of GFP_ATOMIC allocations are
> actually safe on PREEMPT_RT because the the lock substitution ensures
> preemptibility.  Only those GFP_ATOMIC allocations that are invoke
> while holding a raw spinlock or with preemption otherwise disabled need
> adjustment to work correctly on PREEMPT_RT.
>
> [ I am not as confident of the above as I would like to be... ]

I'd leave that whole paragraph out. This documents the rules and from
the above code examples it's pretty clear what works and what not :)

> And meeting time, will continue later!

Enjoy!

Thanks,

        tglx