linux-kernel - Re: [PATCH v2 1/4] locking/qspinlock: Handle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190123093424.GE15019@brain-police>
Date:   Wed, 23 Jan 2019 09:34:25 +0000
From:   Will Deacon <will.deacon@....com>
To:     Waiman Long <longman@...hat.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
        linux-arch@...r.kernel.org, x86@...nel.org,
        Zhenzhong Duan <zhenzhong.duan@...cle.com>,
        James Morse <james.morse@....com>,
        SRINIVAS <srinivas.eeda@...cle.com>
Subject: Re: [PATCH v2 1/4] locking/qspinlock: Handle > 4 slowpath nesting
 levels

On Tue, Jan 22, 2019 at 10:49:08PM -0500, Waiman Long wrote:
> Four queue nodes per cpu are allocated to enable up to 4 nesting levels
> using the per-cpu nodes. Nested NMIs are possible in some architectures.
> Still it is very unlikely that we will ever hit more than 4 nested
> levels with contention in the slowpath.
> 
> When that rare condition happens, however, it is likely that the system
> will hang or crash shortly after that. It is not good and we need to
> handle this exception case.
> 
> This is done by spinning directly on the lock using repeated trylock.
> This alternative code path should only be used when there is nested
> NMIs. Assuming that the locks used by those NMI handlers will not be
> heavily contended, a simple TAS locking should work out.
> 
> Suggested-by: Peter Zijlstra <peterz@...radead.org>
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
>  kernel/locking/qspinlock.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
> index 8a8c3c2..0875053 100644
> --- a/kernel/locking/qspinlock.c
> +++ b/kernel/locking/qspinlock.c
> @@ -412,6 +412,21 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
>  	idx = node->count++;
>  	tail = encode_tail(smp_processor_id(), idx);

Does the compiler generate better code if we move the tail assignment
further down, closer to the xchg_tail() call?

> +	/*
> +	 * 4 nodes are allocated based on the assumption that there will
> +	 * not be nested NMIs taking spinlocks. That may not be true in
> +	 * some architectures even though the chance of needing more than
> +	 * 4 nodes will still be extremely unlikely. When that happens,
> +	 * we fall back to spinning on the lock directly without using
> +	 * any MCS node. This is not the most elegant solution, but is
> +	 * simple enough.
> +	 */
> +	if (unlikely(idx >= MAX_NODES)) {
> +		while (!queued_spin_trylock(lock))
> +			cpu_relax();
> +		goto release;
> +	}

Acked-by: Will Deacon <will.deacon@....com>

Will