linux-kernel - Re: [PATCH 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aORn/vKfVL88q05w@nvidia.com>
Date: Mon, 6 Oct 2025 18:08:14 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Jacob Pan <jacob.pan@...ux.microsoft.com>
CC: <linux-kernel@...r.kernel.org>, "iommu@...ts.linux.dev"
	<iommu@...ts.linux.dev>, Will Deacon <will@...nel.org>, Jason Gunthorpe
	<jgg@...dia.com>, Robin Murphy <robin.murphy@....com>, Zhang Yu
	<zhangyu1@...ux.microsoft.com>, Jean Philippe-Brucker
	<jean-philippe@...aro.org>, Alexander Grest <Alexander.Grest@...rosoft.com>
Subject: Re: [PATCH 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and
 efficiency

On Wed, Sep 24, 2025 at 10:54:38AM -0700, Jacob Pan wrote:
>  static void arm_smmu_cmdq_shared_lock(struct arm_smmu_cmdq *cmdq)
>  {
> -	int val;
> -
>  	/*
> -	 * We can try to avoid the cmpxchg() loop by simply incrementing the
> -	 * lock counter. When held in exclusive state, the lock counter is set
> -	 * to INT_MIN so these increments won't hurt as the value will remain
> -	 * negative.
> +	 * We can simply increment the lock counter. When held in exclusive
> +	 * state, the lock counter is set to INT_MIN so these increments won't
> +	 * hurt as the value will remain negative.

It seems to me that the change at the first statement is not very
necessary.

> This will also signal the
> +	 * exclusive locker that there are shared waiters. Once the exclusive
> +	 * locker releases the lock, the sign bit will be cleared and our
> +	 * increment will make the lock counter positive, allowing us to
> +	 * proceed.
>  	 */
>  	if (atomic_fetch_inc_relaxed(&cmdq->lock) >= 0)
>  		return;
>  
> -	do {
> -		val = atomic_cond_read_relaxed(&cmdq->lock, VAL >= 0);
> -	} while (atomic_cmpxchg_relaxed(&cmdq->lock, val, val + 1) != val);
> +	atomic_cond_read_relaxed(&cmdq->lock, VAL >= 0);

The returned value is not captured for anything. Is this read()
necessary? If so, a line of comments elaborating it?

> +/*
> + * Only clear the sign bit when releasing the exclusive lock this will
> + * allow any shared_lock() waiters to proceed without the possibility
> + * of entering the exclusive lock in a tight loop.
> + */
>  #define arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags)		\
>  ({									\
> -	atomic_set_release(&cmdq->lock, 0);				\
> +	atomic_fetch_and_release(~INT_MIN, &cmdq->lock);				\

By a quick skim, the whole thing looks quite smart to me. But I
need some time to revisit and perhaps test it as well.

It's also important to get feedback from Will. Both patches are
touching his writing that has been running for years already..

Nicolin