lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aORn/vKfVL88q05w@nvidia.com>
Date: Mon, 6 Oct 2025 18:08:14 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Jacob Pan <jacob.pan@...ux.microsoft.com>
CC: <linux-kernel@...r.kernel.org>, "iommu@...ts.linux.dev"
	<iommu@...ts.linux.dev>, Will Deacon <will@...nel.org>, Jason Gunthorpe
	<jgg@...dia.com>, Robin Murphy <robin.murphy@....com>, Zhang Yu
	<zhangyu1@...ux.microsoft.com>, Jean Philippe-Brucker
	<jean-philippe@...aro.org>, Alexander Grest <Alexander.Grest@...rosoft.com>
Subject: Re: [PATCH 2/2] iommu/arm-smmu-v3: Improve CMDQ lock fairness and
 efficiency

On Wed, Sep 24, 2025 at 10:54:38AM -0700, Jacob Pan wrote:
>  static void arm_smmu_cmdq_shared_lock(struct arm_smmu_cmdq *cmdq)
>  {
> -	int val;
> -
>  	/*
> -	 * We can try to avoid the cmpxchg() loop by simply incrementing the
> -	 * lock counter. When held in exclusive state, the lock counter is set
> -	 * to INT_MIN so these increments won't hurt as the value will remain
> -	 * negative.
> +	 * We can simply increment the lock counter. When held in exclusive
> +	 * state, the lock counter is set to INT_MIN so these increments won't
> +	 * hurt as the value will remain negative.

It seems to me that the change at the first statement is not very
necessary.

> This will also signal the
> +	 * exclusive locker that there are shared waiters. Once the exclusive
> +	 * locker releases the lock, the sign bit will be cleared and our
> +	 * increment will make the lock counter positive, allowing us to
> +	 * proceed.
>  	 */
>  	if (atomic_fetch_inc_relaxed(&cmdq->lock) >= 0)
>  		return;
>  
> -	do {
> -		val = atomic_cond_read_relaxed(&cmdq->lock, VAL >= 0);
> -	} while (atomic_cmpxchg_relaxed(&cmdq->lock, val, val + 1) != val);
> +	atomic_cond_read_relaxed(&cmdq->lock, VAL >= 0);

The returned value is not captured for anything. Is this read()
necessary? If so, a line of comments elaborating it?

> +/*
> + * Only clear the sign bit when releasing the exclusive lock this will
> + * allow any shared_lock() waiters to proceed without the possibility
> + * of entering the exclusive lock in a tight loop.
> + */
>  #define arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags)		\
>  ({									\
> -	atomic_set_release(&cmdq->lock, 0);				\
> +	atomic_fetch_and_release(~INT_MIN, &cmdq->lock);				\

By a quick skim, the whole thing looks quite smart to me. But I
need some time to revisit and perhaps test it as well.

It's also important to get feedback from Will. Both patches are
touching his writing that has been running for years already..

Nicolin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ