[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aORieLYckU9YLdVF@nvidia.com>
Date: Mon, 6 Oct 2025 17:44:40 -0700
From: Nicolin Chen <nicolinc@...dia.com>
To: Jacob Pan <jacob.pan@...ux.microsoft.com>
CC: <linux-kernel@...r.kernel.org>, "iommu@...ts.linux.dev"
<iommu@...ts.linux.dev>, Will Deacon <will@...nel.org>, Jason Gunthorpe
<jgg@...dia.com>, Robin Murphy <robin.murphy@....com>, Zhang Yu
<zhangyu1@...ux.microsoft.com>, Jean Philippe-Brucker
<jean-philippe@...aro.org>, Alexander Grest <Alexander.Grest@...rosoft.com>
Subject: Re: [PATCH 1/2] iommu/arm-smmu-v3: Fix CMDQ timeout warning
On Wed, Sep 24, 2025 at 10:54:37AM -0700, Jacob Pan wrote:
> While polling for n spaces in the cmdq, the current code instead checks
> if the queue is full. If the queue is almost full but not enough space
> (<n), then the CMDQ timeout warning is never triggered even if the
> polling has exceeded timeout limit.
This does sound like an issue that is missing a warning print.
> This patch polls for the availability of exact space instead of full and
> emit timeout warning accordingly.
And the solution sounds plausible as well.
> @@ -806,10 +769,28 @@ int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu,
> do {
> u64 old;
>
> + queue_poll_init(smmu, &qp);
> while (!queue_has_space(&llq, n + sync)) {
> local_irq_restore(flags);
> - if (arm_smmu_cmdq_poll_until_not_full(smmu, cmdq, &llq))
> - dev_err_ratelimited(smmu->dev, "CMDQ timeout\n");
> + /*
> + * Try to update our copy of cons by grabbing exclusive cmdq access. If
> + * that fails, spin until somebody else updates it for us.
> + */
> + if (arm_smmu_cmdq_exclusive_trylock_irqsave(cmdq, flags)) {
> + WRITE_ONCE(cmdq->q.llq.cons, readl_relaxed(cmdq->q.cons_reg));
> + arm_smmu_cmdq_exclusive_unlock_irqrestore(cmdq, flags);
> + llq.val = READ_ONCE(cmdq->q.llq.val);
> + local_irq_save(flags);
> + continue;
> + }
> +
> + ret = queue_poll(&qp);
> + if (ret == -ETIMEDOUT) {
> + dev_err_ratelimited(smmu->dev, "CPU %d CMDQ Timeout, C: %08x, CW:%x P: 0x%08x PW: %x cmdq.lock 0x%x\n",
> + smp_processor_id(), Q_IDX(&llq, llq.cons), Q_WRP(&llq, llq.cons), Q_IDX(&llq, llq.prod), Q_WRP(&llq, llq.prod), atomic_read(&cmdq->lock));
> + queue_poll_init(smmu, &qp);
> + }
> + llq.val = READ_ONCE(cmdq->q.llq.val);
> local_irq_save(flags);
But, couldn't we write a new arm_smmu_cmdq_poll_until_enough_space()
simply replacing arm_smmu_cmdq_exclusive_unlock_irqrestore()?
This whole unwrapped piece is really not easy to read :(
Also, the new error message has too much debugging info, which could
be trimmed away, IMHO. Though kernel coding now does allow a higher
limit per line, that 200-ish-character line is a bit overdone :-/
Nicolin
Powered by blists - more mailing lists