linux-kernel - Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <51768B25.1060501@redhat.com>
Date:	Tue, 23 Apr 2013 09:22:45 -0400
From:	Don Dutile <ddutile@...hat.com>
To:	Joerg Roedel <joro@...tes.org>
CC:	Suravee Suthikulanit <suravee.suthikulpanit@....com>,
	iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2 V2] iommu/amd: Add workaround for ERBT1312

On 04/18/2013 12:28 PM, Joerg Roedel wrote:
> On Thu, Apr 18, 2013 at 11:13:19AM -0500, Suravee Suthikulanit wrote:
>> This workaround is required for both event log and ppr log.  Your
>> patch is only taking care of the event log.
>
> Right, thanks for the notice. Here is the updated patch.
>
>  From cebe04596989c4b9001e2c1571c4fb219ea37b99 Mon Sep 17 00:00:00 2001
> From: Joerg Roedel<joro@...tes.org>
> Date: Thu, 18 Apr 2013 17:55:04 +0200
> Subject: [PATCH] iommu/amd: Workaround for ERBT1312
>
> Work around an IOMMU  hardware bug where clearing the
> EVT_INT or PPR_INT bit in the status register may race with
> the hardware trying to set it again. When not handled the
> bit might not be cleared and we lose all future event or ppr
> interrupts.
>
> Reported-by: Suravee Suthikulpanit<suravee.suthikulpanit@....com>
> Cc: stable@...r.kernel.org
> Signed-off-by: Joerg Roedel<joro@...tes.org>
> ---
>   drivers/iommu/amd_iommu.c |   34 ++++++++++++++++++++++++++--------
>   1 file changed, 26 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index f42793d..27792f8 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -700,14 +700,23 @@ retry:
>
>   static void iommu_poll_events(struct amd_iommu *iommu)
>   {
> -	u32 head, tail;
> +	u32 head, tail, status;
>   	unsigned long flags;
>
> -	/* enable event interrupts again */
> -	writel(MMIO_STATUS_EVT_INT_MASK, iommu->mmio_base + MMIO_STATUS_OFFSET);
> -
>   	spin_lock_irqsave(&iommu->lock, flags);
>
> +	/* enable event interrupts again */
> +	do {
> +		/*
> +		 * Workaround for Erratum ERBT1312
> +		 * Clearing the EVT_INT bit may race in the hardware, so read
> +		 * it again and make sure it was really cleared
> +		 */
> +		status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
> +		writel(MMIO_STATUS_EVT_INT_MASK,
> +		       iommu->mmio_base + MMIO_STATUS_OFFSET);
> +	} while (status&  MMIO_STATUS_EVT_INT_MASK);
> +
>   	head = readl(iommu->mmio_base + MMIO_EVT_HEAD_OFFSET);
>   	tail = readl(iommu->mmio_base + MMIO_EVT_TAIL_OFFSET);
>
> @@ -744,16 +753,25 @@ static void iommu_handle_ppr_entry(struct amd_iommu *iommu, u64 *raw)
>   static void iommu_poll_ppr_log(struct amd_iommu *iommu)
>   {
>   	unsigned long flags;
> -	u32 head, tail;
> +	u32 head, tail, status;
>
>   	if (iommu->ppr_log == NULL)
>   		return;
>
> -	/* enable ppr interrupts again */
> -	writel(MMIO_STATUS_PPR_INT_MASK, iommu->mmio_base + MMIO_STATUS_OFFSET);
> -
>   	spin_lock_irqsave(&iommu->lock, flags);
>
> +	/* enable ppr interrupts again */
> +	do {
> +		/*
> +		 * Workaround for Erratum ERBT1312
> +		 * Clearing the PPR_INT bit may race in the hardware, so read
> +		 * it again and make sure it was really cleared
> +		 */
> +		status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
> +		writel(MMIO_STATUS_PPR_INT_MASK,
> +		       iommu->mmio_base + MMIO_STATUS_OFFSET);
> +	} while (status&  MMIO_STATUS_PPR_INT_MASK);
> +
>   	head = readl(iommu->mmio_base + MMIO_PPR_HEAD_OFFSET);
>   	tail = readl(iommu->mmio_base + MMIO_PPR_TAIL_OFFSET);
>
Given other threads on this mail list (and I've seen crashes with same problem)
where this type of logging during a flood of IOMMU errors will lock up the machine,
is there something that can be done to break the do-while loop after n iterations
have been exec'd, so the kernel can progress during a crash ?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/