lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <baf17040-70e8-d850-30cd-74944e41285d@acm.org>
Date:   Mon, 13 Sep 2021 09:23:07 -0700
From:   Bart Van Assche <bvanassche@....org>
To:     Kiwoong Kim <kwmad.kim@...sung.com>, linux-scsi@...r.kernel.org,
        linux-kernel@...r.kernel.org, alim.akhtar@...sung.com,
        avri.altman@....com, jejb@...ux.ibm.com,
        martin.petersen@...cle.com, beanhuo@...ron.com,
        cang@...eaurora.org, adrian.hunter@...el.com, sc.suh@...sung.com,
        hy50.seo@...sung.com, sh425.lee@...sung.com,
        bhoon95.kim@...sung.com
Subject: Re: [PATCH v2 3/3] scsi: ufs: ufs-exynos: implement exynos isr

On 9/13/21 12:55 AM, Kiwoong Kim wrote:
> This patch is to raise recovery in some abnormal
> conditions using an vendor specific interrupt
> for some cases, such as a situation that some
> contexts of a pending request in the host isn't
> the same with those of its corresponding UPIUs
> if they should have been the same exactly.
> 
> The representative case is shown like below.
> In the case, a broken UTRD entry, for internal
> coherent problem or whatever, that had smaller value
> of PRDT length than expected was transferred to the host.
> So, the host raised an interrupt of transfer complete
> even if device didn't finish its data transfer because
> the host sees a fetched version of UTRD to determine
> if data tranfer is over or not. Then the application level
> seemed to recogize this as a sort of corruption and this
> symptom led to boot failure.

How can a UTRD entry be broken? Does that perhaps indicate memory
corruption at the host side? Working around host-side memory
corruption in a driver seems wrong to me. I think the root cause
of the memory corruption should be fixed.

> +static irqreturn_t exynos_ufs_isr(struct ufs_hba *hba)
> +{
> +	struct exynos_ufs *ufs = ufshcd_get_variant(hba);
> +	u32 status;
> +	unsigned long flags;
> +
> +	if (!hba->priv) return IRQ_HANDLED;

Please verify patches with checkpatch before posting these on the
linux-scsi mailing list. The above if-statement does not follow the
Linux kernel coding style.

> +	if (status & RX_UPIU_HIT_ERROR) {
> +		pr_err("%s: status: 0x%08x\n", __func__, status);
> +		hba->force_reset = true;
> +		hba->force_requeue = true;
> +		scsi_schedule_eh(hba->host);
> +		spin_unlock_irqrestore(hba->host->host_lock, flags);
> +		return IRQ_HANDLED;
> +	}
> +	return IRQ_NONE;
> +}

So the above code unlocks the host_lock depending on whether or not
status & RX_UPIU_HIT_ERROR is true? Yikes ...

Additionally, in the above code I found the following pattern:

	unsigned long flags;
	[ ... ]
	spin_unlock_irqrestore(hba->host->host_lock, flags);

Such code is ALWAYS wrong. The value of the 'flags' argument passed to
spin_unlock_irqrestore() must come from spin_lock_irqsave().

Bart.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ