lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250204143124.1252912-1-avri.altman@wdc.com>
Date: Tue,  4 Feb 2025 16:31:24 +0200
From: Avri Altman <avri.altman@....com>
To: "Martin K . Petersen" <martin.petersen@...cle.com>
Cc: linux-scsi@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Bart Van Assche <bvanassche@....org>,
	Avri Altman <avri.altman@....com>
Subject: [PATCH v2] scsi: ufs: critical health condition

Martin hi,

The UFS4.1 standard, released on January 8, 2025, is adding several new
features. Among them is a new exception event: HEALTH_CRITICAL, which
notifies the host of a device's critical health condition. This
notification implies that the device is approaching the end of its
lifetime based on the amount of performed program/erase cycles.

Once an EOL (End-of-Life) exception event is received, we toggle a
`dev_info` designated member, which is exposed via a `sysfs` entry.
This new `sysfs` entry, will indicate whether a critical health
condition has been detected (1) or not (0).

To handle this new `sysfs` entry, `udev` rules should be configured to
monitor changes in the `critical_health` attribute. When the attribute
value changes to 1, indicating a critical health condition, `udev` can
trigger a user-space script or application to notify the user or take
appropriate actions.

The host can gain further insight into the specific issue by reading one
of the following attributes: bPreEOLInfo, bDeviceLifeTimeEstA,
bDeviceLifeTimeEstB, bWriteBoosterBufferLifeTimeEst, and
bRPMBLifeTimeEst. All those are available for reading via the driver's
sysfs entries or through an applicable utility. It is up to user-space
to read these attributes if needed.

Please consider this for the next merge window.

Thanks,
Avri

Signed-off-by: Avri Altman <avri.altman@....com>

---
Changes in v2:
 - withdraw from using hw-monitor subsystem (Guenter)
---
 Documentation/ABI/testing/sysfs-driver-ufs | 13 +++++++++++++
 drivers/ufs/core/ufs-sysfs.c               | 10 ++++++++++
 drivers/ufs/core/ufshcd.c                  |  6 ++++++
 include/ufs/ufs.h                          |  4 ++++
 4 files changed, 33 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs b/Documentation/ABI/testing/sysfs-driver-ufs
index 5fa6655aee84..a74ea70bcee4 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1559,3 +1559,16 @@ Description:
 		Symbol - HCMID. This file shows the UFSHCD manufacturer id.
 		The Manufacturer ID is defined by JEDEC in JEDEC-JEP106.
 		The file is read only.
+
+What:		/sys/bus/platform/drivers/ufshcd/*/critical_health
+What:		/sys/bus/platform/devices/*.ufs/critical_health
+Date:		February 2025
+Contact:	Avri Altman <avri.altman@....com>
+Description:	indicate whether a critical health condition has been detected
+		(1) or not (0). further insight into the specific issue can be
+		gained by reading one of: bPreEOLInfo, bDeviceLifeTimeEstA,
+		bDeviceLifeTimeEstB, bWriteBoosterBufferLifeTimeEst, and
+		bRPMBLifeTimeEst.
+
+		The file is read only.
+
diff --git a/drivers/ufs/core/ufs-sysfs.c b/drivers/ufs/core/ufs-sysfs.c
index 3438269a5440..84de3e558a55 100644
--- a/drivers/ufs/core/ufs-sysfs.c
+++ b/drivers/ufs/core/ufs-sysfs.c
@@ -458,6 +458,14 @@ static ssize_t pm_qos_enable_store(struct device *dev,
 	return count;
 }
 
+static ssize_t critical_health_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
+{
+	struct ufs_hba *hba = dev_get_drvdata(dev);
+
+	return sysfs_emit(buf, "%d\n", hba->dev_info.critical_health);
+}
+
 static DEVICE_ATTR_RW(rpm_lvl);
 static DEVICE_ATTR_RO(rpm_target_dev_state);
 static DEVICE_ATTR_RO(rpm_target_link_state);
@@ -470,6 +478,7 @@ static DEVICE_ATTR_RW(enable_wb_buf_flush);
 static DEVICE_ATTR_RW(wb_flush_threshold);
 static DEVICE_ATTR_RW(rtc_update_ms);
 static DEVICE_ATTR_RW(pm_qos_enable);
+static DEVICE_ATTR_RO(critical_health);
 
 static struct attribute *ufs_sysfs_ufshcd_attrs[] = {
 	&dev_attr_rpm_lvl.attr,
@@ -484,6 +493,7 @@ static struct attribute *ufs_sysfs_ufshcd_attrs[] = {
 	&dev_attr_wb_flush_threshold.attr,
 	&dev_attr_rtc_update_ms.attr,
 	&dev_attr_pm_qos_enable.attr,
+	&dev_attr_critical_health.attr,
 	NULL
 };
 
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index cd404ade48dc..950878c0bb01 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -6216,6 +6216,9 @@ static void ufshcd_exception_event_handler(struct work_struct *work)
 	if (status & hba->ee_drv_mask & MASK_EE_URGENT_TEMP)
 		ufshcd_temp_exception_event_handler(hba, status);
 
+	if (status & hba->ee_drv_mask & MASK_EE_HEALTH_CRITICAL)
+		hba->dev_info.critical_health = true;
+
 	ufs_debugfs_exception_event(hba, status);
 }
 
@@ -8308,6 +8311,9 @@ static int ufs_get_device_desc(struct ufs_hba *hba)
 
 	ufshcd_temp_notif_probe(hba, desc_buf);
 
+	if (dev_info->wspecversion >= 0x410)
+		ufshcd_enable_ee(hba, MASK_EE_HEALTH_CRITICAL);
+
 	ufs_init_rtc(hba, desc_buf);
 
 	/*
diff --git a/include/ufs/ufs.h b/include/ufs/ufs.h
index 89672ad8c3bb..d25083098f66 100644
--- a/include/ufs/ufs.h
+++ b/include/ufs/ufs.h
@@ -419,6 +419,7 @@ enum {
 	MASK_EE_TOO_LOW_TEMP		= BIT(4),
 	MASK_EE_WRITEBOOSTER_EVENT	= BIT(5),
 	MASK_EE_PERFORMANCE_THROTTLING	= BIT(6),
+	MASK_EE_HEALTH_CRITICAL		= BIT(9),
 };
 #define MASK_EE_URGENT_TEMP (MASK_EE_TOO_HIGH_TEMP | MASK_EE_TOO_LOW_TEMP)
 
@@ -589,6 +590,9 @@ struct ufs_dev_info {
 	u32 rtc_update_period;
 
 	u8 rtt_cap; /* bDeviceRTTCap */
+
+	/* HEALTH_CRITICAL exception reported */
+	bool critical_health;
 };
 
 /*
-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ