[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210713075226.11094-1-ogabbay@kernel.org>
Date: Tue, 13 Jul 2021 10:52:16 +0300
From: Oded Gabbay <ogabbay@...nel.org>
To: linux-kernel@...r.kernel.org
Cc: Ofir Bitton <obitton@...ana.ai>
Subject: [PATCH 01/11] habanalabs/gaudi: trigger state dump in case of SM errors
From: Ofir Bitton <obitton@...ana.ai>
State dump is relevant to the user in case of Sync Manager error, so
we need to trigger it in that case as well.
Signed-off-by: Ofir Bitton <obitton@...ana.ai>
Reviewed-by: Oded Gabbay <ogabbay@...nel.org>
Signed-off-by: Oded Gabbay <ogabbay@...nel.org>
---
drivers/misc/habanalabs/gaudi/gaudi.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalabs/gaudi/gaudi.c
index fdbe8155ef3c..6cbedeee15d1 100644
--- a/drivers/misc/habanalabs/gaudi/gaudi.c
+++ b/drivers/misc/habanalabs/gaudi/gaudi.c
@@ -7894,8 +7894,9 @@ static void gaudi_handle_eqe(struct hl_device *hdev,
u32 ctl = le32_to_cpu(eq_entry->hdr.ctl);
u16 event_type = ((ctl & EQ_CTL_EVENT_TYPE_MASK)
>> EQ_CTL_EVENT_TYPE_SHIFT);
- u8 cause;
bool reset_required;
+ u8 cause;
+ int rc;
gaudi->events_stat[event_type]++;
gaudi->events_stat_aggregate[event_type]++;
@@ -8081,6 +8082,10 @@ static void gaudi_handle_eqe(struct hl_device *hdev,
gaudi_print_irq_info(hdev, event_type, false);
gaudi_print_sm_sei_info(hdev, event_type,
&eq_entry->sm_sei_data);
+ rc = hl_state_dump(hdev);
+ if (rc)
+ dev_err(hdev->dev,
+ "Error during system state dump %d\n", rc);
hl_fw_unmask_irq(hdev, event_type);
break;
--
2.25.1
Powered by blists - more mailing lists