[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aAeosU3V02vWxD7Z@LouisNoVo>
Date: Tue, 22 Apr 2025 16:33:21 +0200
From: Louis Peens <louis.peens@...igine.com>
To: Michael Chan <michael.chan@...adcom.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org, edumazet@...gle.com,
kuba@...nel.org, pabeni@...hat.com, andrew+netdev@...n.ch,
pavan.chebbi@...adcom.com, andrew.gospodarek@...adcom.com,
Kalesh AP <kalesh-anakkur.purayil@...adcom.com>
Subject: Re: [PATCH net-next v2 1/4] bnxt_en: Change FW message timeout
warning
On Thu, Apr 17, 2025 at 10:24:45AM -0700, Michael Chan wrote:
> The firmware advertises a "hwrm_cmd_max_timeout" value to the driver
> for NVRAM and coredump related functions that can take tens of seconds
> to complete. The driver polls for the operation to complete under
> mutex and may trigger hung task watchdog warning if the wait is too long.
> To warn the user about this, the driver currently prints a warning if
> this advertised value exceeds 40 seconds:
>
> Device requests max timeout of %d seconds, may trigger hung task watchdog
>
> Initially, we chose 40 seconds, well below the kernel's default
> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT (120 seconds) to avoid triggering
> the hung task watchdog. But 60 seconds is the timeout on most
> production FW and cannot be reduced further. Change the driver's warning
> threshold to 60 seconds to avoid triggering this warning on all
> production devices. We also print the warning if the value exceeds
> CONFIG_DEFAULT_HUNG_TASK_TIMEOUT which may be set to architecture
> specific defaults as low as 10 seconds.
>
> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@...adcom.com>
> Reviewed-by: Pavan Chebbi <pavan.chebbi@...adcom.com>
> Reviewed-by: Andy Gospodarek <andrew.gospodarek@...adcom.com>
> Signed-off-by: Michael Chan <michael.chan@...adcom.com>
> ---
> v2: Add check for CONFIG_DEFAULT_HUNG_TASK_TIMEOUT
Hi. Sorry if this is noise - but I have not seen this reported yet. I
think this change introduced a config dependency on 'DEBUG_KERNEL'. As far as I
track the dependency chain:
DEFAULT_HUNG_TASK_TIMEOUT -> DETECT_HUNG_TASK -> DEBUG_KERNEL.
I have a 'local_defconfig' file which I'm regularly using for compiles,
and I had to add all three these CONFIG settings to it to be able to
compile again, otherwise I encounter this issue:
drivers/net/ethernet/broadcom/bnxt/bnxt.c:10188:21: \
error: 'CONFIG_DEFAULT_HUNG_TASK_TIMEOUT' undeclared (first use in this function)
max_tmo_secs > CONFIG_DEFAULT_HUNG_TASK_TIMEOUT) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Perhaps this was on purpose, but from what I can tell on a quick scan I don't
think it was.
Regards
Louis
>
> v1: https://lore.kernel.org/netdev/20250415174818.1088646-2-michael.chan@broadcom.com/
> ---
> --
> 2.30.1
>
>
Powered by blists - more mailing lists