[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac157481-9cd3-47be-92f1-67a67b1b726d@acm.org>
Date: Wed, 4 Jun 2025 03:19:54 +0800
From: Bart Van Assche <bvanassche@....org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>,
"James E . J . Bottomley" <James.Bottomley@...senPartnership.com>,
"Martin K . Petersen" <martin.petersen@...cle.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sd: Add timeout_sec and max_retries module parameter
for sd
On 6/3/25 1:40 PM, Masami Hiramatsu (Google) wrote:
> For example, enabling CONFIG_DETECT_HUNG_TASK_BLOCKER, I got an
> error message something like below (Note that this is 6.1 kernel
> example, so the function names are a bit different.);
>
> INFO: task udevd:5301 blocked for more than 122 seconds.
> ...
> INFO: task udevd:5301 is blocked on a mutex likely owned by task kworker/u4:1:11.
> task:kworker/u4:1state:D stack:0 pid:11ppid:2 flags:0x00004000
> Workqueue: events_unbound async_run_entry_fn
> Call Trace:
> <TASK>
> schedule+0x438/0x1490
> ? blk_mq_do_dispatch_ctx+0x70/0x1c0
> schedule_timeout+0x253/0x790
> ? try_to_del_timer_sync+0xb0/0xb0
> io_schedule_timeout+0x3f/0x80
> wait_for_common_io+0xb4/0x160
> blk_execute_rq+0x1bd/0x210
> __scsi_execute+0x156/0x240
> sd_revalidate_disk+0xa2a/0x2360
> ? kobject_uevent_env+0x158/0x430
> sd_probe+0x364/0x47
> really_probe+0x15a/0x3b0
> __driver_probe_device+0x78/0xc0
> driver_probe_device+0x24/0x1a0
> __device_attach_driver+0x131/0x160
> ? coredump_store+0x50/0x50
> bus_for_each_drv+0x9d/0xf0
> __device_attach_async_helper+0x7e/0xd0 <=== device_lock()
> ...
How can this happen? The following code should prevent that a hung task
complaint appears while blk_execute_rq() is waiting:
static inline void blk_wait_io(struct completion *done)
{
/* Prevent hang_check timer from firing at us during very long I/O */
unsigned long timeout = sysctl_hung_task_timeout_secs * HZ / 2;
if (timeout)
while (!wait_for_completion_io_timeout(done, timeout))
;
else
wait_for_completion_io(done);
}
> +/* timeout_sec defines the default value of the SCSI command timeout in second. */
> +static int sd_timeout_sec = SD_TIMEOUT / HZ;
> +module_param_named(timeout_sec, sd_timeout_sec, int, 0644);
> +
> +/*
> + * write_same_timeout_sec defines the default value of the WRITE SAME SCSI
> + * command timeout in second.
> + */
> +static int sd_write_same_timeout_sec = SD_WRITE_SAME_TIMEOUT / HZ;
> +module_param_named(write_same_timeout_sec, sd_write_same_timeout_sec, int, 0644);
> +
> +/* max_retries defines the default value of the max of SCSI command retries.*/
> +static int sd_max_retries = SD_MAX_RETRIES;
> +module_param_named(max_retries, sd_max_retries, int, 0644);
Shouldn't these parameters come from the SCSI host template rather than
introducing new kernel module parameters? It is impossible to make a
good choice for these kernel module parameters if multiple types of SCSI
devices are present (USB, HDD, ...).
Bart.
Powered by blists - more mailing lists