lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac157481-9cd3-47be-92f1-67a67b1b726d@acm.org>
Date: Wed, 4 Jun 2025 03:19:54 +0800
From: Bart Van Assche <bvanassche@....org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>,
 "James E . J . Bottomley" <James.Bottomley@...senPartnership.com>,
 "Martin K . Petersen" <martin.petersen@...cle.com>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
 linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] sd: Add timeout_sec and max_retries module parameter
 for sd

On 6/3/25 1:40 PM, Masami Hiramatsu (Google) wrote:
> For example, enabling CONFIG_DETECT_HUNG_TASK_BLOCKER, I got an
> error message something like below (Note that this is 6.1 kernel
> example, so the function names are a bit different.);
> 
>   INFO: task udevd:5301 blocked for more than 122 seconds.
> ...
>   INFO: task udevd:5301 is blocked on a mutex likely owned by task kworker/u4:1:11.
>   task:kworker/u4:1state:D stack:0 pid:11ppid:2  flags:0x00004000
>   Workqueue: events_unbound async_run_entry_fn
>   Call Trace:
>    <TASK>
>    schedule+0x438/0x1490
>    ? blk_mq_do_dispatch_ctx+0x70/0x1c0
>    schedule_timeout+0x253/0x790
>    ? try_to_del_timer_sync+0xb0/0xb0
>    io_schedule_timeout+0x3f/0x80
>    wait_for_common_io+0xb4/0x160
>    blk_execute_rq+0x1bd/0x210
>    __scsi_execute+0x156/0x240
>    sd_revalidate_disk+0xa2a/0x2360
>    ? kobject_uevent_env+0x158/0x430
>    sd_probe+0x364/0x47
>    really_probe+0x15a/0x3b0
>    __driver_probe_device+0x78/0xc0
>    driver_probe_device+0x24/0x1a0
>    __device_attach_driver+0x131/0x160
>    ? coredump_store+0x50/0x50
>    bus_for_each_drv+0x9d/0xf0
>    __device_attach_async_helper+0x7e/0xd0  <=== device_lock()
> ...

How can this happen? The following code should prevent that a hung task
complaint appears while blk_execute_rq() is waiting:

static inline void blk_wait_io(struct completion *done)
{
	/* Prevent hang_check timer from firing at us during very long I/O */
	unsigned long timeout = sysctl_hung_task_timeout_secs * HZ / 2;

	if (timeout)
		while (!wait_for_completion_io_timeout(done, timeout))
			;
	else
		wait_for_completion_io(done);
}

> +/* timeout_sec defines the default value of the SCSI command timeout in second. */
> +static int sd_timeout_sec = SD_TIMEOUT / HZ;
> +module_param_named(timeout_sec, sd_timeout_sec, int, 0644);
> +
> +/*
> + * write_same_timeout_sec defines the default value of the WRITE SAME SCSI
> + * command timeout in second.
> + */
> +static int sd_write_same_timeout_sec = SD_WRITE_SAME_TIMEOUT / HZ;
> +module_param_named(write_same_timeout_sec, sd_write_same_timeout_sec, int, 0644);
> +
> +/* max_retries defines the default value of the max of SCSI command retries.*/
> +static int sd_max_retries = SD_MAX_RETRIES;
> +module_param_named(max_retries, sd_max_retries, int, 0644);

Shouldn't these parameters come from the SCSI host template rather than
introducing new kernel module parameters? It is impossible to make a 
good choice for these kernel module parameters if multiple types of SCSI
devices are present (USB, HDD, ...).

Bart.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ