[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAH2r5msm+xFyMPCJV7Luf6aQ04cHOhevmjJi8J-2Bf711T1ziQ@mail.gmail.com>
Date: Sat, 13 Apr 2024 04:02:15 -0500
From: Steve French <smfrench@...il.com>
To: Rickard Andersson <rickaran@...s.com>
Cc: linux-cifs@...r.kernel.org, samba-technical@...ts.samba.org,
sfrench@...ba.org, pc@...guebit.com, ronniesahlberg@...il.com,
sprasad@...rosoft.com, tom@...pey.com, linux-kernel@...r.kernel.org,
rickard314.andersson@...il.com, kernel@...s.com
Subject: Re: [PATCH v2 1/1] smb: client: Fix hang in smb2_reconnect
Shyam and I tried some experiments and there are two cases where the
patch breaks:
1) ChangeNotify will time out
2) Certainly byte range lock calls (they can be allowed to block) will timeout
An obvious alternative would be to not make this change for the
commands like ChangeNotify and blocking locks but allow it for the
others.
On Tue, Apr 9, 2024 at 6:29 AM Rickard Andersson <rickaran@...s.com> wrote:
>
> From: Rickard x Andersson <rickaran@...s.com>
>
> Test case:
> mount -t cifs //192.168.0.1/test y -o
> port=19999,ro,vers=2.1,sec=none,echo_interval=1
> kill smbd with SIGSTOP
> umount /tmp/y
>
> Gives the following error:
> INFO: task umount:466 blocked for more than 122 seconds.
> Not tainted 6.8.2-axis9-devel #1
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
> message.
> task:umount state:D stack:0 pid:466 tgid:466 ppid:464 flags:0x00000004
> __schedule from schedule+0x34/0x108
> schedule from schedule_preempt_disabled+0x24/0x34
> schedule_preempt_disabled from __mutex_lock.constprop.0+0x330/0x8b0
> __mutex_lock.constprop.0 from smb2_reconnect+0x278/0x8fc [cifs]
> smb2_reconnect [cifs] from SMB2_open_init+0x54/0x9f4 [cifs]
> SMB2_open_init [cifs] from smb2_query_info_compound+0x1a0/0x500[cifs]
> smb2_query_info_compound [cifs] from smb2_queryfs+0x64/0x134 [cifs]
> smb2_queryfs [cifs] from cifs_statfs+0xc8/0x318 [cifs]
> cifs_statfs [cifs] from statfs_by_dentry+0x60/0x84
> statfs_by_dentry from fd_statfs+0x30/0x74
> fd_statfs from sys_fstatfs64+0x40/0x6c
> sys_fstatfs64 from ret_fast_syscall+0x0/0x54
>
> The umount task is blocked waiting on the session mutex. The reason it
> never gets the session mutex is because 'kworker/0:3' is holding the
> mutex and is waiting for response (see line 1209 in
> fs/smb/client/smb2pdu.c.
>
> Stack trace of 'kworker/0:3' just before calling wait_for_response:
> CPU: 0 PID: 220 Comm: kworker/0:3 Not tainted 6.8.2-axis9-devel #1
> Hardware name: Freescale i.MX6 SoloX (Device Tree)
> Workqueue: cifsiod smb2_reconnect_server [cifs]
> unwind_backtrace from show_stack+0x18/0x1c
> show_stack from dump_stack_lvl+0x24/0x2c
> dump_stack_lvl from compound_send_recv+0x7bc/0xac8 [cifs]
> compound_send_recv [cifs] from cifs_send_recv+0x34/0x3c [cifs]
> cifs_send_recv [cifs] from SMB2_negotiate+0x410/0x13dc [cifs]
> SMB2_negotiate [cifs] from smb2_negotiate+0x4c/0x58 [cifs]
> smb2_negotiate [cifs] from cifs_negotiate_protocol+0x9c/0x100 [cifs]
> cifs_negotiate_protocol [cifs] from smb2_reconnect+0x418/0x8fc [cifs]
> smb2_reconnect [cifs] from smb2_reconnect_server+0x1dc/0x514 [cifs]
> smb2_reconnect_server [cifs] from process_one_work+0x188/0x3ec
> process_one_work from worker_thread+0x1fc/0x430
> worker_thread from kthread+0x110/0x130
> kthread from ret_from_fork+0x14/0x28
>
> Change-Id: I53439ffb007c9c51d77ce40fb655a34e5ca291ec
> Signed-off-by: Rickard x Andersson <rickaran@...s.com>
> ---
> fs/smb/client/transport.c | 15 +++++++++------
> 1 file changed, 9 insertions(+), 6 deletions(-)
>
> diff --git a/fs/smb/client/transport.c b/fs/smb/client/transport.c
> index 994d70193432..96b8f8757ddc 100644
> --- a/fs/smb/client/transport.c
> +++ b/fs/smb/client/transport.c
> @@ -32,6 +32,8 @@
> /* Max number of iovectors we can use off the stack when sending requests. */
> #define CIFS_MAX_IOV_SIZE 8
>
> +#define RESPONSE_TIMEOUT_SECS 50
> +
> void
> cifs_wake_up_task(struct mid_q_entry *mid)
> {
> @@ -735,13 +737,14 @@ static int allocate_mid(struct cifs_ses *ses, struct smb_hdr *in_buf,
> static int
> wait_for_response(struct TCP_Server_Info *server, struct mid_q_entry *midQ)
> {
> - int error;
> + int ret;
> +
> + ret = wait_event_killable_timeout(server->response_q,
> + midQ->mid_state != MID_REQUEST_SUBMITTED &&
> + midQ->mid_state != MID_RESPONSE_RECEIVED,
> + RESPONSE_TIMEOUT_SECS*HZ);
>
> - error = wait_event_state(server->response_q,
> - midQ->mid_state != MID_REQUEST_SUBMITTED &&
> - midQ->mid_state != MID_RESPONSE_RECEIVED,
> - (TASK_KILLABLE|TASK_FREEZABLE_UNSAFE));
> - if (error < 0)
> + if ((ret < 0) || (ret == 0))
> return -ERESTARTSYS;
>
> return 0;
> --
> 2.30.2
>
>
--
Thanks,
Steve
Powered by blists - more mailing lists