[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <2c8985b8.107370.1839e591d73.Coremail.duoming@zju.edu.cn>
Date: Mon, 3 Oct 2022 22:56:23 +0800 (GMT+08:00)
From: duoming@....edu.cn
To: "Dmitry Bogdanov" <d.bogdanov@...ro.com>
Cc: linux-kernel@...r.kernel.org, martin.petersen@...cle.com,
kuba@...nel.org, davem@...emloft.net, andrii@...nel.org,
gregkh@...uxfoundation.org, linux-scsi@...r.kernel.org,
target-devel@...r.kernel.org
Subject: Re: [PATCH] scsi: target: iscsi: cxgbit: fix
sleep-in-atomic-context bug in cxgbit_abort_conn
Hello,
On Mon, 3 Oct 2022 17:46:02 +0300 Dmitry Bogdanov wrote:
> On Sun, Oct 02, 2022 at 09:40:47AM +0800, Duoming Zhou wrote:
> >
> > The function iscsit_handle_time2retain_timeout() is a timer handler that
> > runs in an atomic context, but it calls "alloc_skb(0, GFP_KERNEL | ...)"
> > that may sleep. As a result, the sleep-in-atomic-context bug will happen.
> > The process is shown below:
> >
> > iscsit_handle_time2retain_timeout()
> > iscsit_close_session()
> > iscsit_free_connection_recovery_entries()
> > iscsit_free_cmd()
> > __iscsit_free_cmd()
> > cxgbit_unmap_cmd()
> > cxgbit_abort_conn()
> > alloc_skb(0, GFP_KERNEL | ...) //may sleep
> >
> > This patch changes the gfp_t parameter of alloc_skb() from GFP_KERNEL to
> > GFP_ATOMIC in order to mitigate the bug.
> >
> > Fixes: 1ae01724ae92 ("cxgbit: Abort the TCP connection in case of data out timeout")
> > Signed-off-by: Duoming Zhou <duoming@....edu.cn>
> > ---
> > drivers/target/iscsi/cxgbit/cxgbit_cm.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/target/iscsi/cxgbit/cxgbit_cm.c b/drivers/target/iscsi/cxgbit/cxgbit_cm.c
> > index 3336d2b78bf..eb3da6d2c62 100644
> > --- a/drivers/target/iscsi/cxgbit/cxgbit_cm.c
> > +++ b/drivers/target/iscsi/cxgbit/cxgbit_cm.c
> > @@ -697,7 +697,7 @@ __cxgbit_abort_conn(struct cxgbit_sock *csk, struct sk_buff *skb)
> >
> > void cxgbit_abort_conn(struct cxgbit_sock *csk)
> > {
> > - struct sk_buff *skb = alloc_skb(0, GFP_KERNEL | __GFP_NOFAIL);
> > + struct sk_buff *skb = alloc_skb(0, GFP_ATOMIC | __GFP_NOFAIL);
> >
> > cxgbit_get_csk(csk);
> > cxgbit_init_wr_wait(&csk->com.wr_wait);
> > --
> > 2.17.1
> >
>
> The last line in cxgbit_abort_conn is cxgbit_wait_for_reply() which
> also should not be called in interrupt context.
I agree with you.
> Anyway this issue is not due to cxgbit, it is common for iSCSI itself:
> iscsit_close_session()
> iscsit_free_connection_recovery_entries()
> iscsit_free_cmd()
> transport_generic_free_cmd()
> target_wait_free_cmd()
> wait_for_completion_timeout()
I understand.
> IMHO, there is no reason to call iscsit_close_session in an atomic context.
> I have two patches relaited Time2Retain timer. I will share them today.
That's great, thank you!
Best regards,
Duoming Zhou
Powered by blists - more mailing lists