[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89i+ooMT_G9aL8keZ-WOcAKqpC44OLQNGvfUtjA6PW-yxcA@mail.gmail.com>
Date: Tue, 28 Feb 2023 12:23:46 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: syzbot <syzbot+9c0268252b8ef967c62e@...kaller.appspotmail.com>,
borisp@...dia.com, bpf@...r.kernel.org, davem@...emloft.net,
john.fastabend@...il.com, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org, pabeni@...hat.com,
syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [net?] INFO: task hung in tls_sw_sendpage (3)
On Tue, Feb 28, 2023 at 12:53 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Mon, 27 Feb 2023 21:35:41 +0100 Eric Dumazet wrote:
> > This looks suspicious to me
> >
> > commit 79ffe6087e9145d2377385cac48d0d6a6b4225a5
> > Author: Jakub Kicinski <kuba@...nel.org>
> > Date: Tue Nov 5 14:24:35 2019 -0800
> >
> > net/tls: add a TX lock
> >
> >
> > If tls_sw_sendpage() has to call sk_stream_wait_memory(),
> > sk_stream_wait_memory() is properly releasing the socket lock,
> > but knows nothing about mutex_{un}lock(&tls_ctx->tx_lock);
>
> That's supposed to be the point of the lock, prevent new writers from
> messing with the partially pushed records when the original writer
> is waiting for write space.
>
> Obvious hack but the async crypto support makes TLS a bit of a mess :|
>
> sendpage_lock not taking tx_lock may lead to obvious problems, I'm not
> seeing where the deadlock is, tho..
>
This report mentions sendpage, but sendmsg() would have the same issue.
A thread might be blocked in sk_stream_wait_memory() with the mutex
held, for an arbitrary amount of time,
say if the remote peer stays in RWIN 0 for hours.
This prevents tx_work from making progress, and
tls_sw_cancel_work_tx() would be stuck forever.
The consensus is that the kernel shouts a warning if a thread has been
waiting on a mutex
more than 120 seconds (check_hung_uninterruptible_tasks())
Powered by blists - more mailing lists