lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Feb 2023 12:23:46 +0100
From:   Eric Dumazet <edumazet@...gle.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     syzbot <syzbot+9c0268252b8ef967c62e@...kaller.appspotmail.com>,
        borisp@...dia.com, bpf@...r.kernel.org, davem@...emloft.net,
        john.fastabend@...il.com, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org, pabeni@...hat.com,
        syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [net?] INFO: task hung in tls_sw_sendpage (3)

On Tue, Feb 28, 2023 at 12:53 AM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Mon, 27 Feb 2023 21:35:41 +0100 Eric Dumazet wrote:
> > This looks suspicious to me
> >
> > commit 79ffe6087e9145d2377385cac48d0d6a6b4225a5
> > Author: Jakub Kicinski <kuba@...nel.org>
> > Date:   Tue Nov 5 14:24:35 2019 -0800
> >
> >     net/tls: add a TX lock
> >
> >
> > If tls_sw_sendpage() has to call sk_stream_wait_memory(),
> > sk_stream_wait_memory() is properly releasing the socket lock,
> > but knows nothing about mutex_{un}lock(&tls_ctx->tx_lock);
>
> That's supposed to be the point of the lock, prevent new writers from
> messing with the partially pushed records when the original writer
> is waiting for write space.
>
> Obvious hack but the async crypto support makes TLS a bit of a mess :|
>
> sendpage_lock not taking tx_lock may lead to obvious problems, I'm not
> seeing where the deadlock is, tho..
>

This report mentions sendpage, but sendmsg() would have the same issue.

A thread might be blocked in sk_stream_wait_memory() with the mutex
held, for an arbitrary amount of time,
say if the remote peer stays in RWIN 0 for hours.

This prevents tx_work from making progress, and
tls_sw_cancel_work_tx() would be stuck forever.

The consensus is that the kernel shouts a warning if a thread has been
waiting on a mutex
more than 120 seconds (check_hung_uninterruptible_tasks())

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ