lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 13 Apr 2022 10:14:22 -0700
From:   Eric Dumazet <edumazet@...gle.com>
To:     David Howells <dhowells@...hat.com>
Cc:     netdev@...r.kernel.org, Marc Dionne <marc.dionne@...istor.com>,
        linux-afs@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] rxrpc: Restore removed timer deletion

On Wed, Apr 13, 2022 at 3:16 AM David Howells <dhowells@...hat.com> wrote:
>
> A recent patch[1] from Eric Dumazet flipped the order in which the
> keepalive timer and the keepalive worker were cancelled in order to fix a
> syzbot reported issue[2].  Unfortunately, this enables the mirror image bug
> whereby the timer races with rxrpc_exit_net(), restarting the worker after
> it has been cancelled:
>
>         CPU 1           CPU 2
>         =============== =====================
>                         if (rxnet->live)
>                         <INTERRUPT>
>         rxnet->live = false;
>         cancel_work_sync(&rxnet->peer_keepalive_work);
>                         rxrpc_queue_work(&rxnet->peer_keepalive_work);
>         del_timer_sync(&rxnet->peer_keepalive_timer);
>
> Fix this by restoring the removed del_timer_sync() so that we try to remove
> the timer twice.  If the timer runs again, it should see ->live == false
> and not restart the worker.
>
> Fixes: 1946014ca3b1 ("rxrpc: fix a race in rxrpc_exit_net()")
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Eric Dumazet <edumazet@...gle.com>
> cc: Marc Dionne <marc.dionne@...istor.com>
> cc: linux-afs@...ts.infradead.org
> Link: https://lore.kernel.org/r/20220404183439.3537837-1-eric.dumazet@gmail.com/ [1]
> Link: https://syzkaller.appspot.com/bug?extid=724378c4bb58f703b09a [2]
> ---
>
>  net/rxrpc/net_ns.c |    2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/rxrpc/net_ns.c b/net/rxrpc/net_ns.c
> index f15d6942da45..cc7e30733feb 100644
> --- a/net/rxrpc/net_ns.c
> +++ b/net/rxrpc/net_ns.c
> @@ -113,7 +113,9 @@ static __net_exit void rxrpc_exit_net(struct net *net)
>         struct rxrpc_net *rxnet = rxrpc_net(net);
>
>         rxnet->live = false;
> +       del_timer_sync(&rxnet->peer_keepalive_timer);
>         cancel_work_sync(&rxnet->peer_keepalive_work);
> +       /* Remove the timer again as the worker may have restarted it. */
>         del_timer_sync(&rxnet->peer_keepalive_timer);
>         rxrpc_destroy_all_calls(rxnet);
>         rxrpc_destroy_all_connections(rxnet);
>
>

ok... so we have a timer and a work queue, both activating each other
in kind of a ping pong ?

Any particular reason not using delayed works ?

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ