lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJwJo6byPNeA_K3kgx-xtEpNMNja3+GrfwzhxtAxE4QE4S6-OA@mail.gmail.com>
Date: Thu, 25 Jul 2024 09:50:06 +0100
From: Dmitry Safonov <0x7f454c46@...il.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: "David S. Miller" <davem@...emloft.net>, David Ahern <dsahern@...nel.org>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org, stable@...nel.org
Subject: Re: [PATCH net] net/tcp: Disable TCP-AO static key after RCU grace period

Hi Eric, thanks for looking into this,

On Thu, 25 Jul 2024 at 08:48, Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Thu, Jul 25, 2024 at 7:00 AM Dmitry Safonov via B4 Relay
> <devnull+0x7f454c46.gmail.com@...nel.org> wrote:
> >
> > From: Dmitry Safonov <0x7f454c46@...il.com>
> >
> > The lifetime of TCP-AO static_key is the same as the last
> > tcp_ao_info. On the socket destruction tcp_ao_info ceases to be
> > with RCU grace period, while tcp-ao static branch is currently deferred
> > destructed. The static key definition is
> > : DEFINE_STATIC_KEY_DEFERRED_FALSE(tcp_ao_needed, HZ);
> >
> > which means that if RCU grace period is delayed by more than a second
> > and tcp_ao_needed is in the process of disablement, other CPUs may
> > yet see tcp_ao_info which atent dead, but soon-to-be.
>
> > And that breaks the assumption of static_key_fast_inc_not_disabled().
>
> I am afraid I do not understand this changelog at all.
>
> What is "the assumption of static_key_fast_inc_not_disabled()"  you
> are referring to ?
>
> I think it would help to provide more details.

Sorry, my bad, I'm referring here to the comment/description for the function:

 * static_key_fast_inc_not_disabled - adds a user for a static key
 * @key: static key that must be already enabled
 *
 * The caller must make sure that the static key can't get disabled while
 * in this function. It doesn't patch jump labels, only adds a user to
 * an already enabled static key.

Originally it was introduced in commit eb8c507296f6 ("jump_label:
Prevent key->enabled int overflow"), which is needed for the atomic
contexts, one of which would be the creation of a full socket from a
request socket. In that atomic context, we know by the presence of the
key (md5/ao) that the static branch is already enabled. So, we can
just increment the ref counter for that static branch instead of
holding the proper mutex. static_key_fast_inc_not_disabled() is just a
helper for that usage case. But it must not be used if the static
branch could get disabled in parallel as it's not protected by
jump_label_mutex and as a result, races with jump_label_update()
implementation details.

Specifically, from the log in [1], I see that jump_label_type()
wrongly tells arch_jump_label_transform_queue() to enable the
static_brach, when the caller was, in fact,
__static_key_slow_dec_cpuslocked() - requesting to disable that. And
then, the x86-specific code produces:
: jump_label: Fatal kernel bug, unexpected op at
tcp_inbound_hash+0x1a7/0x870 [ffffffffa8c4e9b7] (eb 50 0f 1f 44 != 66
90 0f 1f 00)) size:2 type:1

when it tries to enable the static key; but the op-code is not no-op,
it's 2-byte jump. The reason for that of course is that intended
operation was to disable the branch, but it has raced with this
increment helper.

Hopefully, that clarifies somewhat the situation here.

Thankfully, for TCP-MD5 I did a better job: tcp_md5sig_info_free_rcu()
and tcp_md5_twsk_free_rcu() are RCU callbacks.

Also, please note that I intentionally call
static_branch_slow_dec_deferred() variant in the RCU callback, rather
than synchronous. The reason for that:

 * When the control is directly exposed to userspace, it is prudent to delay the
 * decrement to avoid high frequency code modifications which can (and do)
 * cause significant performance degradation. Struct static_key_deferred and
 * static_key_slow_dec_deferred() provide for this.

>
> >
> > Happened on netdev test-bot[1], so not a theoretical issue:
> >
> > [] jump_label: Fatal kernel bug, unexpected op at tcp_inbound_hash+0x1a7/0x870 [ffffffffa8c4e9b7] (eb 50 0f 1f 44 != 66 90 0f 1f 00)) size:2 type:1
> > [] ------------[ cut here ]------------
> > [] kernel BUG at arch/x86/kernel/jump_label.c:73!
> > [] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
> > [] CPU: 3 PID: 243 Comm: kworker/3:3 Not tainted 6.10.0-virtme #1
> > [] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> > [] Workqueue: events jump_label_update_timeout
> > [] RIP: 0010:__jump_label_patch+0x2f6/0x350
> > ...
> > [] Call Trace:
> > []  <TASK>
> > []  arch_jump_label_transform_queue+0x6c/0x110
> > []  __jump_label_update+0xef/0x350
> > []  __static_key_slow_dec_cpuslocked.part.0+0x3c/0x60
> > []  jump_label_update_timeout+0x2c/0x40
> > []  process_one_work+0xe3b/0x1670
> > []  worker_thread+0x587/0xce0
> > []  kthread+0x28a/0x350
> > []  ret_from_fork+0x31/0x70
> > []  ret_from_fork_asm+0x1a/0x30
> > []  </TASK>
> > [] Modules linked in: veth
> > [] ---[ end trace 0000000000000000 ]---
> > [] RIP: 0010:__jump_label_patch+0x2f6/0x350
> >
> > [1]: https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/696681/5-connect-deny-ipv6/stderr
> >
> > Cc: stable@...nel.org
> > Fixes: 67fa83f7c86a ("net/tcp: Add static_key for TCP-AO")
> > Signed-off-by: Dmitry Safonov <0x7f454c46@...il.com>
> > ---
> > ---
> >  net/ipv4/tcp_ao.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c
> > index 85531437890c..5ce914d3e3db 100644
> > --- a/net/ipv4/tcp_ao.c
> > +++ b/net/ipv4/tcp_ao.c
> > @@ -267,6 +267,14 @@ static void tcp_ao_key_free_rcu(struct rcu_head *head)
> >         kfree_sensitive(key);
> >  }
> >
> > +static void tcp_ao_info_free_rcu(struct rcu_head *head)
> > +{
> > +       struct tcp_ao_info *ao = container_of(head, struct tcp_ao_info, rcu);
> > +
> > +       kfree(ao);
> > +       static_branch_slow_dec_deferred(&tcp_ao_needed);
> > +}
> > +
> >  void tcp_ao_destroy_sock(struct sock *sk, bool twsk)
> >  {
> >         struct tcp_ao_info *ao;
> > @@ -290,9 +298,7 @@ void tcp_ao_destroy_sock(struct sock *sk, bool twsk)
> >                         atomic_sub(tcp_ao_sizeof_key(key), &sk->sk_omem_alloc);
> >                 call_rcu(&key->rcu, tcp_ao_key_free_rcu);
> >         }
> > -
> > -       kfree_rcu(ao, rcu);
> > -       static_branch_slow_dec_deferred(&tcp_ao_needed);
> > +       call_rcu(&ao->rcu, tcp_ao_info_free_rcu);
> >  }
> >
> >  void tcp_ao_time_wait(struct tcp_timewait_sock *tcptw, struct tcp_sock *tp)
> >
> > ---
> > base-commit: c33ffdb70cc6df4105160f991288e7d2567d7ffa
> > change-id: 20240725-tcp-ao-static-branch-rcu-85ede7b3a1a5
> >
> > Best regards,
> > --
> > Dmitry Safonov <0x7f454c46@...il.com>
> >
> >


Thanks,
             Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ