lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 10 Nov 2021 18:06:28 +0100
From:   Ard Biesheuvel <ardb@...nel.org>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jason Baron <jbaron@...mai.com>,
        "Steven Rostedt (VMware)" <rostedt@...dmis.org>
Subject: Re: [PATCH 2/2] jump_label: refine placement of static_keys

On Wed, 10 Nov 2021 at 16:22, Eric Dumazet <edumazet@...gle.com> wrote:
>
> On Wed, Nov 10, 2021 at 2:24 AM Ard Biesheuvel <ardb@...nel.org> wrote:
> >
> > On Wed, 10 Nov 2021 at 09:36, Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Tue, Nov 09, 2021 at 05:09:06PM -0800, Eric Dumazet wrote:
> > > > From: Eric Dumazet <edumazet@...gle.com>
> > > >
> > > > With CONFIG_JUMP_LABEL=y, "struct static_key" content is only
> > > > used for the control path.
> > > >
> > > > Marking them __read_mostly is only needed when CONFIG_JUMP_LABEL=n.
> > > > Otherwise we place them out of the way to increase data locality.
> > > >
> > > > This patch adds __static_key to centralize this new policy.
> > > >
> > > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > > > ---
> > > >  arch/x86/kvm/lapic.c       |  4 ++--
> > > >  arch/x86/kvm/x86.c         |  2 +-
> > > >  include/linux/jump_label.h | 25 +++++++++++++++++--------
> > > >  kernel/events/core.c       |  2 +-
> > > >  kernel/sched/fair.c        |  2 +-
> > > >  net/core/dev.c             |  8 ++++----
> > > >  net/netfilter/core.c       |  2 +-
> > > >  net/netfilter/x_tables.c   |  2 +-
> > > >  8 files changed, 28 insertions(+), 19 deletions(-)
> > > >
> > >
> > > Hurmph, it's a bit cumbersome to always have to add this __static_key
> > > attribute to every definition, and in fact you seem to have missed some.
> > >
> > > Would something like:
> > >
> > >         typedef struct static_key __static_key static_key_t;
> > >
> > > work? I forever seem to forget the exact things you can make a typedef
> > > do :/
> >
> > No, that doesn't work. Section placement is an attribute of the symbol
> > not of its type. So we'll need to macro'ify this.
>
> Yes, this is also why I chose a short __static_key (initially I was
> using something more descriptive but longer)
>
> >
> > But I'm not sure I understand why we need different policies here.
> > Static keys are inherently __read_mostly (unless they are not writable
> > to begin with), so keeping them all together in one place in the
> > binary should be sufficient, no?
>
> It is not optimal for CONFIG_JUMP_LABEL=n cases.
>
> For instance, networking will prefer having rps_needed / rfs_needed in
> the same cache lines than other hot read_mostly stuff,
> instead of being far away in other locations.
>
> ffffffff830e0f80 D dev_weight_tx_bias
> ffffffff830e0f84 D dev_rx_weight
> ffffffff830e0f88 D dev_tx_weight
> ffffffff830e0f8c D gro_normal_batch
> ffffffff830e0f90 D rps_sock_flow_table
> ffffffff830e0f98 D rps_cpu_mask
> ffffffff830e0f9c D rps_needed
> ffffffff830e0fa0 D rfs_needed
> ffffffff830e0fa4 D netdev_flow_limit_table_len
> ffffffff830e0fa8 d netif_napi_add.__print_once
> ffffffff830e0fac D netdev_unregister_timeout_secs
> ffffffff830e0fb0 D ptype_base
>
>
> When CONFIG_JUMP_LABEL=y, rps_needed/xps_needed being in a remote
> location is a win because it 'saves' 32 bytes than can be used better

I understand that you want the key out of the way for
CONFIG_JUMP_LABEL=n, but the question was why we shouldn't do that
unconditionally. If we put all the keys together in a section, they
will only share cachelines with each other.

Also, what is the performance impact on a real world use case of this change?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ