lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKAhZx3xRBO4gH-4SCZzUJoZy0HwkB8d5-zcA_uGQ4b1g@mail.gmail.com>
Date:   Wed, 10 Nov 2021 09:43:39 -0800
From:   Eric Dumazet <edumazet@...gle.com>
To:     Ard Biesheuvel <ardb@...nel.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jason Baron <jbaron@...mai.com>,
        "Steven Rostedt (VMware)" <rostedt@...dmis.org>
Subject: Re: [PATCH 2/2] jump_label: refine placement of static_keys

On Wed, Nov 10, 2021 at 9:06 AM Ard Biesheuvel <ardb@...nel.org> wrote:
>
> On Wed, 10 Nov 2021 at 16:22, Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Wed, Nov 10, 2021 at 2:24 AM Ard Biesheuvel <ardb@...nel.org> wrote:
> > >
> > > On Wed, 10 Nov 2021 at 09:36, Peter Zijlstra <peterz@...radead.org> wrote:
> > > >
> > > > On Tue, Nov 09, 2021 at 05:09:06PM -0800, Eric Dumazet wrote:
> > > > > From: Eric Dumazet <edumazet@...gle.com>
> > > > >
> > > > > With CONFIG_JUMP_LABEL=y, "struct static_key" content is only
> > > > > used for the control path.
> > > > >
> > > > > Marking them __read_mostly is only needed when CONFIG_JUMP_LABEL=n.
> > > > > Otherwise we place them out of the way to increase data locality.
> > > > >
> > > > > This patch adds __static_key to centralize this new policy.
> > > > >
> > > > > Signed-off-by: Eric Dumazet <edumazet@...gle.com>
> > > > > ---
> > > > >  arch/x86/kvm/lapic.c       |  4 ++--
> > > > >  arch/x86/kvm/x86.c         |  2 +-
> > > > >  include/linux/jump_label.h | 25 +++++++++++++++++--------
> > > > >  kernel/events/core.c       |  2 +-
> > > > >  kernel/sched/fair.c        |  2 +-
> > > > >  net/core/dev.c             |  8 ++++----
> > > > >  net/netfilter/core.c       |  2 +-
> > > > >  net/netfilter/x_tables.c   |  2 +-
> > > > >  8 files changed, 28 insertions(+), 19 deletions(-)
> > > > >
> > > >
> > > > Hurmph, it's a bit cumbersome to always have to add this __static_key
> > > > attribute to every definition, and in fact you seem to have missed some.
> > > >
> > > > Would something like:
> > > >
> > > >         typedef struct static_key __static_key static_key_t;
> > > >
> > > > work? I forever seem to forget the exact things you can make a typedef
> > > > do :/
> > >
> > > No, that doesn't work. Section placement is an attribute of the symbol
> > > not of its type. So we'll need to macro'ify this.
> >
> > Yes, this is also why I chose a short __static_key (initially I was
> > using something more descriptive but longer)
> >
> > >
> > > But I'm not sure I understand why we need different policies here.
> > > Static keys are inherently __read_mostly (unless they are not writable
> > > to begin with), so keeping them all together in one place in the
> > > binary should be sufficient, no?
> >
> > It is not optimal for CONFIG_JUMP_LABEL=n cases.
> >
> > For instance, networking will prefer having rps_needed / rfs_needed in
> > the same cache lines than other hot read_mostly stuff,
> > instead of being far away in other locations.
> >
> > ffffffff830e0f80 D dev_weight_tx_bias
> > ffffffff830e0f84 D dev_rx_weight
> > ffffffff830e0f88 D dev_tx_weight
> > ffffffff830e0f8c D gro_normal_batch
> > ffffffff830e0f90 D rps_sock_flow_table
> > ffffffff830e0f98 D rps_cpu_mask
> > ffffffff830e0f9c D rps_needed
> > ffffffff830e0fa0 D rfs_needed
> > ffffffff830e0fa4 D netdev_flow_limit_table_len
> > ffffffff830e0fa8 d netif_napi_add.__print_once
> > ffffffff830e0fac D netdev_unregister_timeout_secs
> > ffffffff830e0fb0 D ptype_base
> >
> >
> > When CONFIG_JUMP_LABEL=y, rps_needed/xps_needed being in a remote
> > location is a win because it 'saves' 32 bytes than can be used better
>
> I understand that you want the key out of the way for
> CONFIG_JUMP_LABEL=n, but the question was why we shouldn't do that
> unconditionally. If we put all the keys together in a section, they
> will only share cachelines with each other.
>
> Also, what is the performance impact on a real world use case of this change?

Yes, this matters for low latency stuff, mostly.

For CONFIG_JUMP_LABEL=n, I suggest we do not change the current layout,
there is no need to. I do not want to risk performance regressions for
no good reason.

Unless you have something in mind _requiring_ all these atomic_t being
grouped together ?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ