lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAADnVQLE8HHU1GReeiou-S6PNJrK=E6NuZE-Pp68YTx6pDFnYg@mail.gmail.com>
Date: Thu, 16 Jan 2025 14:58:05 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: rcu@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>, 
	Kernel Team <kernel-team@...a.com>, Steven Rostedt <rostedt@...dmis.org>, 
	Alexei Starovoitov <ast@...nel.org>, Andrii Nakryiko <andrii@...nel.org>, 
	Peter Zijlstra <peterz@...radead.org>, Kent Overstreet <kent.overstreet@...ux.dev>, 
	bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH rcu 13/17] srcu: Add SRCU-fast readers

On Thu, Jan 16, 2025 at 1:55 PM Paul E. McKenney <paulmck@...nel.org> wrote:
>
> On Thu, Jan 16, 2025 at 01:00:24PM -0800, Alexei Starovoitov wrote:
> > On Thu, Jan 16, 2025 at 12:21 PM Paul E. McKenney <paulmck@...nel.org> wrote:
> > >
> > > +/*
> > > + * Counts the new reader in the appropriate per-CPU element of the
> > > + * srcu_struct.  Returns a pointer that must be passed to the matching
> > > + * srcu_read_unlock_fast().
> > > + *
> > > + * Note that this_cpu_inc() is an RCU read-side critical section either
> > > + * because it disables interrupts, because it is a single instruction,
> > > + * or because it is a read-modify-write atomic operation, depending on
> > > + * the whims of the architecture.
> > > + */
> > > +static inline struct srcu_ctr __percpu *__srcu_read_lock_fast(struct srcu_struct *ssp)
> > > +{
> > > +       struct srcu_ctr __percpu *scp = READ_ONCE(ssp->srcu_ctrp);
> > > +
> > > +       RCU_LOCKDEP_WARN(!rcu_is_watching(), "RCU must be watching srcu_read_lock_fast().");
> > > +       this_cpu_inc(scp->srcu_locks.counter); /* Y */
> > > +       barrier(); /* Avoid leaking the critical section. */
> > > +       return scp;
> > > +}
> >
> > This doesn't look fast.
> > If I'm reading this correctly,
> > even with debugs off RCU_LOCKDEP_WARN() will still call
> > rcu_is_watching() and this doesn't look cheap or fast.
>
> Here is the CONFIG_PROVE_RCU=n definition:
>
>         #define RCU_LOCKDEP_WARN(c, s) do { } while (0 && (c))
>
> The "0" in the "0 && (c)" should prevent that call to rcu_is_watching().
>
> But why not see what the compiler thinks?  I added the following function
> to kernel/rcu/srcutree.c:
>
> struct srcu_ctr __percpu *test_srcu_read_lock_fast(struct srcu_struct *ssp)
> {
>         struct srcu_ctr __percpu *p;
>
>         p = srcu_read_lock_fast(ssp);
>         return p;
> }
>
> This function compiles to the following code:
>
> Dump of assembler code for function test_srcu_read_lock_fast:
>    0xffffffff811220c0 <+0>:     endbr64
>    0xffffffff811220c4 <+4>:     sub    $0x8,%rsp
>    0xffffffff811220c8 <+8>:     mov    0x8(%rdi),%rax
>    0xffffffff811220cc <+12>:    add    %gs:0x7eef3944(%rip),%rax        # 0x15a18 <this_cpu_off>
>    0xffffffff811220d4 <+20>:    mov    0x20(%rax),%eax
>    0xffffffff811220d7 <+23>:    test   $0x8,%al
>    0xffffffff811220d9 <+25>:    je     0xffffffff811220eb <test_srcu_read_lock_fast+43>
>    0xffffffff811220db <+27>:    mov    (%rdi),%rax
>    0xffffffff811220de <+30>:    incq   %gs:(%rax)
>    0xffffffff811220e2 <+34>:    add    $0x8,%rsp
>    0xffffffff811220e6 <+38>:    jmp    0xffffffff81f5fe60 <__x86_return_thunk>
>    0xffffffff811220eb <+43>:    mov    $0x8,%esi
>    0xffffffff811220f0 <+48>:    mov    %rdi,(%rsp)
>    0xffffffff811220f4 <+52>:    call   0xffffffff8111fb90 <__srcu_check_read_flavor>
>    0xffffffff811220f9 <+57>:    mov    (%rsp),%rdi
>    0xffffffff811220fd <+61>:    jmp    0xffffffff811220db <test_srcu_read_lock_fast+27>
>
> The first call to srcu_read_lock_fast() invokes __srcu_check_read_flavor(),
> but after that the "je" instruction will fall through.  So the common-case
> code path executes only the part of this function up to and including the
> "jmp 0xffffffff81f5fe60 <__x86_return_thunk>".
>
> Does that serve?

Thanks for checking.
I was worried that in case of:
while (0 && (c))
the compiler might need to still emit (c) since it cannot prove
that there are no side effects there.
I vaguely recall now that C standard allows to ignore 2nd expression
when the first expression satisfies the boolean condition.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ