lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoCgc13hS64mSWvn6zYQWwVKzcyF8xueWsaP62ZZJiv+nog@mail.gmail.com>
Date: Tue, 5 Nov 2024 12:25:09 +0100
From: Peter Newman <peternewman@...gle.com>
To: Fenghua Yu <fenghua.yu@...el.com>
Cc: Reinette Chatre <reinette.chatre@...el.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	"H . Peter Anvin" <hpa@...or.com>, Tony Luck <tony.luck@...el.com>, Babu Moger <babu.moger@....com>, 
	James Morse <james.morse@....com>, Martin Kletzander <nert.pinx@...il.com>, 
	Shaopeng Tan <tan.shaopeng@...itsu.com>, linux-kernel@...r.kernel.org, eranian@...gle.com
Subject: Re: [PATCH 2/2] x86/resctrl: Don't workqueue local event counter reads

Hi Fenghua,

On Mon, Nov 4, 2024 at 11:36 PM Fenghua Yu <fenghua.yu@...el.com> wrote:
>
> Hi, Peter,
>
> On 10/31/24 07:25, Peter Newman wrote:

> > diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> > index 200d89a640270..daaff1cfd3f24 100644
> > --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> > +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> > @@ -541,6 +541,31 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> >               return;
> >       }
> >
> > +     /*
> > +      * If a performance-conscious caller has gone to the trouble of binding
> > +      * their thread to the monitoring domain of the event counter, ensure
> > +      * that the counters are read directly. smp_call_on_cpu()
> > +      * unconditionally uses a work queue to read the counter, substantially
> > +      * increasing the cost of the read.
> > +      *
> > +      * Preemption must be disabled to prevent a migration out of the domain
> > +      * after the CPU is checked, which would result in reading the wrong
> > +      * counters. Note that this makes the (slow) remote path a little slower
> > +      * by requiring preemption to be reenabled when redirecting the request
> > +      * to another domain was in fact necessary.
> > +      *
> > +      * In the case where all eligible target CPUs are nohz_full and
> > +      * smp_call_function_any() is used, keep preemption disabled to avoid
> > +      * the cost of reenabling it twice in the same read.
> > +      */
> > +     cpu = get_cpu();
> > +     if (cpumask_test_cpu(cpu, cpumask)) {
> > +             mon_event_count(rr);
> > +             resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx);
> > +             put_cpu();
> > +             return;
> > +     }
>
> This fast path code is a duplicate part of smp_call_funcion_any().
>
> In nohz_full() case, the fast path doesn't gain much and even hurts
> remote domain performance:
> 1. On local domain, it may gain a little bit because it has a few lines
> less than directly calling smp_call_function_any(). But the gain is
> minor due to a lines less code, not due to heavy weight queued work.
>
> 2. On remote domain, it degrades performance because get_cpu() and
> put_cpu() are both called twice: one in the fast path code and one in
> smp_call_function_any(). As you mentioned earlier, put_cpu() impacts
> performance. I think get_cpu() has same impact too.

get_cpu() and put_cpu() nest, so only the put_cpu() that reduces the
preempt count to 0 will call into the scheduler. See the source
comment I had added below.

But... note that below smp_call_on_cpu() is now called with preemption
disabled. (Looks like I only benchmarked and never ran a debug
build...) I will have to change the patch to make sure put_cpu() is
called before smp_call_on_cpu().


>
> The fast path only gains in none nohz_full() case.
>
> So maybe it's better to move the fast path code into the non nohz_full()
> case? With this change, you may have the following benefits:
>
> 1. No performance impact on nohz_full() case (either local or remote
> domain).
> 2. Improve performance on non nohz_full() case as you intended in this
> patch.
> 3. The fast path focuses on fixing the right performance bottleneck.

The consequence of reusing the current-cpu-in-mask check in
smp_call_function_any() is that if the check fails, it could cause
resctrl_arch_rmid_read() to fail by invoking it in an IPI handler when
it would have succeeded if invoked on a kernel worker, undoing James's
original work.

-Peter


>
> > +
> >       cpu = cpumask_any_housekeeping(cpumask, RESCTRL_PICK_ANY_CPU);
> >
> >       /*
> > @@ -554,6 +579,9 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> >       else
> >               smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);
> >
> > +     /* If smp_call_function_any() was used, preemption is reenabled here. */
> > +     put_cpu();
> > +
> >       resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx);
> >   }
> >
>
> Thanks.
>
> -Fenghua

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ