linux-kernel - Re: [PATCH v2 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190403160112.GK4038@hirez.programming.kicks-ass.net>
Date:   Wed, 3 Apr 2019 18:01:12 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Alex Kogan <alex.kogan@...cle.com>
Cc:     Waiman Long <longman@...hat.com>, linux@...linux.org.uk,
        mingo@...hat.com, will.deacon@....com, arnd@...db.de,
        linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, tglx@...utronix.de, bp@...en8.de,
        hpa@...or.com, x86@...nel.org, steven.sistare@...cle.com,
        daniel.m.jordan@...cle.com, dave.dice@...cle.com,
        rahul.x.yadav@...cle.com
Subject: Re: [PATCH v2 3/5] locking/qspinlock: Introduce CNA into the slow
 path of qspinlock

On Wed, Apr 03, 2019 at 11:39:09AM -0400, Alex Kogan wrote:

> >> The patch that I am looking for is to have a separate
> >> numa_queued_spinlock_slowpath() that coexists with
> >> native_queued_spinlock_slowpath() and
> >> paravirt_queued_spinlock_slowpath(). At boot time, we select the most
> >> appropriate one for the system at hand.
> Is this how this selection works today for paravirt?
> I see a PARAVIRT_SPINLOCKS config option, but IIUC you are talking about a different mechanism here.
> Can you, please, elaborate or give me a link to a page that explains that?

Oh man, you ask us to explain how paravirt patching works... that's
magic :-)

Basically, the compiler will emit a bunch of indirect calls to the
various pv_ops.*.* functions.

Then, at alternative_instructions() <- apply_paravirt() it will rewrite
all these indirect calls to direct calls to the function pointers that
are in the pv_ops structure at that time (+- more magic).

So we initialize the pv_ops.lock.* methods to the normal
native_queued_spin*() stuff, if KVM/Xen/whatever setup detectors pv
spnlock support changes the methods to the paravirt_queued_*() stuff.

If you wnt more details, you'll just have to read
arch/x86/include/asm/paravirt*.h and arch/x86/kernel/paravirt*.c, I
don't think there's a coherent writeup of all that.

> > Agreed; and until we have static_call, I think we can abuse the paravirt
> > stuff for this.
> > 
> > By the time we patch the paravirt stuff:
> > 
> >  check_bugs()
> >    alternative_instructions()
> >      apply_paravirt()
> > 
> > we should already have enumerated the NODE topology and so nr_node_ids()
> > should be set.
> > 
> > So if we frob pv_ops.lock.queued_spin_lock_slowpath to
> > numa_queued_spin_lock_slowpath before that, it should all get patched
> > just right.
> > 
> > That of course means the whole NUMA_AWARE_SPINLOCKS thing depends on
> > PARAVIRT_SPINLOCK, which is a bit awkward…

> Just to mention here, the patch so far does not address paravirt, but
> our goal is to add this support once we address all the concerns for
> the native version.  So we will end up with four variants for the
> queued_spinlock_slowpath() — one for each combination of
> native/paravirt and NUMA/non-NUMA.  Or perhaps we do not need a
> NUMA/paravirt variant?

I wouldn't bother with a pv version of the numa aware code at all. If
you have overcommitted guests, topology is likely irrelevant anyway. If
you have 1:1 pinned guests, they'll not use pv spinlocks anyway.

So keep it to tertiary choice:

 - native
 - native/numa
 - paravirt