linux-kernel - Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190626203637.GC245468@romley-ivt3.sc.intel.com>
Date:   Wed, 26 Jun 2019 13:36:37 -0700
From:   Fenghua Yu <fenghua.yu@...el.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        H Peter Anvin <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krcmar <rkrcmar@...hat.com>,
        Christopherson Sean J <sean.j.christopherson@...el.com>,
        Ashok Raj <ashok.raj@...el.com>,
        Tony Luck <tony.luck@...el.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Xiaoyao Li <xiaoyao.li@...el.com>,
        Sai Praneeth Prakhya <sai.praneeth.prakhya@...el.com>,
        Ravi V Shankar <ravi.v.shankar@...el.com>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        x86 <x86@...nel.org>, kvm@...r.kernel.org
Subject: Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split
 lock

On Wed, Jun 26, 2019 at 10:20:05PM +0200, Thomas Gleixner wrote:
> On Tue, 18 Jun 2019, Fenghua Yu wrote:
> > +
> > +static atomic_t split_lock_debug;
> > +
> > +void split_lock_disable(void)
> > +{
> > +	/* Disable split lock detection on this CPU */
> > +	this_cpu_and(msr_test_ctl_cached, ~MSR_TEST_CTL_SPLIT_LOCK_DETECT);
> > +	wrmsrl(MSR_TEST_CTL, this_cpu_read(msr_test_ctl_cached));
> > +
> > +	/*
> > +	 * Use the atomic variable split_lock_debug to ensure only the
> > +	 * first CPU hitting split lock issue prints one single complete
> > +	 * warning. This also solves the race if the split-lock #AC fault
> > +	 * is re-triggered by NMI of perf context interrupting one
> > +	 * split-lock warning execution while the original WARN_ONCE() is
> > +	 * executing.
> > +	 */
> > +	if (atomic_cmpxchg(&split_lock_debug, 0, 1) == 0) {
> > +		WARN_ONCE(1, "split lock operation detected\n");
> > +		atomic_set(&split_lock_debug, 0);
> 
> What's the purpose of this atomic_set()?

atomic_set() releases the split_lock_debug flag after WARN_ONCE() is done.
The same split_lock_debug flag will be used in sysfs write for atomic
operation as well, as proposed by Ingo in https://lkml.org/lkml/2019/4/25/48
So that's why the flag needs to be cleared, right?

> 
> > +dotraplinkage void do_alignment_check(struct pt_regs *regs, long error_code)
> > +{
> > +	unsigned int trapnr = X86_TRAP_AC;
> > +	char str[] = "alignment check";
> > +	int signr = SIGBUS;
> > +
> > +	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
> > +
> > +	if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) == NOTIFY_STOP)
> > +		return;
> > +
> > +	cond_local_irq_enable(regs);
> > +	if (!user_mode(regs) && static_cpu_has(X86_FEATURE_SPLIT_LOCK_DETECT)) {
> > +		/*
> > +		 * Only split locks can generate #AC from kernel mode.
> > +		 *
> > +		 * The split-lock detection feature is a one-shot
> > +		 * debugging facility, so we disable it immediately and
> > +		 * print a warning.
> > +		 *
> > +		 * This also solves the instruction restart problem: we
> > +		 * return the faulting instruction right after this it
> 
> we return the faulting instruction ... to the store so we get our deposit
> back :)
> 
>   the fault handler returns to the faulting instruction which will be then
>   executed without ....
> 
> Don't try to impersonate code, cpus or whatever. It doesn't make sense and
> confuses people.
> 
> > +		 * will be executed without generating another #AC fault
> > +		 * and getting into an infinite loop, instead it will
> > +		 * continue without side effects to the interrupted
> > +		 * execution context.
> 
> That last part 'instead .....' is redundant. It's entirely clear from the
> above that the faulting instruction is reexecuted ....
> 
> Please write concise comments and do try to repeat the same information
> with a different painting.

I copied the comment completely from Ingo's comment on v8:
https://lkml.org/lkml/2019/4/25/40

Thanks.

-Fenghua