linux-kernel - Re: [PATCH] x86/mm: determine whether the fault address is canonical

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20191007143255.GA59713@gmail.com>
Date:   Mon, 7 Oct 2019 16:32:55 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Sean Christopherson <sean.j.christopherson@...el.com>
Cc:     Dave Hansen <dave.hansen@...el.com>,
        Changbin Du <changbin.du@...il.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Andy Lutomirski <luto@...nel.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/mm: determine whether the fault address is canonical


* Sean Christopherson <sean.j.christopherson@...el.com> wrote:

> On Fri, Oct 04, 2019 at 07:39:08AM -0700, Dave Hansen wrote:
> > On 10/4/19 6:45 AM, Changbin Du wrote:
> > > +static inline bool is_canonical_addr(u64 addr)
> > > +{
> > > +#ifdef CONFIG_X86_64
> > > +	int shift = 64 - boot_cpu_data.x86_phys_bits;
> > 
> > I think you mean to check the virtual bits member, not "phys_bits".
> > 
> > BTW, I also prefer the IS_ENABLED(CONFIG_) checks to explicit #ifdefs.
> > Would one of those work in this case?
> > 
> > As for the error message:
> > 
> > >  {
> > > -	WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user access. Non-canonical address?");
> > > +	WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault at %s address in user access.",
> > > +		  is_canonical_addr(fault_addr) ? "canonical" : "non-canonical");
> > 
> > I've always read that as "the GP might have been caused by a
> > non-canonical access".  The main nit I'd have with the change is that I
> > don't think all #GP's during user access functions which are given a
> > non-canonical address *necessarily* caused the #GP.
> > 
> > There are a billion ways you can get a #GP and I bet canonical
> > violations aren't the only way you can get one in a user copy function.
> 
> All the other reasons would require a fairly egregious kernel bug, hence
> the speculation that the #GP is due to a non-canonical address.  Something
> like the following would be more precise, though highly unlikely to ever
> be exercised, e.g. KVM had a fatal bug related to injecting a non-zero
> error code that went unnoticed for years.
> 
> 	WARN_ONCE(trapnr == X86_TRAP_GP, "General protection fault in user access. %s?\n",
> 		  (IS_ENABLED(CONFIG_X86_64) && !error_code) ? "Non-canonical address" :
> 		  					       "Segmentation bug");

Instead of trying to guess the reason of the #GPF (which guess might be 
wrong), please just state it as the reason if we are sure that the cause 
is a non-canonical address - and provide a best-guess if it's not but 
clearly signal that it's a guess.

I.e. if I understood all the cases correctly we'd have three types of 
messages generated:

 !error_code:
	"General protection fault in user access, due to non-canonical address."

 error_code && !is_canonical_addr(fault_addr):
	"General protection fault in user access. Non-canonical address?"

 error_code && is_canonical_addr(fault_addr):
	"General protection fault in user access. Segmentation bug?"

Only the first one is declarative, because we know we got a #GP with a 
zero error code which should denote a non-canonical address access.

The second and third ones are guesses with question marks to communicate 
the uncertainty.

Assuming that !error_code always means non-canonical access?

And hopefully "!error_code && !is_canonical_addr(fault_addr)" is not 
possible?

Thanks,

	Ingo