lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1208241412040.15568@kaball.uk.xensource.com>
Date:	Fri, 24 Aug 2012 14:32:38 +0100
From:	Stefano Stabellini <stefano.stabellini@...citrix.com>
To:	Thomas Gleixner <tglx@...utronix.de>
CC:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Borislav Petkov <bp@...en8.de>,
	Attilio Rao <attilio.rao@...rix.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"mingo@...hat.com" <mingo@...hat.com>,
	"hpa@...or.com" <hpa@...or.com>, "x86@...nel.org" <x86@...nel.org>,
	Stefano Stabellini <Stefano.Stabellini@...citrix.com>
Subject: Re: [PATCH v4 1/2] XEN/X86: Improve semantic support for
 x86_init.mapping.pagetable_reserve

On Fri, 24 Aug 2012, Thomas Gleixner wrote:
> I looked at the whole pgt_buf_* mess and it's amazingly stupid. We
> could avoid all that dance and make all of that pgt_buf_* stuff static
> and provide proper accessor functions and hand start, end, top to the
> reserve function instead of fiddling with global variables all over
> the place. That'd be a real cleanup and progress.
> 
> But we can't do that easily. And why? Because XEN is making magic
> decisions based on those globals in mask_rw_pte().
> 
> 	/*
> 	 * If the new pfn is within the range of the newly allocated
> 	 * kernel pagetable, and it isn't being mapped into an
> 	 * early_ioremap fixmap slot as a freshly allocated page, make sure
> 	 * it is RO.
> 	 */
> 	if (((!is_early_ioremap_ptep(ptep) &&
> 			pfn >= pgt_buf_start && pfn < pgt_buf_top)) ||
> 			(is_early_ioremap_ptep(ptep) && pfn != (pgt_buf_end - 1)))
> 
> This comment along with the implementation is really a master piece of
> obfuscation. Let's see what this is doing. RO is enforced when:
> 
> This is not an early ioreamp AND
> 
>       pfn >= pgt_buf_start && pfn < pgt_buf_top
> 
> So why is this checking pgt_buf_top? The early stuff is installed
> within pgt_buf_start and pgt_buf_end. Anything which is >=
> pgt_buf_end at this point is completely wrong.

Unfortunately pgt_buf_end only marks the current end of the pagetable
pages (pgt_buf_end keeps increasing during
kernel_physical_mapping_init).  However at some point
kernel_physical_mapping_init is going to start mapping the pagetable
pages themselves, when that happens some of them are not pagetable pages
yet (pgt_buf_end <= page < pgt_buf_top) but they are going to be in the
near future.
On Xen they all need to be mapped RO because otherwise as soon as they
are hooked into the pagetable the kernel goes boom.
In fact this problem only happens if the pagetable pages are allocated in high
memory, we started to see it after:

commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e
Author: Yinghai Lu <yinghai@...nel.org>
Date:   Fri Dec 17 16:58:28 2010 -0800

    x86-64, mm: Put early page table high



> Now the second check is even more interesting:
> 
> If this is an early ioremap AND
> 
>       pfn != (pgt_buf_end -1 )
> 
> then it's forced RO as well.
> 
> So this checks whether the early ioremap is happening on the last
> allocated pfn from the pgt_buf range.
> 
> OMG, really great design! And the comment above that if() obfuscation
> is not really helping much.
> 
> If anything is missing a semantic documentation and analysis then
> definitely code like this which is just a cobbled together steaming
> pile of ....
 
I agree it is terrible and fragile, I would be happy to get rid of it.
However it is a difficult problem to solve and even if we had lengthy
discussions on the LKML we couldn't find any better solutions at the
time.
The comment is not much, but there quite a bit on context in the
commit message:

commit 279b706bf800b5967037f492dbe4fc5081ad5d0f
Author: Stefano Stabellini <stefano.stabellini@...citrix.com>
Date:   Thu Apr 14 15:49:41 2011 +0100

    x86,xen: introduce x86_init.mapping.pagetable_reserve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ