lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 3 Dec 2014 10:31:44 +0100
From:	Lars Persson <lars.persson@...s.com>
To:	Leonid Yegoshin <Leonid.Yegoshin@...tec.com>
CC:	"linux-mips@...ux-mips.org" <linux-mips@...ux-mips.org>,
	"james.hogan@...tec.com" <james.hogan@...tec.com>,
	"keescook@...omium.org" <keescook@...omium.org>,
	"paul.burton@...tec.com" <paul.burton@...tec.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"ralf@...ux-mips.org" <ralf@...ux-mips.org>,
	"manuel.lauss@...il.com" <manuel.lauss@...il.com>,
	"pbonzini@...hat.com" <pbonzini@...hat.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"blogic@...nwrt.org" <blogic@...nwrt.org>,
	"markos.chandras@...tec.com" <markos.chandras@...tec.com>
Subject: Re: [PATCH] Revert "MIPS: Remove race window in page fault handling"

Hi Leonid

First let me describe the mechanism of this race condition, which was a
fault in the kernel's MIPS architecture code. Specifically in its
implementation of lazy dcache flushing. AFAIK, it would only hit on
systems where the pagein code path writes to the page from the CPU.

The order of calls is:
flush_dcache_page() (from the FS's readpage)
set_pte_at()
update_mmu_cache()

The thread number one has executed the set_pte_at() when thread number
two hits the same page. It finds a valid PTE and proceeds to execute
code from a page that is not yet flushed to the point of I/D coherency.
That flush would happen in update_mmu_cache().

My patch does increase number of cache flushes for CoW yes and there
could be an optimization opportunity by playing tricks with the pte_t to
include information about executability of the mapping. 

Reverting the patch is a big no-no, then we go back to a state of
undefined CPU behavior.

- Lars


On Wed, 2014-12-03 at 04:25 +0100, Leonid Yegoshin wrote:
> This reverts commit 64f23ab30b1fec86b91a48bf1f088840e5299971
> (commit 2a4a8b1e5d9d343e13ff22e19af7b353f7b52d6f in Linux 'master')
> 
> Unfortunately the original commit effectively kills Imagination MIPS CPUs
> performance because it enforces page cache flush each time then PTE is changed
> for our CPUs. Basically - each CoW, each process start, each library attachment
> or detachment. And it happens even in "fully-coherent" systems (in MIPS sense -
> we have non coherent I-cache). Last tendency for increasing page size to 16K-64K
> even exaggerate the performance loss.
> 
> Original commit discussion says:
> 
>     Kernel call stacks showed one thread handling an illegal instruction
>     exception while another thread was somewhere around the
>     set_pte_at/update_mmu_cache calls for the same page.
> 
> If this is taken correct then it points to a major problem unrelated to MIPS -
> if by chance a first thread hits the page just before set_pte_at then it should
> get a non-present PTE: set_pte_at with _PAGE_PRESENT/_PAGE_VALID flags can be
> used only on "disactivated" PTE, after pte_clear* functions, effectively -
> non-present, non-valid. And application can get SIGSEGV. It is a major race
> condition and kernel should not offer it to applications. And in my
> understanding set_pte_at is used in cases then page is available after kernel
> finishes page handling, at least in theory.
> 
> Test note: I ran MIPS 1004K with 8 VPEs on 4 cores around 1.5 month on SOAK test
> and never seen that problem on 3.10 kernel.
> 
> Signed-off-by: Leonid Yegoshin <Leonid.Yegoshin@...tec.com>
>  
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ