lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 6 Oct 2014 13:03:43 -0700
From:	Leonid Yegoshin <Leonid.Yegoshin@...tec.com>
To:	David Daney <ddaney.cavm@...il.com>
CC:	<linux-mips@...ux-mips.org>, <Zubair.Kakakhel@...tec.com>,
	<david.daney@...ium.com>, <peterz@...radead.org>,
	<paul.gortmaker@...driver.com>, <davidlohr@...com>,
	<macro@...ux-mips.org>, <chenhc@...ote.com>, <zajec5@...il.com>,
	<james.hogan@...tec.com>, <keescook@...omium.org>,
	<alex@...x-smith.me.uk>, <tglx@...utronix.de>,
	<blogic@...nwrt.org>, <jchandra@...adcom.com>,
	<paul.burton@...tec.com>, <qais.yousef@...tec.com>,
	<linux-kernel@...r.kernel.org>, <ralf@...ux-mips.org>,
	<markos.chandras@...tec.com>, <manuel.lauss@...il.com>,
	<akpm@...ux-foundation.org>, <lars.persson@...s.com>
Subject: Re: [PATCH 2/3] MIPS: Setup an instruction emulation in VDSO protected
 page instead of user stack

On 10/06/2014 11:05 AM, David Daney wrote:
> On 10/03/2014 08:17 PM, Leonid Yegoshin wrote:
>> Historically, during FPU emulation MIPS runs live BD-slot instruction 
>> in stack.
>> This is needed because it was the only way to correctly handle branch
>> exceptions with unknown COP2 instructions in BD-slot. Now there is
>> an eXecuteInhibit feature and it is desirable to protect stack from 
>> execution
>> for security reasons.
>> This patch moves FPU emulation from stack area to VDSO-located page 
>> which is set
>> write-protected for application access. VDSO page itself is now 
>> per-thread and
>> it's addresses and offsets are stored in thread_info.
>> Small stack of emulation blocks is supported because nested traps are 
>> possible
>> in MIPS32/64 R6 emulation mix with FPU emulation.
>>
>
> Can you explain how this per-thread mapping works.
>
> I am especially interested in what happens when a different thread 
> from the thread using the special mapping, issues flush_tlb_mm(), and 
> invalidates the TLBs on all CPUs.  How does the TLB entry for the 
> special mapping survive this?
>
>
This patch works as long as 'install_special_mapping()' doesn't change 
PTE itself but installs Page Fault handler. It is the only hidden 
dependency from common Linux code.

MIPS code allocates a page (copy of a standard 'VDSO' page) and links it 
to thread_info and handles all allocation/deallocation/thread creation 
via arch hooks. It does it only for thread which have a memory map, not 
for kernel threads. Oh, it does all stuff only if CPU has RI/XI 
capability - the HW execute inhibit feature, otherwise it works as is 
done today.

It still does attachment of a standard 'VDSO' page to memory map for 
accounting purpose, so /proc/.../maps shows [VDSO] page. However the new 
(per-thread) page is actually a shadow.

Then TLB refill happens it loads an empty PTE and subsequent TLBL (TLB 
load Page Fault) comes to MIPS C-code which recognizes 'VDSO' address 
and asks install_vdso_tlb() to fill TLB directly and marks ASID of it in 
memory map for this CPU.

At process (read - thread) reschedule there is a check that on this CPU 
some previous thread of the same memory map loads TLB via comparing 
ASIDs. If that happend and ASIDs are the same, then local_flush_tlb_page 
is called to eliminate this TLB because it has the same ASID but can 
have a different per-thread page.

Because PTE stays as 0x00..00 and never changes then this activity 
starts again after eviction of TLB due to some reason - either 
flush_tlb_mm(), either other flush or either eviction due to TLB array 
HW or SW replacements, but only if page is demanded again.

Now, the emulation part:  some stack of emulation blocks can be used 
from top of page. Each time during emulation of FPU instruction from 
BD-slot it takes a kernel VA of page and puts that into stack but 
changes a thread EPC to user VA of that block. It uses a cache flush via 
different addresses here (D-cache via kernel VA and I-cache via user VA) 
in case of cache aliasing and new functions is needed to avoid a huge 
performance loss from flush_cache_page(). It uses a regular 
flush_cache_sigtramp() in absence of cache aliasing because in some 
systems it can be much faster (via SYNCI).

Stack of emulation blocks is needed because I work on MIPS32/64 R6 
architecture kernel and there is a need for emulation of some removed 
MIPS R2 instructions. And a reentry of emulation may happens in some 
rare cases - FPU emulation and MIPS R2 emulation subsystems are 
different pieces.


Note: After Peter Zijlstra note about performance I am thinking about 
adding the check of situation then the same single thread is rescheduled 
again on the same CPU and don't flush TLB in this case. It just requires 
yet another array of process-ids or 'VDSO' pages - one element per CPU 
and I am weighting it against schedule time interval. Today array is max 
8 elements for MIPS but it can change in future. There is also a 
possibility to write a special TLB flush function which compares TLB 
element address with page address and skips TLB element eviction if 
address compares.

- Leonid.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists