linux-kernel - Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and migration policy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20121026135024.GA11640@gmail.com>
Date:	Fri, 26 Oct 2012 15:50:24 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Rik van Riel <riel@...hat.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Mel Gorman <mgorman@...e.de>,
	Johannes Weiner <hannes@...xchg.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 26/31] sched, numa, mm: Add fault driven placement and
 migration policy


* Ingo Molnar <mingo@...nel.org> wrote:

> [
>   task_numa_work() performance side note:
> 
>   We are also *very* close to be able to use down_read() instead
>   of down_write() in the sampling-unmap code in 
>   task_numa_work(), as it should be safe in theory to call 
>   change_protection(PROT_NONE) in parallel - but there's one 
>   regression that disagrees with this theory so we use 
>   down_write() at the moment.
> 
>   Maybe you could help us there: can you see a reason why the
>   change_prot_none()->change_protection() call in
>   task_numa_work() can not occur in parallel to a page fault in
>   another thread on another CPU? It should be safe - yet if we 
>   change it I can see occasional corruption of user-space state: 
>   segfaults and register corruption.
> ]

Oh, just found the reason:

the ptep_modify_prot_start()/modify()/commit() sequence is 
SMP-unsafe - it has to be done under the mmap_sem write-locked.

It is safe against *hardware* updates to the PTE, but not safe 
against itself.

This is apparently a hidden cost of paravirt, it is forcing that 
weird sequence and thus the down_write() ...

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/