lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 23 Sep 2008 16:49:32 +1000
From:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
Cc:	Nick Piggin <npiggin@...e.de>,
	Linux Memory Management List <linux-mm@...ck.org>,
	Linux Kernel list <linux-kernel@...r.kernel.org>,
	Hugh Dickins <hugh@...itas.com>
Subject: Re: PTE access rules & abstraction


> A good first step might be to define some conventions.  For example,
> define that set_pte*() *always* means setting a non-valid pte to either
> a new non-valid state (like a swap reference) or to a valid state. 
> modify_pte() would modify the flags of a valid
> pte, giving a new valid pte.  etc...

Yup. Or make it clear that ptep_set_access_flags() should only be used
to -relax- access (ie, set dirty, writeable, accessed, ... but not
remove any of them).

> It may be that a given architecture collapses some or all of these down
> to the same underlying functionality, but it would allow the core intent
> to be clearly expressed.
> 
> What is the complete set of primitives we need?  I also noticed that a
> number of the existing pagetable operations are used only once or twice
> in the core code; I wonder if we really need such special cases, or
> whether we can make each arch pte operation carry a bit more weight?

Yes, that was some of my concern. It's getting close to having one API
per call site :-)

> Also, rather than leaving all the rule enforcing to documentation and a
> maintainer, we should also consider having a debug mode which adds
> enough paranoid checks to each operation so that any rule breakage will
> fail obviously on all architectures.

We could do both.

Now, regarding operations, let's first find the major call sites, see
what I miss. I'm omitting free_* in memory.c as those are for freeing
pte pages, not accessing PTEs themselves. I'm also ignoring read-only
call sites and hugetlb for now.

* None-iterative accessors

 - handle_pte_fault in memory.c, on "fixup" faults (pte is present and
it's not a COW), for fixing up DIRTY and ACCESSED (btw, could we make
that also fixup EXEC ? I would like this for some stuff I'm working on
at the moment, ie set it if the vma has VM_EXEC and it was lost from the
PTE as I might want to mask it out of PTEs under some circumstances).
Textbook usage of ptep_set_access_flags(), so that's fine.

 - do_wp_page() in memory.c for COW or fixup of shared writeable mapping
writeable-ness. Doesn't overwrite existing PTE for COW anymore, it uses
clear_flush nowadays and fixup of shared writeable mapping uses
ptep_set_access_flags() as it should, so that's all good.

 - insert_pfn() and insert_page() still in memory.c for fancy page
faults. Just a trivial set_pte_at() of a !present one, no big deal here

  - RMAP ones ? Some ad-hoc stuff due to _notify thingies.

* Iterative accessors (some don't batch, maybe they could/should).

 - zapping a mapping (zap_p*) in memory.c
 - fork (copy_p*) in memory.c could batch better maybe ?
 - setting linear user mappings (remap_p*) in memory.c, trivial
set_pte_at() on a range, pte's should be !present I think.
 - mprotect (change_p*) in memory.c, which has the problem I mentioned
 - moving page tables (move_p*), pretty trivial clear_flush + set_pte_at
 - clear_regs_pte_range via walk_page_range in fs/proc/task_mmu.c, does
a test_and_clear_young, flushes mm afterward, could use some lazy stuff
so we can batch properly on ppc64.
 - vmalloc, that's a bit special and kernel only, doesn't have nasty
races between creating/tearing down mappings vs. using them
 - highmem I leave alone for now, it's mostly trivial set_pte_at &
flushing for normal kmap but kmap_atomic can be nasty, though it's arch
specific.
 - some stuff in fremap I'm not too familiar with and I need to run...

What did I miss ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ