lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 31 Mar 2022 10:13:07 -0700
From:   Ira Weiny <ira.weiny@...el.com>
To:     Dave Hansen <dave.hansen@...ux.intel.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Dan Williams <dan.j.williams@...el.com>
Cc:     Fenghua Yu <fenghua.yu@...el.com>,
        Rick Edgecombe <rick.p.edgecombe@...el.com>,
        "Shankar, Ravi V" <ravi.v.shankar@...el.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH V9 00/45] PKS/PMEM: Add Stray Write Protection

On Thu, Mar 10, 2022 at 09:19:34AM -0800, Ira wrote:
> From: Ira Weiny <ira.weiny@...el.com>
> 
> 
> I'm looking for Intel acks on the series prior to submitting to maintainers.
> Most of the changes from V8 to V9 was in getting the tests straightened out.
> But there are some improvements in the actual code.

Is there any feedback on this?

Ira

> 
> 
> Changes for V9
> 
> Review and update all commit messages.
> Update cover letter below
> 
> PKS Core
> 	Separate user and supervisor pkey code in the headers
> 		create linux/pks.h for supervisor calls
> 		This facilitated making the pmem code more efficient 
> 	Completely rearchitect the test code
> 		[After Dave Hansen and Rick Edgecombe found issues in the test
> 			code it was easier to rearchitect the code completely
> 			rather than attempt to fix it.]
> 		Remove pks_test_callback in favor of using fault hooks
> 			Fault hooks also isolate the fault callbacks from being
> 			false positives if non-test consumers are running
> 		Make additional PKS_TEST_RUN_ALL Kconfig option which is
> 			mutually exclusive to any non-test PKS consumer
> 			PKS_TEST_RUN_ALL takes over all pkey callbacks
> 		Ensure that each test runs within it's own context and is
> 			mutually exclusive from running while any other test is
> 			running.
> 		Ensure test session and context memory is cleaned up on file
> 			close
> 		Use pr_debug() and dynamic debug for in kernel debug messages
> 		Enhance test_pks selftest
> 			Add the ability to run all tests not just the context
> 				switch test
> 			Standardize output [PASS][FAIL][SKIP]
> 			Add '-d' option enables dynamic debug to see the kernel
> 				debug messages
> 
> 	Incorporate feedback from Rick Edgecombe
> 		Update all pkey types to u8
> 		Fix up test code barriers
> 	Move patch declaring PKS_INIT_VALUE ahead of the patch which enables
> 		PKS so that PKS_INIT_VALUE can be used when pks_setup() is
> 		first created
> 	From Dan Williams
> 		Use macros instead of an enum for a pkey allocation scheme
> 			which is predicated on the config options of consumers
> 			This almost worked perfectly.  It required a bit of
> 			tweeking to be able to allocate all of the keys.
> 
> 	From Dave Hansen
> 		Reposition some code to be near/similar to user pkeys
> 			s/pks_write_current/x86_pkrs_load
> 			s/pks_saved_pkrs/pkrs
> 		Update Documentation
> 		s/PKR_{RW,AD,WD}_KEY/PKR_{RW,AD,WD}_MASK
> 		Consistently use lower case for pkey
> 		Update commit messages
> 		Add Acks
> 
> PMEM Stray Write
> 	Building on the change to the pks_mk_*() function rename
> 		s/pgmap_mk_*/pgmap_set_*/
> 		s/dax_mk_*/dax_set_*/
> 	From Dan Williams
> 		Avoid adding new dax operations by teaching dax_device about pgmap
> 		Remove pgmap_protection_flag_invalid() patch (Just let
> 			kmap'ings fail)
> 
> 
> PKS/PMEM Stray write protection
> ===============================
> 
> This series is broken into 2 parts.
> 
> 	1) Introduce Protection Key Supervisor (PKS), testing, and
> 	   documentation
> 	2) Use PKS to protect PMEM from stray writes
> 
> Introduce Protection Key Supervisor (PKS)
> -----------------------------------------
> 
> PKS enables protections on 'domains' of supervisor pages to limit supervisor
> mode access to pages beyond the normal paging protections.  PKS works in a
> similar fashion to user space pkeys, PKU.  As with PKU, supervisor pkeys are
> checked in addition to normal paging protections.  And page mappings are
> assigned to a domain by setting a 4 bit pkey in the PTE of that mapping.
> 
> Unlike PKU, permissions are changed via a MSR update.  This update avoids TLB
> flushes making this an efficient way to alter protections vs PTE updates.
> 
> Also, unlike PTE updates PKS permission changes apply only to the current
> processor.  Therefore changing permissions apply only to that thread and not
> any other cpu/process.  This allows protections to remain in place on other
> cpus for additional protection and isolation.
> 
> Even though PKS updates are thread local, XSAVE is not supported for the PKRS
> MSR.  Therefore this implementation saves and restores the MSR across context
> switches and during exceptions within software.  Nested exceptions are
> supported by each exception getting a new PKS state.
> 
> For consistent behavior with current paging protections, pkey 0 is reserved and
> configured to allow full access via the pkey mechanism, thus preserving the
> default paging protections because PTEs naturally have a pkey value of 0.
> 
> Other keys, (1-15) are statically allocated by kernel consumers when
> configured.  This is done by adding the appropriate PKS_NEW_KEY and
> PKS_DECLARE_INIT_VALUE macros to pks-keys.h.
> 
> Two PKS consumers, PKS_TEST and PMEM stray write protection, are included in
> this series.  When the number of users grows larger the sharing of keys will
> need to be resolved depending on the needs of the users at that time.  Many
> methods have been contemplated but the number of kernel users and use cases
> envisioned is still quite small, much less than the 15 available keys.
> 
> To summarize, the following are key attributes of PKS.
> 
> 	1) Fast switching of permissions
> 		1a) Prevents access without page table manipulations
> 		1b) No TLB flushes required
> 	2) Works on a per thread basis, thus allowing protections to be
> 	   preserved on threads which are not actively accessing data through
> 	   the mapping.
> 
> PKS is available with 4 and 5 level paging.  For this and simplicity of
> implementation, the feature is restricted to x86_64.
> 
> 
> Use PKS to protect PMEM from stray writes
> -----------------------------------------
> 
> DAX leverages the direct-map to enable 'struct page' services for PMEM.  Given
> that PMEM capacity may be an order of magnitude higher capacity than System RAM
> it presents a large vulnerability surface to stray writes.  Such a stray write
> becomes a silent data corruption bug.
> 
> Stray pointers to System RAM may result in a crash or other undesirable
> behavior which, while unfortunate, are usually recoverable with a reboot.
> Stray writes to PMEM are permanent in nature and thus are more likely to result
> in permanent user data loss.  Given that PMEM access from the kernel is limited
> to a constrained set of locations (PMEM driver, Filesystem-DAX, direct-I/O, and
> any properly kmap'ed page), it is amenable to PKS protection.
> 
> Set up an infrastructure for extra device access protection. Then implement the
> protection using the new Protection Keys Supervisor (PKS) on architectures
> which support it.
> 
> Because PMEM pages are all associated with a struct dev_pagemap and flags in
> struct page are valuable the flag of protecting memory can be stored in struct
> dev_pagemap.  All PMEM is protected by the same pkey.  So a single flag is all
> that is needed in each dev_pagemap to indicate protection.
> 
> General access in the kernel is supported by modifying the kmap infrastructure
> which can detect if a page is pks protected and enable access until the
> corresponding unmap is called.
> 
> Because PKS is a thread local mechanism and because kmap was never really
> intended to create a long term mapping, this implementation does not support
> the kmap()/kunmap() calls.  Calling kmap() on a PMEM protected page is allowed
> but accessing that mapping will cause a fault.
> 
> Originally this series modified many of the kmap call sites to indicate they
> were thread local.[1]  And an attempt to support kmap()[2] was made.  But now
> that kmap_local_page() has been developed[3] and in more wide spread use,
> kmap() can safely be left unsupported.
> 
> How the fault is handled is configurable via a new module parameter
> memremap.pks_fault_mode.  Two modes are supported.
> 
> 	'relaxed' (default) -- WARN_ONCE, disable the protection and allow
> 	                       access
> 
> 	'strict' -- prevent any unguarded access to a protected dev_pagemap
> 		    range
> 
> This 'safety valve' feature has already been useful in the development of this
> feature.
> 
> 
> [1] https://lore.kernel.org/lkml/20201009195033.3208459-1-ira.weiny@intel.com/
> 
> [2] https://lore.kernel.org/lkml/87mtycqcjf.fsf@nanos.tec.linutronix.de/
> 
> [3] https://lore.kernel.org/lkml/20210128061503.1496847-1-ira.weiny@intel.com/
>     https://lore.kernel.org/lkml/20210210062221.3023586-1-ira.weiny@intel.com/
>     https://lore.kernel.org/lkml/20210205170030.856723-1-ira.weiny@intel.com/
>     https://lore.kernel.org/lkml/20210217024826.3466046-1-ira.weiny@intel.com/
> 
> 
> ----------------------------------------------------------------------------
> Changes for V8
> 
> Feedback from Thomas
> 	* clean up noinstr mess
> 	* Fix static PKEY allocation mess
> 	* Ensure all functions are consistently named.
> 	* Split up patches to do 1 thing per patch
> 	* pkey_update_pkval() implementation
> 	* Streamline the use of pks_write_pkrs() by not disabling preemption
> 		- Leave this to the callers who require it.
> 		- Use documentation and lockdep to prevent errors
> 	* Clean up commit messages to explain in detail _why_ each patch is
> 		there.
> 
> Feedback from Dave H.
> 	* Leave out pks_mk_readonly() as it is not used by the PMEM use case
> 
> Feedback from Peter Anvin
> 	* Replace pks_abandon_pkey() with pks_update_exception()
> 		This is an even greater simplification in that it no longer
> 		attempts to shield users from faults.  As the main use case for
> 		abandoning a key was to allow a system to continue running even
> 		with an error.  This should be a rare event so the performance
> 		should not be an issue.
> 
> * Simplify ARCH_ENABLE_SUPERVISOR_PKEYS
> 
> * Update PKS Test code
> 	- Add default value test
> 	- Split up the test code into patches which follow each feature
> 	  addition
> 	- simplify test code processing
> 	- ensure consistent reporting of errors.
> 
> * Ensure all entry points to the PKS code are protected by
> 	cpu_feature_enabled(X86_FEATURE_PKS)
> 	- At the same time make sure non-entry points or sub-functions to the
> 	  PKS code are not _unnecessarily_ protected by the feature check
> 
> * Update documentation
> 	- Use kernel docs to place the docs with the code for easier internal
> 	  developer use
> 
> * Adjust the PMEM use cases for the core changes
> 
> * Split the PMEM patches up to be 1 change per patch and help clarify review
> 
> * Review all header files and remove those no longer needed
> 
> * Review/update/clarify all commit messages
> 
> Fenghua Yu (1):
> mm/pkeys: Define PKS page table macros
> 
> Ira Weiny (43):
> entry: Create an internal irqentry_exit_cond_resched() call
> Documentation/protection-keys: Clean up documentation for User Space
> pkeys
> x86/pkeys: Clarify PKRU_AD_KEY macro
> x86/pkeys: Make PKRU macros generic
> x86/fpu: Refactor arch_set_user_pkey_access()
> mm/pkeys: Add Kconfig options for PKS
> x86/pkeys: Add PKS CPU feature bit
> x86/fault: Adjust WARN_ON for pkey fault
> Documentation/pkeys: Add initial PKS documentation
> mm/pkeys: Provide for PKS key allocation
> x86/pkeys: Enable PKS on cpus which support it
> mm/pkeys: PKS testing, add initial test code
> x86/selftests: Add test_pks
> x86/pkeys: Introduce pks_write_pkrs()
> x86/pkeys: Preserve the PKS MSR on context switch
> mm/pkeys: Introduce pks_set_readwrite()
> mm/pkeys: Introduce pks_set_noaccess()
> mm/pkeys: PKS testing, add a fault call back
> mm/pkeys: PKS testing, add pks_set_*() tests
> mm/pkeys: PKS testing, test context switching
> x86/entry: Add auxiliary pt_regs space
> entry: Split up irqentry_exit_cond_resched()
> entry: Add calls for save/restore auxiliary pt_regs
> x86/entry: Define arch_{save|restore}_auxiliary_pt_regs()
> x86/pkeys: Preserve PKRS MSR across exceptions
> x86/fault: Print PKS MSR on fault
> mm/pkeys: PKS testing, Add exception test
> mm/pkeys: Introduce pks_update_exception()
> mm/pkeys: PKS testing, test pks_update_exception()
> mm/pkeys: PKS testing, add test for all keys
> mm/pkeys: Add pks_available()
> memremap_pages: Add Kconfig for DEVMAP_ACCESS_PROTECTION
> memremap_pages: Introduce pgmap_protection_available()
> memremap_pages: Introduce a PGMAP_PROTECTION flag
> memremap_pages: Introduce devmap_protected()
> memremap_pages: Reserve a PKS pkey for eventual use by PMEM
> memremap_pages: Set PKS pkey in PTEs if requested
> memremap_pages: Define pgmap_set_{readwrite|noaccess}() calls
> memremap_pages: Add memremap.pks_fault_mode
> kmap: Make kmap work for devmap protected pages
> dax: Stray access protection for dax_direct_access()
> nvdimm/pmem: Enable stray access protection
> devdax: Enable stray access protection
> 
> Rick Edgecombe (1):
> mm/pkeys: Introduce PKS fault callbacks
> 
> .../admin-guide/kernel-parameters.txt | 12 +
> Documentation/core-api/protection-keys.rst | 130 ++-
> arch/x86/Kconfig | 6 +
> arch/x86/entry/calling.h | 20 +
> arch/x86/entry/common.c | 2 +-
> arch/x86/entry/entry_64.S | 22 +
> arch/x86/entry/entry_64_compat.S | 6 +
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/disabled-features.h | 8 +-
> arch/x86/include/asm/entry-common.h | 15 +
> arch/x86/include/asm/msr-index.h | 1 +
> arch/x86/include/asm/pgtable_types.h | 22 +
> arch/x86/include/asm/pkeys.h | 2 +
> arch/x86/include/asm/pkeys_common.h | 18 +
> arch/x86/include/asm/pkru.h | 20 +-
> arch/x86/include/asm/pks.h | 46 ++
> arch/x86/include/asm/processor.h | 15 +-
> arch/x86/include/asm/ptrace.h | 21 +
> arch/x86/include/uapi/asm/processor-flags.h | 2 +
> arch/x86/kernel/asm-offsets_64.c | 15 +
> arch/x86/kernel/cpu/common.c | 2 +
> arch/x86/kernel/dumpstack.c | 32 +-
> arch/x86/kernel/fpu/xstate.c | 22 +-
> arch/x86/kernel/head_64.S | 6 +
> arch/x86/kernel/process_64.c | 3 +
> arch/x86/mm/fault.c | 17 +-
> arch/x86/mm/pkeys.c | 320 +++++++-
> drivers/dax/device.c | 2 +
> drivers/dax/super.c | 59 ++
> drivers/md/dm-writecache.c | 8 +-
> drivers/nvdimm/pmem.c | 26 +
> fs/dax.c | 8 +
> fs/fuse/virtio_fs.c | 2 +
> include/linux/dax.h | 5 +
> include/linux/entry-common.h | 15 +-
> include/linux/highmem-internal.h | 4 +
> include/linux/memremap.h | 1 +
> include/linux/mm.h | 72 ++
> include/linux/pgtable.h | 4 +
> include/linux/pks-keys.h | 92 +++
> include/linux/pks.h | 73 ++
> include/linux/sched.h | 7 +
> include/uapi/asm-generic/mman-common.h | 1 +
> init/init_task.c | 3 +
> kernel/entry/common.c | 44 +-
> kernel/sched/core.c | 40 +-
> lib/Kconfig.debug | 33 +
> lib/Makefile | 3 +
> lib/pks/Makefile | 3 +
> lib/pks/pks_test.c | 755 ++++++++++++++++++
> mm/Kconfig | 32 +
> mm/memremap.c | 132 +++
> tools/testing/selftests/x86/Makefile | 2 +-
> tools/testing/selftests/x86/test_pks.c | 514 ++++++++++++
> 54 files changed, 2617 insertions(+), 109 deletions(-)
> create mode 100644 arch/x86/include/asm/pkeys_common.h
> create mode 100644 arch/x86/include/asm/pks.h
> create mode 100644 include/linux/pks-keys.h
> create mode 100644 include/linux/pks.h
> create mode 100644 lib/pks/Makefile
> create mode 100644 lib/pks/pks_test.c
> create mode 100644 tools/testing/selftests/x86/test_pks.c
> 
> --
> 2.35.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ