[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1402655819-14325-1-git-send-email-dh.herrmann@gmail.com>
Date: Fri, 13 Jun 2014 12:36:52 +0200
From: David Herrmann <dh.herrmann@...il.com>
To: linux-kernel@...r.kernel.org
Cc: Michael Kerrisk <mtk.manpages@...il.com>,
Ryan Lortie <desrt@...rt.ca>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org,
Greg Kroah-Hartman <greg@...ah.com>, john.stultz@...aro.org,
Lennart Poettering <lennart@...ttering.net>,
Daniel Mack <zonque@...il.com>, Kay Sievers <kay@...y.org>,
Hugh Dickins <hughd@...gle.com>,
Tony Battersby <tonyb@...ernetics.com>,
Andy Lutomirski <luto@...capital.net>,
David Herrmann <dh.herrmann@...il.com>
Subject: [PATCH v3 0/7] File Sealing & memfd_create()
Hi
This is v3 of the File-Sealing and memfd_create() patches. You can find v1 with
a longer introduction at gmane:
http://thread.gmane.org/gmane.comp.video.dri.devel/102241
An LWN article about memfd+sealing is available, too:
https://lwn.net/Articles/593918/
v2 with some more discussions can be found here:
http://thread.gmane.org/gmane.linux.kernel.mm/115713
This series introduces two new APIs:
memfd_create(): Think of this syscall as malloc() but it returns a
file-descriptor instead of a pointer. That file-descriptor is
backed by anon-memory and can be memory-mapped for access.
sealing: The sealing API can be used to prevent a specific set of operations
on a file-descriptor. You 'seal' the file and give thus the
guarantee, that it cannot be modified in the specific ways.
A short high-level introduction is also available here:
http://dvdhrm.wordpress.com/2014/06/10/memfd_create2/
Changed in v3:
- fcntl() now returns EINVAL if the FD does not support sealing. We used to
return EBADF like pipe_fcntl() does, but that is really weird and I don't
like repeating that.
- seals are now saved as "unsigned int" instead of "u32".
- i_mmap_writable is now an atomic so we can deny writable mappings just like
i_writecount does.
- SHMEM_ALLOW_SEALING is dropped. We initialize all objects with F_SEAL_SEAL
and only unset it for memfds that shall support sealing.
- memfd_create() no longer has a size argument. It was redundant, use
ftruncate() or fallocate().
- memfd_create() flags are "unsigned int" now, instead of "u64".
- NAME_MAX off-by-one fix
- several cosmetic changes
- Added AIO/Direct-IO page-pinning protection
The last point is the most important change in this version: We now bail out if
any page-refcount is elevated while setting SEAL_WRITE. This prevents parallel
GUP users from writing to sealed files _after_ they were sealed. There is also a
new FUSE-based test-case to trigger such situations.
The last 2 patches try to improve the page-pinning handling. I included both in
this series, but obviously only one of them is needed (or we could stack them):
- 6/7: This waits for up to 150ms for pages to be unpinned
- 7/7: This isolates pinned pages and replaces them with a fresh copy
Hugh, patch 6 is basically your code. In case that gets merged, can I put your
Signed-off-by on it?
I hope I didn't miss anything. Further comments welcome!
Thanks
David
David Herrmann (7):
mm: allow drivers to prevent new writable mappings
shm: add sealing API
shm: add memfd_create() syscall
selftests: add memfd_create() + sealing tests
selftests: add memfd/sealing page-pinning tests
shm: wait for pins to be released when sealing
shm: isolate pinned pages when sealing files
arch/x86/syscalls/syscall_32.tbl | 1 +
arch/x86/syscalls/syscall_64.tbl | 1 +
fs/fcntl.c | 5 +
fs/inode.c | 1 +
include/linux/fs.h | 29 +-
include/linux/shmem_fs.h | 17 +
include/linux/syscalls.h | 1 +
include/uapi/linux/fcntl.h | 15 +
include/uapi/linux/memfd.h | 8 +
kernel/fork.c | 2 +-
kernel/sys_ni.c | 1 +
mm/mmap.c | 24 +-
mm/shmem.c | 320 ++++++++-
mm/swap_state.c | 1 +
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/memfd/.gitignore | 4 +
tools/testing/selftests/memfd/Makefile | 40 ++
tools/testing/selftests/memfd/fuse_mnt.c | 110 +++
tools/testing/selftests/memfd/fuse_test.c | 311 +++++++++
tools/testing/selftests/memfd/memfd_test.c | 913 +++++++++++++++++++++++++
tools/testing/selftests/memfd/run_fuse_test.sh | 14 +
21 files changed, 1807 insertions(+), 12 deletions(-)
create mode 100644 include/uapi/linux/memfd.h
create mode 100644 tools/testing/selftests/memfd/.gitignore
create mode 100644 tools/testing/selftests/memfd/Makefile
create mode 100755 tools/testing/selftests/memfd/fuse_mnt.c
create mode 100644 tools/testing/selftests/memfd/fuse_test.c
create mode 100644 tools/testing/selftests/memfd/memfd_test.c
create mode 100755 tools/testing/selftests/memfd/run_fuse_test.sh
--
2.0.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists