linux-kernel - Re: [PATCHv3 1/1] Documentation: describe how to add a system call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 31 Jul 2015 08:15:26 -0700
From:	Josh Triplett <josh@...htriplett.org>
To:	David Drysdale <drysdale@...gle.com>
Cc:	linux-api@...r.kernel.org,
	Michael Kerrisk <mtk.manpages@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Arnd Bergmann <arnd@...db.de>,
	Shuah Khan <shuahkh@....samsung.com>,
	Jonathan Corbet <corbet@....net>,
	Eric B Munson <emunson@...mai.com>,
	Randy Dunlap <rdunlap@...radead.org>,
	Cyril Hrubis <chrubis@...e.cz>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Oleg Nesterov <oleg@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Andy Lutomirski <luto@...capital.net>,
	Al Viro <viro@...iv.linux.org.uk>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Peter Zijlstra <peterz@...radead.org>,
	Vivek Goyal <vgoyal@...hat.com>,
	Alexei Starovoitov <ast@...mgrid.com>,
	David Herrmann <dh.herrmann@...il.com>,
	Theodore Ts'o <tytso@....edu>,
	Kees Cook <keescook@...omium.org>,
	Milosz Tanski <milosz@...in.com>, Fam Zheng <famz@...hat.com>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv3 1/1] Documentation: describe how to add a system call

On Fri, Jul 31, 2015 at 03:35:46PM +0100, David Drysdale wrote:
> Add a document describing the process of adding a new system call,
> including the need for a flags argument for future compatibility, and
> covering 32-bit/64-bit concerns (albeit in an x86-centric way).
> 
> Signed-off-by: David Drysdale <drysdale@...gle.com>
> Reviewed-by: Michael Kerrisk <mtk.manpages@...il.com>
> Reviewed-by: Eric B Munson <emunson@...mai.com>
> Reviewed-by: Kees Cook <keescook@...omium.org>
> Reviewed-by: Randy Dunlap <rdunlap@...radead.org>

Other than the change to the mention of capabilities, this now looks
ready to me.  With that item fixed:

Reviewed-by: Josh Triplett <josh@...htriplett.org>

>  Documentation/adding-syscalls.txt | 531 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 531 insertions(+)
>  create mode 100644 Documentation/adding-syscalls.txt
> 
> diff --git a/Documentation/adding-syscalls.txt b/Documentation/adding-syscalls.txt
> new file mode 100644
> index 000000000000..633691c5fb45
> --- /dev/null
> +++ b/Documentation/adding-syscalls.txt
> @@ -0,0 +1,531 @@
> +Adding a New System Call
> +========================
> +
> +This document describes what's involved in adding a new system call to the
> +Linux kernel, over and above the normal submission advice in
> +Documentation/SubmittingPatches.
> +
> +
> +System Call Alternatives
> +------------------------
> +
> +The first thing to consider when adding a new system call is whether one of
> +the alternatives might be suitable instead.  Although system calls are the
> +most traditional and most obvious interaction points between userspace and the
> +kernel, there are other possibilities -- choose what fits best for your
> +interface.
> +
> + - If the operations involved can be made to look like a filesystem-like
> +   object, it may make more sense to create a new filesystem or device.  This
> +   also makes it easier to encapsulate the new functionality in a kernel module
> +   rather than requiring it to be built into the main kernel.
> +     - If the new functionality involves operations where the kernel notifies
> +       userspace that something has happened, then returning a new file
> +       descriptor for the relevant object allows userspace to use
> +       poll/select/epoll to receive that notification.
> +     - However, operations that don't map to read(2)/write(2)-like operations
> +       have to be implemented as ioctl(2) requests, which can lead to a
> +       somewhat opaque API.
> + - If you're just exposing runtime system information, a new node in sysfs
> +   (see Documentation/filesystems/sysfs.txt) or the /proc filesystem may be
> +   more appropriate.  However, access to these mechanisms requires that the
> +   relevant filesystem is mounted, which might not always be the case (e.g.
> +   in a namespaced/sandboxed/chrooted environment).  Avoid adding any API to
> +   debugfs, as this is not considered a 'production' interface to userspace.
> + - If the operation is specific to a particular file or file descriptor, then
> +   an additional fcntl(2) command option may be more appropriate.  However,
> +   fcntl(2) is a multiplexing system call that hides a lot of complexity, so
> +   this option is best for when the new function is closely analogous to
> +   existing fcntl(2) functionality, or the new functionality is very simple
> +   (for example, getting/setting a simple flag related to a file descriptor).
> + - If the operation is specific to a particular task or process, then an
> +   additional prctl(2) command option may be more appropriate.  As with
> +   fcntl(2), this system call is a complicated multiplexor so is best reserved
> +   for near-analogs of existing prctl() commands or getting/setting a simple
> +   flag related to a process.
> +
> +
> +Designing the API: Planning for Extension
> +-----------------------------------------
> +
> +A new system call forms part of the API of the kernel, and has to be supported
> +indefinitely.  As such, it's a very good idea to explicitly discuss the
> +interface on the kernel mailing list, and it's important to plan for future
> +extensions of the interface.
> +
> +(The syscall table is littered with historical examples where this wasn't done,
> +together with the corresponding follow-up system calls -- eventfd/eventfd2,
> +dup2/dup3, inotify_init/inotify_init1,  pipe/pipe2, renameat/renameat2 -- so
> +learn from the history of the kernel and plan for extensions from the start.)
> +
> +For simpler system calls that only take a couple of arguments, the preferred way
> +to allow for future extensibility is to include a flags argument to the system
> +call.  To make sure that userspace programs can safely use flags between kernel
> +versions, check whether the flags value holds any unknown flags, and reject the
> +sycall (with EINVAL) if it does:
> +
> +    if (flags & ~(THING_FLAG1 | THING_FLAG2 | THING_FLAG3))
> +        return -EINVAL;
> +
> +(If no flags values are used yet, check that the flags argument is zero.)
> +
> +For more sophisticated system calls that involve a larger number of arguments,
> +it's preferred to encapsulate the majority of the arguments into a structure
> +that is passed in by pointer.  Such a structure can cope with future extension
> +by including a size argument in the structure:
> +
> +    struct xyzzy_params {
> +        u32 size; /* userspace sets p->size = sizeof(struct xyzzy_params) */
> +        u32 param_1;
> +        u64 param_2;
> +        u64 param_3;
> +    };
> +
> +As long as any subsequently added field, say param_4, is designed so that a zero
> +value gives the previous behaviour, then this allows both directions of version
> +mismatch:
> +
> + - To cope with a later userspace program calling an older kernel, the kernel
> +   code should check that any memory beyond the size of the structure that it
> +   expects is zero (effectively checking that param_4 == 0).
> + - To cope with an older userspace program calling a newer kernel, the kernel
> +   code can zero-extend a smaller instance of the structure (effectively setting
> +   param_4 = 0).
> +
> +See perf_event_open(2) and the perf_copy_attr() function (in
> +kernel/events/core.c) for an example of this approach.
> +
> +
> +Designing the API: Other Considerations
> +---------------------------------------
> +
> +If your new system call allows userspace to refer to a kernel object, it
> +should use a file descriptor as the handle for that object -- don't invent a
> +new type of userspace object handle when the kernel already has mechanisms and
> +well-defined semantics for using file descriptors.
> +
> +If your new xyzzy(2) system call does return a new file descriptor, then the
> +flags argument should include a value that is equivalent to setting O_CLOEXEC
> +on the new FD.  This makes it possible for userspace to close the timing
> +window between xyzzy() and calling fcntl(fd, F_SETFD, FD_CLOEXEC), where an
> +unexpected fork() and execve() in another thread could leak a descriptor to
> +the exec'ed program. (However, resist the temptation to re-use the actual value
> +of the O_CLOEXEC constant, as it is architecture-specific and is part of a
> +numbering space of O_* flags that is fairly full.)
> +
> +If your system call returns a new file descriptor, you should also consider
> +what it means to use the poll(2) family of system calls on that file
> +descriptor. Making a file descriptor ready for reading or writing is the
> +normal way for the kernel to indicate to userspace that an event has
> +occurred on the corresponding kernel object.
> +
> +If your new xyzzy(2) system call involves a filename argument:
> +
> +    int sys_xyzzy(const char __user *path, ..., unsigned int flags);
> +
> +you should also consider whether an xyzzyat(2) version is more appropriate:
> +
> +    int sys_xyzzyat(int dfd, const char __user *path, ..., unsigned int flags);
> +
> +This allows more flexibility for how userspace specifies the file in question;
> +in particular it allows userspace to request the functionality for an
> +already-opened file descriptor using the AT_EMPTY_PATH flag, effectively giving
> +an fxyzzy(3) operation for free:
> +
> + - xyzzyat(AT_FDCWD, path, ..., 0) is equivalent to xyzzy(path,...)
> + - xyzzyat(fd, "", ..., AT_EMPTY_PATH) is equivalent to fxyzzy(fd, ...)
> +
> +(For more details on the rationale of the *at() calls, see the openat(2) man
> +page; for an example of AT_EMPTY_PATH, see the statat(2) man page.)
> +
> +If your new xyzzy(2) system call involves a parameter describing an offset
> +within a file, make its type loff_t so that 64-bit offsets can be supported
> +even on 32-bit architectures.
> +
> +If your new xyzzy(2) system call involves privileged functionality, it
> +needs to be governed by the appropriate Linux capability bit (checked with a
> +call to capable()), as described in the capabilities(7) man page.
> +
> + - If there is an existing capability that governs related functionality, then
> +   use that.  However, avoid combining lots of only vaguely related functions
> +   together under the same bit, as this goes against capabilities' purpose of
> +   splitting the power of root.  In particular, avoid adding new uses of the
> +   already overly-general CAP_SYS_ADMIN capability.
> + - If there is no related capability, then consider adding a new capability
> +   bit -- but bear in mind that the numbering space is limited, and each new
> +   bit needs to be understood and administered by sysadmins.
> +
> +If your new xyzzy(2) system call manipulates a process other than the calling
> +process, it should be restricted (using a call to ptrace_may_access()) so that
> +only a calling process with the same permissions as the target process, or
> +with the necessary capabilities, can manipulate the target process.
> +
> +Finally, be aware that some non-x86 architectures have an easier time if
> +system call parameters that are explicitly 64-bit fall on odd-numbered
> +arguments (i.e. parameter 1, 3, 5), to allow use of contiguous pairs of 32-bit
> +registers.  (This concern does not apply if the arguments are part of a
> +structure that's passed in by pointer.)
> +
> +
> +Proposing the API
> +-----------------
> +
> +To make new system calls easy to review, it's best to divide up the patchset
> +into separate chunks.  These should include at least the following items as
> +distinct commits (each of which is described further below):
> +
> + - The core implementation of the system call, together with prototypes,
> +   generic numbering, Kconfig changes and fallback stub implementation.
> + - Wiring up of the new system call for one particular architecture, usually
> +   x86 (including all of x86_64, x86_32 and x32).
> + - A demonstration of the use of the new system call in userspace via a
> +   selftest in tools/testing/selftests/.
> + - A draft man-page for the new system call, either as plain text in the
> +   cover letter, or as a patch to the (separate) man-pages repository.
> +
> +New system call proposals, like any change to the kernel's API, should always
> +be cc'ed to linux-api@...r.kernel.org
> +
> +
> +Generic System Call Implementation
> +----------------------------------
> +
> +The main entry point for your new xyzzy(2) system call will be called
> +sys_xyzzy(), but you add this entry point with the appropriate
> +SYSCALL_DEFINEn() macro rather than explicitly.  The 'n' indicates the number
> +of arguments to the system call, and the macro takes the system call name
> +followed by the (type, name) pairs for the parameters as arguments.  Using
> +this macro allows metadata about the new system call to be made available for
> +other tools.
> +
> +The new entry point also needs a corresponding function prototype, in
> +include/linux/syscalls.h, marked as asmlinkage to match the way that system
> +calls are invoked:
> +
> +    asmlinkage long sys_xyzzy(...);
> +
> +Some architectures (e.g. x86) have their own architecture-specific syscall
> +tables, but several other architectures share a generic syscall table. Add your
> +new system call to the generic list by adding an entry to the list in
> +include/uapi/asm-generic/unistd.h:
> +
> +    #define __NR_xyzzy 292
> +    __SYSCALL(__NR_xyzzy, sys_xyzzy)
> +
> +Also update the __NR_syscalls count to reflect the additional system call, and
> +note that if multiple new system calls are added in the same merge window,
> +your new syscall number may get adjusted to resolve conflicts.
> +
> +The file kernel/sys_ni.c provides a fallback stub implementation of each system
> +call, returning -ENOSYS.  Add your new system call here too:
> +
> +    cond_syscall(sys_xyzzy);
> +
> +Your new kernel functionality, and the system call that controls it, should
> +normally be optional, so add a CONFIG option (typically to init/Kconfig) for
> +it. As usual for new CONFIG options:
> +
> + - Include a description of the new functionality and system call controlled
> +   by the option.
> + - Make the option depend on EXPERT if it should be hidden from normal users.
> + - Make any new source files implementing the function dependent on the CONFIG
> +   option in the Makefile (e.g. "obj-$(CONFIG_XYZZY_SYSCALL) += xyzzy.c").
> + - Double check that the kernel still builds with the new CONFIG option turned
> +   off.
> +
> +To summarize, you need a commit that includes:
> +
> + - CONFIG option for the new function, normally in init/Kconfig
> + - SYSCALL_DEFINEn(xyzzy, ...) for the entry point
> + - corresponding prototype in include/linux/syscalls.h
> + - generic table entry in include/uapi/asm-generic/unistd.h
> + - fallback stub in kernel/sys_ni.c
> +
> +
> +x86 System Call Implementation
> +------------------------------
> +
> +To wire up your new system call for x86 platforms, you need to update the
> +master syscall tables.  Assuming your new system call isn't special in some
> +way (see below), this involves a "common" entry (for x86_64 and x32) in
> +arch/x86/entry/syscalls/syscall_64.tbl:
> +
> +    333   common   xyzzy     sys_xyzzy
> +
> +and an "i386" entry in arch/x86/entry/syscalls/syscall_32.tbl:
> +
> +    380   i386     xyzzy     sys_xyzzy
> +
> +Again, these numbers are liable to be changed if there are conflicts in the
> +relevant merge window.
> +
> +
> +Compatibility System Calls (Generic)
> +------------------------------------
> +
> +For most system calls the same 64-bit implementation can be invoked even when
> +the userspace program is itself 32-bit; even if the system call's parameters
> +include an explicit pointer, this is handled transparently.
> +
> +However, there are a couple of situations where a compatibility layer is
> +needed to cope with size differences between 32-bit and 64-bit.
> +
> +The first is if the 64-bit kernel also supports 32-bit userspace programs, and
> +so needs to parse areas of (__user) memory that could hold either 32-bit or
> +64-bit values.  In particular, this is needed whenever a system call argument
> +is:
> +
> + - a pointer to a pointer
> + - a pointer to a struct containing a pointer (e.g. struct iovec __user *)
> + - a pointer to a varying sized integral type (time_t, off_t, long, ...)
> + - a pointer to a struct containing a varying sized integral type.
> +
> +The second situation that requires a compatibility layer is if one of the
> +system call's arguments has a type that is explicitly 64-bit even on a 32-bit
> +architecture, for example loff_t or __u64.  In this case, a value that arrives
> +at a 64-bit kernel from a 32-bit application will be split into two 32-bit
> +values, which then need to be re-assembled in the compatibility layer.
> +
> +(Note that a system call argument that's a pointer to an explicit 64-bit type
> +does *not* need a compatibility layer; for example, splice(2)'s arguments of
> +type loff_t __user * do not trigger the need for a compat_ system call.)
> +
> +The compatibility version of the system call is called compat_sys_xyzzy(), and
> +is added with the COMPAT_SYSCALL_DEFINEn() macro, analogously to
> +SYSCALL_DEFINEn.  This version of the implementation runs as part of a 64-bit
> +kernel, but expects to receive 32-bit parameter values and does whatever is
> +needed to deal with them.  (Typically, the compat_sys_ version converts the
> +values to 64-bit versions and either calls on to the sys_ version, or both of
> +them call a common inner implementation function.)
> +
> +The compat entry point also needs a corresponding function prototype, in
> +include/linux/compat.h, marked as asmlinkage to match the way that system
> +calls are invoked:
> +
> +    asmlinkage long compat_sys_xyzzy(...);
> +
> +If the system call involves a structure that is laid out differently on 32-bit
> +and 64-bit systems, say struct xyzzy_args, then the include/linux/compat.h
> +header file should also include a compat version of the structure (struct
> +compat_xyzzy_args) where each variable-size field has the appropriate compat_
> +type that corresponds to the type in struct xyzzy_args.  The
> +compat_sys_xyzzy() routine can then use this compat_ structure to parse the
> +arguments from a 32-bit invocation.
> +
> +For example, if there are fields:
> +
> +    struct xyzzy_args {
> +        const char __user *ptr;
> +        __kernel_long_t varying_val;
> +        u64 fixed_val;
> +        /* ... */
> +    };
> +
> +in struct xyzzy_args, then struct compat_xyzzy_args would have:
> +
> +    struct compat_xyzzy_args {
> +        compat_uptr_t ptr;
> +        compat_long_t varying_val;
> +        u64 fixed_val;
> +        /* ... */
> +    };
> +
> +The generic system call list also needs adjusting to allow for the compat
> +version; the entry in include/uapi/asm-generic/unistd.h should use
> +__SC_COMP rather than __SYSCALL:
> +
> +    #define __NR_xyzzy 292
> +    __SC_COMP(__NR_xyzzy, sys_xyzzy, compat_sys_xyzzy)
> +
> +To summarize, you need:
> +
> + - a COMPAT_SYSCALL_DEFINEn(xyzzy, ...) for the compat entry point
> + - corresponding prototype in include/linux/compat.h
> + - (if needed) 32-bit mapping struct in include/linux/compat.h
> + - instance of __SC_COMP not __SYSCALL in include/uapi/asm-generic/unistd.h
> +
> +
> +Compatibility System Calls (x86)
> +--------------------------------
> +
> +To wire up the x86 architecture of a system call with a compatibility version,
> +the entries in the syscall tables need to be adjusted.
> +
> +First, the entry in arch/x86/entry/syscalls/syscall_32.tbl gets an extra
> +column to indicate that a 32-bit userspace program running on a 64-bit kernel
> +should hit the compat entry point:
> +
> +    380   i386     xyzzy     sys_xyzzy    compat_sys_xyzzy
> +
> +Second, you need to figure out what should happen for the x32 ABI version of
> +the new system call.  There's a choice here: the layout of the arguments
> +should either match the 64-bit version or the 32-bit version.
> +
> +If there's a pointer-to-a-pointer involved, the decision is easy: x32 is
> +ILP32, so the layout should match the 32-bit version, and the entry in
> +arch/x86/entry/syscalls/syscall_64.tbl is split so that x32 programs hit the
> +compatibility wrapper:
> +
> +    333   64       xyzzy     sys_xyzzy
> +    ...
> +    555   x32      xyzzy     compat_sys_xyzzy
> +
> +If no pointers are involved, then it is preferable to re-use the 64-bit system
> +call for the x32 ABI (and consequently the entry in
> +arch/x86/entry/syscalls/syscall_64.tbl is unchanged).
> +
> +In either case, you should check that the types involved in your argument
> +layout do indeed map exactly from x32 (-mx32) to either the 32-bit (-m32) or
> +64-bit (-m64) equivalents.
> +
> +
> +System Calls Returning Elsewhere
> +--------------------------------
> +
> +For most system calls, once the system call is complete the user program
> +continues exactly where it left off -- at the next instruction, with the same
> +stack and registers as before the system call, and with the same virtual
> +memory space.
> +
> +However, a few system calls do things differently.  They might return to a
> +different location (rt_sigreturn) or change the memory space (fork/vfork/clone)
> +or even architecture (execve/execveat) of the program.
> +
> +To allow for this, the kernel implementation of the system call may need to
> +save and restore additional registers to the kernel stack, allowing complete
> +control of where and how execution continues after the system call.
> +
> +This is arch-specific, but typically involves defining assembly entry points
> +that save/restore additional registers and invoke the real system call entry
> +point.
> +
> +For x86_64, this is implemented as a stub_xyzzy entry point in
> +arch/x86/entry/entry_64.S, and the entry in the syscall table
> +(arch/x86/entry/syscalls/syscall_64.tbl) is adjusted to match:
> +
> +    333   common   xyzzy     stub_xyzzy
> +
> +The equivalent for 32-bit programs running on a 64-bit kernel is normally
> +called stub32_xyzzy and implemented in arch/x86/entry/entry_64_compat.S,
> +with the corresponding syscall table adjustment in
> +arch/x86/entry/syscalls/syscall_32.tbl:
> +
> +    380   i386     xyzzy     sys_xyzzy    stub32_xyzzy
> +
> +If the system call needs a compatibility layer (as in the previous section)
> +then the stub32_ version needs to call on to the compat_sys_ version of the
> +system call rather than the native 64-bit version.  Also, if the x32 ABI
> +implementation is not common with the x86_64 version, then its syscall
> +table will also need to invoke a stub that calls on to the compat_sys_
> +version.
> +
> +For completeness, it's also nice to set up a mapping so that user-mode Linux
> +still works -- its syscall table will reference stub_xyzzy, but the UML build
> +doesn't include arch/x86/entry/entry_64.S implementation (because UML
> +simulates registers etc).  Fixing this is as simple as adding a #define to
> +arch/x86/um/sys_call_table_64.c:
> +
> +    #define stub_xyzzy sys_xyzzy
> +
> +
> +Other Details
> +-------------
> +
> +Most of the kernel treats system calls in a generic way, but there is the
> +occasional exception that may need updating for your particular system call.
> +
> +The audit subsystem is one such special case; it includes (arch-specific)
> +functions that classify some special types of system call -- specifically
> +file open (open/openat), program execution (execve/exeveat) or socket
> +multiplexor (socketcall) operations. If your new system call is analogous to
> +one of these, then the audit system should be updated.
> +
> +More generally, if there is an existing system call that is analogous to your
> +new system call, it's worth doing a kernel-wide grep for the existing system
> +call to check there are no other special cases.
> +
> +
> +Testing
> +-------
> +
> +A new system call should obviously be tested; it is also useful to provide
> +reviewers with a demonstration of how user space programs will use the system
> +call.  A good way to combine these aims is to include a simple self-test
> +program in a new directory under tools/testing/selftests/.
> +
> +For a new system call, there will obviously be no libc wrapper function and so
> +the test will need to invoke it using syscall(); also, if the system call
> +involves a new userspace-visible structure, the corresponding header will need
> +to be installed to compile the test.
> +
> +Make sure the selftest runs successfully on all supported architectures.  For
> +example, check that it works when compiled as an x86_64 (-m64), x86_32 (-m32)
> +and x32 (-mx32) ABI program.
> +
> +For more extensive and thorough testing of new functionality, you should also
> +consider adding tests to the Linux Test Project, or to the xfstests project
> +for filesystem-related changes.
> + - https://linux-test-project.github.io/
> + - git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git
> +
> +
> +Man Page
> +--------
> +
> +All new system calls should come with a complete man page, ideally using groff
> +markup, but plain text will do.  If groff is used, it's helpful to include a
> +pre-rendered ASCII version of the man page in the cover email for the
> +patchset, for the convenience of reviewers.
> +
> +The man page should be cc'ed to linux-man@...r.kernel.org
> +For more details, see https://www.kernel.org/doc/man-pages/patches.html
> +
> +References and Sources
> +----------------------
> +
> + - LWN article from Michael Kerrisk on use of flags argument in system calls:
> +   https://lwn.net/Articles/585415/
> + - LWN article from Michael Kerrisk on how to handle unknown flags in a system
> +   call: https://lwn.net/Articles/588444/
> + - LWN article from Jake Edge describing constraints on 64-bit system call
> +   arguments: https://lwn.net/Articles/311630/
> + - Pair of LWN articles from David Drysdale that describe the system call
> +   implementation paths in detail for v3.14:
> +    - https://lwn.net/Articles/604287/
> +    - https://lwn.net/Articles/604515/
> + - Architecture-specific requirements for system calls are discussed in the
> +   syscall(2) man-page:
> +   http://man7.org/linux/man-pages/man2/syscall.2.html#NOTES
> + - Collated emails from Linus Torvalds discussing the problems with ioctl():
> +   http://yarchive.net/comp/linux/ioctl.html
> + - "How to not invent kernel interfaces", Arnd Bergmann,
> +   http://www.ukuug.org/events/linux2007/2007/papers/Bergmann.pdf
> + - LWN article from Michael Kerrisk on avoiding new uses of CAP_SYS_ADMIN:
> +   https://lwn.net/Articles/486306/
> + - Recommendation from Andrew Morton that all related information for a new
> +   system call should come in the same email thread:
> +   https://lkml.org/lkml/2014/7/24/641
> + - Recommendation from Michael Kerrisk that a new system call should come with
> +   a man page: https://lkml.org/lkml/2014/6/13/309
> + - Suggestion from Thomas Gleixner that x86 wire-up should be in a separate
> +   commit: https://lkml.org/lkml/2014/11/19/254
> + - Suggestion from Greg Kroah-Hartman that it's good for new system calls to
> +   come with a man-page & selftest: https://lkml.org/lkml/2014/3/19/710
> + - Discussion from Michael Kerrisk of new system call vs. prctl(2) extension:
> +   https://lkml.org/lkml/2014/6/3/411
> + - Suggestion from Ingo Molnar that system calls that involve multiple arguments
> +   should encapsulate those arguments in a struct, which includes a size field
> +   for future extensibility: https://lkml.org/lkml/2015/7/30/117
> + - Numbering oddities arising from (re-)use of O_* numbering space flags:
> +    - commit 75069f2b5bfb ("vfs: renumber FMODE_NONOTIFY and add to uniqueness
> +      check")
> +    - commit 12ed2e36c98a ("fanotify: FMODE_NONOTIFY and __O_SYNC in sparc
> +      conflict")
> +    - commit bb458c644a59 ("Safer ABI for O_TMPFILE")
> + - Discussion from Matthew Wilcox about restrictions on 64-bit arguments:
> +   https://lkml.org/lkml/2008/12/12/187
> + - Recommendation from Greg Kroah-Hartman that unknown flags should be
> +   policed: https://lkml.org/lkml/2014/7/17/577
> + - Recommendation from Linus Torvalds that x32 system calls should prefer
> +   compatibility with 64-bit versions rather than 32-bit versions:
> +   https://lkml.org/lkml/2011/8/31/244
> --
> 2.5.0.rc2.392.g76e840b
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/