lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230425135429.GQ3390869@ZenIV>
Date:   Tue, 25 Apr 2023 14:54:29 +0100
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Christian Brauner <brauner@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [GIT PULL] pidfd updates

On Tue, Apr 25, 2023 at 02:34:15PM +0200, Christian Brauner wrote:

> It is rife with misunderstandings just looking at what we did in
> kernel/fork.c earlier:
> 
> 	retval = get_unused_fd_flags(O_RDWR | O_CLOEXEC);
>         [...]
>         pidfile = anon_inode_getfile("[pidfd]", &pidfd_fops, pid,
>                                      O_RDWR | O_CLOEXEC);
> 
> seeing where both get_unused_fd_flags() and both *_getfile() take flag
> arguments. Sure, for us this is pretty straightforward since we've seen
> that code a million times. For others it's confusing why there's two
> flag arguments. Sure, we could use one flags argument but it still be
> weird to look at.

First of all, get_unused_fd_flags() doesn't give a damn about O_RDWR and
anon_inode_getfile() - about O_CLOEXEC.  Duplicating the expression in
places like that is cargo-culting.
 
> But with this api we also force all users to remember that they need to
> cleanup the fd and the file - but definitely one - depending on where
> they fail.
> 
> But conceptually the fd and the file belong together. After all it's the
> file we're about to make that reserved fd refer to.

Why?  The common pattern is
	* reserve the descriptor
	* do some work that might fail
	* get struct file set up (which also might fail)
	* do more work (possibly including copyout, etc.)
	* install file into descriptor table

We want to reserve the descriptor early, since it's about the easiest
thing to undo.  Why bother doing it next to file setup?  That can be
pretty deep in call chain and doing it that way comes with headache
about passing the damn thing around.  And cleanup rules with your
variant end up being more complicated.

Keep the manipulations of descriptor table as close to the surface
as possible.  The real work is with file references; descriptors
belong in userland and passing them around kernel-side is asking
for headache.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ