lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Tue, 18 Jul 2023 14:03:02 +0200
From: Alejandro Colomar <alx@...nel.org>
To: Matthew House <mattlloydhouse@...il.com>
Cc: linux-man@...r.kernel.org, linux-api@...r.kernel.org,
 netdev@...r.kernel.org, Ulrich Drepper <drepper@...il.com>,
 Ulrich Drepper <drepper@...hat.com>, "David S. Miller"
 <davem@...emloft.net>, Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2] recv.2: Document MSG_CMSG_CLOEXEC as returned in
 msg_flags

Hi Matthew,

On 2023-07-18 08:00, Matthew House wrote:
> On Mon, Jul 17, 2023 at 7:10 PM Alejandro Colomar <alx@...nel.org> wrote:
>> Hi Matthew,
>>
>> I don't understand what's the purpose of this.  The kernel sets a bit
>> just to report to the caller that it set a bit?  No other purpose?
>> It feels very weird.  Of course, the caller already has that info,
>> doesn't it?
> 
> The main reason I posted this patch was because I was confused by the
> flag's presence in the msg_flags when I was looking at some strace logs, so
> I figured that it would be a good idea to document it.

Makes sense.

> As for the original
> purpose of the behavior, it's not really clear, and it may well have been
> an implementation artifact that got enshrined in the user space ABI. (Even
> io_uring is careful to replicate this behavior!)

This is what worries me.  I've CCd a bunch of people to see if they can
bring some light.

> 
> This behavior began when the MSG_CMSG_CLOEXEC flag was first added in Linux
> 2.6.23, with Ulrich Drepper's commit 4a19542e5f69 ("O_CLOEXEC for
> SCM_RIGHTS"). Per the commit message, the flag was designed to be
> "passe[d]... just like the existing MSG_CMSG_COMPAT flag". Since it was
> added to the msg_flags at the start of sys_recvmsg(), the
> scm_detach_fds[_compat]() functions in net/core/scm.c and net/compat.c
> could read the flag off of msg->msg_flags without having to thread the
> recvmsg() flags through.
> 
> This was indeed similar to the behavior of MSG_CMSG_COMPAT. That flag was
> added in Linux 2.5.65, with commit 3225fc8a85f4 ("[NET]: Simplify scm
> handling and sendmsg/recvmsg invocation, consolidate net compat
> syscalls."), in which put_cmsg() and scm_detach_fds() in net/core/scm.c
> read it off of msg->msg_flags. (It wouldn't actually be set in msg_flags
> until Linux 2.5.67, with commit 7e8d06bc1d90, "[COMPAT]: Fix
> MSG_CMSG_COMPAT flag passing, kill cmsg_compat_recvmsg_fixup." Both of
> these commits are from history/history.git.)
> 
> However, the MSG_CMSG_COMPAT flag has been scrubbed from the output
> msg_flags since Linux 2.6.14, with commit 37f7f421cce1 ("[NET]: Do not leak
> MSG_CMSG_COMPAT into userspace."). That's what I find so unclear:
> MSG_CMSG_CLOEXEC was added after the kernel started scrubbing
> MSG_CMSG_COMPAT from the output, but the new flag was never written to be
> similarly scrubbed.
> 
> Later, in Linux 3.10, with commits 1be374a0518a ("net: Block
> MSG_CMSG_COMPAT in send(m)msg and recv(m)msg") and a7526eb5d06b ("net:
> Unbreak compat_sys_{send,recv}msg"), MSG_CMSG_COMPAT was banned from being
> passed to the *msg() syscalls' flags from user space, with the rationale
> that they were "not intended to be part of the API". Then, in Linux 4.0, we
> reached the current status quo with commit d720d8cec563 ("net: compat:
> Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg"), where
> MSG_CMSG_COMPAT is allowed (and a no-op) in compat syscalls, but banned
> from non-compat syscalls.
> 
> So I agree that it's very weird that this flag gets returned to user space,
> even while the internal flag that it's modeled after doesn't. I suppose I
> could spin up a nice story, where the user-space function calling recvmsg()
> is totally separate from the function processing the returned struct
> msghdr, and the latter function would really like to know whether the fds
> in that message are close-on-exec without having to call fcntl(F_GETFD).
> But that's all just a total guess. If you want to know for sure, perhaps
> cc'ing Drepper may be worthwhile?
> 
> A cursory look hasn't shown me any existing user-space code that depends on
> this behavior. Though one library appears to be aware of this behavior,
> actively filtering MSG_CMSG_CLOEXEC out of the result flags:
> <https://github.com/dutchanddutch/node-socket-calls/blob/ca759a0da87cb112875d158f4a81b45b31f4a871/src/socket_calls.cc#L417>
> 
> Also, only somewhat relatedly, some libraries incorrectly attempt to
> request MSG_CMSG_CLOEXEC by passing it into the msg_flags field instead of
> the flags argument:
> <https://git.samba.org/samba.git/?p=samba.git;a=blob;f=lib/messaging/messages_dgm.c;hb=refs/tags/samba-4.17.9#l1272>
> <https://github.com/genodelabs/genode/blob/23.05/repos/base-linux/src/lib/base/ipc.cc#L132>
> <https://github.com/proxmox/pve-lxc-syscalld/blob/a14430f3e75c2b695332ad712164e599464177fc/src/io/seq_packet.rs#L123>

In any case, all of this mail has been very interesting,
and it would be useful to have it in the commit message of the patch.
Please send an v2 with it, and add 'Cc:' tags for all these people:

Cc: <linux-api@...r.kernel.org>
Cc: <netdev@...r.kernel.org>
Cc: Ulrich Drepper <drepper@...il.com>
Cc: Ulrich Drepper <drepper@...hat.com>
Cc: "David S. Miller" <davem@...emloft.net>
Cc: Andrew Morton <akpm@...ux-foundation.org>

(I don't know if <drepper@...hat.com> still works.  Does anyone know?)

Thanks!
Alex

> 
> Thank you,
> Matthew House

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5



Download attachment "OpenPGP_signature" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ