[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXJAmw6XpNoAt=tTPACsJVjPD+i9wwnouifk0ym5vDb-xf6MQ@mail.gmail.com>
Date: Fri, 20 Dec 2024 15:42:53 -0800
From: John Ousterhout <ouster@...stanford.edu>
To: Arnd Bergmann <arnd@...db.de>
Cc: Jakub Kicinski <kuba@...nel.org>, Netdev <netdev@...r.kernel.org>,
Paolo Abeni <pabeni@...hat.com>, Eric Dumazet <edumazet@...gle.com>, Simon Horman <horms@...nel.org>
Subject: Re: [PATCH net-next v4 01/12] inet: homa: define user-visible API for Homa
On Fri, Dec 20, 2024 at 1:13 PM Arnd Bergmann <arnd@...db.de> wrote:
>
> On Fri, Dec 20, 2024, at 20:31, Jakub Kicinski wrote:
> > On Fri, 20 Dec 2024 09:59:53 -0800 John Ousterhout wrote:
> >> > > I see that "void *" is used in the declaration for struct msghdr
> >> > > (along with some other pointer types as well) and struct msghdr is
> >> > > part of several uAPI interfaces, no?
> >> >
> >> > Off the top off my head this use is a source of major pain, grep around
> >> > for compat_msghdr.
> >>
> >> How should I go about confirming that this __aligned_u64 is indeed the
> >> expected convention (sounds like you aren't certain)?
> >
> > Let me add Arnd Bergmann to the CC list, he will correct me if
> > I'm wrong. Otherwise you can trust my intuition :)
>
> You are right that for the purposes of the user API, structures
> should use __u64 or __aligned_u64 in place of pointers, there are
> some more details on this in Documentation/driver-api/ioctl.rst.
I have now changed the type from void * to __u64.
> What worries me more in this particular case is the way that
> this pointer is passed through setsockopt(), which really doesn't
> take any pointers in other protocols.
>
> I have not fully understood what is behind the pointer, but
> it looks like this gets stored in the kernel in a per-socket
> structure that is annotated as a kernel pointer, not a user
> pointer, which may cause additional problems.
It is actually a user pointer; I had forgotten the __user annotation.
I have fixed this now as well.
> I don't know if the same pointer ever points to a kernel
> structure, but if it does, that needs to be fixed first.
It doesn't.
> Assuming this is actually meant as a persistent __user
> pointer, I'm still unsure what this means if the socket is
> available to more than one process, e.g. through a fork()
> or explicit file descriptor passing, or if the original
> process dies while there is still a transfer in progress.
> I realize that there is a lot of information already out
> there that I haven't all read, so this is probably explained
> somewhere, but it would be nice to point to that documentation
> somewhere near the code to clarify the corner cases.
I hadn't considered this, but the buffering mechanism prevents the
same socket from being shared across processes. I'm okay with that:
I'm not sure that sharing between processes adds much value for Homa,
and the performance benefit from the buffer mechanism is quite large.
I will document this. Is there a way to prevent a socket from being
shared across processes (e.g. can I set close-on-exec from within the
kernel?) I don't think there is any risk to kernel integrity if the
socket does end up shared; the worst that will happen is that the
memory of one of the processes will get trashed because Homa will
write to memory that isn't actually buffer space in that process.
> That probably also explains what type of memory the
> __user buffer can point to, but I would like to make
> sure that this has well-defined behavior e.g. if that
> buffer is an mmap()ed file on NFS that was itself
> mounted over a homa socket. Is there any guarantee that
> this is either prohibited or is free of deadlocks and
> recursion?
Given the API incompatibilities between Homa and TCP, I don't think it
is possible to have NFS mounted over a Homa socket. But you raise the
issue of whether some kinds of addresses might not be suitable for
Homa's buffer use this way. I don't know enough about the various
possible kinds of memory to know what kinds of problems could occur.
My assumption is that the buffer area will be a simple mmap()ed
region. The only use Homa makes of the buffer address is to call
import_ubuf with addresses in the buffer region, followed by
skb_copy_datagram_iter with the resulting iov_iter.
Is there some way I can check the "kind" of memory behind the buffer
pointer, so Homa could reject anything other than the simple case?
However, even if Homa checks when it sets up the buffer region, the
application could always rearrange its memory later, so I'm not sure
this would be effective. Also, if there is some sort of weird memory
that would cause problems for Homa, couldn't an application pass this
same memory into a call to recvmsg on a normal socket? Given that
Homa's use of the address is similar to what already happens in other
sockets, I don't think that the fact that Homa keeps the address
around in the kernel for a long time should cause additional problems,
no?
-John-
Powered by blists - more mailing lists