[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAGXJAmzN4_y2fzHdm+tVncbSso4ZMYOV5WmE+A9sUm6by=rhwQ@mail.gmail.com>
Date: Wed, 30 Oct 2024 13:13:12 -0700
From: John Ousterhout <ouster@...stanford.edu>
To: akpm@...ux-foundation.org
Cc: Andrew Lunn <andrew@...n.ch>, netdev@...r.kernel.org
Subject: Re: [PATCH net-next 04/12] net: homa: create homa_pool.h and homa_pool.c
Hi Andrew,
Andrew Lunn suggested that I write to you in the hopes that you could
identify someone to review this patch series from the standpoint of
memory management:
https://patchwork.kernel.org/project/netdevbpf/list/?series=903993&state=*
The patch series introduces a new transport protocol, Homa, into the
kernel. Homa uses an unusual approach for managing receive buffers for
incoming messages. The traditional approach, where the application
provides a receive buffer in the recvmsg kernel call, results in poor
performance for Homa because it prevents Homa from overlapping the
copying of data to user space with receiving data over the network.
Instead, a Homa application mmaps a large region of memory and passes
its virtual address range to Homa. Homa associates the memory with a
particular socket, retains it for the life of the socket, and
allocates buffer space for incoming messages out of this region. The
recvmsg kernel call returns the location of the buffer(s), and can
also be used to return buffers back to Homa once the application has
finished processing messages. The code for managing this buffer space
is in the files homa_pool.c and homa_pool.h.
I gave a talk on this mechanism at NetDev last year, which may be
useful to provide more background. Slides and video are available
here:
https://netdevconf.info/0x17/sessions/talk/kernel-managed-user-buffers-in-homa.html
Thanks in advance for any help you can provide.
-John-
On Wed, Oct 30, 2024 at 9:03 AM Andrew Lunn <andrew@...n.ch> wrote:
>
> On Wed, Oct 30, 2024 at 08:46:33AM -0700, John Ousterhout wrote:
> > On Wed, Oct 30, 2024 at 5:54 AM Andrew Lunn <andrew@...n.ch> wrote:
> > > > I think this is a different problem from what page pools solve. Rather
> > > > than the application providing a buffer each time it calls recvmsg, it
> > > > provides a large region of memory in its virtual address space in
> > > > advance;
> > >
> > > Ah, O.K. Yes, page pool is for kernel memory. However, is the virtual
> > > address space mapped to pages and pinned? Or do you allocate pages
> > > into that VM range as you need them? And then free them once the
> > > application says it has completed? If you are allocating and freeing
> > > pages, the page pool might be useful for these allocations.
> >
> > Homa doesn't allocate or free pages for this: the application mmap's a
> > region and passes the virtual address range to Homa. Homa doesn't need
> > to pin the pages. This memory is used in a fashion similar to how a
> > buffer passed to recvmsg would be used, except that Homa maintains
> > access to the region for the lifetime of the associated socket. When
> > the application finishes processing an incoming message, it notifies
> > Homa so that Homa can reuse the message's buffer space for future
> > messages; there's no page allocation or freeing in this process.
>
> I clearly don't know enough about memory management! I would of
> expected the kernel to do lazy allocation of pages to VM addresses as
> needed. Maybe it is, and when you actually access one of these missing
> pages, you get a page fault and the MM code is kicking in to put an
> actual page there? This could all be hidden inside the copy_to_user()
> call.
>
> > > Taking a step back here, the kernel already has a number of allocators
> > > and ideally we don't want to add yet another one unless it is really
> > > required. So it would be good to get some reviews from the MM people.
> >
> > I'm happy to do that if you still think it's necessary; how do I do that?
>
> Reach out to Andrew Morton <akpm@...ux-foundation.org>, the main
> Memory Management Maintainer. Ask who a good person would be to review
> this code.
>
> Andrew
Powered by blists - more mailing lists