lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGXJAmxKM5a95uhBwbmm1Z427=bGyZhcCUopycLMTEfc4dHnew@mail.gmail.com>
Date:   Sun, 13 Nov 2022 21:37:24 -0800
From:   John Ousterhout <ouster@...stanford.edu>
To:     Andrew Lunn <andrew@...n.ch>
Cc:     Jiri Pirko <jiri@...nulli.us>,
        Stephen Hemminger <stephen@...workplumber.org>,
        netdev@...r.kernel.org
Subject: Re: Upstream Homa?

On Sun, Nov 13, 2022 at 12:38 PM Andrew Lunn <andrew@...n.ch> wrote:
>
> On Sun, Nov 13, 2022 at 12:10:22PM -0800, John Ousterhout wrote:
> > On Sun, Nov 13, 2022 at 9:10 AM Andrew Lunn <andrew@...n.ch> wrote:
> > >
> > > > Homa implements RPCs rather than streams like TCP or messages like
> > > > UDP. An RPC consists of a request message sent from client to server,
> > > > followed by a response message from server back to client. This requires
> > > > additional information in the API beyond what is provided in the arguments to
> > > > sendto and recvfrom. For example, when sending a request message, the
> > > > kernel returns an RPC identifier back to the application; when waiting for
> > > > a response, the application can specify that it wants to receive the reply for
> > > > a specific RPC identifier (or, it can specify that it will accept any
> > > > reply, or any
> > > > request, or both).
> > >
> > > This sounds like the ancillary data you can pass to sendmsg(). I've
> > > not checked the code, it might be the current plumbing is only into to
> > > the kernel, but i don't see why you cannot extend it to also allow
> > > data to be passed back to user space. If this is new functionality,
> > > maybe add a new flags argument to control it.
> > >
> > > recvmsg() also has ancillary data.
> >
> > Whoah! I'd never noticed the msg_control and msg_controllen fields before.
> > These may be sufficient to do everything Homa needs. Thanks for pointing
> > this out.
>
> Is zero copy also required? https://lwn.net/Articles/726917/ talks
> about this. But rather than doing the transmit complete notification
> via MSG_ERRORQUEUE, maybe you could make it part of the ancillary data
> for a later message? That could save you some system calls? Or is the
> latency low enough that the RPC reply acts an implicitly indication
> the transmit buffer can be recycled?
>
> If your aim is to offload Homa to the NIC, it seems like zero copy is
> something you want, so even if you are not implementing it now, you
> probably should consider what the uAPI looks like.

I know that zero copy is all the rage these days, but I've become somewhat of
a skeptic. We spent quite a bit of time in the RAMCloud project
implementing zero
copy (and we were using kernel-bypass NICs, which make it about as efficient as
possible); we found that it is very difficult to get a real performance benefit.
Managing the space so you know when you can reclaim it adds a lot of complexity
and overhead. My current thinking is that zero copy only makes sense when you
have really large blocks of data. I'm inclined to let others
experiment with zero-copy
for a while and see if they can achieve sustainable benefits over a
meaningful range
of operating conditions.

-John-

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ