netdev - Re: behavior of recvmmsg() on blocking sockets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100327142658.GO3625@ghostprotocols.net>
Date:	Sat, 27 Mar 2010 11:26:58 -0300
From:	Arnaldo Carvalho de Melo <acme@...radead.org>
To:	Brandon Black <blblack@...il.com>
Cc:	Chris Friesen <cfriesen@...tel.com>, linux-kernel@...r.kernel.org,
	netdev@...r.kernel.org
Subject: Re: behavior of recvmmsg() on blocking sockets

Em Sat, Mar 27, 2010 at 08:19:09AM -0500, Brandon Black escreveu:
> On Wed, Mar 24, 2010 at 2:55 PM, Brandon Black <blblack@...il.com> wrote:
> > On Wed, Mar 24, 2010 at 2:36 PM, Chris Friesen <cfriesen@...tel.com> wrote:
> >> Consider the case where you want to do some other useful work in
> >> addition to running your network server.  Every cpu cycle spent on the
> >> network server is robbed from the other work.  In this scenario you want
> >> to handle packets as efficiently as possible, so the timeout-based
> >> behaviour is better since it is more likely to give you multiple packets
> >> per syscall.
> >
> > That's a good point, I tend to tunnelvision on the dedicated server
> > scenario.  I should probably have a user-level option for
> > timeout-based operation as well, since the decision here gets to the
> > systems admin/engineering level and will be situational.
> 
> I've been playing with the timeout argument to recvmmsg as well now,
> and I'm struggling to see how one would ever use it correctly with the
> current implementation.  It seems to rely on the assumption of a
> never-ending stream of tightly-spaced input packets?  It seems like it

As said by somebody else in this recent discussion (perhaps Chris), it
is based on the maximum latency acceptable.

If minimum latency is desired, use a zero timeout and get as many
packets get queued up while the application is processing the last
batch.

If instead more packets are desired per batch and some latency is
acceptable, use a timeout.

10 Gbit/s interfaces were the target but results with simple app
published when the syscall was posted initially showed that even on 1
1 Gbit/s eth this helped.

> was meant for usage on blocking sockets.  Given a blocking socket with
> timeout 0 (infinite), and a recvmmsg timeout of 100us, if you had a
> very steady stream of input packets, it recvmmsg would pull in all of
> them that it could within a max timeframe of (100us +
> time_to_execute_one_recvmsg).  However, any disruption to the input
> stream for a time-window of N would result in delaying some
> already-received packets by N.  For example, consider the case that 2
> packets are already queued when you invoke recvmmsg(), but then the
> next packet doesn't arrive for another 300ms.  In this scenario, you'd
> end up with recvmmsg() blocking for 300ms and then returning all 3
> packets, two of which have been delayed way beyond the specified
> timeout.

And that is a use case that is fixed by your patch, thanks, now we cover
more use cases :-)

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html