[<prev] [next>] [day] [month] [year] [list]
Message-ID: <49FEAF6B.5090308@cosmosbay.com>
Date: Mon, 04 May 2009 11:03:39 +0200
From: Eric Dumazet <dada1@...mosbay.com>
To: Elad Lahav <elahav@...terloo.ca>
CC: linux-kernel@...r.kernel.org,
Linux Netdev List <netdev@...r.kernel.org>
Subject: Re: [PATCH] Implementation of the sendgroup() system call
Eric Dumazet a écrit :
> Elad Lahav a écrit :
>> The attached patch contains an implementation of sendgroup(), a system
>> call that allows a UDP packet to be transmitted efficiently to multiple
>> recipients. Use cases for this system call include live-streaming and
>> multi-player online games.
>> The basic idea is that the caller maintains a group - a list of IP
>> addresses and UDP ports - and calls sendgroup() with the group list and
>> a common payload. Optionally, the call allows for per-recipient data to
>> be prepended or appended to the shared block. The data is copied once in
>> the kernel into an allocated page, and the per-recipient socket buffers
>> point to that page. Savings come from avoiding both the multiple calls
>> and the multiple copies of the data required with regular socket
>> operations. We have measured an improvement of 42% in CPU utilisation
>> when using this system call with the Helix multimedia server (reference:
>> http://simula.no/~griff/nossdav2008/27-32.pdf).
>>
>> The patch includes two implementations: one as described above and one
>> that uses the udp_sendmsg() function in a tight loop inside the kernel
>> (and thus saves on mode switches, but not on data copies). The latter is
>> provided for reference and benchmarking only.
>>
>> Feedback is welcome.
>>
>
> Hi Elad
>
> Patch is not inlined, this is really asking for troubles, I doubt many people
> will actually read your patch...
>
> My comments are :
>
> 1) Lack of latency checks. Sending UDP on 1000 destinations is expensive.
> A syscall is not preemptable unless special conditions are met.
>
> 2) Lack of a 32/64 bits aware API. A 64bit kernel should be able to
> run a 32bit application using a sendgroup() syscall.
>
> 3) Are footer/header differents for each calls ? Maybe you need
> something better to avoid extra copies for them at each sendgroup() systemcall
>
> 4) One expensive thing on UDP sends is the route cache lookups. You could avoid
> this cost using 'connected' group setup (see point 3)
>
> ie using a different syscall to setup the group (and compute/lookup all needed routes)
> (this syscall would be able to add/delete members (with their footer/header) to socket group)
>
> Then sendgroup() would be really light, since it would provide a group identifier
> (can be a file descriptor -> mapping one group), and the UDP message to send.
Ah some other points : You forgot to include netdev (CCed on my message),
as some network guys dont read lkml every day :)
On your experiments, did you change NIC txqueue length ? (default being 1000)
Using sendgroup() or sendmsg(), you'll hit pretty fast the NIC queue limit anyway...
Also, since 2.6.25 added memory accounting on UDP sockets, you'll probably need to
increase SO_SNDBUF to avoid being blocked on sendmsg()/sendgroup() call
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists