[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJPywTLpLjFXNBJnNB2puaDKe0Ku_afSE-sRZiQ+ZGdmFQhaDA@mail.gmail.com>
Date: Thu, 13 Dec 2018 12:25:20 +0100
From: Marek Majkowski <marek@...udflare.com>
To: netdev@...r.kernel.org
Subject: splice() performance for TCP socket forwarding
Hi!
I'm basically trying to do TCP splicing in Linux. I'm focusing on
performance of the simplest case: receive data from one TCP socket,
write data to another TCP socket. I get poor performance with splice.
First, the naive code, pretty much:
while(1){
n = read(rs, buf);
write(ws, buf, n);
}
With GRO enabled, this code does roughly line-rate of 10Gbps, hovering
~50% of CPU in application (sys mostly).
When replaced with splice version:
pipe(pfd);
fcntl(pfd[0], F_SETPIPE_SZ, 1024 * 1024);
while(1) {
n = splice(rd, NULL, pfd[1], NULL, 1024*1024,
SPLICE_F_MOVE);
splice(pfd[0], NULL, wd, NULL, n, SPLICE_F_MOVE);
}
Full code:
https://gist.github.com/majek/c58a97b9be7d9217fe3ebd6c1328faaa#file-proxy-splice-c-L59
I get 100% cpu (sys) and dramatically worse performance (1.5x slower).
naive run of perf record ./proxy-splice shows:
5.73% [k] queued_spin_lock_slowpath
5.23% [k] ipt_do_table
4.72% [k] __splice_segment.part.59
4.72% [k] do_tcp_sendpages
3.47% [k] _raw_spin_lock_bh
3.36% [k] __x86_indirect_thunk_rax
(kernel 4.14.71)
Is it possible to squeeze more from splice? Is it possible to force
splice() to hang forever and not return quickly (SO_RCVLOWAT doesn't
work).
Is there another way of doing TCP splicing? I'm aware of TCP ZEROCOPY
that landed in 4.19.
Cheers,
Marek
Powered by blists - more mailing lists