[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <303c8600e4964d1593b038239779ba4b@AcuMS.aculab.com>
Date: Thu, 27 Feb 2020 13:19:15 +0000
From: David Laight <David.Laight@...LAB.COM>
To: netdev <netdev@...r.kernel.org>
Subject: sys_sendto() spinning for 1.6ms
I'm looking into unexpected delays in some real time (RTP audio) processing.
Mostly they are a few 100 usecs in the softint code.
However I've just triggered in sendto() on a raw IPV4 socket taking 1.6ms.
Most of the sends take less than 32us.
The process isn't sleeping and there are no interrupts.
Any idea where it might be spinning?
I'm running ftrace monitoring scheduler events, system calls and
interrupts (hard and soft) but can't turn on anything other than
selective function trace (the trace cost is far too high).
The ftrace at the time of the sendto() is (I'm only running 4 cpu):
pid-28219 [002] .... 1979891.159912: sys_sendto(fd: 394, buff: 7ffbcb2e2940, len: c8, flags: 0, addr: 7ffbccdc9490, addr_len: 10)
pid-28217 [003] .... 1979891.159912: sys_recvfrom(fd: 23c, ubuf: 7ffbcb1db100, size: 748, flags: 0, addr: 7ffbcddcb5b0, addr_len: 7ffbcddcb5ac)
pid-28218 [001] .... 1979891.159912: sys_futex(uaddr: 113d8e0, op: 80, val: 2, utime: 0, uaddr2: 113d8e0, val3: 6e3a)
pid-28216 [000] .... 1979891.159912: sys_futex(uaddr: 113d8e0, op: 81, val: 1, utime: 7ffbce5cc920, uaddr2: 113d8e0, val3: 6e38)
pid-28217 [003] .... 1979891.159912: sys_recvfrom -> 0xac
Cpu 2 trace is:
pid-28219 [002] .... 1979891.159910: sys_futex(uaddr: 113d8e0, op: 81, val: 1, utime: 7ffbccdc9920, uaddr2: 113d8e0, val3: 6e3b)
pid-28219 [002] .... 1979891.159911: sys_futex -> 0x0
pid-28219 [002] .... 1979891.159912: sys_sendto(fd: 394, buff: 7ffbcb2e2940, len: c8, flags: 0, addr: 7ffbccdc9490, addr_len: 10)
pid-28219 [002] .... 1979891.161647: sys_sendto -> 0xc8
pid-28219 [002] .... 1979891.161648: sys_pread64(fd: 3ae, buf: 7ffbccdc9420, count: 1, pos: 0)
pid-28219 [002] .... 1979891.161650: sys_pread64 -> 0x1
pid-28219 [002] .... 1979891.161651: sys_pwrite64(fd: 3ae, buf: d495e8, count: 1, pos: 0)
The previous send on fd 394 was much earlier and that socket is only used
by that process, and only for sends (to avoid sleeping on the socket lock).
It can't be an problem with the socket send buffer - because that would sleep.
The other 3 cpu carry on processing (the futex calls return immediately).
In fact they all sleep 1.4ms before the sento() finishes.
At the moment the trace is a sample of 1 - so I don't know if the concurrent
sento() and recvfrom() are significant.
They will both be the same remote IP, and might be the same port.
Ideas before I start trying to bisect the call graph?
A rough histogram of the time (ns) for the sendto() call.
0k: 71866866
48k: 99213
80k: 3583
112k: 371
144k: 245
176k: 694
208k: 382
240k: 214
272k: 20
304k: 13
336k: 15
368k: 13
400k: 11
432k: 14
464k: 12
496k: 7
528k: 10
560k: 2
592k: 1
624k: 2
656k: 1
688k: 0 (twice)
752k: 1
784k: 0 (twice)
848k: 1
880k: 1
912k: 0 (24 times)
1680k: 1
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Powered by blists - more mailing lists