[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 12 Dec 2016 17:24:26 -0500
From: Josef Bacik <jbacik@...com>
To: Hannes Frederic Sowa <hannes@...essinduktion.org>
CC: Eric Dumazet <eric.dumazet@...il.com>,
Tom Herbert <tom@...bertland.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Josef Bacik <jbacik@...com>
Subject: Re: Soft lockup in inet_put_port on 4.6
On Mon, Dec 12, 2016 at 1:44 PM, Hannes Frederic Sowa
<hannes@...essinduktion.org> wrote:
> On 12.12.2016 19:05, Josef Bacik wrote:
>> On Fri, Dec 9, 2016 at 11:14 PM, Eric Dumazet
>> <eric.dumazet@...il.com>
>> wrote:
>>> On Fri, 2016-12-09 at 19:47 -0800, Eric Dumazet wrote:
>>>
>>>>
>>>> Hmm... Is your ephemeral port range includes the port your load
>>>> balancing app is using ?
>>>
>>> I suspect that you might have processes doing bind( port = 0) that
>>> are
>>> trapped into the bind_conflict() scan ?
>>>
>>> With 100,000 + timewaits there, this possibly hurts.
>>>
>>> Can you try the following loop breaker ?
>>
>> It doesn't appear that the app is doing bind(port = 0) during normal
>> operation. I tested this patch and it made no difference. I'm
>> going to
>> test simply restarting the app without changing to the SO_REUSEPORT
>> option. Thanks,
>
> Would it be possible to trace the time the function uses with trace?
> If
> we don't see the number growing considerably over time we probably can
> rule out that we loop somewhere in there (I would instrument
> inet_csk_bind_conflict, __inet_hash_connect and inet_csk_get_port).
>
> __inet_hash_connect -> __inet_check_established also takes a lock
> (inet_ehash_lockp) which can be locked from inet_diag code path during
> socket diag info dumping.
>
> Unfortunately we couldn't reproduce it so far. :/
So I had a bcc script running to time how long we spent in
inet_csk_bind_conflict, __inet_hash_connect and inet_csk_get_port, but
of course I'm an idiot and didn't actually separate out the stats so I
could tell _which_ one was taking forever. But anyway here's a normal
distribution on the box
Some shit : count distribution
0 -> 1 : 0 |
|
2 -> 3 : 0 |
|
4 -> 7 : 0 |
|
8 -> 15 : 0 |
|
16 -> 31 : 0 |
|
32 -> 63 : 0 |
|
64 -> 127 : 0 |
|
128 -> 255 : 0 |
|
256 -> 511 : 0 |
|
512 -> 1023 : 0 |
|
1024 -> 2047 : 74 |
|
2048 -> 4095 : 10537
|****************************************|
4096 -> 8191 : 8497 |********************************
|
8192 -> 16383 : 3745 |**************
|
16384 -> 32767 : 300 |*
|
32768 -> 65535 : 250 |
|
65536 -> 131071 : 180 |
|
131072 -> 262143 : 71 |
|
262144 -> 524287 : 18 |
|
524288 -> 1048575 : 5 |
|
With the times in nanoseconds, and here's the distribution during the
problem
Some shit : count distribution
0 -> 1 : 0 |
|
2 -> 3 : 0 |
|
4 -> 7 : 0 |
|
8 -> 15 : 0 |
|
16 -> 31 : 0 |
|
32 -> 63 : 0 |
|
64 -> 127 : 0 |
|
128 -> 255 : 0 |
|
256 -> 511 : 0 |
|
512 -> 1023 : 0 |
|
1024 -> 2047 : 21 |
|
2048 -> 4095 : 21820
|****************************************|
4096 -> 8191 : 11598 |*********************
|
8192 -> 16383 : 4337 |*******
|
16384 -> 32767 : 290 |
|
32768 -> 65535 : 59 |
|
65536 -> 131071 : 23 |
|
131072 -> 262143 : 12 |
|
262144 -> 524287 : 6 |
|
524288 -> 1048575 : 19 |
|
1048576 -> 2097151 : 1079 |*
|
2097152 -> 4194303 : 0 |
|
4194304 -> 8388607 : 1 |
|
8388608 -> 16777215 : 0 |
|
16777216 -> 33554431 : 0 |
|
33554432 -> 67108863 : 1192 |**
|
Some shit : count distribution
0 -> 1 : 0 |
|
2 -> 3 : 0 |
|
4 -> 7 : 0 |
|
8 -> 15 : 0 |
|
16 -> 31 : 0 |
|
32 -> 63 : 0 |
|
64 -> 127 : 0 |
|
128 -> 255 : 0 |
|
256 -> 511 : 0 |
|
512 -> 1023 : 0 |
|
1024 -> 2047 : 48 |
|
2048 -> 4095 : 14714
|********************|
4096 -> 8191 : 6769 |*********
|
8192 -> 16383 : 2234 |***
|
16384 -> 32767 : 422 |
|
32768 -> 65535 : 208 |
|
65536 -> 131071 : 61 |
|
131072 -> 262143 : 10 |
|
262144 -> 524287 : 416 |
|
524288 -> 1048575 : 826 |*
|
1048576 -> 2097151 : 598 |
|
2097152 -> 4194303 : 10 |
|
4194304 -> 8388607 : 0 |
|
8388608 -> 16777215 : 1 |
|
16777216 -> 33554431 : 289 |
|
33554432 -> 67108863 : 921 |*
|
67108864 -> 134217727 : 74 |
|
134217728 -> 268435455 : 75 |
|
268435456 -> 536870911 : 48 |
|
536870912 -> 1073741823 : 25 |
|
1073741824 -> 2147483647 : 3 |
|
2147483648 -> 4294967295 : 2 |
|
4294967296 -> 8589934591 : 1 |
|
As you can see we start getting tail latencies of up to 4-8 seconds.
Tomorrow I'll separate out the stats so we can know which function is
the problem child. Sorry about not doing that first. Thanks,
Josef
Powered by blists - more mailing lists