[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aec45003-3354-e49f-b032-5297e98722eb@gmail.com>
Date: Mon, 30 Apr 2018 08:43:50 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: David Miller <davem@...emloft.net>, soheil.kdev@...il.com
Cc: netdev@...r.kernel.org, ycheng@...gle.com, ncardwell@...gle.com,
edumazet@...gle.com, willemb@...gle.com, soheil@...gle.com
Subject: Re: [PATCH V2 net-next 1/2] tcp: send in-queue bytes in cmsg upon
read
On 04/30/2018 08:38 AM, David Miller wrote:
> From: Soheil Hassas Yeganeh <soheil.kdev@...il.com>
> Date: Fri, 27 Apr 2018 14:57:32 -0400
>
>> Since the socket lock is not held when calculating the size of
>> receive queue, TCP_INQ is a hint. For example, it can overestimate
>> the queue size by one byte, if FIN is received.
>
> I think it is even worse than that.
>
> If another application comes in and does a recvmsg() in parallel with
> these calculations, you could even report a negative value.
>
> These READ_ONCE() make it look like some of these issues are being
> addressed but they are not.
>
> You could freeze the values just by taking sk->sk_lock.slock, but I
> don't know if that cost is considered acceptable or not.
>
> Another idea is to sample both values in a loop, similar to a sequence
> lock sequence:
>
> again:
> tmp1 = A;
> tmp2 = B;
> barrier();
> tmp3 = A;
> if (tmp1 != tmp3)
> goto again;
>
> But the current state of affairs is not going to work well.
>
We want a hint, and max_t(int, 0, ....) does not return a negative value ?
If the hint is wrong in 0.1 % of the cases, we really do not care, it is not meant
to replace the existing precise ( well, sort of ) mechanism.
I say sort of, because by the time we have any number, TCP might have received more packets anyway.
Powered by blists - more mailing lists