[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <dc5ddf13-b524-42a8-ed7a-5db91aaee4ef@gmail.com>
Date: Mon, 27 Sep 2021 14:33:53 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: Johannes Lundberg <jlundberg@...w.com>,
Eric Dumazet <edumazet@...gle.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
David Ahern <dsahern@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Florian Westphal <fw@...len.de>,
Alexander Aring <aahringo@...hat.com>,
Tonghao Zhang <xiangxia.m.yue@...il.com>,
Yangbo Lu <yangbo.lu@....com>,
Thomas Gleixner <tglx@...utronix.de>,
netdev <netdev@...r.kernel.org>,
Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH] fs: eventpoll: add empty event
On 9/27/21 2:17 PM, Johannes Lundberg wrote:
>
> On 9/27/21 1:47 PM, Eric Dumazet wrote:
>> On Mon, Sep 27, 2021 at 1:30 PM Johannes Lundberg <jlundberg@...w.com> wrote:
>>> The EPOLLEMPTY event will trigger when the TCP write buffer becomes
>>> empty, i.e., when all outgoing data have been ACKed.
>>>
>>> The need for this functionality comes from a business requirement
>>> of measuring with higher precision how much time is spent
>>> transmitting data to a client. For reference, similar functionality
>>> was previously added to FreeBSD as the kqueue event EVFILT_EMPTY.
>>
>> Adding yet another indirect call [1] in TCP fast path, for something
>> (measuring with higher precision..)
>> which is already implemented differently in TCP stack [2] is not desirable.
>>
>> Our timestamping infrastructure should be ported to FreeBSD instead :)
>>
>> [1] CONFIG_RETPOLINE=y
>>
>> [2] Refs :
>> commit e1c8a607b28190cd09a271508aa3025d3c2f312e
>> net-timestamp: ACK timestamp for bytestreams
>> tools/testing/selftests/net/txtimestamp.c
>
> Hi Eric
>
> Thanks for the feedback! If there's a way to achieve the same thing with current Linux I'm all for it. I'll look into how to use timestamps for this.
>
You are welcome !
Note that timestamping allows to trigger many events, even if write queue is not empty.
This is particularly useful when an application does not want a write queue to be drained,
since this would add transmit stalls.
Also, since the events are time stamped exactly when the relevant ACK are processed,
they are more accurate than something based on epoll, since I guess you would
get timestamps after a thread wakeup.
Powered by blists - more mailing lists