[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b4f06d11-4761-dabb-f641-5fc05c1c34fc@redhat.com>
Date: Wed, 2 Jan 2019 11:28:43 +0800
From: Jason Wang <jasowang@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: linux-kernel@...r.kernel.org, maxime.coquelin@...hat.com,
tiwei.bie@...el.com, wexu@...hat.com, jfreimann@...hat.com,
"David S. Miller" <davem@...emloft.net>,
virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org
Subject: Re: [PATCH RFC 1/2] virtio-net: bql support
On 2018/12/31 上午2:45, Michael S. Tsirkin wrote:
> On Thu, Dec 27, 2018 at 06:00:36PM +0800, Jason Wang wrote:
>> On 2018/12/26 下午11:19, Michael S. Tsirkin wrote:
>>> On Thu, Dec 06, 2018 at 04:17:36PM +0800, Jason Wang wrote:
>>>> On 2018/12/6 上午6:54, Michael S. Tsirkin wrote:
>>>>> When use_napi is set, let's enable BQLs. Note: some of the issues are
>>>>> similar to wifi. It's worth considering whether something similar to
>>>>> commit 36148c2bbfbe ("mac80211: Adjust TSQ pacing shift") might be
>>>>> benefitial.
>>>> I've played a similar patch several days before. The tricky part is the mode
>>>> switching between napi and no napi. We should make sure when the packet is
>>>> sent and trakced by BQL, it should be consumed by BQL as well. I did it by
>>>> tracking it through skb->cb. And deal with the freeze by reset the BQL
>>>> status. Patch attached.
>>>>
>>>> But when testing with vhost-net, I don't very a stable performance,
>>> So how about increasing TSQ pacing shift then?
>>
>> I can test this. But changing default TCP value is much more than a
>> virtio-net specific thing.
> Well same logic as wifi applies. Unpredictable latencies related
> to radio in one case, to host scheduler in the other.
>
>>>> it was
>>>> probably because we batch the used ring updating so tx interrupt may come
>>>> randomly. We probably need to implement time bounded coalescing mechanism
>>>> which could be configured from userspace.
>>> I don't think it's reasonable to expect userspace to be that smart ...
>>> Why do we need time bounded? used ring is always updated when ring
>>> becomes empty.
>>
>> We don't add used when means BQL may not see the consumed packet in time.
>> And the delay varies based on the workload since we count packets not bytes
>> or time before doing the batched updating.
>>
>> Thanks
> Sorry I still don't get it.
> When nothing is outstanding then we do update the used.
> So if BQL stops userspace from sending packets then
> we get an interrupt and packets start flowing again.
Yes, but how about the cases of multiple flows. That's where I see
unstable results.
>
> It might be suboptimal, we might need to tune it but I doubt running
> timers is a solution, timer interrupts cause VM exits.
Probably not a timer but a time counter (or event byte counter) in vhost
to add used and signal guest if it exceeds a value instead of waiting
the number of packets.
Thanks
>
>>>> Btw, maybe it's time just enable napi TX by default. I get ~10% TCP_RR
>>>> regression on machine without APICv, (haven't found time to test APICv
>>>> machine). But consider it was for correctness, I think it's acceptable? Then
>>>> we can do optimization on top?
>>>>
>>>>
>>>> Thanks
>>>>
>>>>
>>>>> Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
>>>>> ---
>>>>> drivers/net/virtio_net.c | 27 +++++++++++++++++++--------
>>>>> 1 file changed, 19 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>> index cecfd77c9f3c..b657bde6b94b 100644
>>>>> --- a/drivers/net/virtio_net.c
>>>>> +++ b/drivers/net/virtio_net.c
>>>>> @@ -1325,7 +1325,8 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
>>>>> return stats.packets;
>>>>> }
>>>>> -static void free_old_xmit_skbs(struct send_queue *sq)
>>>>> +static void free_old_xmit_skbs(struct send_queue *sq, struct netdev_queue *txq,
>>>>> + bool use_napi)
>>>>> {
>>>>> struct sk_buff *skb;
>>>>> unsigned int len;
>>>>> @@ -1347,6 +1348,9 @@ static void free_old_xmit_skbs(struct send_queue *sq)
>>>>> if (!packets)
>>>>> return;
>>>>> + if (use_napi)
>>>>> + netdev_tx_completed_queue(txq, packets, bytes);
>>>>> +
>>>>> u64_stats_update_begin(&sq->stats.syncp);
>>>>> sq->stats.bytes += bytes;
>>>>> sq->stats.packets += packets;
>>>>> @@ -1364,7 +1368,7 @@ static void virtnet_poll_cleantx(struct receive_queue *rq)
>>>>> return;
>>>>> if (__netif_tx_trylock(txq)) {
>>>>> - free_old_xmit_skbs(sq);
>>>>> + free_old_xmit_skbs(sq, txq, true);
>>>>> __netif_tx_unlock(txq);
>>>>> }
>>>>> @@ -1440,7 +1444,7 @@ static int virtnet_poll_tx(struct napi_struct *napi, int budget)
>>>>> struct netdev_queue *txq = netdev_get_tx_queue(vi->dev, vq2txq(sq->vq));
>>>>> __netif_tx_lock(txq, raw_smp_processor_id());
>>>>> - free_old_xmit_skbs(sq);
>>>>> + free_old_xmit_skbs(sq, txq, true);
>>>>> __netif_tx_unlock(txq);
>>>>> virtqueue_napi_complete(napi, sq->vq, 0);
>>>>> @@ -1505,13 +1509,15 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>>>>> struct send_queue *sq = &vi->sq[qnum];
>>>>> int err;
>>>>> struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
>>>>> - bool kick = !skb->xmit_more;
>>>>> + bool more = skb->xmit_more;
>>>>> bool use_napi = sq->napi.weight;
>>>>> + unsigned int bytes = skb->len;
>>>>> + bool kick;
>>>>> /* Free up any pending old buffers before queueing new ones. */
>>>>> - free_old_xmit_skbs(sq);
>>>>> + free_old_xmit_skbs(sq, txq, use_napi);
>>>>> - if (use_napi && kick)
>>>>> + if (use_napi && !more)
>>>>> virtqueue_enable_cb_delayed(sq->vq);
>>>>> /* timestamp packet in software */
>>>>> @@ -1552,7 +1558,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>>>>> if (!use_napi &&
>>>>> unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
>>>>> /* More just got used, free them then recheck. */
>>>>> - free_old_xmit_skbs(sq);
>>>>> + free_old_xmit_skbs(sq, txq, false);
>>>>> if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
>>>>> netif_start_subqueue(dev, qnum);
>>>>> virtqueue_disable_cb(sq->vq);
>>>>> @@ -1560,7 +1566,12 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>>>>> }
>>>>> }
>>>>> - if (kick || netif_xmit_stopped(txq)) {
>>>>> + if (use_napi)
>>>>> + kick = __netdev_tx_sent_queue(txq, bytes, more);
>>>>> + else
>>>>> + kick = !more || netif_xmit_stopped(txq);
>>>>> +
>>>>> + if (kick) {
>>>>> if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) {
>>>>> u64_stats_update_begin(&sq->stats.syncp);
>>>>> sq->stats.kicks++;
Powered by blists - more mailing lists