[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <581BC852.7010304@gmail.com>
Date: Thu, 3 Nov 2016 16:29:22 -0700
From: John Fastabend <john.fastabend@...il.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Shrijeet Mukherjee <shm@...ulusnetworks.com>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Thomas Graf <tgraf@...g.ch>,
Alexei Starovoitov <alexei.starovoitov@...il.com>,
Jakub Kicinski <kubakici@...pl>,
David Miller <davem@...emloft.net>, alexander.duyck@...il.com,
shrijeet@...il.com, tom@...bertland.com, netdev@...r.kernel.org,
Roopa Prabhu <roopa@...ulusnetworks.com>,
Nikolay Aleksandrov <nikolay@...ulusnetworks.com>,
aconole@...hat.com
Subject: Re: [PATCH net-next RFC WIP] Patch for XDP support for virtio_net
[...]
>>> - when XDP is attached disable all LRO using VIRTIO_NET_CTRL_GUEST_OFFLOADS_SET
>>> (not used by driver so far, designed to allow dynamic LRO control with
>>> ethtool)
>>
>> I see there is a UAPI bit for this but I guess we also need to add
>> support to vhost as well? Seems otherwise we may just drop a bunch
>> of packets on the floor out of handle_rx() when recvmsg returns larger
>> than a page size. Or did I read this wrong...
>
> It's already supported host side. However you might
> get some packets that were in flight when you attached.
>
Really I must have missed it I don't see any *GUEST_FEATURES* flag in
./drivers/vhost/?
>>> - start adding page-sized buffers
>>
>> I started to mangle add_recvbuf_big() and receive_big() here and this
>> didn't seem too bad.
>
> I imagine it won't be ATM but I think we'll need to support
> mrg buffers with time and then it will be messy.
> Besides, it's not an architectural thing that receive_big
> uses page sized buffers, it could use any size.
> So a separate path just for xdp would be better imho.
>
>>> - do something with non-page-sized buffers added previously - what
>>> exactly? copy I guess? What about LRO packets that are too large -
>>> can we drop or can we split them up?
>>
>> hmm not sure I understand this here. With LRO disabled and mergeable
>> buffers disabled all packets should fit in a page correct?
>
> Assuing F_MTU is negotiated and MTU field is small enough, yes.
> But if you disable mrg buffers dynamically you will get some packets
> in buffers that were added before the disable.
> Similarly for disabling LRO dynamically.
>
>> With LRO enabled case I guess to start with we block XDP from being
>> loaded for the same reason we don't allow jumbo frames on physical
>> nics.
>
> If you ask that host disables the capability, then yes, it's easy.
> Let's do that for now, it's a start.
>
>
>>>
>>> I'm fine with disabling XDP for some configurations as the first step,
>>> and we can add that support later.
>>>
>>
>> In order for this to work though I guess we need to be able to
>> dynamically disable mergeable buffers at the moment I just commented
>> it out of the features list and fixed up virtio_has_features so it
>> wont bug_on.
>
> For now we can just set mrg_rxbuf=off on qemu command line, and
> fail XDP attach if not there. I think we'll be able to support it
> long term but you will need host side changes, or fully reset
> device and reconfigure it.
see question below. I agree disabling mrg_rxbuff=off lro=off and an
xdp receive path makes this relatively straight forward and clean with
the MTU patch noted below as well.
>
>>> Ideas about mergeable buffers (optional):
>>>
>>> At the moment mergeable buffers can't be disabled dynamically.
>>> They do bring a small benefit for XDP if host MTU is large (see below)
>>> and aren't hard to support:
>>> - if header is by itself skip 1st page
>>> - otherwise copy all data into first page
>>> and it's nicer not to add random limitations that require guest reboot.
>>> It might make sense to add a command that disables/enabled
>>> mergeable buffers dynamically but that's for newer hosts.
>>
>> Yep it seems disabling mergeable buffers solves this but didn't look at
>> it too closely. I'll look closer tomorrow.
>>
>>>
>>> Spec does not require it but in practice most hosts put all data
>>> in the 1st page or all in the 2nd page so the copy will be nop
>>> for these cases.
>>>
>>> Large host MTU - newer hosts report the host MTU, older ones don't.
>>> Using mergeable buffers we can at least detect this case
>>> (and then what? drop I guess).
>>>
>>
>> The physical nics just refuse to load XDP with large MTU.
>
> So let's do the same for now, unfortunately you don't know
> the MTU unless _F_MTU is negitiated and QEMU does not
> implement that yet, but it's easy to add.
> In fact I suspect Aaron (cc) has an implementation since
> he posted a patch implementing that.
> Aaron could you post it pls?
>
Great! Aaron if you want me to review/test at all let me know I have
a few systems setup running this now so can help if needed.
>> Any reason
>> not to negotiate the mtu with the guest so that the guest can force
>> this?
>
> There are generally many guests and many NICs on the host.
> A big packet arrives, what do you want to do with it?
Drop it just like a physical nic would do if packet is larger
than MTU. Maybe splat a message in the log so user has some
clue something got misconfigured.
> We probably want to build propagating MTU across all VMs and NICs
> but let's get a basic thing merged first.
That feels like an orchestration/QEMU type problem to me. Just because
some NIC has jumbo frames enabled doesn't necessarily mean they would
ever get to any specific VM based on forwarding configuration.
And if I try to merge the last email I sent out here. In mergeable and
big_packets modes if LRO is off and MTU < PAGE_SIZE it seems we should
always get physically contiguous data on a single page correct? It
may be at some offset in a page however. But the offset should not
matter to XDP. If I read this right we wouldn't need to add a new
XDP mode and could just use the existing merge or big modes. This would
seem cleaner to me than adding a new mode and requiring a qemu option.
Thanks for all the pointers by the way its very helpful.
Powered by blists - more mailing lists