netdev - Re: [PATCH net-next 0/2] Enable virtio to act as a master for a passthru device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 3 Jan 2018 10:28:03 -0800
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     "Samudrala, Sridhar" <sridhar.samudrala@...el.com>
Cc:     Jakub Kicinski <kubakici@...pl>,
        "Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Stephen Hemminger <stephen@...workplumber.org>,
        Netdev <netdev@...r.kernel.org>,
        virtualization@...ts.linux-foundation.org,
        virtio-dev@...ts.oasis-open.org,
        Alexander Duyck <alexander.h.duyck@...el.com>
Subject: Re: [PATCH net-next 0/2] Enable virtio to act as a master for a
 passthru device

On Wed, Jan 3, 2018 at 10:14 AM, Samudrala, Sridhar
<sridhar.samudrala@...el.com> wrote:
>
>
> On 1/3/2018 8:59 AM, Alexander Duyck wrote:
>>
>> On Tue, Jan 2, 2018 at 6:16 PM, Jakub Kicinski <kubakici@...pl> wrote:
>>>
>>> On Tue,  2 Jan 2018 16:35:36 -0800, Sridhar Samudrala wrote:
>>>>
>>>> This patch series enables virtio to switch over to a VF datapath when a
>>>> VF
>>>> netdev is present with the same MAC address. It allows live migration of
>>>> a VM
>>>> with a direct attached VF without the need to setup a bond/team between
>>>> a
>>>> VF and virtio net device in the guest.
>>>>
>>>> The hypervisor needs to unplug the VF device from the guest on the
>>>> source
>>>> host and reset the MAC filter of the VF to initiate failover of datapath
>>>> to
>>>> virtio before starting the migration. After the migration is completed,
>>>> the
>>>> destination hypervisor sets the MAC filter on the VF and plugs it back
>>>> to
>>>> the guest to switch over to VF datapath.
>>>>
>>>> It is based on netvsc implementation and it may be possible to make this
>>>> code
>>>> generic and move it to a common location that can be shared by netvsc
>>>> and virtio.
>>>>
>>>> This patch series is based on the discussion initiated by Jesse on this
>>>> thread.
>>>> https://marc.info/?l=linux-virtualization&m=151189725224231&w=2
>>>
>>> How does the notion of a device which is both a bond and a leg of a
>>> bond fit with Alex's recent discussions about feature propagation?
>>> Which propagation rules will apply to VirtIO master?  Meaning of the
>>> flags on a software upper device may be different.  Why muddy the
>>> architecture like this and not introduce a synthetic bond device?
>>
>> It doesn't really fit with the notion I had. I think there may have
>> been a bit of a disconnect as I have been out for the last week or so
>> for the holidays.
>>
>> My thought on this was that the feature bit should be spawning a new
>> para-virtual bond device and that bond should have the virto and the
>> VF as slaves. Also I thought there was some discussion about trying to
>> reuse as much of the netvsc code as possible for this so that we could
>> avoid duplication of effort and have the two drivers use the same
>> approach. It seems like it should be pretty straight forward since you
>> would have the feature bit in the case of virto, and netvsc just does
>> this sort of thing by default if I am not mistaken.
>
> This patch is mostly based on netvsc implementation. The only change is
> avoiding the
> explicit dev_open() call of the VF netdev after a delay. I am assuming that
> the guest userspace
> will bring up the VF netdev and the hypervisor will update the MAC filters
> to switch to
> the right data path.
> We could commonize the code and make it shared between netvsc and virtio. Do
> we want
> to do this right away or later? If so, what would be a good location for
> these shared functions?
> Is it net/core/dev.c?

No, I would think about starting a new driver file in "/drivers/net/".
The idea is this driver would be utilized to create a bond
automatically and set the appropriate registration hooks. If nothing
else you could probably just call it something generic like virt-bond
or vbond or whatever.

> Also, if we want to go with a solution that creates a bond device, do we
> want virtio_net/netvsc
> drivers to create a upper device?  Such a solution is already possible via
> config scripts that can
> create a bond with virtio and a VF net device as slaves.  netvsc and this
> patch series is trying to
> make it as simple as possible for the VM to use directly attached devices
> and support live migration
> by switching to virtio datapath as a backup during the migration process
> when the VF device
> is unplugged.

We all understand that. But you are making the solution very virtio
specific. We want to see this be usable for other interfaces such as
netsc and whatever other virtual interfaces are floating around out
there.

Also I haven't seen us address what happens as far as how we will
handle this on the host. My thought was we should have a paired
interface. Something like veth, but made up of a bond on each end. So
in the host we should have one bond that has a tap/vhost interface and
a VF port representor, and on the other we would be looking at the
virtio interface and the VF. Attaching the tap/vhost to the bond could
be a way of triggering the feature bit to be set in the virtio. That
way communication between the guest and the host won't get too
confusing as you will see all traffic from the bonded MAC address
always show up on the host side bond instead of potentially showing up
on two unrelated interfaces. It would also make for a good way to
resolve the east/west traffic problem on hosts since you could just
send the broadcast/multicast traffic via the tap/vhost/virtio channel
instead of having to send it back through the port representor and eat
up all that PCIe bus traffic.