lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9dbc644f-4e4c-7119-8f99-99850fc67b73@nvidia.com>
Date:   Fri, 8 Mar 2019 00:34:25 +0530
From:   Kirti Wankhede <kwankhede@...dia.com>
To:     Parav Pandit <parav@...lanox.com>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>
CC:     Or Gerlitz <gerlitz.or@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "michal.lkml@...kovi.net" <michal.lkml@...kovi.net>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        Jiri Pirko <jiri@...lanox.com>,
        Alex Williamson <alex.williamson@...hat.com>
Subject: Re: [RFC net-next 0/8] Introducing subdev bus and devlink extension

CC += Alex

On 3/6/2019 11:12 AM, Parav Pandit wrote:
> Hi Kirti,
> 
>> -----Original Message-----
>> From: Kirti Wankhede <kwankhede@...dia.com>
>> Sent: Tuesday, March 5, 2019 9:51 PM
>> To: Parav Pandit <parav@...lanox.com>; Jakub Kicinski
>> <jakub.kicinski@...ronome.com>
>> Cc: Or Gerlitz <gerlitz.or@...il.com>; netdev@...r.kernel.org; linux-
>> kernel@...r.kernel.org; michal.lkml@...kovi.net; davem@...emloft.net;
>> gregkh@...uxfoundation.org; Jiri Pirko <jiri@...lanox.com>
>> Subject: Re: [RFC net-next 0/8] Introducing subdev bus and devlink extension
>>
>>
>>
>> On 3/6/2019 6:14 AM, Parav Pandit wrote:
>>> Hi Greg, Kirti,
>>>
>>>> -----Original Message-----
>>>> From: Parav Pandit
>>>> Sent: Tuesday, March 5, 2019 5:45 PM
>>>> To: Parav Pandit <parav@...lanox.com>; Kirti Wankhede
>>>> <kwankhede@...dia.com>; Jakub Kicinski
>> <jakub.kicinski@...ronome.com>
>>>> Cc: Or Gerlitz <gerlitz.or@...il.com>; netdev@...r.kernel.org; linux-
>>>> kernel@...r.kernel.org; michal.lkml@...kovi.net;
>> davem@...emloft.net;
>>>> gregkh@...uxfoundation.org; Jiri Pirko <jiri@...lanox.com>
>>>> Subject: RE: [RFC net-next 0/8] Introducing subdev bus and devlink
>>>> extension
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: linux-kernel-owner@...r.kernel.org <linux-kernel-
>>>>> owner@...r.kernel.org> On Behalf Of Parav Pandit
>>>>> Sent: Tuesday, March 5, 2019 5:17 PM
>>>>> To: Kirti Wankhede <kwankhede@...dia.com>; Jakub Kicinski
>>>>> <jakub.kicinski@...ronome.com>
>>>>> Cc: Or Gerlitz <gerlitz.or@...il.com>; netdev@...r.kernel.org;
>>>>> linux- kernel@...r.kernel.org; michal.lkml@...kovi.net;
>>>>> davem@...emloft.net; gregkh@...uxfoundation.org; Jiri Pirko
>>>>> <jiri@...lanox.com>
>>>>> Subject: RE: [RFC net-next 0/8] Introducing subdev bus and devlink
>>>>> extension
>>>>>
>>>>> Hi Kirti,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Kirti Wankhede <kwankhede@...dia.com>
>>>>>> Sent: Tuesday, March 5, 2019 4:40 PM
>>>>>> To: Parav Pandit <parav@...lanox.com>; Jakub Kicinski
>>>>>> <jakub.kicinski@...ronome.com>
>>>>>> Cc: Or Gerlitz <gerlitz.or@...il.com>; netdev@...r.kernel.org;
>>>>>> linux- kernel@...r.kernel.org; michal.lkml@...kovi.net;
>>>>>> davem@...emloft.net; gregkh@...uxfoundation.org; Jiri Pirko
>>>>>> <jiri@...lanox.com>
>>>>>> Subject: Re: [RFC net-next 0/8] Introducing subdev bus and devlink
>>>>>> extension
>>>>>>
>>>>>>
>>>>>>
>>>>>>> I am novice at mdev level too. mdev or vfio mdev.
>>>>>>> Currently by default we bind to same vendor driver, but when it
>>>>>>> was
>>>>>> created as passthrough device, vendor driver won't create netdevice
>>>>>> or rdma device for it.
>>>>>>> And vfio/mdev or whatever mature available driver would bind at
>>>>>>> that
>>>>>> point.
>>>>>>>
>>>>>>
>>>>>> Using mdev framework, if you want to partition a physical device
>>>>>> into multiple logic devices, you can bind those devices to same
>>>>>> vendor driver through vfio-mdev, where as if you want to
>>>>>> passthrough the device bind it to vfio-pci. If I understand
>>>>>> correctly, that is what you are
>>>>> looking for.
>>>>>>
>>>>>>
>>>>> We cannot bind a whole PCI device to vfio-pci, reason is, A given
>>>>> PCI device has existing protocol devices on it such as netdevs and rdma
>> dev.
>>>>> This device is partitioned while those protocol devices exist and
>>>>> mlx5_core, mlx5_ib drivers are loaded on it.
>>>>> And we also need to connect these objects rightly to eswitch exposed
>>>>> by devlink interface (net/core/devlink.c) that supports eswitch
>>>>> binding, health, registers, parameters, ports support.
>>>>> It also supports existing PCI VFs.
>>>>>
>>>>> I don’t think we want to replicate all of this again in mdev subsystem [1].
>>>>>
>>>>> [1]
>>>>> https://www.kernel.org/doc/Documentation/vfio-mediated-device.txt
>>>>>
>>>>> So devlink interface to migrate users from managing VFs to non_VF
>>>>> sub device is natural progression.
>>>>>
>>>>> However, in future, I believe we would be creating mediated devices
>>>>> on user request, to use mdev modules and map them to VM.
>>>>>
>>>>> Also 'mdev_bus' is created as a class and not as a bus. This limits
>>>>> to not use devlink interface whose handle is bus+device name.
>>>>>
>>>>> So one option is to change mdev from class to bus.
>>>>> devlink will create mdevs on the bus, mdev driver can probe these
>>>>> devices on host system by default.
>>>>> And if told to do passthrough, a different driver exposes them to VM.
>>>>> How feasible is this?
>>>>>
>>>> Wait, I do see a mdev bus and mdevs are created on this bus using
>>>> mdev_device_create().
>>>> So how about we create mdevs on this bus using devlink, instead of sysfs?
>>>> And driver side on host gets the mdev_register_driver()->probe()?
>>>>
>>>
>>> Thinking more and reviewing more mdev code, I believe mdev fits this
>>> need a lot better than new subdev bus, mfd, platform device, or devlink
>> subport.
>>> For coming future, to map this sub device (mdev) to VM will also be easier
>> by using mdev bus.
>>>
>>
>> Thanks for taking close look at mdev code.
>>
>> Assigning mdev to VM support is already in place, QEMU and libvirt have
>> support to assign mdev device to VM.
>>
>>> I also believe we can use the sysfs interface for mdev life cycle.
>>> Here when mdev are created it will register as devlink instance and
>>> will be able to query/config parameters before driver probe the device.
>>> (instead of having life cycle via devlink)
>>>
>>> Few enhancements would be needed for mdev side.
>>> 1. making iommu optional.
>>
>> Currently mdev devices are not IOMMU aware, vendor driver is responsible
>> for programming IOMMU for mdev device, if required.
>> IOMMU aware mdev device patch set is almost reviewed and ready to get
>> pulled. This is optional, vendor driver have to decide whether mdev device
>> should be associated with its parents IOMMU or not. I'm testing it and I
>> think Alex is on vacation and this will get pulled when Alex will be back from
>> vacation.
>> https://lwn.net/Articles/779650/
>>
>>> 2. configuring mdev device parameters during creation time
>>>
>>
>> Mdev framework provides a way to define multiple types for creation
>> through sysfs. You can define multiple types rather than having creation
>> time parameter and on creation accordingly update 'available_instances'.
>> Mdev also provides a way to provide vendor-specific-attributes for parent
>> physical device as well as for created mdev device. You can add sysfs
>> interface to get input parameters for a mdev device which can be used by
>> vendor driver when open() on that mdev device is called.
>>
>> Thanks,
>> Kirti
> 
> Yes. I got my patches to adapt to mdev way. Will be posting RFC v2 soon.
> Will wait for a day to receive more comments/views from Greg and others.
> 
> As I explained in this cover-letter and discussion,
> First use case is to create and use mdevs in the host (and not in VM).
> Later on, I am sure once we have mdevs available, VM users will likely use it.
> 
> So, mlx5_core driver will have two components as starting point.
> 
> 1. drivers/net/ethernet/mellanox/mlx5/core/mdev/mdev.c
> This is mdev device life cycle driver which will do, mdev_register_device() and implements mlx5_mdev_ops.
> 
Ok. I would suggest not use mdev.c file name, may be add device name,
something like mlx_mdev.c or vfio_mlx.c

> 2. drivers/net/ethernet/mellanox/mlx5/core/mdev/mdev_driver.c
> This is mdev device driver which does mdev_register_driver() 
> and probe() creates netdev by heavily reusing existing code of the PF device.
> These drivers will not be placed under drivers/vfio/mdev, because this is not a vfio driver.
> This is fine, right?
>

I'm not too familiar with netdev, but can you create netdev on open()
call on mlx mdev device? Then you don't have to write mdev device driver.


> Given that this is net driver, we will be submitting patches,
> through netdev mailing list through Dave Miller's net-next tree.
> And CC kvm@...r.kernel.org, you and others as usual.
> Are you ok, merging code this way as mdev device creator and mdev driver.
> Yes?
> 

Keep Alex and me in loop.

Thanks,
Kirti

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ