[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56B885FD.2050305@stressinduktion.org>
Date: Mon, 8 Feb 2016 13:11:41 +0100
From: Hannes Frederic Sowa <hannes@...essinduktion.org>
To: Jiri Pirko <jiri@...nulli.us>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <brouer@...hat.com>,
netdev@...r.kernel.org, davem@...emloft.net, idosch@...lanox.com,
eladr@...lanox.com, yotamg@...lanox.com, ogerlitz@...lanox.com,
yishaih@...lanox.com, dledford@...hat.com, sean.hefty@...el.com,
hal.rosenstock@...il.com, eugenia@...lanox.com,
roopa@...ulusnetworks.com, nikolay@...ulusnetworks.com,
hadarh@...lanox.com, jhs@...atatu.com, john.fastabend@...il.com,
jeffrey.t.kirsher@...el.com, jbenc@...hat.com
Subject: Re: [patch net-next RFC 0/6] Introduce devlink interface and first
drivers to use it
Hi,
On 08.02.2016 11:55, Jiri Pirko wrote:
> Mon, Feb 08, 2016 at 11:15:38AM CET, hannes@...essinduktion.org wrote:
>> Hello,
>>
>> On 06.02.2016 20:40, Jiri Pirko wrote:
>>> Fri, Feb 05, 2016 at 06:38:42PM CET, alexei.starovoitov@...il.com wrote:
>>>> On Fri, Feb 05, 2016 at 11:01:22AM +0100, Hannes Frederic Sowa wrote:
>>>>>
>>>>> Okay. I see it more as changing mode of operation of hardware and thus has
>>>>> not really anything to do with networking. If you say you change ethernet to
>>>>> infiniband it has something to do with networking, sure. But I am fine with
>>>>> this, I just thought the code size could be reduced by adding this to sysfs
>>>>> quite a lot. I don't have a strong opinion on this.
>>>>
>>>> there is already a way to change eth/ib via
>>>> echo 'eth' > /sys/bus/pci/drivers/mlx4_core/0000:02:00.0/mlx4_port1
>>>>
>>>> sounds like this is another way to achieve the same?
>>>
>>> It is. However the current way is driver-specific, not correct.
>>
>> Why is driver specific not correct? Actually it is very much a device
>> specific thing, isn't it?
>
> Well, adding driver specific sysfs file called "driver_name_port_type"
> does not seem correct to me.
Why? PHYs are debugged like that? I thought that especially sysfs is the
right thing, it makes sure we can correctly identify a device. The logic
in devlink_alloc by just incrementing a counter and having the naming
policy be decided by driver registration time will introduce the same
problems like identifying devices by interfaces had before.
>>> For mlx5, we need the same, it cannot be done in this way. Do devlink is
>>> the correct way to go.
>>
>> Do two drivers already justify a new complete netlink api? Doesn't this
>> create the same problems like netdevice naming problems which needed multiple
>> years to become stable in case we have multiple cards or some administrator
>
> The thing is, other driver would use it as well, but there's no way to
> do it :) So vendors have their proprietary configuration utils. Devlink
> objective is to avoid those, to introduce vendor-neutral interface.
Ok, agreed. But multiple driver reuse the phy-sysfs routines, too. I
didn't see this to be a problem.
Anyway, I don't care if it is sysfs or something else, I am concerned
about the atomic_inc_return based identification of those devices.
>> reorders the cards (biosdevorder, systemd/udev issues)? Are ports always
>> stable? How can we have a 1:1 relationship with ifindexes and how can they be
>> stable? It is impossible to use that in scripts?
>
> Port index is setup by driver always, they have stable internal
> numbering. devlink device name is not stable (as for example netdev
> name), but can be easily identified by bus name and device name. I don't
> see a reason why udev cannot rename it according to some rules. By the
> way, this is very similar to phyX wireless devices.
Ok, understood. It just seems to be duplication of code with another name.
>>>> Why not hide echo/cat in iproute2 instead of adding parallel netlink api?
>>>> Or this is for switches instead of nics?
>>>> Then why it's not adding to switchdev?
>>>
>>> Note this is not specific to switch ASICs. This is for all network devices.
>>
>> That's actually my fear. The relationship from "devlink-names" to ifindexes I
>> didn't understand at all architecturally.
>
> Again, this is very similar to phyX wireless devices.
> I don't understand the reason for your fear :)
If, as you said, this gets integrated by systemd/udev and will change
names to stable ones before switching ports (so we don't accidentally
switch a wrong port) I am all fine. This is basically how net_devices
are handled.
Then my only argument is that this is too complex, but I can live with that.
Thanks,
Hannes
Powered by blists - more mailing lists