lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9c4f0583-3e5d-5684-87d5-1eaadfdb75b5@mellanox.com>
Date:   Sun, 31 Dec 2017 12:52:25 +0200
From:   Arkadi Sharshevsky <arkadis@...lanox.com>
To:     David Ahern <dsa@...ulusnetworks.com>,
        Yuval Mintz <yuvalm@...lanox.com>,
        Roopa Prabhu <roopa@...ulusnetworks.com>,
        Jiri Pirko <jiri@...nulli.us>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>, mlxsw <mlxsw@...lanox.com>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...oirfairelinux.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Michael Chan <michael.chan@...adcom.com>,
        "ganeshgr@...lsio.com" <ganeshgr@...lsio.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        Matan Barak <matanb@...lanox.com>,
        Leon Romanovsky <leonro@...lanox.com>,
        Ido Schimmel <idosch@...lanox.com>,
        "jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>,
        "ast@...nel.org" <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Simon Horman <simon.horman@...ronome.com>,
        "pieter.jansenvanvuuren@...ronome.com" 
        <pieter.jansenvanvuuren@...ronome.com>,
        "john.hurley@...ronome.com" <john.hurley@...ronome.com>,
        Alexander Duyck <alexander.h.duyck@...el.com>,
        "John W. Linville" <linville@...driver.com>,
        Andy Gospodarek <gospo@...adcom.com>,
        Steve Lin <steven.lin1@...adcom.com>,
        Or Gerlitz <ogerlitz@...lanox.com>,
        Shrijeet Mukherjee <shm@...ulusnetworks.com>,
        Andy Roulin <aroulin@...ulusnetworks.com>
Subject: Re: [patch net-next v2 00/10] Add support for resource abstraction



On 12/30/2017 11:15 PM, David Ahern wrote:
> On 12/28/17 1:21 AM, Yuval Mintz wrote:
>> I think it goes the other way around. The dpipe tables are the ones that
>> can be translated to functionality; The resources are internal and HW-specific
>> representing the possible internal division of resources -
>> but a given resource sn't necessarily mapped to a single networking feature.
>> [It might be in some cases, but not in the general case]
> 
> This is what I am getting at -- a single resource /kvd/linear is used
> for multiple networking features, and those networking features do map
> to well known entities -- fdb entries, ACL entries, ipv4/v6 host
> entries, LPM entries, etc.
> 
> Nothing about the output from devlink helps the user in any way to
> understand how to change the resource values. Saying that these

The current patchset adds the following dpipe table <--> resource
relation

host4 -- hash single
host6 -- hash double
adj -- linear

By dumping the resources via the 'resource show' you can the tree like
structure, you can see that you have a tradeoff between those subparts.
So for example if a user would like to increase the number of nexthops
with the expense of neighbors, it is pretty clear. As more dpipe table
will be introduced this relations will be more complete and the user
will get the complete view of the ASIC.

Just to summarize, the user gets the following info
1. Constrains\trade off about setting the sizes -  this you get
   by the tree structure.
2. Each hardware process which use this resource is mapped to it

By combining those two you can get the most accurate information
about what your change will do. Partitioning of the KVD is very delicate
process, because the hardware is complex. Many hardware processes are
pointing to this memory and size changes effect the whole ASIC, as I
mentioned as more of the pipeline will be exposed via dpipe the user
will get a more precise vision of the hardware.

We will provide some recommended and tested configuration of the whole
mlxsw resource tree for different user scenarios. A more experienced
user can do it for himself, if he got some very special scenario.


> resources, what they mean and how they are used is MLX proprietary and
> is known only to MLX employees and those with MLX agreements is not
> acceptable. Likewise, requiring some network admin to deep dive into the
> mlxsw driver to piece together how kvd/linear (for example) is used is
> not acceptable.
> 
> The cover letter touts "Many of the ASIC's internal resources are
> limited and are shared between several hardware procedures. For example,
> unified hash-based memory can be used for many lookup purposes, like FDB
> and LPM. In many cases the user can provide a partitioning scheme for
> such a resource in order to perform fine tuning for his application."
> 
> Great, now give the user some indication of how to do that. Is setting
> /kvd/linear to 0 acceptable? If not, why? What functionality is lost?
> (Apparently, everything [1].)
>
> The dpipe tables list some correlation between the kvd resources and
> tables but that is not a complete list and again there is nothing to
> tell a user that it is only a partial list of how a kvd resource is

This is work in progress, the LPM block will be exposed as the last
L3 part. Then we will start the l2 part of the ASIC.

> used. For example, it shows ipv4 host is in /kvd/hash_single and that is
> all it shows. So if I have an ipv6 only deployment can I conclude that I
> can set /kvd/hash_single to 0? Or the reverse, can I set hash_double to
> 0 for an ipv4 only deployment? From the limited information given, it is
> reasonable for a user to assume yes and has to learn through trial and
> error what can be done. [2]
> 

So you want to add min/max size attribute? I think this its not needed.

> -----
> 
> [1] This is allowed by the current patch set and perhaps it should not be:
> 
> $ ip ro ls vrf vrf1101
> unreachable default metric 8192
> 11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
> 11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
> 11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
> 11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
> 11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
> 11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
> 11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
> 11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
> 
> $ devlink resource set pci/0000:03:00.0 path /kvd/linear size 0

This line actually did nothing, because size zero is not acceptable
see patch 6. This is pure userpsace problem that error is not shown.

You can verify it by dumping the resources and see that there is no
pending change (only size and not size_new).

> $ devlink reload pci/0000:03:00.0
> $ ip ro ls vrf vrf1101
> unreachable default metric 8192
>

So you just performed full reload of the driver which includes
unregistration of all the netdevs and full init. KVD update requires
full teardown of the driver.

The system will not get back to the same state after reloading,
It's should be done on init. But it doesn't have to be like this
this, each driver provides his own reload devlink op implementation
so in our case full blown reset is required.


> [2] Same exact result for setting hash_double to 0:
> $ ip ro ls vrf vrf1101
> unreachable default metric 8192
> 11.2.51.0/24 dev swp1s0.51 proto kernel scope link src 11.2.51.1 offload
> 11.3.51.0/24 dev swp1s1.51 proto kernel scope link src 11.3.51.1 offload
> 11.4.51.0/24 dev swp1s2.51 proto kernel scope link src 11.4.51.1 offload
> 11.5.51.0/24 dev swp1s3.51 proto kernel scope link src 11.5.51.1 offload
> 11.6.51.0/24 dev swp3s0.51 proto kernel scope link src 11.6.51.1 offload
> 11.7.51.0/24 dev swp3s1.51 proto kernel scope link src 11.7.51.1 offload
> 11.8.51.0/24 dev swp3s2.51 proto kernel scope link src 11.8.51.1 offload
> 11.9.51.0/24 dev swp3s3.51 proto kernel scope link src 11.9.51.1 offload
> 
> $ devlink resource set pci/0000:03:00.0 path /kvd/hash_double size 0
> $ devlink reload pci/0000:03:00.0
> $ ip ro ls vrf vrf1101
> unreachable default metric 8192
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ