netdev - Re: [PATCH net-next 00/14] mlxsw: Various trap changes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200527125017.1c960f70@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net>
Date:   Wed, 27 May 2020 12:50:17 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Ido Schimmel <idosch@...sch.org>
Cc:     netdev@...r.kernel.org, davem@...emloft.net, jiri@...lanox.com,
        mlxsw@...lanox.com, Ido Schimmel <idosch@...lanox.com>
Subject: Re: [PATCH net-next 00/14] mlxsw: Various trap changes - part 2

On Wed, 27 May 2020 10:38:57 +0300 Ido Schimmel wrote:
> There is no special sauce required to get a DHCP daemon working nor BFD.
> It is supposed to Just Work. Same for IGMP / MLD snooping, STP etc. This
> is enabled by the ASIC trapping the required packets to the CPU.
> 
> However, having a 3.2/6.4/12.8 Tbps ASIC (it keeps growing all the time)
> send traffic to the CPU can very easily result in denial of service. You
> need to have hardware policers and classification to different traffic
> classes ensuring the system remains functional regardless of the havoc
> happening in the offloaded data path.

I don't see how that's only applicable to a switch ASIC, though.
Ingress classification, and rate limiting applies to any network 
system.

> This control plane policy has been hard coded in mlxsw for a few years
> now (based on sane defaults), but it obviously does not fit everyone's
> needs. Different users have different use cases and different CPUs
> connected to the ASIC. Some have Celeron / Atom while others have more
> high-end Xeon CPUs, which are obviously capable of handling more packets
> per second. You also have zero visibility into how many packets were
> dropped by these hardware policers.

There are embedded Atom systems out there with multi-gig interfaces,
they obviously can't ingest peak traffic, doesn't matter whether they
are connected to a switch ASIC or a NIC.

> By exposing these traps we allow users to tune these policers and get
> visibility into how many packets they dropped. In the future also
> changing their traffic class, so that (for example), packets hitting
> local routes are scheduled towards the CPU before packets dropped due to
> ingress VLAN filter.
> 
> If you don't have any special needs you are probably OK with the
> defaults, in which case you don't need to do anything (no special
> sauce).

As much as traps which forward traffic to the CPU fit the switch
programming model, we'd rather see a solution that offloads constructs
which are also applicable to the software world.

Sniffing dropped frames to troubleshoot is one thing, but IMHO traps
which default to "trap" are a bad smell.