lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2b3cabca-4710-0a71-69c7-cc433e2b3062@ti.com>
Date:   Sat, 2 Jun 2018 18:28:22 -0500
From:   Grygorii Strashko <grygorii.strashko@...com>
To:     Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        Ivan Vecera <ivecera@...hat.com>, Andrew Lunn <andrew@...n.ch>
CC:     Jiri Pirko <jiri@...nulli.us>, <netdev@...r.kernel.org>,
        <ivan.khoronzhuk@...aro.org>, <nsekhar@...com>,
        <francois.ozog@...aro.org>, <yogeshs@...com>, <spatton@...com>,
        Jiri Pirko <jiri@...nulli.us>
Subject: Re: [PATCH 0/4] RFC CPSW switchdev mode

Hi All,

Sry, for delayed reply.

On 05/24/2018 09:08 AM, Ilias Apalodimas wrote:
> On Thu, May 24, 2018 at 03:44:54PM +0200, Ivan Vecera wrote:
>> On 24.5.2018 14:54, Andrew Lunn wrote:
>>> On Thu, May 24, 2018 at 11:48:31AM +0300, Ilias Apalodimas wrote:
>>>> On Thu, May 24, 2018 at 10:05:28AM +0200, Jiri Pirko wrote:
>>>>> Thu, May 24, 2018 at 08:56:20AM CEST, ilias.apalodimas@...aro.org wrote:
>>>>> Any reason you need cpu port? We don't need it in mlxsw and also in dsa.
>>>> Yes i've seen that on mlxsw/rocker drivers and i was reluctant adding one here.
>>>> The reason is that TI wants this configured differently from customer facing
>>>> ports. Apparently there are existing customers already using the "feature".
>>>> So OR'ing and adding the cpu port on every operation (add/del vlans add
>>>> ucast/mcast entries etc) was less favoured.
>>>
>>> Hi Ilias
>>>
>>> Nice to see this device moving away from its custom model and towards
>>> the switchdev model.
>> +1
> Thanks. To be honest it opens up so many posibilities for common configuration
> from userspace across vendors that doing something new without it doesn't make
> any sense (at least to me).
> 
>>
>>> Did you consider making a clean break from the existing code and write
>>> a new driver. Let the existing customers using the existing
>>> driver. Have the new switchdev driver fully conform to switchdev.
>>
>> I would also prefer fresh new driver. The existing one can be marked as
>> 'bugfix-only' and later pertinently deprecated/removed.
> Yes, but given the driver and the platforms it's used at, we ended up patching
> the existing driver. I am not opposed to the idea, but Grygorii is more suited
> to reply on that.

Correct, we considered two options - start from scratch or hack existing driver
to get working prototype as fast as possible.
Hacking of existing driver was just faster way to go and i agree that new driver 
might be better approach for the future.

> 
>>>
>>> I don't like having this 'cpu' interface. As you say, it breaks the
>>> switchhdev model. If we need to extend the switchdev model to support
>>> some use case, lets do that. Please can you fully describe the use
>>> cases, so we can discuss how to implement them cleanly within the
>>> switchdev model.
>> +1
> There's configuration needs from customers adding or not adding a VLAN to the
> CPU port. In my configuration examples for instance, if the cpu port is not
> added to the bridge, you cannot get an ip address on it.
> Similar cases exist for customers on adding MDBs as far as i know. So they want
> the "customer facing ports" to have the MDBs present but not the cpu port.
> In some cases (where the CPE/device that has the switch) participates in the
> traffic they want the cpu port to have the samne MDBs installed.
> This is just two simple cases that come in mind, again Grygorii is more suited
> to answer and explain existing/more complex use cases better than me.
> 
> Adding a cpu port that cannot transmit or receive traffic is a bit "weird", on
> the other hand you can access it's configuration using the same userspace tools
> and the same commands you do for the "normal" ports. Extending switchdev might
> be the proper solution here.

I'd try to provide more information about this switch and why we end up adding eth0 port.
Please, be patient to me as this is new area for me and I might not be right in some
conclusions or can missing smth. 

*Before this patch set*: Current CPSW driver in switch mode can be described as below
          +----------------------------------------+
          |Linux Host 0                            |
          |                                        |
          |   TI tool (ioctl)/netdev               |
          |  + +                                   |
          +-----------------+----------------------+
             | |            |
Control MMIO | |            | Data DMA
             | |            |
         +------------+-----+-----+-----------------+
         |   | |      |   CPDMA   |        CPSW     |
         |   v |      |           |                 |
         |     |      +-----------+                 |
         |     |      |    P0     |                 |
         |     |      |           |                 |
         |     |      +-----+-----+                 |
         |     |            |  FDB: static MAC Port=0
         |     |      +-----+-----+                 |
         |     +------>    ALE    +-----+           |
         |     +------+           |     |           |
         |     |      +-----------+     |           |
         | +---+----+              +----+---+       |
         | |        |              |        |       |
         | |  P1    |              |  P2    |       |
         +------------------------------------------+
           |        |              |        |
           |  PHY   |              |  PHY   |
           +---+----+              +----+---+
               |                        |
           +---+----+               +---+----+
           |        |               |        |
           |  Host 1|               | Host 2 |
           +--------+               +--------+

Note. This is embedded world and in many cases network configuration is
static, everything which is not allowed - drop (means no such things like
IGMP or even ARP).

The core Part of CPSW is ALE module which perform switching ops according 
to ALE table (fdb/mdb/vlan). From ALE point of view *all* port absolutely equal.
And Linux Host 0 in most of customer use cases is the no much different from Hosts 1 or 2,
so allowed net traffic to Linux Host 0 have to be very carefully configured as Apps running
on it must continue working even if network failed or overloaded (packet storms).

So, as per my understanding, P0 meaning here is absolutely not the same as CPU port in DSA.
It just another switch port with bad luck to be connected directly to CPU.

Current CPSW driver offloads basic switch configuration in HW and additional configuration
can be done using TI custom tool (kernel updated to accept additional IOCTL).
Current driver creates one netdev ETH0 and all net traffic enter/exit through this device
- it possible to distinguish which external port p1/2 received packet or send packet directly
to external port p1/2, but current CPSW doesn't do this.
Static Unicast FDB entries created in ALE table for Port 0 to specify which 
unicast traffic need to be accepted by Linux Host 0.  By default only registered multicast 
or broadcast packets forwarded to Linux Host 0.



*After this patch set*: goal keep things working the same as max as possible and
get rid of TI custom tool.

        +-------------------------------------------------------+
        | Linux Host 0                                          |
        |                            +------------------+       |
        |                            |            br0   |       |
        |                            |                  |       |
        |                            +---+--------------+       |
        |                                |             |data    |
        |     +-------+--------------+---------------+ |        |
        |     |       ^              ^   |           | |        |
        |     |      ++-----+      +-+---+-+    +----+-+-+      |
        |     |      |sw0p0 |      | sw0p1 |    |sw0p2   |      |
        |     |      +------+      +----^--+    +-----^--+      |
        |     |                         |             |         |
        |     |             +-----------+-------------+         |
        |     |             |                                   |
        +----+v+------------------------------------------------+
             | |            |
Control MMIO | |            | Data DMA
             | |            |
         +------------+-----+-----+-----------------------------+
         |   | |      |   CPDMA   |        CPSW                 |
         |   v |      |           |                             |
         |     |      +-----------+                             |
         |     |      |    P0     |                             |
         |     |      |           |                             |
         |     |      +-----+-----+                             |
         |     |            |        FDB: static MACP1 port=0   |
         |     |      +-----+-----+  FDB: static MACP2 port=0   |
         |     +------>    ALE    +-----+                       |
         |     +------+           |     |                       |
         |     |      +-----------+     |                       |
         | +---+----+              +----+---+                   |
         | |        |              |        |                   |
         | |  P1    |              |  P2    |                   |
         +------------------------------------------------------+
           |        |              |        |
           |  PHY   |              |  PHY   |
           +--------+              +--------+

In this implementation switch egress traffic to Linux Host 0 always
split between sw0p1 and sw0p2 depending on which external port P1/P2
packet was received, sw0p0 doesn't produce any traffic.
On switch ingress from Linux Host 0 packets are always sent directly to
external ports P1/P2, with assumption that Linux Bridge knows where packet
should go, and ALE bypassed in this case. sw0p0 doesn't allows to
send any traffic and serves for configuration purposes only.

Below I've described some tested use cases (not include full static configuration),
but regarding sw0p0 - there is work done by Ivan Khoronzhuk [1] which enables
adds MQPRIO and CBS Qdisc and targets AVB network features. It required to
offload MQPRIO and CBS parameters on all ports including P0. In case of P0,
CPDMA TX channels shapers need to be configured, and in case 
of sw0p1/sw0p2 internal FIFOS. 
sw0p0 also expected to be used to configure CPDMA interface in general -
number of tx/rx channels, rates, ring sizes.
In addition there is set of global CPSW parameters (not related to P1/P2, like
MAC Authorization Mode, OUI Deny Mode, crc ) which I've 
thought can be added to sw0p0 (using ethtool -priv-flags).

Additional headache is PTP: we have on PHC, but both external interfaces P1/P2
can timestamp packets.

[1] https://lkml.org/lkml/2018/5/18/1134


Below use cases were tried with this approach tried with current LKML,
so possible changes to Net/Bridge framework not considered:

 
1) boot, ping no vlan

# ip link add name br0 type bridge
# echo 0 > /sys/class/net/br0/bridge/default_pvid
# ip link set dev eth2 master br0
# ip link set dev eth0 master br0
# ip link set dev eth1 master br0
# ifconfig br0 192.168.1.2

*Note*: I've had to disable default_pvid as otherwise linux Bridge adds
and offloads default vlan 1, but default configuration for CPSW driver is vid 0.
+  CPSW specific - it can't untag packets for P0.
Another option I've found:
# ip link set dev br0 type bridge vlan_filtering 1.
but anyway, I've found it confusing that Linux bridge adds default vlan when vlan_filtering == 0


2) add vlans.

Host 1 - not vlan capable,
Host 2/ Linux Host 0 can handle vlans

# bridge vlan add dev sw0p1 vid 100 pvid untagged master
# bridge vlan add dev sw0p2 vid 100 master
# bridge vlan add dev sw0p0 vid 100 master

Any combination expected to work.

*Note*. Ilias reused IFF_ALLMULTI and IFF_MULTICAST on each interface to
 configure registered and unregistered multicast ports masks for each ALE VLAN
entries. So, 

# ifconfig eth0 -multicast
# ifconfig eth0 --allmulti
# bridge vlan add dev sw0p0 vid 100 master
expected to stop forwarding  any multicast traffic to Linux Host 0

3) MDB

Allow mcast 239.1.1.1 between P1/P2
# bridge mdb add dev br0 port sw0p1 grp 239.1.1.1 permanent
# bridge mdb add dev br0 port sw0p2 grp 239.1.1.1 permanent

Allow mcast 239.1.1.2 between P0/P2
# bridge mdb add dev br0 port sw0p0 grp 239.1.1.2 permanent
# bridge mdb add dev br0 port sw0p2 grp 239.1.1.2 permanent

*Note !!!!!*. I've found no proper way to add L2 mcast addresses as
01-80-C2-00-00-00 or 01-1B-19-00-00-00. probably I missing smth.

4) stp

I did some updates (added port stp state offload/stp mcats addr configuration
- will post diff on Monday) and tried stp (kernel STP) by connecting P1/P2
ports to the same switch:

# ip link add name br0 type bridge
# echo 0 > /sys/class/net/br0/bridge/default_pvid
# ip link set dev sw0p2 master br0
# ip link set dev sw0p0 master br0
# ip link set dev sw0p1 master br0
# brctl show
# ifconfig br0 192.168.1.2
# brctl stp br0 on
# brctl showstp br0

sw0p1 (3)
 port id                8003                    state                  blocking
sw0p2 (1)
 port id                8001                    state                forwarding


Don't know howto:
1) add FDB entry with "blocked" flag - ALE can discard all packets with SRC/DST
address = blocked MAC
2) add multicast MAC address with Supervisory Packet flag set. 
Such packets will bypass most of checks inside ALE and will be forwarded in all port's
states except "disabled".
3) add "unknown vlan configuration" : ALE provides possibility to configure
default behavior for tagged packets with "unknown vlan" by configuring 
- Unknown VLAN Force Untagged Egress ports Mask.
- Unknown VLAN Registered Multicast Flood Ports Mask
- Unknown VLAN Multicast Flood ports Mask
- Unknown VLAN Member ports List
4) The way to detect "brctl stp br0 on/off"


If you are here. Thanks for reading this and your patience.

-- 
regards,
-grygorii

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ