lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z2FGjeyawnhABnRb@pengutronix.de>
Date: Tue, 17 Dec 2024 10:38:21 +0100
From: Oleksij Rempel <o.rempel@...gutronix.de>
To: Andrew Lunn <andrew@...n.ch>
Cc: Lorenzo Bianconi <lorenzo@...nel.org>,
	Vladimir Oltean <olteanv@...il.com>,
	Oleksij Rempel <linux@...pel-privat.de>, netdev@...r.kernel.org,
	davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
	pabeni@...hat.com, horms@...nel.org, nbd@....name,
	sean.wang@...iatek.com, Mark-MC.Lee@...iatek.com,
	lorenzo.bianconi83@...il.com
Subject: Re: [RFC net-next 0/5] Add ETS and TBF Qdisc offload for Airoha
 EN7581 SoC

On Tue, Dec 17, 2024 at 12:24:13AM +0100, Andrew Lunn wrote:
> > Considering patch [0], we are still offloading the Qdisc on the provided
> > DSA switch port (e.g. LANx) via the port_setup_tc() callback available in
> > dsa_user_setup_qdisc(), but we are introducing even the ndo_setup_tc_conduit()
> > callback in order to use the hw Qdisc capabilities available on the mac chip
> > (e.g. EN7581) for the routed traffic from WAN to LANx. We will still apply
> > the Qdisc defined on LANx for L2 traffic from LANy to LANx. Agree?
> 
> I've not read all the details, so i could be getting something
> wrong. But let me point out the basics. Offloading is used to
> accelerate what Linux already supports in software. So forget about
> your hardware. How would i configure a bunch of e1000e cards connected
> to a software bridge to do what you want?
> 
> There is no conduit interface in this, so i would not expect to
> explicitly configure a conduit interface. Maybe the offloading needs
> to implicitly configure the conduit, but that should be all hidden
> away from the user. But given the software bridge has no concept of a
> conduit, i doubt it.
> 
> It could well be our model does not map to the hardware too well,
> leaving some bits unusable, but there is not much you can do about
> that, that is the Linux model, accelerate what Linux supports in
> software.

Hi,

You are absolutely correct that offloading should accelerate what Linux already
supports in software, and we need to respect this model. However, I’d like to
step back for a moment to clarify the underlying problem before focusing too
much on solutions.

### The Core Problem: Flow Control Limitations

1. **QoS and Flow Control:** 

   At the heart of proper QoS implementation lies flow control. Flow control
   mechanisms exist at various levels:

   - MAC-level signaling (e.g., pause frames)

   - Queue management (e.g., stopping queues when the hardware is congested)

   The typical Linux driver uses flow control signaling from the MAC (e.g.,
   stopping queues) to coordinate traffic, and depending on the Qdisc, this
   flow control can propagate up to user space applications.

2. **Challenges with DSA:**
   In DSA, we lose direct **flow control communication** between:

   - The host MAC

   - The MAC of a DSA user port.

   While internal flow control within the switch may still work, it does not
   extend to the host. Specifically:

   - Pause frames often affect **all priorities** and are not granular enough
     for low-latency applications.

   - The signaling from the MAC of the DSA user port to the host is either
     **not supported** or is **disabled** (often through device tree
     configuration).

### Why This Matters for QoS

For traffic flowing **from the host** to DSA user ports:

- Without proper flow control, congestion cannot be communicated back to the
  host, leading to buffer overruns and degraded QoS.  

- To address this, we need to compensate for the lack of flow control signaling
  by applying traffic limits (or shaping).

### Approach: Applying Limits on the Conduit Interface

One way to solve this is by applying traffic shaping or limits directly on the
**conduit MAC**. However, this approach has significant complexity:

1. **Hardware-Specific Details:**

   We would need deep hardware knowledge to set up traffic filters or disectors
   at the conduit level. This includes:

   - Parsing **CPU tags** specific to the switch in use.  

   - Applying port-specific rules, some of which depend on **user port link
     speed**.

2. **Admin Burden:**

   Forcing network administrators to configure conduit-specific filters
   manually increases complexity and goes against the existing DSA abstractions,
   which are already well-integrated into the kernel.


### How Things Can Be Implemented

To address QoS for host-to-user port traffic in DSA, I see two possible
approaches:

#### 1. Apply Rules on the Conduit Port (Using `dst_port`)

In this approach, rules are applied to the **conduit interface**, and specific
user ports are matched using **port indices**.

# Conduit interface  
tc qdisc add dev conduit0 clsact  

# Match traffic for user port 1 (e.g., lan0)  
tc filter add dev conduit0 egress flower dst_port 1 \  
    action police rate 50mbit burst 5k drop  

# Match traffic for user port 2 (e.g., lan1)  
tc filter add dev conduit0 egress flower dst_port 2 \  
    action police rate 30mbit burst 3k drop  

#### 2. Apply Rules Directly on the User Ports (With Conduit Marker)

In this approach, rules are applied **directly to the user-facing DSA ports**
(e.g., `lan0`, `lan1`) with a **conduit-specific marker**. The kernel resolves
the mapping internally.

# Apply rules with conduit marker for user ports  
tc qdisc add dev lan0 root tbf rate 50mbit burst 5k conduit-only  
tc qdisc add dev lan1 root tbf rate 30mbit burst 3k conduit-only  

Here:  
- **`conduit-only`**: A marker (flag) indicating that the rule applies
specifically to **host-to-port traffic** and not to L2-forwarded traffic within
the switch.  

### Recommendation

The second approach (**user port-based with `conduit-only` marker**) is cleaner
and more intuitive. It avoids exposing hardware details like port indices while
letting the kernel handle conduit-specific behavior transparently.

Best regards,  
Oleksij
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ