[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230905133312.6a29b654@wsk>
Date:   Tue, 5 Sep 2023 13:33:12 +0200
From:   Lukasz Majewski <lukma@...x.de>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     Eric Dumazet <edumazet@...gle.com>, Andrew Lunn <andrew@...n.ch>,
        davem@...emloft.net, Paolo Abeni <pabeni@...hat.com>,
        Woojung Huh <woojung.huh@...rochip.com>,
        Tristram.Ha@...rochip.com, Florian Fainelli <f.fainelli@...il.com>,
        Jakub Kicinski <kuba@...nel.org>, UNGLinuxDriver@...rochip.com,
        George McCollister <george.mccollister@...il.com>,
        Oleksij Rempel <o.rempel@...gutronix.de>,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 RFC 2/4] net: dsa: Extend ksz9477 TAG setup to
 support HSR frames duplication
Hi Vladimir,
> On Tue, Sep 05, 2023 at 12:44:09PM +0200, Lukasz Majewski wrote:
> > > Not to mention that there are other problems with the
> > > "dev->hsr_ports" concept. For example, having a hsr0 over lan0
> > > and lan1, and a hsr1 over lan2 and lan3, would set dev->hsr_ports
> > > to GENMASK(3, 0).  
> > 
> > I doubt that having two hsr{01} interfaces is possible with current
> > kernel.  
> 
> You mean 2 hsr{01} interfaces not being able to coexist in general,
> or just "offloaded" ones?
The KSZ9477 IC only allows to have two its ports from 5 available to be
configured as HSR ones (so the HW offloading would work).
And having single hsr0 with lan[12] is the used case on which I'm
focused (with offloading or pure SW).
> 
> > The KSZ9477 allows only to have 2 ports of 5 available as HSR
> > ones.
> > 
> > The same is with earlier chip xrs700x (but this have even bigger
> > constrain - there only ports 1 and 2 can support HSR).   
> 
> > > > +	if (dev->features & NETIF_F_HW_HSR_DUP) {
> > > > +		val &= ~KSZ9477_TAIL_TAG_LOOKUP;    
> > > 
> > > No need to unset a bit which was never set.  
> > 
> > I've explicitly followed the vendor's guidelines - the TAG_LOOKUP
> > needs to be cleared.
> > 
> > But if we can assure that it is not set here I can remove it.  
> 
> Let's look at ksz9477_xmit(), filtering only for changes to "u16 val".
> 
> static struct sk_buff *ksz9477_xmit(struct sk_buff *skb,
> 				    struct net_device *dev)
> {
> 	u16 val;
> 
> 	val = BIT(dp->index);
> 
> 	val |= FIELD_PREP(KSZ9477_TAIL_TAG_PRIO, prio);
> 
> 	if (is_link_local_ether_addr(hdr->h_dest))
> 		val |= KSZ9477_TAIL_TAG_OVERRIDE;
> 
> 	if (dev->features & NETIF_F_HW_HSR_DUP) {
> 		val &= ~KSZ9477_TAIL_TAG_LOOKUP;
> 		val |= ksz_hsr_get_ports(dp->ds);
> 	}
> }
> 
> Is KSZ9477_TAIL_TAG_LOOKUP ever set in "val", or am I missing
> something?
No, it looks like you are not. The clearance of KSZ9477_TAIL_TAG_LOOKUP
seems to be an overkill.
> 
> > > > +		val |= ksz_hsr_get_ports(dp->ds);
> > > > +	}    
> > > 
> > > Would this work instead?
> > > 
> > > 	struct net_device *hsr_dev = dp->hsr_dev;
> > > 	struct dsa_port *other_dp;
> > > 
> > > 	dsa_hsr_foreach_port(other_dp, dp->ds, hsr_dev)
> > > 		val |= BIT(other_dp->index);
> > >   
> > 
> > I thought about this solution as well, but I've been afraid, that
> > going through the loop of all 5 ports each time we want to send
> > single packet will reduce the performance.
> > 
> > Hence, the idea with having the "hsr_ports" set once during join
> > function and then use this cached value afterwards.  
> 
> There was a quote about "premature optimization" which I can't quite
> remember...
Yes, using caching by default instead of list iterating is the
"premature optimization" .... :-)
> 
> If you can see a measurable performance difference, then the list
> traversal can be converted to something more efficient.
> 
> In this case, struct dsa_port :: hsr_dev can be converted to a larger
> struct dsa_hsr structure, similar to struct dsa_port :: bridge.
> That structure could look like this:
> 
> struct dsa_hsr {
> 	struct net_device *dev;
> 	unsigned long port_mask;
> 	refcount_t refcount;
> };
> 
> and you could replace the list traversal with "val |=
> dp->hsr->port_mask". But a more complex solution requires a
> justification, which in this case is performance-related. So
> performance data must be gathered.
> 
> FWIW, dsa_master_find_slave() also performs a list traversal.
> But similar discussions about performance improvements didn't lead
> anywhere.
The iteration over hsr ports would simplify the code. I will use it and
provide feedback if I find performance drop.
Thanks for the feedback.
Best regards,
Lukasz Majewski
--
DENX Software Engineering GmbH,      Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@...x.de
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists
 
