lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Dec 2021 17:33:17 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Ansuel Smith <ansuelsmth@...il.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>, Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>
Subject: Re: [RFC PATCH net-next 0/7] DSA master state tracking

On Thu, Dec 09, 2021 at 03:44:14PM +0100, Ansuel Smith wrote:
> On Thu, Dec 09, 2021 at 02:28:30PM +0000, Vladimir Oltean wrote:
> > On Thu, Dec 09, 2021 at 04:05:59AM +0100, Ansuel Smith wrote:
> > > On Thu, Dec 09, 2021 at 12:32:23AM +0200, Vladimir Oltean wrote:
> > > > This patch set is provided solely for review purposes (therefore not to
> > > > be applied anywhere) and for Ansuel to test whether they resolve the
> > > > slowdown reported here:
> > > > https://patchwork.kernel.org/project/netdevbpf/cover/20211207145942.7444-1-ansuelsmth@gmail.com/
> > > > 
> > > > It does conflict with net-next due to other patches that are in my tree,
> > > > and which were also posted here and would need to be picked ("Rework DSA
> > > > bridge TX forwarding offload API"):
> > > > https://patchwork.kernel.org/project/netdevbpf/cover/20211206165758.1553882-1-vladimir.oltean@nxp.com/
> > > > 
> > > > Additionally, for Ansuel's work there is also a logical dependency with
> > > > this series ("Replace DSA dp->priv with tagger-owned storage"):
> > > > https://patchwork.kernel.org/project/netdevbpf/cover/20211208200504.3136642-1-vladimir.oltean@nxp.com/
> > > > 
> > > > To get both dependency series, the following commands should be sufficient:
> > > > git b4 20211206165758.1553882-1-vladimir.oltean@....com
> > > > git b4 20211208200504.3136642-1-vladimir.oltean@....com
> > > > 
> > > > where "git b4" is an alias in ~/.gitconfig:
> > > > [b4]
> > > > 	midmask = https://lore.kernel.org/r/%25s
> > > > [alias]
> > > > 	b4 = "!f() { b4 am -t -o - $@ | git am -3; }; f"
> > > > 
> > > > The patches posted here are mainly to offer a consistent
> > > > "master_up"/"master_going_down" chain of events to switches, without
> > > > duplicates, and always starting with "master_up" and ending with
> > > > "master_going_down". This way, drivers should know when they can perform
> > > > Ethernet-based register access.
> > > > 
> > > > Vladimir Oltean (7):
> > > >   net: dsa: only bring down user ports assigned to a given DSA master
> > > >   net: dsa: refactor the NETDEV_GOING_DOWN master tracking into separate
> > > >     function
> > > >   net: dsa: use dsa_tree_for_each_user_port in
> > > >     dsa_tree_master_going_down()
> > > >   net: dsa: provide switch operations for tracking the master state
> > > >   net: dsa: stop updating master MTU from master.c
> > > >   net: dsa: hold rtnl_mutex when calling dsa_master_{setup,teardown}
> > > >   net: dsa: replay master state events in
> > > >     dsa_tree_{setup,teardown}_master
> > > > 
> > > >  include/net/dsa.h  |  8 +++++++
> > > >  net/dsa/dsa2.c     | 52 ++++++++++++++++++++++++++++++++++++++++++++--
> > > >  net/dsa/dsa_priv.h | 11 ++++++++++
> > > >  net/dsa/master.c   | 29 +++-----------------------
> > > >  net/dsa/slave.c    | 32 +++++++++++++++-------------
> > > >  net/dsa/switch.c   | 29 ++++++++++++++++++++++++++
> > > >  6 files changed, 118 insertions(+), 43 deletions(-)
> > > > 
> > > > -- 
> > > > 2.25.1
> > > > 
> > > 
> > > I applied this patch and it does work correctly. Sadly the problem is
> > > not solved and still the packet are not tracked correctly. What I notice
> > > is that everything starts to work as soon as the master is set to
> > > promiiscuous mode. Wonder if we should track that event instead of
> > > simple up?
> > > 
> > > Here is a bootlog [0]. I added some log when the function timeouts and when
> > > master up is actually called.
> > > Current implementation for this is just a bool that is set to true on
> > > master up and false on master going down. (final version should use
> > > locking to check if an Ethernet transation is in progress)
> > > 
> > > [0] https://pastebin.com/7w2kgG7a
> > 
> > This is strange. What MAC DA do the ack packets have? Could you give us
> > a pcap with the request and reply packets (not necessarily now)?
> 
> If you want I can give you a pcap from a router bootup to the setup with
> no ethernet cable attached. I notice the switch sends some packet at the
> bootup for some reason but they are not Ethernet mdio packet or other
> type. It seems they are not even tagged (doesn't have qca tag) as the
> header mode is disabled by default)
> Let me know if you need just a pcap for the Ethernet mdio transaction or
> from a bootup. I assume it would be better from a bootup? (they are not
> tons of packet and the mdio Ethernet ones are easy to notice.)

Anything that contains some request and response packets should do, as
long as they're relatively easy to spot. But as stated, this can wait
for a while, I don't think that promiscuity is the issue, after your
second reply.

> > Can you try to set ".promisc_on_master = true" in qca_netdev_ops?
> 
> I already tried and here [0] is a log. I notice with promisc_on_master
> the "eth0 entered promiscuous mode" is missing. Is that correct?
> Unless I was tired and misread the code, the info should be printed
> anyway. Also looking at the comments for promisc_on_master I don't think
> that should be applied to this tagger.
> 
> [0] https://pastebin.com/MN2ttVpr

It isn't missing, it's right there on line 11.
I think the problem is that we also need to track the operstate of the
master (netif_oper_up via NETDEV_CHANGE) before declaring it as good to go.
You can see that this is exactly the line after which the timeouts disappear:

[    7.146901] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

I didn't really want to go there, because now I'm not sure how to
synthesize the information for the switch drivers to consume it.
Anyway I've prepared a v2 patchset and I'll send it out very soon.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ