lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3e37228e-c0a9-4198-98d3-a35cc77dbd94@lunn.ch>
Date: Thu, 24 Apr 2025 22:34:50 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Alexander Duyck <alexander.duyck@...il.com>
Cc: Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
	linux@...linux.org.uk, hkallweit1@...il.com, davem@...emloft.net,
	pabeni@...hat.com
Subject: Re: [net-next PATCH 0/2] net: phylink: Fix issue w/ BMC link flap

Sorry for the delay, busy with $DAY_JOB

> > > > There are not 4 host MACs connected to a 5 port switch. Rather, each
> > > > host gets its own subset of queues, DMA engines etc, for one shared
> > > > MAC. Below the MAC you have all the usual PCS, SFP cage, gpios, I2C
> > > > bus, and blinky LEDs. Plus you have the BMC connected via an RMII like
> > > > interface.
> > >
> > > Yeah, that is the setup so far. Basically we are using one QSFP cable
> > > and slicing it up. So instead of having a 100CR4 connection we might
> > > have 2x50CR2 operating on the same cable, or 4x25CR.
> >
> > But for 2x50CR2 you have two MACs? And for 4x25CR 4 MACs?
> 
> Yes. Some confusion here may be that our hardware always has 4
> MAC/PCS/PMA setups, one for each host. Depending on the NIC
> configuration we may have either 4 hosts or 2 hosts present with 2
> disabled.

So with 2 hosts, each host has two netdevs? If you were to dedicate
the whole card to one host, you would have 4 netdevs? It is upto
whatever is above to perform load balancing over those?

If you always have 4 MAC/PCS, then the PCS is only ever used with a
single lane? The MAC does not support 100000baseKR4 for example, but
250000baseKR1?

> The general idea is that we have to cache the page and bank in the
> driver and pass those as arguments to the firmware when we perform a
> read. Basically it will take a lock on the I2C, set the page and bank,
> perform the read, and then release the lock. With that all 4 hosts can
> read the I2C from the QSFP without causing any side effects.

I assume your hardware team have not actually implemented I2C, they
have licensed it. Hence there is probably already a driver for it in
drivers/i2c/busses, maybe one of the i2c-designware-? However, you are
not going to use it, you are going to reinvent the wheel so you can
parse the transactions going over it, look for reads and writes to
address 127? Humm, i suppose you could have a virtual I2C driver doing
this stacked on top of the real I2C driver. Is this something other
network drivers are going to need? Should it be somewhere in
drivers/net/phy? The hard bit is how you do the mutex in an agnostic
way. But it looks like hardware spinlocks would work:
https://docs.kernel.org/locking/hwspinlock.html

And actually, it is more complex than caching the page.

  This specification defines functions in Pages 00h-02h. Pages 03-7Fh
  are reserved for future use. Writing the value of a non-supported
  Page shall not be accepted by the transceiver. The Page Select byte
  shall revert to 0 and read / write operations shall be to the
  unpaged A2h memory map.

So i expect the SFP driver to do a write followed by a read to know if
it needs to return EOPNOTSUPP to user space because the SFP does not
implement the page.

	Andrew

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ