lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <trinity-1d4cc306-d1a4-4ccf-b853-d315553515ce-1666543305596@3c-app-gmx-bs01>
Date:   Sun, 23 Oct 2022 18:41:45 +0200
From:   Frank Wunderlich <frank-w@...lic-files.de>
To:     "Russell King (Oracle)" <linux@...linux.org.uk>
Cc:     Frank Wunderlich <linux@...web.de>,
        linux-mediatek@...ts.infradead.org,
        Alexander Couzens <lynxis@...0.eu>,
        Felix Fietkau <nbd@....name>, John Crispin <john@...ozen.org>,
        Sean Wang <sean.wang@...iatek.com>,
        Mark Lee <Mark-MC.Lee@...iatek.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Matthias Brugger <matthias.bgg@...il.com>,
        netdev@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org
Subject: Aw: Re: [PATCH v2] net: mtk_sgmii: implement mtk_pcs_ops

> Gesendet: Sonntag, 23. Oktober 2022 um 17:46 Uhr
> Von: "Russell King (Oracle)" <linux@...linux.org.uk>
> On Sun, Oct 23, 2022 at 05:05:30PM +0200, Frank Wunderlich wrote:
> > > Gesendet: Sonntag, 23. Oktober 2022 um 11:43 Uhr
> > > Von: "Russell King (Oracle)" <linux@...linux.org.uk>
> >
> > > On Sun, Oct 23, 2022 at 09:26:39AM +0200, Frank Wunderlich wrote:
> > > > > Gesendet: Samstag, 22. Oktober 2022 um 21:18 Uhr
> > > > > Von: "Russell King (Oracle)" <linux@...linux.org.uk>
> > > > > Hi,
> > > > >
> > > > > On Sat, Oct 22, 2022 at 07:53:16PM +0200, Frank Wunderlich wrote:
> > > > > > > Gesendet: Samstag, 22. Oktober 2022 um 19:05 Uhr
> > > > > > > Von: "Russell King (Oracle)" <linux@...linux.org.uk>
> > > > > > > On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > > > > > > > > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > > > > > > > > Von: "Russell King (Oracle)" <linux@...linux.org.uk>
> > > > > >
> > > > > > > > this patch breaks connectivity at least on the sfp-port (eth1).
> > > > > >
> > > > > > > > pcs_get_state
> > > > > > > > [   65.522936] offset:0 0x2c1140
> > > > > > > > [   65.522950] offset:4 0x4d544950
> > > > > > > > [   65.525914] offset:8 0x40e041a0
> > > > > > > > [  177.346183] offset:0 0x2c1140
> > > > > > > > [  177.346202] offset:4 0x4d544950
> > > > > > > > [  177.349168] offset:8 0x40e041a0
> > > > > > > > [  177.352477] offset:0 0x2c1140
> > > > > > > > [  177.356952] offset:4 0x4d544950
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Thanks. Well, the results suggest that the register at offset 8 is
> > > > > > > indeed the advertisement and link-partner advertisement register. So
> > > > > > > we have a bit of progress and a little more understanding of this
> > > > > > > hardware.
> > > > > > >
> > > > > > > Do you know if your link partner also thinks the link is up?
> > > > > >
> > > > > > yes link is up on my switch, cannot enable autoneg for fibre-port, so port is fixed to 1000M/full flowcontrol enabled.
> > > > > >
> > > > > > > What I notice is:
> > > > > > >
> > > > > > > mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
> > > > > > >
> > > > > > > The duplex is "unknown" which means you're not filling in the
> > > > > > > state->duplex field in your pcs_get_state() function. Given the
> > > > > > > link parter adverisement is 0x00e0, this means the link partner
> > > > > > > supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
> > > > > > > is therefore full duplex, so can we hack that in to your
> > > > > > > pcs_get_state() so we're getting that right for this testing please?
> > > > > >
> > > > > > 0xe0 is bits 5-7 are set (in lower byte from upper word)..which one is for duplex?
> > > > > >
> > > > > > so i should set state->duplex/pause based on this value (maybe compare with own caps)?
> > > > > >
> > > > > > found a documentation where 5=full,6=half, and bits 7+8 are for pause (symetric/asymetric)
> > > > > >
> > > > > > regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1+8, &val);
> > > > > > partner_advertising = (val & 0x00ff0000) >> 16;
> > > > >
> > > > > Not quite :) When we have the link partner's advertisement and the BMSR,
> > > > > we have a helper function in phylink to do all the gritty work:
> > > > >
> > > > > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> > > > > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);
> > > > >
> > > > > 	phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);
> > > > >
> > > > > will do all the work for you without having to care about whether
> > > > > you're operating at 2500base-X, 1000base-X or SGMII mode.
> > > > >
> > > > > > > Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
> > > > > > > in the SGMSYS_SGMII_MODE register. Does one of these bits set the
> > > > > > > format for the 16-bit control word that's used to convey the
> > > > > > > advertisements. I think the next step would be to play around with
> > > > > > > these and see what effect setting or clearing these bits has -
> > > > > > > please can you give that a go?
> > > > > >
> > > > > > these is not clear to me...should i blindly set these and how to
> > > > > > verify what they do?
> > > > >
> > > > > Yes please - I don't think anyone knows what they do.
> > > >
> > > > i guess BIT0 is the SGMII_EN flag like in other sgmii implementations.
> > > > Bit5 is "reserved" in all docs i've found....maybe it is related to HSGMII
> > > > or for 1G vs. 2G5.
> > >
> > > "other sgmii implementations" ?
> >
> > yes i googled for sgmii and found register definition for different vendor...
> > i don't know if sgmii is a standard for each vendor, afair trgmii was not.
> >
> > > If this is the SGMII_EN flag, maybe SGMII_IF_MODE_BIT0 should be
> > > renamed to SGMII_IF_SGMII_EN ? Maybe it needs to be set for SGMII
> > > and clear for base-X ?
> > >
> > > > but how to check what has changed...i guess only the register itself changed
> > > > and i have to readout another to check whats changed.
> > > >
> > > > do we really need these 2 bits? reading/setting duplex/pause from/to the register
> > > > makes sense, but digging into undocumented bits is much work and we still only guess.
> > >
> > > I don't know - I've no idea about this hardware, or what the PCS is,
> > > and other people over the years I've talked to have said "we're not
> > > using it, we can't help". The mediatek driver has been somewhat of a
> > > pain for phylink as a result.
> > >
> > > > so i would first want to get sgmii working again and then strip the pause/duplex from it.
> > >
> > > I think I'd need more information on your setup - is this dev 0? Are
> > > you using in-band mode or fixed-link mode?
> >
> > i only test with dev1 which is the sfp-port/eth1/gmac1...dev0 is the fixed-link to switch-chip.
>
> Okay, so when you're using SGMII, how are you testing it? With a copper
> SFP plugged in?
>
> > > I don't think you've updated me with register values for this since
> > > the patch. With the link timer adjusted back to 1.6ms, that should
> > > result in it working again, but if not, I think there's some
> > > possibilities.
> >
> > sorry for that, have debugged timing and it was wrong because if-
> > condition had not included 1000baseX and 2500baseX. only sgmii
>
> SGMII's link timer is specified to be 1.6ms - the SGMII v1.8 spec
> doesn't specify the margins for this.
>
> 802.3z (1000base-X) is 10ms +10ms -0s.
>
> This is what we should be using, and what I tried to implement.
>
> The hex values programmed into the register should be 0x186A0 for
> SGMII and 0x98968 for 1000base-X and 2500base-X - both values
> should fit because the link timer is apparently 20 bits wide.
>
> > > The addition of SGMII_AN_ENABLE for SGMSYS_PCS_CONTROL_1 could have
> > > broken your setup if there is no actual in-band signalling, which
> > > basically means that your firmware description is wrong - and you've
> > > possibly been led astray by the poor driver implementation.
> >
> > disabled it, but makes no change.
> >
> > but i've noticed that timing is wrong
> >
> > old value: 0x186a0
> > new value: 0x98968
> >
> > so it takes the 10000000 and not the 1600000. so it looks like interface-mode is not (yet) SGMII.
> >
> > debugged it and got 1000baseX (21),in dts i have
> > phy-mode = "2500base-x";
> > but SFP only supports 1G so mode 1000baseX is right
> >
> > set the old value with your calculation, but still not working, also with disabled AN_ENABLE-flag ;(
>
> I'm getting the impression that there's some confusing terminology going
> on here... can we clear this up please?
>
> SGMII is a proprietary modification of the 802.3z 1000base-X standard
> which:
> - reduces the link timer from 10ms to 1.6ms
> - implements data replication by 10x and 100x to achieve 100M and 10M
>   speeds over a link operating at a fixed speed of 1.25Gbaud.
> - changes the control word format to allow a SGMII PHY to signal to the
>   MAC in a timely manner which speed it is operating at.
>
> So, if you're using a fibre SFP to another device that is operating in
> 1000base-X mode, then you're wanting 1000base-X and not SGMII, and
> referring to this as SGMII is technically misleading.
>
> > root@...-r3:~# ip link set eth1 up
> > [   44.287442] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000be
> > [   44.295902] interface-mode 21 (sgmii:4)
> > root@...-r3:~# [   44.295907] timer 0x186a0
> > [   44.352872] offset:0 0x2c1140
> > [   44.355507] offset:4 0x4d544950
> > [   44.358462] offset:8 0x40e041a0
> > [   44.361609] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Full - flf
> > [   44.373042] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> >
> > root@...-r3:~# ip a a 192.168.0.19/24 dev eth1
> > root@...-r3:~# ping 192.168.0.10
> > PING 192.168.0.10 (192.168.0.10) 56(84) bytes of data.
> > ^C
>
> This is where I postulated that the PCS is trying to interpret the
> advertisements as if they are SGMII formatted control words rather than
> 1000base-X formatted control words - and by doing so, it is trying to
> operate at 10Mbps (100x data replication) with the remote end trying to
> operate at 1000Mbps. If that is what it is doing, then you will have
> link-up but no communication.
>
> The solution to this is likely trying to find a bit that tells the
> PCS whether it should be expecting a 1000base-X (or 802.3z) formatted
> control word (aka 1000base-X mode) or a SGMII formatted control word.
>
> You mentioned that bit 0 in SGMSYS_SGMII_MODE is a "SGMII_EN" bit.
> Any ideas exactly what this bit does? Does it enable the PCS as a
> whole, or could that be the bit which switches between 1000base-X
> mode and SGMII mode? (More on this below).
>
> Note that the "old way" used to work because even in 1000base-X
> mode, the code would (technically incorrectly) force the PCS to
> use a fixed configuration of 1000Mbps and force the duplex bit -
> basically no 802.3 specified autonegotiation.
>
> However, 1000base-X with in-band signalling _should_ be using
> the autonegotiation - as everything else that uses phylink does.
>
> > > Can you confirm that the device on the other end for dev 0 does in
> > > actual fact use in-band signalling please?
> > >
> > > > > If it's interpreting a link partner advertisement of 0x00e0 using
> > > > > SGMII rules, then it will be looking at bits 11 and 10 for the
> > > > > speed, both of which are zero, which means 10Mbps - and 1000base-X
> > > > > doesn't operate at 10Mbps!
> > > >
> > > > so maybe this breaks sgmii? have you changed this behaviour with your Patch?
> > >
> > > Nope, but not setting the duplex properly is yet another buggy and poor
> > > quality of implementation that afficts this driver. I've said about
> > > setting the duplex value when reviewing your patch to add .pcs_get_state
> > > and I'm disappointed that you seemingly haven't yet corrected it in the
> > > code you're testing despite those review comments.
> >
> > sorry, i thought we want to read out the values from registers to set it based on them.
> >
> > currently i test only with the dev 1 (in-band-managed with 1GBit/s SFP)
> >
> > [    1.088310] dev: 0 offset:0 0x40140
> > [    1.088331] dev: 0 offset:4 0x4d544950
> > [    1.091827] dev: 0 offset:8 0x1
> > [    1.095607] dev: 1 offset:0 0x81140
> > [    1.098739] dev: 1 offset:4 0x4d544950
> > [    1.102214] dev: 1 offset:8 0x1
> >
> > after bring device up (disabled AN and set duplex to full):
> >
> > [   34.615926] timer 0x98968
> > [   34.672888] offset:0 0x2c1140
> > [   34.675518] offset:4 0x4d544950
> > [   34.678473] offset:8 0x40e041a0
> >
> > codebase:
> >
> > https://github.com/frank-w/BPI-R2-4.14/commits/6.1-r3-sgmii
>
> I think it would also be useful to print the register at offset 32
> as well, which is the SGMSYS_SGMII_MODE register, so we can discover
> what the initial and current values of these IF_MODE_BITs are. I
> may then be able to provide you an updated patch.
>
> > > If duplex remains as "unknown", then the MAC will be programmed to
> > > operate in _half_ _duplex_ mode (read mtk_mac_link_up()) which is not
> > > what you likely want. Many MACs don't support half duplex at 1G speed,
> > > so it's likely that without setting state->duplex, the result is that
> > > the MAC hardware is programmed incorrectly.
> >
> > wonder why it was working with only my patch which had duplex also not set.
>
> It depends entirely on the MAC implementation and why the manufacturer
> decides to state that 1000 half-duplex isn't supported by the hardware!
> I don't think we can guess. However, configuring the hardware correctly
> eliminates potential issues.
>
> It is in entirely possible for devices configured with dissimilar
> duplex settings to communicate, but there will be packet loss - since
> the end operating in full duplex will transmit while the receiving
> end could also be transmitting, and the receving end could interpret
> that as a collision.
>
> > > > > So my hunch is that one of those two IF_MODE_BIT{0,5} _might_ change
> > > > > the way the PCS interprets the control word, but as we don't have
> > > > > any documentation to go on, only experimentation will answer this
> > > > > question.
> >
> > the bits were in offset 0/4/8? are they now different than before?
> > if yes maybe these break it.
> >
> > as offset 4 is the phy-id and 8 is the advertisement from local and far
> > interface i guesss IF_MODE_* is in offset 0.
>
> They're in the register at offset 32:
>
> /* Register to control remote fault */
> #define SGMSYS_SGMII_MODE               0x20
> #define SGMII_IF_MODE_BIT0              BIT(0)
> ...
> #define SGMII_IF_MODE_BIT5              BIT(5)
>
> So, I think the first step should be to print the value of this register
> along side the other three you've been providing me and update me with
> its value. I'll then provide you a replacement patch to try.

here it is (AN again enabled, changes pushed to tree above):

bootup:

[    1.098876] dev: 1 offset:0 0x81140
[    1.102699] dev: 1 offset:4 0x4d544950
[    1.106180] dev: 1 offset:8 0x1
[    1.109914] dev: 1 offset:32 0x3112001b

after putting eth1 up:

[   32.566099] timer 0x186a0
[   32.623021] offset:0 0x2c1140
[   32.625653] offset:4 0x4d544950
[   32.628614] offset:8 0x40e041a0
[   32.631746] offset:32 0x3112011b

regards Frank

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ