[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2259373.iZASKD2KPV@n95hx1g2>
Date: Fri, 25 Aug 2023 17:49:45 +0200
From: Christian Eggers <ceggers@...i.de>
To: Brian Hutchinson <b.hutchman@...il.com>
CC: <netdev@...r.kernel.org>, Vladimir Oltean <OlteanV@...il.com>,
<arun.ramadoss@...rochip.com>, <rakesh.sankaranarayanan@...rochip.com>
Subject: Re: Microchip net DSA with ptp4l getting tx_timeout failed msg using 6.3.12 kernel and KSZ9567 switch
Hi Brian,
On Thursday, 24 August 2023, 21:03:32 CEST, Brian Hutchinson wrote:
> Update. Top posting because I think this is my issue.
>
> I dug further into my problem. I'm using E2E and it looks like the
> mainlined Microchip KSZ DSA PTP code is only supporting P2P.
>
> The 5.10.69 kernel that I was first able to get working with
> Christian's early pre-mainlined patches had:
> 0016-net-dsa-microchip-ksz9477-add-E2E-support.patch
sorry for this, but I forgot that you use E2E. Unfortunately I
have no up-to-date patches for this, so you may try to port
the old patch yourself.
regards
Christian
>
> ... which gets into the "sticky" bits of why these patches weren't
> accepted in the first place due to some Microchip specific
> implementation if I recall correctly.
>
> Regards,
>
> Brian
>
>
> On Thu, Aug 24, 2023 at 2:26 PM Brian Hutchinson <b.hutchman@...il.com> wrote:
> >
> > Hi Christian,
> >
> >
> > On Wed, Aug 23, 2023 at 9:29 AM Brian Hutchinson <b.hutchman@...il.com> wrote:
> > >
> > >
> > >
> > > On Wed, Aug 23, 2023 at 4:22 AM Christian Eggers <ceggers@...i.de> wrote:
> > >>
> > >> Hi Brian,
> > >>
> > >> I just return from my holidays...
> > >
> > >
> > > Hope you had a good one ... I need one too!
> > >
> > >>
> > >>
> > >> Am Dienstag, 22. August 2023, 23:49:33 CEST schrieben Sie:
> > >> > Getting this tx_timestamp_timeout error over and over when I try to run ptp4l:
> > >> >
> > >> > ptp4l[1366.143]: selected best master clock 001747.fffe.70151b
> > >> > ptp4l[1366.143]: updating UTC offset to 37
> > >> > ptp4l[1366.143]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
> > >> > ptp4l[1366.860]: port 1: delay timeout
> > >> > ptp4l[1376.871]: timed out while polling for tx timestamp
> > >> > ptp4l[1376.871]: increasing tx_timestamp_timeout may correct this
> > >> > issue, but it is likely caused by a driver bug
> > >> > ptp4l[1376.871]: port 1: send delay request failed
> > >> >
> > >> > I was using 5.10.69 with Christians patches before they were mainlined
> > >> > and had everything working with the help of Christian, Vladimir and
> > >> > others.
> > >> >
> > >> > Now I need to update kernel so tried 6.3.12 which contains Christians
> > >> > upstream patches and I also back ported v8 of the upstreamed patches
> > >> > to 6.1.38 and I'm getting the same results with that kernel too.
> > >> >
> > >>
> > >> I am also in the process of upgrading to 6.1.38 (but not really tested).
> > >> I cherry-picked all necessary patches from the latest master (see attached
> > >> archive). Maybe you would like to compare this with your patch series.
> > >
> > >
> > > Excellent, I will check it out! Yeah, we needed to be on a LTS kernel so that's why I'm focusing on 6.1.38 as it's the latest in the yocto/oe universe.
> >
> > So I checked all of your patches for 6.1.38 vs the ones I had. I had
> > all except 0002 and 0003. I didn't have all of 0001 but I got a build
> > error on diff_by_scaled_ppm and back ported that function from 6.3.12
> > to make things build.
> >
> > I applied the missing patches I got from you and rebuilt everything
> > and still have the same result with tx_timestamp_timeout. Which
> > didn't surprise me as I mentioned before I tried 6.3.12 mainline and
> > get same result there too.
> >
> > Regards,
> >
> > Brian
> >
> > >
> > >>
> > >>
> > >> > [...]
> > >> >
> > >> > I tried increasing tx_timestamp and it doesn't appear to matter. I
> > >> > feel like I had this problem before when first starting to work with
> > >> > 5.10.69 but can't remember if another patch resolved it. With 5.10.69
> > >> > I've got quite a few more patches than just the 13 that were mainlined
> > >> > in 6.3. Looking through old emails I want to say it might have been
> > >> > resolved with net-dsa-ksz9477-avoid-PTP-races-with-the-data-path-l.patch
> > >> > that Vladimir gave me but looking at the code it doesn't appear
> > >> > mainline has that one.
> > >>
> > >> How is the IRQ line of you switch attached? I remember there was a problem
> > >> with the IRQ type (edge vs. level), but I think this has already been
> > >> applied to 6.1.38 (via -stable).
> > >
> > >
> > > So that's one of the first things I thought of which is why I provided cat of /proc/interrupts.
> > >
> > > I also do have a /dev/ptp1 (/dev/ptp0 is imx8mm)
> > >
> > > My device tree node is the same as before:
> > >
> > > i2c_ksz9567: ksz9567@5f {
> > > compatible = "microchip,ksz9567";
> > > reg = <0x5f>;
> > > phy-mode = "rgmii-id";
> > > status = "okay";
> > > interrupt-parent = <&gpio1>;
> > > interrupts = <10 IRQ_TYPE_LEVEL_LOW>;
> > >
> > > ports {
> > > #address-cells = <1>;
> > > #size-cells = <0>;
> > > port@0 {
> > > reg = <0>;
> > > label = "lan1";
> > > };
> > > port@1 {
> > > reg = <1>;
> > > label = "lan2";
> > > };
> > > port@6 {
> > > reg = <6>;
> > > label = "cpu";
> > > ethernet = <&fec1>;
> > > phy-mode = "rgmii-id";
> > > fixed-link {
> > > speed = <100>;
> > > full-duplex;
> > > };
> > > };
> > > };
> > > };
> > >
> > > And I have same pinmux setup as before. I double checked all of that.
> > >
> > > I noticed new kernel /proc/interrupts now has a bunch of ksz lines in addition to "gpio-mxc 10 Level" which is IRQ from the ksz switch.
> > >
> > > Here is what the old 5.10.69 /proc/interrupts looked like:
> > >
> > > cat /proc/interrupts
> > > CPU0 CPU1 CPU2 CPU3
> > > 11: 46141 127 127 124 GICv3 30 Level arch_timer
> > > 14: 5260 0 0 0 GICv3 79 Level timer@...a0000
> > > 15: 0 0 0 0 GICv3 23 Level arm-pmu
> > > 20: 0 0 0 0 GICv3 127 Level sai
> > > 21: 0 0 0 0 GICv3 82 Level sai
> > > 32: 0 0 0 0 GICv3 110 Level 30280000.watchdog
> > > 33: 0 0 0 0 GICv3 135 Level sdma
> > > 34: 0 0 0 0 GICv3 66 Level sdma
> > > 35: 0 0 0 0 GICv3 52 Level caam-snvs
> > > 36: 0 0 0 0 GICv3 51 Level rtc alarm
> > > 37: 0 0 0 0 GICv3 36 Level 30370000.snvs:snvs-powerkey
> > > 39: 0 0 0 0 GICv3 64 Level 30830000.spi
> > > 40: 1412 0 0 0 GICv3 59 Level 30890000.serial
> > > 42: 55291 0 0 0 GICv3 67 Level 30a20000.i2c
> > > 43: 0 0 0 0 GICv3 68 Level 30a30000.i2c
> > > 44: 0 0 0 0 GICv3 69 Level 30a40000.i2c
> > > 45: 0 0 0 0 GICv3 70 Level 30a50000.i2c
> > > 47: 0 0 0 0 GICv3 55 Level mmc1
> > > 48: 3003 0 0 0 GICv3 56 Level mmc2
> > > 49: 2565 0 0 0 GICv3 139 Level 30bb0000.spi
> > > 50: 0 0 0 0 GICv3 34 Level sdma
> > > 51: 0 0 0 0 GICv3 150 Level 30be0000.ethernet
> > > 52: 0 0 0 0 GICv3 151 Level 30be0000.ethernet
> > > 53: 1417 0 0 0 GICv3 152 Level 30be0000.ethernet
> > > 54: 0 0 0 0 GICv3 153 Level 30be0000.ethernet
> > > 56: 0 0 0 0 GICv3 130 Level imx8_ddr_perf_pmu
> > > 60: 0 0 0 0 gpio-mxc 3 Level bd718xx-irq
> > > 67: 23 0 0 0 gpio-mxc 10 Level 0-005f
> > > 72: 0 0 0 0 gpio-mxc 15 Edge 30b50000.mmc cd
> > > 217: 0 0 0 0 bd718xx-irq 5 Edge gpio_keys
> > > IPI0: 29 14 13 13 Rescheduling interrupts
> > > IPI1: 0 41 41 41 Function call interrupts
> > > IPI2: 0 0 0 0 CPU stop interrupts
> > > IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
> > > IPI4: 0 0 0 0 Timer broadcast interrupts
> > > IPI5: 7959 0 0 0 IRQ work interrupts
> > > IPI6: 0 0 0 0 CPU wake-up interrupts
> > > Err: 0
> > >
> > > I'll check out your 6.1.38 changes compared to what I did.
> > >
> > > Thanks,
> > >
> > > Brian
> > >
> > >>
> > >>
>
Powered by blists - more mailing lists