[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFZh4h_ueji_KucLdPx9PtTQP1g29PbcjNDFGzLBJYpYK8Rt3w@mail.gmail.com>
Date: Thu, 24 Aug 2023 15:03:32 -0400
From: Brian Hutchinson <b.hutchman@...il.com>
To: Christian Eggers <ceggers@...i.de>
Cc: netdev@...r.kernel.org, Vladimir Oltean <OlteanV@...il.com>, arun.ramadoss@...rochip.com,
rakesh.sankaranarayanan@...rochip.com
Subject: Re: Microchip net DSA with ptp4l getting tx_timeout failed msg using
6.3.12 kernel and KSZ9567 switch
Update. Top posting because I think this is my issue.
I dug further into my problem. I'm using E2E and it looks like the
mainlined Microchip KSZ DSA PTP code is only supporting P2P.
The 5.10.69 kernel that I was first able to get working with
Christian's early pre-mainlined patches had:
0016-net-dsa-microchip-ksz9477-add-E2E-support.patch
... which gets into the "sticky" bits of why these patches weren't
accepted in the first place due to some Microchip specific
implementation if I recall correctly.
Regards,
Brian
On Thu, Aug 24, 2023 at 2:26 PM Brian Hutchinson <b.hutchman@...il.com> wrote:
>
> Hi Christian,
>
>
> On Wed, Aug 23, 2023 at 9:29 AM Brian Hutchinson <b.hutchman@...il.com> wrote:
> >
> >
> >
> > On Wed, Aug 23, 2023 at 4:22 AM Christian Eggers <ceggers@...i.de> wrote:
> >>
> >> Hi Brian,
> >>
> >> I just return from my holidays...
> >
> >
> > Hope you had a good one ... I need one too!
> >
> >>
> >>
> >> Am Dienstag, 22. August 2023, 23:49:33 CEST schrieben Sie:
> >> > Getting this tx_timestamp_timeout error over and over when I try to run ptp4l:
> >> >
> >> > ptp4l[1366.143]: selected best master clock 001747.fffe.70151b
> >> > ptp4l[1366.143]: updating UTC offset to 37
> >> > ptp4l[1366.143]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
> >> > ptp4l[1366.860]: port 1: delay timeout
> >> > ptp4l[1376.871]: timed out while polling for tx timestamp
> >> > ptp4l[1376.871]: increasing tx_timestamp_timeout may correct this
> >> > issue, but it is likely caused by a driver bug
> >> > ptp4l[1376.871]: port 1: send delay request failed
> >> >
> >> > I was using 5.10.69 with Christians patches before they were mainlined
> >> > and had everything working with the help of Christian, Vladimir and
> >> > others.
> >> >
> >> > Now I need to update kernel so tried 6.3.12 which contains Christians
> >> > upstream patches and I also back ported v8 of the upstreamed patches
> >> > to 6.1.38 and I'm getting the same results with that kernel too.
> >> >
> >>
> >> I am also in the process of upgrading to 6.1.38 (but not really tested).
> >> I cherry-picked all necessary patches from the latest master (see attached
> >> archive). Maybe you would like to compare this with your patch series.
> >
> >
> > Excellent, I will check it out! Yeah, we needed to be on a LTS kernel so that's why I'm focusing on 6.1.38 as it's the latest in the yocto/oe universe.
>
> So I checked all of your patches for 6.1.38 vs the ones I had. I had
> all except 0002 and 0003. I didn't have all of 0001 but I got a build
> error on diff_by_scaled_ppm and back ported that function from 6.3.12
> to make things build.
>
> I applied the missing patches I got from you and rebuilt everything
> and still have the same result with tx_timestamp_timeout. Which
> didn't surprise me as I mentioned before I tried 6.3.12 mainline and
> get same result there too.
>
> Regards,
>
> Brian
>
> >
> >>
> >>
> >> > [...]
> >> >
> >> > I tried increasing tx_timestamp and it doesn't appear to matter. I
> >> > feel like I had this problem before when first starting to work with
> >> > 5.10.69 but can't remember if another patch resolved it. With 5.10.69
> >> > I've got quite a few more patches than just the 13 that were mainlined
> >> > in 6.3. Looking through old emails I want to say it might have been
> >> > resolved with net-dsa-ksz9477-avoid-PTP-races-with-the-data-path-l.patch
> >> > that Vladimir gave me but looking at the code it doesn't appear
> >> > mainline has that one.
> >>
> >> How is the IRQ line of you switch attached? I remember there was a problem
> >> with the IRQ type (edge vs. level), but I think this has already been
> >> applied to 6.1.38 (via -stable).
> >
> >
> > So that's one of the first things I thought of which is why I provided cat of /proc/interrupts.
> >
> > I also do have a /dev/ptp1 (/dev/ptp0 is imx8mm)
> >
> > My device tree node is the same as before:
> >
> > i2c_ksz9567: ksz9567@5f {
> > compatible = "microchip,ksz9567";
> > reg = <0x5f>;
> > phy-mode = "rgmii-id";
> > status = "okay";
> > interrupt-parent = <&gpio1>;
> > interrupts = <10 IRQ_TYPE_LEVEL_LOW>;
> >
> > ports {
> > #address-cells = <1>;
> > #size-cells = <0>;
> > port@0 {
> > reg = <0>;
> > label = "lan1";
> > };
> > port@1 {
> > reg = <1>;
> > label = "lan2";
> > };
> > port@6 {
> > reg = <6>;
> > label = "cpu";
> > ethernet = <&fec1>;
> > phy-mode = "rgmii-id";
> > fixed-link {
> > speed = <100>;
> > full-duplex;
> > };
> > };
> > };
> > };
> >
> > And I have same pinmux setup as before. I double checked all of that.
> >
> > I noticed new kernel /proc/interrupts now has a bunch of ksz lines in addition to "gpio-mxc 10 Level" which is IRQ from the ksz switch.
> >
> > Here is what the old 5.10.69 /proc/interrupts looked like:
> >
> > cat /proc/interrupts
> > CPU0 CPU1 CPU2 CPU3
> > 11: 46141 127 127 124 GICv3 30 Level arch_timer
> > 14: 5260 0 0 0 GICv3 79 Level timer@...a0000
> > 15: 0 0 0 0 GICv3 23 Level arm-pmu
> > 20: 0 0 0 0 GICv3 127 Level sai
> > 21: 0 0 0 0 GICv3 82 Level sai
> > 32: 0 0 0 0 GICv3 110 Level 30280000.watchdog
> > 33: 0 0 0 0 GICv3 135 Level sdma
> > 34: 0 0 0 0 GICv3 66 Level sdma
> > 35: 0 0 0 0 GICv3 52 Level caam-snvs
> > 36: 0 0 0 0 GICv3 51 Level rtc alarm
> > 37: 0 0 0 0 GICv3 36 Level 30370000.snvs:snvs-powerkey
> > 39: 0 0 0 0 GICv3 64 Level 30830000.spi
> > 40: 1412 0 0 0 GICv3 59 Level 30890000.serial
> > 42: 55291 0 0 0 GICv3 67 Level 30a20000.i2c
> > 43: 0 0 0 0 GICv3 68 Level 30a30000.i2c
> > 44: 0 0 0 0 GICv3 69 Level 30a40000.i2c
> > 45: 0 0 0 0 GICv3 70 Level 30a50000.i2c
> > 47: 0 0 0 0 GICv3 55 Level mmc1
> > 48: 3003 0 0 0 GICv3 56 Level mmc2
> > 49: 2565 0 0 0 GICv3 139 Level 30bb0000.spi
> > 50: 0 0 0 0 GICv3 34 Level sdma
> > 51: 0 0 0 0 GICv3 150 Level 30be0000.ethernet
> > 52: 0 0 0 0 GICv3 151 Level 30be0000.ethernet
> > 53: 1417 0 0 0 GICv3 152 Level 30be0000.ethernet
> > 54: 0 0 0 0 GICv3 153 Level 30be0000.ethernet
> > 56: 0 0 0 0 GICv3 130 Level imx8_ddr_perf_pmu
> > 60: 0 0 0 0 gpio-mxc 3 Level bd718xx-irq
> > 67: 23 0 0 0 gpio-mxc 10 Level 0-005f
> > 72: 0 0 0 0 gpio-mxc 15 Edge 30b50000.mmc cd
> > 217: 0 0 0 0 bd718xx-irq 5 Edge gpio_keys
> > IPI0: 29 14 13 13 Rescheduling interrupts
> > IPI1: 0 41 41 41 Function call interrupts
> > IPI2: 0 0 0 0 CPU stop interrupts
> > IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
> > IPI4: 0 0 0 0 Timer broadcast interrupts
> > IPI5: 7959 0 0 0 IRQ work interrupts
> > IPI6: 0 0 0 0 CPU wake-up interrupts
> > Err: 0
> >
> > I'll check out your 6.1.38 changes compared to what I did.
> >
> > Thanks,
> >
> > Brian
> >
> >>
> >>
Powered by blists - more mailing lists