[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFcVEC++NGG2fieiZ4TnG1smhyuGLqhKw88PKfUkzmqUvRTtHw@mail.gmail.com>
Date: Sat, 9 Mar 2019 11:08:35 +0530
From: Harini Katakam <harinik@...inx.com>
To: Paul Thomas <pthomas8589@...il.com>
Cc: "linuxptp-devel@...ts.sourceforge.net"
<linuxptp-devel@...ts.sourceforge.net>, netdev@...r.kernel.org
Subject: Re: [Linuxptp-devel] strangeness
Hi Paul,
On Sat, Mar 9, 2019 at 3:13 AM Paul Thomas <pthomas8589@...il.com> wrote:
>
> On Fri, Mar 8, 2019 at 1:07 PM Paul Thomas <pthomas8589@...il.com> wrote:
> >
> > Hi Harini,
> >
> > On Fri, Mar 8, 2019 at 1:08 AM Harini Katakam <harinik@...inx.com> wrote:
> > >
> > > Hi Paul,
> > > On Fri, Mar 8, 2019 at 12:33 AM Paul Thomas <pthomas8589@...il.com> wrote:
> > > >
> > > > On Thu, Mar 7, 2019 at 12:32 AM Harini Katakam <harinik@...inx.com> wrote:
> > > > >
> > > > > Hi Paul,
<snip>
> > > >
> > > > OK, I think things are becoming more clear. After just doing ioctl(fd,
> > > > SIOCSHWTSTAMP, &ifreq) from userspace (tx_bd_control =
> > > > TSTAMP_ALL_FRAMES in macb_ptp.c) then with the nc experiment some udp
> > > > transmits do not make it to macb_start_xmit() until receive traffic on
> > > > the nc connection comes in (one-to-one, one new rx packet means one
> > > > old tx packet goes out).
> > >
> > > Could you please share any wireshark log or dump for what is being
> > > received here?
> >
> > Here are two wireshark captures, the thing to note in the bad one is
> > that packets No. 5, 7, 9 from .102 to .103 were actually sent just
> > after packet No. 2 but they don't show up on the wire until the
> > packets the other way (one for one).
> >
> > >
> > > >
> > > > Working setup:
> > > > Before the tx_bd_control = TSTAMP_ALL_FRAMES.
> > > > Every time I hit "sN Enter" from nc I see a macb_start_xmit
> > > > print_hex_dump() and I see the packet on the nc client side:
> > > > # nc -l -u -p 9999
> > > > ...
> > > > s11
> > > > [ 347.517080] macb_start_xmit data: 00000000: 20 b0 f7 04 0a 29 20 b0
> > > > f7 04 0a 26 08 00 45 00 ....) ....&..E.
> > > > s12
> > > > [ 348.964369] macb_start_xmit data: 00000000: 20 b0 f7 04 0a 29 20 b0
> > > > f7 04 0a 26 08 00 45 00 ....) ....&..E.
> > > > ...
> > > >
> > > > Broken setup:
> > > > After the tx_bd_control = TSTAMP_ALL_FRAMES.
> > > > Not the first nc packet, but many of the subsequent ones never make it
> > > > to macb_start_xmit()
> > > > # nc -l -u -p 9999
> > > > ...
> > > > s3
> > > > s4
> > > > s5
> > > > ...
> > > > Eventually after I send data from the client nc I do see the
> > > > macb_start_xmit() lines.
> > >
> > > Thanks for this debug. If macb_start_xmit is never called, one of
> > > the preceeding checks (such as if skb is present or if the TX queues
> > > are off etc)
> > > should fail. I'm still tracing this but I'm not sure under what
> > > circumstances only
> > > some UDP packets will be prevented from being transmitted.
> > In this specific test the first tx packets always goes through, and
> > the subsequent ones don't until rx packets. So it's not random when
> > they go through, I could have been clearer about that.
> >
> > > Just to be sure, could you please confirm you are not seeing any
> > > "buffer exhausted" messaged from TX error tasks?
> > Correct, I'm not seeing any "buffer exhausted" errors.
> >
> > thanks,
> > Paul
>
> And one more piece that may be helpful. I think I narrowed down what's
> happening in the receive that finally flushes out a pending tx packet.
> It seems to be the netif_receive_skb(skb); line in gem_rx() (line
> 1067). I tested with an mdelay before and after this call:
> mdelay(1000);//mdelay here is slow to flush the pending tx
> packet (as seen by nc client)
> netif_receive_skb(skb);
> //mdelay(1000);//mdelay here is fast to flush the pending tx
> packet (as seen by nc client)
> This seems very strange to me, I quickly glanced at what
> netif_receive_skb() is doing and didn't see anything connected with
> the TX path, but those are the symptoms.
Thanks for the logs and debug.
I'm afraid I can't think of how this receive affects TX path.
Even if the IP somehow has any dependency between TX and RX path
(which is doesn't, on ZynqMP, to my knowledge), it wouldn't explain
why packets dont reach _xmit function at all.
Let me debug a little more.
Regards,
Harini
Powered by blists - more mailing lists