netdev - Re: [Linuxptp-devel] strangeness

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD56B7dh=NaQLU4Z_-iBdeWfO6gTjCY9o12eoFGAMv4MCeh+aQ@mail.gmail.com>
Date:   Tue, 12 Mar 2019 16:00:15 -0400
From:   Paul Thomas <pthomas8589@...il.com>
To:     "Keller, Jacob E" <jacob.e.keller@...el.com>,
        netdev@...r.kernel.org, Harini Katakam <harinik@...inx.com>
Cc:     "linuxptp-devel@...ts.sourceforge.net" 
        <linuxptp-devel@...ts.sourceforge.net>
Subject: Re: [Linuxptp-devel] strangeness

On Tue, Mar 12, 2019 at 3:22 PM Keller, Jacob E
<jacob.e.keller@...el.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Paul Thomas [mailto:pthomas8589@...il.com]
> > Sent: Tuesday, March 12, 2019 10:50 AM
> > To: Keller, Jacob E <jacob.e.keller@...el.com>
> > Cc: linuxptp-devel@...ts.sourceforge.net
> > Subject: Re: [Linuxptp-devel] strangeness
> >
> > On Tue, Mar 12, 2019 at 1:11 PM Keller, Jacob E
> > <jacob.e.keller@...el.com> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Paul Thomas [mailto:pthomas8589@...il.com]
> > > > Sent: Tuesday, March 12, 2019 10:00 AM
> > > > To: Keller, Jacob E <jacob.e.keller@...el.com>
> > > > Subject: Re: [Linuxptp-devel] strangeness
> > > >
> > > > On Tue, Mar 12, 2019 at 12:26 PM Keller, Jacob E
> > > > <jacob.e.keller@...el.com> wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: netdev-owner@...r.kernel.org [mailto:netdev-owner@...r.kernel.org]
> > On
> > > > > > Behalf Of Harini Katakam
> > > > > > Sent: Tuesday, March 12, 2019 3:10 AM
> > > > > > To: Paul Thomas <pthomas8589@...il.com>
> > > > > > Cc: linuxptp-devel@...ts.sourceforge.net; netdev@...r.kernel.org
> > > > > > Subject: Re: [Linuxptp-devel] strangeness
> > > > > >
> > > > > > Hi Paul,
> > > > > > On Tue, Mar 12, 2019 at 8:26 AM Paul Thomas <pthomas8589@...il.com>
> > > > wrote:
> > > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > Let me do a quick clean recap of this issue.
> > > > > > >
> > > > > > > On a Debian arm64 system with a 5.0rc8 kernel using the macb driver on
> > > > > > > zynqmp, enabling tx timestamping (1) breaks networking! The first and
> > > > > > > most noticeable way is that you can no longer connect with ssh. This
> > > > > > > is a serious bug somewhere and merits some attention.
> > > > > > >
> > > > > > > Trying to debug ssh is a possibility, but I was trying to debug with
> > > > > > > something easier and thus the netcat testing. The specific issue can
> > > > > > > be seen in the following strace. In this setup nc just connects to a
> > > > > > > server and tries to send two packets (2). The first packet goes
> > > > > > > through fine, but the second doesn't because nc is stuck forever
> > > > > > > trying to read from the socket.
> > > > > > > pselect6(4, [0 3], NULL, NULL, NULL, NULL) = 1 (in [0]) <-- waiting on
> > > > > > > stdin and UDP sock
> > > > > > > read(0, "c1\n", 8192) = 3 <-- read three chars from stdin
> > > > > > > write(3, "c1\n", 3) = 3 <-- write those out on the UDP sock
> > > > > > > pselect6(4, [0 3], NULL, NULL, NULL, NULL) = 1 (in [3])  <-- waiting
> > > > > > > on stdin and UDP sock
> > > > > > > read(3, <-- waits forever here as there is no data to read
> > > > > > >
> > > > > > > I've been reading more, an old patch and the timestamping.txt doc
> > > > > > > helped me understand a little more of what's going on:
> > > > > > > https://lore.kernel.org/netdev/20130328211925.7644.15781.stgit@jekeller-
> > > > > > hub.jf.intel.com/
> > > > > > > https://www.kernel.org/doc/Documentation/networking/timestamping.txt
> > > > > > >
> > > > > > > So it is clear that if the SO_SELECT_ERR_QUEUE flag is set then in
> > > > > > > fact the select should return, but it is not set in this case. I can
> > > > > > > see everything that is going on in datagram_poll() in datagram.c. The
> > > > > > > main difference being that in the broken case the mask is 0x30c and in
> > > > > > > the working case it is 0x304. The difference is EPOLLERR, which is
> > > > > > > there clearly in the code if !skb_queue_empty(&sk->sk_error_queue).
> > > > > > >
> > > > > > > Then in select.c POLLIN_SET includes EPOLLERR. It almost looks as if
> > > > > > > it's behaving as it should (except that things break). My first
> > > > > > > question is should the sk_error_queue be empty if there is a tx
> > > > > > > timestamp available (in datagram_poll() in datagram.c)? If it's not
> > > > > > > empty I don't see what else SO_SELECT_ERR_QUEUE flag is doing for the
> > > > > > > select() and I don't see what would be different about the macb/arm64
> > > > > > > setup?
> > > > > >
> > > > > > Thanks for the summary.
> > > > > > I think sk_error_queue should be empty because packets are queued to
> > > > > > that via skb_complete_timestamp (sock_queue_err_skb) and this should
> > > > > > not be called in this flow. I'm sorry if I'm missing something - I'll let others
> > > > > > from netdev comment.
> > > > > > I'm not sure why EPOLLERR in being set in this case.
> > > > > >
> > > > >
> > > > > I believe at least historically that the Tx timestamps were notified to the
> > > > applications using the error queue.
> > > > The documentation (Documentation/networking/timestamping.txt) says
> > > > "recvmsg() with flag MSG_ERRQUEUE", so it's returned to userspace
> > > > through the error queue, but at datagram_poll() in datagram.c:
> > > > /* exceptional events? */
> > > > if (sk->sk_err || !skb_queue_empty(&sk->sk_error_queue))
> > > >
> > > > Should this already be non-empty and EPOLLERR be returned as part of
> > > > mask? Or should _only_ EPOLLPRI be returned when SOCK_SELECT_ERR_QUEUE
> > > > is true? Surly someone on this list knows what should be happening? It
> > > > will help direct where we are looking. Should we be looking at the
> > > > queues? Or should be looking at do_select() in fs/select.c or
> > > > somewhere else?
> > > >
> > >
> >
> > HI Jake,
> > Thanks for looking at this.
> >
> > > It's been quite some time since I looked at this. Hmm. Can you reproduce this with
> > another board? I haven't seen the issue before.
> > >
> > > I remember adding SO_SELECT_ERR_QUEUE as an option, and it made the socket
> > wake up when a Tx timestamp occurred. I believe this was eventually fixed in a more
> > direct way, and now the socket option doesn't really do much.
> > >
> > > See Linux commit 7d4c04fc1700 ("net: add option to enable error queue packets
> > waking select", 2013-03-31)
> > That looks like your netdev patch, and the text makes sense.
> > >
> > > Basically, the option now just adds POLLPRI along with POLLERR when the Tx
> > timestamp is ready, and the socket will wake up when there is a Timestamp.
> > >
> > > This was eventually fixed so that you could select on just POLLERR, and have it
> > actually wake up on the error messages, and the option was no longer necessary.
> > Ah, so is selecting on POLLERR a seperate flag?
> >
> > >
> > > That's what I recall about SO_SELECT_ERR_QUEUE at least.
> > >
> > > So.. to clarify the behavior you're seeing is that the socket wakes up on pselect, and
> > then you go to read it, but it never returns because there's no data available? And the
> > socket woke up because there was an error, but the error queue is also empty?
> > The tx timestams are enabled globally across the system with
> > hwstamp_ctl, there is just one option on or off. The issue seems to be
> > with applications that are not expecting select to return on POLLERR.
> > This certainly the case with netcat, and I suspect it is a similar
> > thing that breaks ssh. I haven't checked if there is actually a tx
> > timestamp in the error queue for the nc case (probably is). But maybe
> > this points us in the right direction for a specific socket if the
> > SOF_TIMESTAMPING_TX_HARDWARE option hasn't been requested should the
> > timestamps be pushed to the error queue in the first place?
> >
> > thanks,
> > Paul
>
> Hmm. I haven't been able to reproduce this using other device drivers, so I wonder if the specific device driver is doing something weird and indicating that it can Tx timestamp every packet, in its hard_xmit routine... But it should only be doing that for packets which are requested to timestamp from the socket.

Yes! I think this could be it. I just submitted this patch so that we
can talk about it:
https://lore.kernel.org/netdev/20190312195053.21741-1-pthomas8589@gmail.com/

This fixes the issue I'm having with nc and ssh.

-Paul

>
> Thanks,
> Jake
>
> >
> > >
> > > Thanks,
> > > Jake
> > >
> > > > thanks,
> > > > Paul
> > > >
> > > > > Thanks,
> > > > > Jake