lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200720221314.xkdbw25nsjsyvgbv@skbuf>
Date:   Tue, 21 Jul 2020 01:13:14 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Jacob Keller <jacob.e.keller@...el.com>
Cc:     kuba@...nel.org, davem@...emloft.net, netdev@...r.kernel.org,
        richardcochran@...il.com, sorganov@...il.com,
        linux-doc@...r.kernel.org
Subject: Re: [PATCH net-next 3/3] docs: networking: timestamping: add a set
 of frequently asked questions

On Mon, Jul 20, 2020 at 02:45:03PM -0700, Jacob Keller wrote:
> 
> 
> On 7/20/2020 2:05 PM, Vladimir Oltean wrote:
> > On Mon, Jul 20, 2020 at 11:54:30AM -0700, Jacob Keller wrote:
> >> On 7/18/2020 4:35 AM, Vladimir Oltean wrote:
> >>> On Fri, Jul 17, 2020 at 04:12:07PM -0700, Jacob Keller wrote:
> >>>> On 7/17/2020 9:10 AM, Vladimir Oltean wrote:
> >>>>> +When the interface they represent offers both ``SOF_TIMESTAMPING_TX_HARDWARE``
> >>>>> +and ``SOF_TIMESTAMPING_TX_SOFTWARE``.
> >>>>> +Originally, the network stack could deliver either a hardware or a software
> >>>>> +time stamp, but not both. This flag prevents software timestamp delivery.
> >>>>> +This restriction was eventually lifted via the ``SOF_TIMESTAMPING_OPT_TX_SWHW``
> >>>>> +option, but still the original behavior is preserved as the default.
> >>>>> +
> >>>>
> >>>> So, this implies that we set this only if both are supported? I thought
> >>>> the intention was to set this flag whenever we start a HW timestamp.
> >>>>
> >>>
> >>> It's only _required_ when SOF_TIMESTAMPING_TX_SOFTWARE is used, it
> >>> seems. I had also thought of setting 'SKBTX_IN_PROGRESS' as good
> >>> practice, but there are many situations where it can do more harm than
> >>> good.
> >>>
> >>
> >> I guess I've only ever implemented a driver with software timestamping
> >> enabled as an option. What sort of issues arise when you have this set?
> >> I'm guessing that it's some configuration of stacked devices as in the
> >> other cases? If the issue can't be fixed I'd at least like more
> >> explanation here, since the prevailing convention is that we set this
> >> flag, so understanding when and why it's problematic would be useful.
> >>
> >> Thanks,
> >> Jake
> > 
> > Yes, the problematic cases have to do with stacked PHCs (DSA, PHY). The
> > pattern is that:
> > - DSA sets SKBTX_IN_PROGRESS
> > - calls dev_queue_xmit towards the MAC driver
> > - MAC driver sees SKBTX_IN_PROGRESS, thinks it's the one who set it
> > - MAC driver delivers TX timestamp
> > - DSA ends poll or receives TX interrupt, collects its timestamp, and
> >   delivers a second TX timestamp
> > In fact this is explained in a bit more detail in the current
> > timestamping.rst file.
> > Not only are there existing in-tree drivers that do that (and various
> > subtle variations of it), but new code also has this tendency to take
> > shortcuts and interpret any SKBTX_IN_PROGRESS flag set as being set
> > locally. Good thing it's caught during review most of the time these
> > days. It's an error-prone design.
> > On the DSA front, 1 driver sets this flag (sja1105) and 3 don't (felix,
> > mv88e6xxx, hellcreek). The driver who had trouble because of this flag?
> > sja1105.
> > On the PHY front, 2 drivers set this flag (mscc_phy, dp83640) and 1
> > doesn't (ptp_ines). The driver who had trouble? dp83640.
> > So it's very far from obvious that setting this flag is 'the prevailing
> > convention'. For a MAC driver, that might well be, but for DSA/PHY,
> > there seem to be risks associated with doing that, and driver writers
> > should know what they're signing up for.
> > 
> 
> Perhaps the issue is that the MAC driver using SKBTX_IN_PROGRESS as the
> mechanism for telling if it should deliver a timestamp. Shouldn't they
> be relying on SKBTX_HW_TSTAMP for the "please timestamp" notification,
> and then using their own mechanism for forwarding that timestamp once
> it's complete?
> 
> I see a handful of drivers do rely on checking this, but I think that's
> the real bug here.
> 
> > -Vladimir
> > 

Yes, indeed, a lot of them are exclusively checking
"skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS", without any further
verification that they have hardware timestamping enabled in the first
place, a lot more than I remembered. Some of the occurrences are
actually new.

I think at least part of the reason why this keeps going on is that
there aren't any hard and fast rules that say you shouldn't do it. When
there isn't even a convincing percentage of DSA/PHY drivers that do set
SKBTX_HW_TSTAMP, the chances are pretty low that you'll get a stacked
PHC driver that sets the flag, plus a MAC driver that checks for it
incorrectly. So people tend to ignore this case. Even though, if stacked
DSA drivers started supporting software TX timestamping (which is not
unlikely, given the fact that this would also give you compatibility
with PHY timestamping), I'm sure things would change, because more
people would become aware of the issue once mv88e6xxx starts getting
affected.

What I've been trying to do is at least try to get people (especially
people who have a lot of XP with 1588 drivers) to agree on a common set
of guidelines that are explicitly written down. I think that's step #1.

-Vladimir

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ