lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87365wgyae.fsf@osv.gnss.ru>
Date:   Mon, 13 Jul 2020 01:32:09 +0300
From:   Sergey Organov <sorganov@...il.com>
To:     Vladimir Oltean <olteanv@...il.com>
Cc:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        Fugang Duan <fugang.duan@....com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Richard Cochran <richardcochran@...il.com>
Subject: Re: [PATCH v2 net] net: fec: fix hardware time stamping by external
 devices

Vladimir Oltean <olteanv@...il.com> writes:

> On Sun, Jul 12, 2020 at 08:29:46PM +0300, Sergey Organov wrote:
>> Vladimir Oltean <olteanv@...il.com> writes:
>> 
>> > As far as I understand. the reason why SKBTX_IN_PROGRESS exists is for
>> > skb_tx_timestamp() to only provide a software timestamp if the hardware
>> > timestamping isn't going to. So hardware timestamping logic must signal
>> > its intention. With SO_TIMESTAMPING, this should not be strictly
>> > necessary, as this UAPI supports multiple sources of timestamping
>> > (including software and hardware together),
>> 
>> As a side note, I tried, but didn't find a way to get 2 timestamps,
>> software and hardware, for the same packet. It looks like once upon a
>> time it was indeed supported by:
>> 
>> SOF_TIMESTAMPING_SYS_HARDWARE: This option is deprecated and ignored.
>> 
>
> With SO_TIMESTAMPING, it is supported through
> SOF_TIMESTAMPING_OPT_TX_SWHW.

This one I overlooked, -- thanks for pointing!

> With APIs pre-SO_TIMESTAMPING (such as SO_TIMESTAMPNS), my understanding
> is that it is not supported. One timestamp would overwrite the other. So
> this needs to be avoided somehow, and this is how SKBTX_IN_PROGRESS came
> to be.
>
>
>> > but I think
>> > SKBTX_IN_PROGRESS predates this UAPI and timestamping should continue to
>> > work with older socket options.
>> 
>> <rant>The UAPI to all this is rather messy, starting with ugly tricks to
>> convert PTP file descriptors to clock IDs, followed by strange ways to
>> figure correct PTP clock for given Ethernet interface, followed by
>> entirely different methods of getting time stamping capabilities and
>> configuring them, and so forth.</rant>
>> 
>
> Yes, but I don't think this really matters in any way here.

Surprisingly it does, as you've got a wrong premise that ethtool doesn't
support external PTP PHY on FEC. And I think it's due to complexities of
the current implementation, where ethtool has entirely separate path of
doing things, with zero help from the FEC driver, see below.

[...]

>> I'll do as you suggest, but I want to say that I didn't question your
>> claim that my proposed changes may fix some existing PTP/DSA setup.
>> 
>> What I still find doubtful is that this fact necessarily means that the
>> part of the patch that fixes some other bug must be submitted
>> separately. If it were the rule, almost no patch that needs to fix 2
>> separate places would be accepted, as there might be some bug somewhere
>> that could be fixed by 1 change, no?
>> 
>
> So, I am asking you to flag this as a separate bugfix for DSA
> timestamping for multiple reasons:

OK, you want it, I'll do it, however:

>
> - because you are not "fixing PHY timestamping", more on that
> separately

I believe I do, more on that below.

> - because I am looking at this problem from the perspective of a user
>   who has a problem with DSA timestamping (aka code that already exists,
>   and that is already claimed to work). They're going to be searching
>   through the git log for potentially relevant changes, and even if they
>   might notice a patch that "adds support for PHY timestamping in FEC"*,
>   it might not register as something relevant to them, and skip it.
>   Just as you don't care about DSA, neither does that hypothetical user
>   care about PHY timestamping.

I don't think it's fair. I believe I do care about both, please re-read
the subject and description of the patch:

"net: fec: fix hardware time stamping by external devices"

that sure must be relevant for any external device, be it PHY or DSA,
and then:

"Fix support for external PTP-aware devices such as DSA or PTP PHY:"

And while formally I do add "support for external PTP PHY in FEC", as
you mention, I still consider it a bug-fix, as ethtool already correctly
supports this, see below for more background.

[...]

>> > Nope. The PHY timestamping support will go to David's net-next, this
>> > common PHY/DSA bugfix to net, and they'll meet sooner rather than
>> > later.
>> 
>> I'll do as you suggest, separating the patches, yet I fail to see why
>> PHY /time stamping bug fix/ should go to another tree than PHY/DSA /time
>> stamping bug fix/? What's the essential difference? Could you please
>> clarify?
>> 
>
> The essential difference is that for PHY timestamping, there is no
> feature that is claimed to work which doesn't work.

I believe this is a feature that is supposed to work, yet it doesn't,
see below.

>
> If you run "ethtool -T eth0" on a FEC network interface, it will always
> report its own PHC, and never the PHC of a PHY. So, you cannot claim
> that you are fixing PHY timestamping, since PHY timestamping is not
> advertised. That's not what a bug fix is, at least not around here, with
> its associated backporting efforts.

You can't actually try it as you don't have the hardware, right? As for
me, rather than running exactly ethtool, I do corresponding ioctl() in
my program, and the kernel does report features of my external PTP PHY,
not of internal one of the FEC, without my patches!

> The only way you could have claimed that this was fixing PHY
> timestamping was if "ethtool -T eth0" was reporting a PHY PHC, however
> timestamps were not coming from the PHY.

That's /exactly/ the case! Moreover, my original work is on 4.9.146
kernel, so ethtool works correctly at least since then. Here is quote from
my original question that I already gave reference to:

<quote>
Almost everything works fine out of the box, except hardware
timestamping. The problems are that I apparently get timestamps from fec
built-in PTP instead of external PHY, and that

  ioctl(fd, SIOCSHWTSTAMP, &ifr)

ends up being executed by fec1 built-in PTP code instead of being
forwarded to the external PHY, and that this happens despite the call to

   info.cmd = ETHTOOL_GET_TS_INFO;                                                                             
   ioctl(fd, SIOCETHTOOL, &ifr);                                                                     

returning phc_index = 1 that corresponds to external PHY, and reports
features of the external PHY, leading to major inconsistency as seen
from user-space.
</quote>

You see? This is exactly the case where I could claim fixing PHY time
stamping even according to your own expertise!

> From the perspective of the mainline kernel, that can never happen.

Yet in happened to me, and in some way because of the UAPI deficiencies
I've mentioned, as ethtool has entirely separate code path, that happens
to be correct for a long time already.

> From your perspective as a developer, in your private work tree, where
> _you_ added the necessary wiring for PHY timestamping, I fully
> understand that this is exactly what happened _to_you_.
> I am not saying that PHY timestamping doesn't need this issue fixed. It
> does, and if it weren't for DSA, it would have simply been a "new
> feature", and it would have been ok to have everything in the same
> patch.

Except that it's not a "new feature", but a bug-fix of an existing one,
as I see it.

>
> The fact that you figured out so quickly what's going on just means
> you're very smart. It took me 2 months to find out the same bug coming
> from the DSA side of things.

Thanks, but I'd rather attribute this to the fact that in my case there
was the code that worked correctly, so I was lucky to get support point
to rely upon.

Thanks,
-- Sergey

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ