netdev - Re: [PATCH] net: mscc: ocelot: fix hardware timestamp dequeue logic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20200727194600.pbtzcyrbg33fots5@skbuf>
Date:   Mon, 27 Jul 2020 22:46:00 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     David Miller <davem@...emloft.net>
Cc:     yangbo.lu@....com, netdev@...r.kernel.org, laurent.brando@....com,
        vladimir.oltean@....com, claudiu.manoil@....com,
        alexandre.belloni@...tlin.com, kuba@...nel.org,
        UNGLinuxDriver@...rochip.com
Subject: Re: [PATCH] net: mscc: ocelot: fix hardware timestamp dequeue logic

On Mon, Jul 27, 2020 at 12:06:08PM -0700, David Miller wrote:
> From: Yangbo Lu <yangbo.lu@....com>
> Date: Mon, 27 Jul 2020 18:26:14 +0800
> 
> > From: laurent brando <laurent.brando@....com>
> > 
> > The next hw timestamp should be snapshoot to the read registers
> > only once the current timestamp has been read.
> > If none of the pending skbs matches the current HW timestamp
> > just gracefully flush the available timestamp by reading it.
> > 
> > Signed-off-by: laurent brando <laurent.brando@....com>
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> > Signed-off-by: Yangbo Lu <yangbo.lu@....com>
> 
> Applied and queued up for -stable, thanks.
> 
> So you have to read the hwtimestamp, even if you won't use it for
> any packets?

That's not the main point, but rather that you must read the timestamp
corresponding to the _current_ packet, and not the _next_ one in the
FIFO.

I slightly disagree with the commit description, in that it paints only
half of the story, and that can raise eyebrows about how this ever
worked. When in reality, it isn't fixing a problem we ever saw in real
life (although, nonetheless, it is a valid fix).

The ocelot_get_hwtimestamp function reads the 30-bit SYS::PTP_TXSTAMP
register in order to reconstruct a full 64-bit TX timestamp that will be
delivered to user space.

But the VSC7514 datasheet specifies this:

    Writing to the one-shot register SYS::PTP_NXT removes the current
    head-of-line entry and advances the pointer to the next entry in the
    time stamp queue.

So, if ocelot_get_hwtimestamp is called after writing to SYS_PTP_NXT,
then there is a chance that the TX timestamp that has been collected
actually doesn't correspond to the expected timestamp ID which was
matched to skb->cb[0], but it corresponds to the next timestamp in the
FIFO. In fact, this chance should be 100% and this should raise the
question of why this wasn't caught earlier. But it looks like if the TX
timestamp FIFO contains a single entry, then SYS_PTP_TXSTAMP does not
get updated when writing to SYS_PTP_NXT. And it looks like our IRQ
handler is always processing the TX timestamps early enough that
there's a single entry in the FIFO, no matter how hard we tried to
stress it.

Because the logic in ocelot_get_txtstamp tries to match skb's awaiting a
TX timestamp with actual TX timestamps once they become available, there
needs to be logic for iterating through the entire TX FIFO, and still do
something in case the TX timestamp could not be matched. That
"something" is to acknowledge the current TX timestamp, and go on. So
that's the reason why writing to SYS_PTP_NXT is done early. But it is
_too_ early.

So, to fix the bug, first we read the TX timestamp, even if skb_match is
going to be NULL, and only _then_ should we ack the timestamp. In
practice, skb_match being NULL is probably a big fat bug. There's
nothing 'graceful' about that, we should be issuing a WARN_ON, because
if nothing was matched, it means that the hardware has raised an
interrupt for a TX timestamp corresponding to a frame that the operating
system has no idea about. But the driver happily ignores it and carries
on.

Thanks,
-Vladimir