linux-kernel - Re: [PATCH] serial: sc16is7xx: address RX timeout interrupt errata

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <938c06e4-ce48-49af-ac3a-61f524796e26@zonque.org>
Date:   Wed, 15 Nov 2023 16:38:01 +0100
From:   Daniel Mack <daniel@...que.org>
To:     Hugo Villeneuve <hugo@...ovil.com>
Cc:     gregkh@...uxfoundation.org, jirislaby@...nel.org,
        lech.perczak@...lingroup.com, u.kleine-koenig@...gutronix.de,
        linux-serial@...r.kernel.org, linux-kernel@...r.kernel.org,
        Maxim Popov <maxim.snafu@...il.com>, stable@...r.kernel.org
Subject: Re: [PATCH] serial: sc16is7xx: address RX timeout interrupt errata

Hi Hugo,

On 11/15/23 16:31, Hugo Villeneuve wrote:
> On Wed, 15 Nov 2023 15:57:38 +0100
> Daniel Mack <daniel@...que.org> wrote:
> 
>> On 11/15/23 15:47, Hugo Villeneuve wrote:
>>> On Tue, 14 Nov 2023 16:55:33 +0100
>>> Daniel Mack <daniel@...que.org> wrote:
>>>
>>>> Hi Hugo,
>>>>
>>>> On 11/14/23 16:20, Hugo Villeneuve wrote:
>>>>> On Tue, 14 Nov 2023 08:49:04 +0100
>>>>> Daniel Mack <daniel@...que.org> wrote:
>>>>>> This devices has a silicon bug that makes it report a timeout interrupt
>>>>>> but no data in FIFO.
>>>>>>
>>>>>> The datasheet states the following in the errata section 18.1.4:
>>>>>>
>>>>>>   "If the host reads the receive FIFO at the at the same time as a
>>>>>>   time-out interrupt condition happens, the host might read 0xCC
>>>>>>   (time-out) in the Interrupt Indication Register (IIR), but bit 0
>>>>>>   of the Line Status Register (LSR) is not set (means there is not
>>>>>>   data in the receive FIFO)."
>>>>>>
>>>>>> When this happens, the loop in sc16is7xx_irq() will run forever,
>>>>>> which effectively blocks the i2c bus and breaks the functionality
>>>>>> of the UART.
>>>>>>
>>>>>> From the information above, it is assumed that when the bug is
>>>>>> triggered, the FIFO does in fact have payload in its buffer, but the
>>>>>> fill level reporting is off-by-one. Hence this patch fixes the issue
>>>>>> by reading one byte from the FIFO when that condition is detected.
>>>>>
>>>>> From what I understand from the errata, when the problem occurs, it
>>>>> affects bit 0 of the LSR register. I see no mention that it
>>>>> also affects the RX FIFO level register (SC16IS7XX_RXLVL_REG)?
>>>>
>>>> True, the errata doesn't explicitly mention that, but tests have shown
>>>> that the RXLVL register is equally affected.
>>>
>>> Hi Daniel,
>>> ok, now it makes more sense if RXLVL is affected.
>>>
>>> Have you contacted NXP about this? If not, I suggest you do open a
>>> support case and let them know about your findings, because it is very
>>> strange that it is not mentioned in the errata. And doing so may led to
>>> an updated and better documentation on their side about this errata.
>>
> 
> Hi Daniel,
> 
>> The errata is also wrong in other regards - the IIR register cannot
>> yield 0xcc according to their own documentation. It also makes no
>> suggestion on how to recover from that situation, which is common
>> practice usually.
> 
> 0xcc is valid according to the datasheet. Bits 7:6 are a mirror copy of
> FCR[0], so bits 5:0 are 0x0c, which is documented in table 14?

Ah, right. I was looking at the masked value.

> But you are right about the recovery procedure, it should be documented
> in the errata.
> 
> 
>> We'll let them know through our FAE channels, but the latest datasheet
>> for this chip was released over a decade ago, and I don't expect any
>> update to the errata wording.
> 
> You cannot assume they would not update the datasheet, especially with
> your findings about RXLVL which add a whole new dimension to this
> errata. The fact that the latest release was long ago is irrelevant.

Yes, we will give them a note. Let's see what happens.

>>> And incorporate this new info into your commit log for an eventual
>>> patch V2.
>>
>> It makes no sense IMO to have all users of this chip suffer from an
>> issue that was clearly identified to be present and which has an evident
>> fix. Why would we do that?
> 
> I don't know what you mean by that...
> 
> My suggestion was simply to incorporate your findings about RXLVL
> register into your commit log for patch V2...

My apologies, I misread your message then. I thought you suggested to
wait for a new errata statement from NXP so we can quote from that in
the commit.

The RXLVL detail is part of the the commit log in the v2 submission.


Thanks,
Daniel