linux-kernel - Re: Data corruption on serial interface under load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 5 Feb 2016 00:24:41 +0200
From:	Andy Shevchenko <andy.shevchenko@...il.com>
To:	Peter Hurley <peter@...leysoftware.com>
Cc:	Russell King <linux@....linux.org.uk>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-serial@...r.kernel.org" <linux-serial@...r.kernel.org>
Subject: Re: Data corruption on serial interface under load

On Thu, Feb 4, 2016 at 10:06 PM, Peter Hurley <peter@...leysoftware.com> wrote:
> Hi Andy,
>
> On 02/04/2016 10:55 AM, Andy Shevchenko wrote:
>> Hi!
>>
>> Today I observed interesting bug / feature of uart layer in the kernel.
>> I do have a setup which connects two identical devices by serial line.
>> I run data transferring in one direction and got data corruption on
>> receiver side (in uart layer, not the driver).
>>
>> Here is the dump from test suite and real data from 8250 registers:
>>
>> === 8< ===
>>
>> Needed 16 reads 0 writes Oh oh, inconsistency at pos 1 (0x1).
>>
>> Original sample:
>> 00000000: 7f 45 4c 46 01 01 01 00  00 00 00 00 00 00 00 00   .ELF............
>> 00000010: 02 00 03 00 01 00 00 00  19 8d 04 08 34 00 00 00   ............4...
>> 00000020: 2c f2 00 00 00 00 00 00  34 00 20 00 04 00 28 00   ,.......4. ...(.
>>
>> Received sample:
>> 00000000: 7f 00 45 00 4c 00 46 00  01 00 01 00 01 00 00 00   ..E.L.F.........
>> 00000010: 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ................
>> 00000020: 02 00 00 00 03 00 00 00  01 00 00 00 00 19 8d 04   ................
>> loops 1 / 1
>>
>> cts: 0 dsr: 0 rng: 0 dcd: 0 rx: 53434 tx: 0 frame 0 ovr 34201 par: 0
>> brk: 0 buf_ovrr: 0
>>
>> === 8< ===
>>
>> R 356.360109 IIR 0xc4           RDI interrupt
>> R 356.360114 LSR 0x63           DR + OE
>> R 356.360119 RX 0x7f
>> R 356.360124 LSR 0x63           DR + still OE
>> R 356.360128 RX 0x45
>> R 356.360133 LSR 0x63           DR + still OE
>> R 356.360137 RX 0x4c
>> R 356.360142 LSR 0x63           DR + still OE
>> R 356.360147 RX 0x46
>> R 356.360151 LSR 0x63           DR + still OE
>> R 356.360156 RX 0x01
>> R 356.360160 LSR 0x63           DR + still OE
>> R 356.360165 RX 0x01
>> R 356.360169 LSR 0x63
>> R 356.360174 RX 0x01
>>
>> As we can see the data is corrupted on Linux side. Can we somehow fix
>> this bug/feature?
>
> Not quite sure what you see as the issue.
>
> 1) That is a lot of overruns. Is that part of the test or are the overruns
>    a regression?

This is part of the other problem which I'm investigating.

> 2) If you mean the NUL bytes for overruns, I could have some functional mode
>    mis-branched in the N_TTY line discipline.

Yeah, this one.

> What are the termios settings
>    on the rx side?

I'm using this [1] tool with small patch applied that enables internal
loopback (TCIOM_LOOP).

[1] https://git.breakpoint.cc/cgit/bigeasy/serialcheck.git/

-- 
With Best Regards,
Andy Shevchenko