lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19bbd87d-75d7-3d9f-d7c1-629d1cc961e8@eurek.it>
Date:   Tue, 7 Apr 2020 11:01:08 +0200
From:   gianluca <gianlucarenzi@...ek.it>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     jslaby@...e.com, linux-serial@...r.kernel.org,
        linux-kernel@...r.kernel.org, Gianluca Renzi <icjtqr@...il.com>,
        dimka@...eddedalley.com, linux@...pel-privat.de
Subject: Re: Serial data loss

Hello,
I am very pleased the Mr. Greg Kroah-Hartman is writing to me in person!

I appreciate a lot sir!

On 04/07/2020 10:24 AM, Greg Kroah-Hartman wrote:
> On Tue, Apr 07, 2020 at 09:30:21AM +0200, gianluca wrote:
>> I have a BIG trouble having dataloss when using two internal serial ports of
>> my boards based on NXP/FreeScale iMX28 SoC ARMv5Te ARM920ej-s architecture.
>>
>> It runs at 454Mhz.
>>
>> Kernel used 4.9.x
>
> That's a very old kernel, you are going to have to get support for that
> from the vendor you bought it from :(
>

We are the vendor. ;-)

Jokes apart, I can try to use the latest kernel 5.6, and see how is 
going on them, but at the first check the driver seems exactly the same 
as in kernel 4.9.

>> When using my test case unit software between two serial ports connect each
>> other by a null modem cable, it fails when the speed rate are different,
>
> Of course, how would that work?
>

I am not native english speaker so I am misleading to a 
misunderstanding: my test case is a software with two pthreads which the 
main thread is working with a differnet baud rate than the other 
pthread. Using the same software in two different machines, and using 
the same baudrate for each corrispondant port it should work.

i.e. /dev/ttyAPP1 is running at 9600 and /dev/ttyAPP2 is running at 38400

The same in the other machine. Both ports are null-modem connected:

	9600  /dev/ttyAPP1 <----> /dev/ttyAPP1 9600
	38400 /dev/ttyAPP2 <----> /dev/ttyAPP2 38400

I hope to be clear now. ;-)

>> and
>> dataloss is increasing higher the speed rate.
>
> What type of flow control are you using?
>

Unfortunately no flow control. Because the I cannot use it. When 
connected to the real-hardware those two ports are connected to a 
microcontroller unit which does not have flow control, only RX & TX 
connected (i.e. no RTS/CTS/DTE/DCE lines)

>> I suppose to have overruns (now I am modifying my software to check them
>> too), but I think it is due the way the ISR is called and all data are
>> passed to the uart circular buffer within the interrupt routine.
>
> Are you using flow control?
>

As above, no [ unfortunately ]


>> I am talking about the high latency from the IRQ up to the service routine
>> when flushing the FIFO and another IRQ is called by another uart in the same
>> time at different speed.
>>
>> The code I was looking is: drivers/tty/serial/mxs-auart.c __but__ all other
>> serial drivers are acting in the same way: they are reading one character at
>> time from the FIFO (if it exists) and put it into the circular buffer so
>> serial/tty driver can pass them to the user read routine.
>>
>> Each function call has some overhead and it is time-consuming, and if
>> another interrupt is invoked by the same UART Core but from another serial
>> port (different context) the continuos insertion done by hardware UART into
>> the FIFO cannot be served fast enough to have an overrun. I think this can
>> be applied __almost__ to every serial driver as they are written in the same
>> way.
>>
>> And it is __NOT__ an issue because of the CPU and its speed! Using two
>> serial converter (FTDI and Prolific PL2303 based) on each board, the problem
>> does not appear at all even after 24 hours running at more than 115200!!!
>
> usb-serial devices are totally different and send data to the host in a
> completly different way.
>
> Your hardware might just not be able to handle really high baud rates at
> a continous stream, what baud rate were you using?
>

I suppose that, but the same issue can be proven with all single core 
(NO FIFO UART) processors using two ports on the same uart core, running 
Linux kernel @ 450 Mhz or less.

The irq latency it is the same.

> And again, this is what flow control was designed for, please use it.
>

I know and usually I am using a sort of protocol which can check 
correctness of packet, and if not, the packet has to be reasked/resent.
In this case the microcontroller board I am connected to is not built by 
us, and the software is a custom protocol (and I do not know if an error 
on transfer can be accomplished by another request).

So the flow control __CANNOT_BE_USED_AT_ALL__...

>> It does work fine if I am using two different serial devices: one internal
>> uart (mxs-auart) and an external uart (ttyUSB).
>
> Again, different interrupt and protocols being used for the USB stuff.
>

...and in our case is working better than the internal uart driver on 
the same board. It is a real pity...

> thanks,
>

Thanks to you, mr. greg k-h!

> greg k-h


P.S.: I am a very close friend of Andrea Arcangeli, we grew up in the 
same place, and we went in the same school here in Italy (Imola - bologna).

We used to talked about you last Christmas Holidays when Andrea came to 
Italy from NY

Regards,
Gianluca Renzi
-- 
Eurek s.r.l.                          |
Electronic Engineering                | http://www.eurek.it
via Celletta 8/B, 40026 Imola, Italy  | Phone: +39-(0)542-609120
p.iva 00690621206 - c.f. 04020030377  | Fax:   +39-(0)542-609212

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ