lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yw1xbnhsk9ow.fsf@unicorn.mansr.com>
Date:	Sun, 10 May 2015 11:29:03 +0100
From:	Måns Rullgård <mans@...sr.com>
To:	Mason <slash.tmp@...e.fr>
Cc:	One Thousand Gnomes <gnomes@...rguk.ukuu.org.uk>,
	linux-serial@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
	Peter Hurley <peter@...leysoftware.com>
Subject: Re: Hardware spec prevents optimal performance in device driver

Mason <slash.tmp@...e.fr> writes:

> One Thousand Gnomes wrote:
>
>> Mason wrote:
>> 
>>> I'm writing a device driver for a serial-ish kind of device.
>>> I'm interested in the TX side of the problem. (I'm working on
>>> an ARM Cortex A9 system by the way.)
>>>
>>> There's a 16-byte TX FIFO. Data is queued to the FIFO by writing
>>> {1,2,4} bytes to a TX{8,16,32} memory-mapped register.
>>> Reading the TX_DEPTH register returns the current queue depth.
>>>
>>> The TX_READY IRQ is asserted when (and only when) TX_DEPTH
>>> transitions from 1 to 0.
>> 
>> If the last statement is correct then your performance is probably always
>> going to suck unless there is additional invisible queueing beyond the
>> visible FIFO.
>
> Do you agree with my assessment that the current semantics for
> TX_READY lead to a race condition, unless we limit ourselves
> to a single (atomic) write between interrupts?

No.  To get best throughput, you can simply busy-wait until TX_DEPTH
indicates the FIFO is almost empty, then write a few words, but no more
than you know fit in the FIFO.  Repeat until all data has been written.
Use the IRQ only to signal completion of the entire packet.

If the transmit rate is low, you can save some CPU time by filling the
FIFO, then sleeping until it should be almost empty, fill again, etc.

Whether busy-waiting or sleeping, this approach keeps the data flowing
as fast as possible.

With the hardware you describe, there is unfortunately a trade-off
between throughput and CPU efficiency.  You'll have to decide which is
more important to you.

-- 
Måns Rullgård
mans@...sr.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ