[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ylkv/s7hRfXK07hD@kroah.com>
Date: Fri, 15 Apr 2022 10:42:38 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Jiri Slaby <jslaby@...e.cz>
Cc: linux-serial@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/3] tty: serial: introduce uart_port_tx{,_limit}()
helpers
On Fri, Apr 15, 2022 at 09:47:26AM +0200, Jiri Slaby wrote:
> On 14. 04. 22, 18:32, Greg KH wrote:
> > On Mon, Apr 11, 2022 at 12:54:03PM +0200, Jiri Slaby wrote:
> > > Many serial drivers do the same thing:
> > > * send x_char if set
> > > * keep sending from the xmit circular buffer until either
> > > - the loop reaches the end of the xmit buffer
> > > - TX is stopped
> > > - HW fifo is full
> > > * check for pending characters and:
> > > - wake up tty writers to fill for more data into xmit buffer
> > > - stop TX if there is nothing in the xmit buffer
> > >
> > > The only differences are:
> > > * how to write the character to the HW fifo
> > > * the check of the end condition:
> > > - is the HW fifo full?
> > > - is limit of the written characters reached?
> > >
> > > So unify the above into two helpers:
> > > * uart_port_tx_limit() -- the generic one, it performs the above taking
> > > into account the written characters limit
> > > * uart_port_tx() -- calls the above with ~0 as the limit. So it only
> > > checks the HW fullness.
> > >
> > > We need three more hooks in struct uart_ops for all this to work:
> > > * tx_ready() -- returns true if HW can accept more data.
> > > * put_char() -- write a character to the device.
> > > * tx_done() -- when the write loop is done, perform arbitrary action
> > > before potential invocation of ops->stop_tx() happens.
> > >
> > > NOTE1: Maybe the three hooks in uart_ops above are overkill. We can
> > > instead pass pointers to the three functions directly to the new helpers
> > > as they are not used elsewhere. Similar to uart_console_write() and its
> > > putchar().
> > >
> > > NOTE2: These two new helper functions call the hooks per every character
> > > processed. I was unable to measure any difference, provided most time is
> > > spent by readb (or alike) in the hooks themselves. First, LTO might
> > > help to eliminate these explicit calls (we might need NOTE1 to be
> > > implemented for this to be true). Second, if this turns out to be a
> > > problem, we can introduce a macro to build the helper in the driver's
> > > code instead of serial_core. That is, similar to wait_event().
> > >
> > > Signed-off-by: Jiri Slaby <jslaby@...e.cz>
> > > ---
> > > Documentation/driver-api/serial/driver.rst | 28 ++++++++++++
> > > drivers/tty/serial/serial_core.c | 53 ++++++++++++++++++++++
> > > include/linux/serial_core.h | 9 ++++
> > > 3 files changed, 90 insertions(+)
> > >
> > > diff --git a/Documentation/driver-api/serial/driver.rst b/Documentation/driver-api/serial/driver.rst
> > > index 06ec04ba086f..7dc3791addeb 100644
> > > --- a/Documentation/driver-api/serial/driver.rst
> > > +++ b/Documentation/driver-api/serial/driver.rst
> > > @@ -80,6 +80,34 @@ hardware.
> > > This call must not sleep
> > > + tx_ready(port)
> > > + The driver returns true if the HW can accept more data to be sent.
> > > +
> > > + Locking: port->lock taken.
> > > +
> > > + Interrupts: locally disabled.
> > > +
> > > + This call must not sleep.
> > > +
> > > + put_char(port, ch)
> > > + The driver is asked to write ch to the device.
> > > +
> > > + Locking: port->lock taken.
> > > +
> > > + Interrupts: locally disabled.
> > > +
> > > + This call must not sleep.
> > > +
> > > + tx_done(port)
> > > + When the write loop is done, the driver can perform arbitrary action
> > > + here before potential invocation of ops->stop_tx() happens.
> > > +
> > > + Locking: port->lock taken.
> > > +
> > > + Interrupts: locally disabled.
> > > +
> > > + This call must not sleep.
> > > +
> > > set_mctrl(port, mctrl)
> > > This function sets the modem control lines for port described
> > > by 'port' to the state described by mctrl. The relevant bits
> > > diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
> > > index 6a8963caf954..1be14e90066c 100644
> > > --- a/drivers/tty/serial/serial_core.c
> > > +++ b/drivers/tty/serial/serial_core.c
> > > @@ -107,6 +107,59 @@ void uart_write_wakeup(struct uart_port *port)
> > > }
> > > EXPORT_SYMBOL(uart_write_wakeup);
> > > +static bool uart_port_tx_always_ready(struct uart_port *port)
> > > +{
> > > + return true;
> > > +}
> > > +
> > > +/**
> > > + * uart_port_tx_limit -- transmit helper for uart_port
> > > + * @port: from which port to transmit
> > > + * @count: limit count
> > > + *
> > > + * uart_port_tx_limit() transmits characters from the xmit buffer to the
> > > + * hardware using @uart_port::ops::put_char(). It does so until @count
> > > + * characters are sent and while @uart_port::ops::tx_ready() still returns
> > > + * non-zero (if non-NULL).
> > > + *
> > > + * Return: number of characters in the xmit buffer when done.
> > > + */
> > > +unsigned int uart_port_tx_limit(struct uart_port *port, unsigned int count)
> > > +{
> > > + struct circ_buf *xmit = &port->state->xmit;
> > > + bool (*tx_ready)(struct uart_port *) = port->ops->tx_ready ? :
> > > + uart_port_tx_always_ready;
> > > + unsigned int pending;
> > > +
> > > + for (; count && tx_ready(port); count--, port->icount.tx++) {
> > > + if (port->x_char) {
> > > + port->ops->put_char(port, port->x_char);
> > > + port->x_char = 0;
> > > + continue;
> > > + }
> > > +
> > > + if (uart_circ_empty(xmit) || uart_tx_stopped(port))
> > > + break;
> > > +
> > > + port->ops->put_char(port, xmit->buf[xmit->tail]);
> >
> > That's a lot of redirection and function pointer mess per each character
> > sent now. With the spectre overhead here (and only getting worse), this
> > feels like a step backwards.
> >
> > I doubt throughput matters here given cpu speeds now, _but_ the cpu load
> > should go up.
> >
> > Although on smaller cpus with slower Mhz and faster line rates, this
> > feels like a lot of extra work happening for no real good reason.
>
> I know… Did you miss NOTE2 in the commit log? Any idea on that?
I did see it, but I LTO can not handle function pointer redirection. I
was wondering if you ran any benchmarks to see if this is noticeable.
I am all for making the drivers smaller, but not at the increased
overhead of every character being sent :(
thanks,
greg k-h
Powered by blists - more mailing lists