[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130912121726.GO10105@radagast>
Date: Thu, 12 Sep 2013 07:17:27 -0500
From: Felipe Balbi <balbi@...com>
To: Alexey Pelykh <alexey.pelykh@...il.com>
CC: <balbi@...com>, Tony Lindgren <tony@...mide.com>,
Greg KH <gregkh@...uxfoundation.org>,
Linux OMAP Mailing List <linux-omap@...r.kernel.org>,
<linux-serial@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: commit 5fe212364 causes division by zero with large bauds
Hi,
On Thu, Sep 12, 2013 at 07:32:54AM +0300, Alexey Pelykh wrote:
> >> >> > On Wed, Sep 11, 2013 at 09:22:26AM +0300, Alexey Pelykh wrote:
> >> >> >> Hi Felipe,
> >> >> >>
> >> >> >> Thanks for finding this issue. Indeed, there is a bug on 3M+ baud
> >> >> >> rates. First patch is close to a complete fix, but still contains
> >> >> >> div-by-zero issue. Here is my version:
> >> >> >>
> >> >> >> diff --git a/drivers/tty/serial/omap-serial.c b/drivers/tty/serial/omap-serial.c
> >> >> >> index 816d1a2..808a880 100644
> >> >> >> --- a/drivers/tty/serial/omap-serial.c
> >> >> >> +++ b/drivers/tty/serial/omap-serial.c
> >> >> >> @@ -240,8 +240,8 @@ serial_omap_baud_is_mode16(struct uart_port *port,
> >> >> >> unsigned int baud)
> >> >> >> {
> >> >> >> unsigned int n13 = port->uartclk / (13 * baud);
> >> >> >> unsigned int n16 = port->uartclk / (16 * baud);
> >> >> >> - int baudAbsDiff13 = baud - (port->uartclk / (13 * n13));
> >> >> >> - int baudAbsDiff16 = baud - (port->uartclk / (16 * n16));
> >> >> >> + int baudAbsDiff13 = n13 ? (baud - (port->uartclk / (13 * n13))) : INT_MAX;
> >> >> >> + int baudAbsDiff16 = n16 ? (baud - (port->uartclk / (16 * n16))) : INT_MAX;
> >> >> >
> >> >> > IOW:
> >> >> >
> >> >> > int baudAbsDiff13 = 0;
> >> >> >
> >> >> > if (n13)
> >> >> > baudAbsDiff13 = (baud - (port->uartclk / (13 * n13)));
> >> >>
> >> >> Not quite same code, INT_MAX instead of 0. With 0 a div-by-zero
> >> >> exception will still occur on 3686400.
> >> >
> >> > why, there's no division after that point, right ? Besides,
> >> > serial_omap_baud_is_mode16() is supposed to return a boolean value.
> >> >
> >> > Setting baudAbsDiff1[36] to 0 will cause no problems, you're only using
> >> > that value with a boolean expression, no divisions whatsoever. Division
> >> > is done after using serial_omap_baud_is_mode16() to initialize divisor
> >> > to 13 or 16 (which is a misnamer, since that's the oversampling
> >> > parameter).
> >> >
> >>
> >> Yes, variables are a bit misnamed, that should be fixed someday.
> >> Regarding 0 vs INT_MAX, in case of 0 values will be
> >> 300: divisor = 12307 (13)
> >> 600: divisor = 6153 (13)
> >> 1200: divisor = 3076 (13)
> >> 2400: divisor = 1538 (13)
> >> 4800: divisor = 625 (16)
> >> 9600: divisor = 384 (13)
> >> 14400: divisor = 256 (13)
> >> 19200: divisor = 192 (13)
> >> 28800: divisor = 128 (13)
> >> 38400: divisor = 96 (13)
> >> 57600: divisor = 64 (13)
> >> 115200: divisor = 32 (13)
> >> 230400: divisor = 16 (13)
> >> 460800: divisor = 8 (13)
> >> 921600: divisor = 4 (13)
> >> 1000000: divisor = 3 (16)
> >> 1843200: divisor = 2 (13)
> >> 3000000: divisor = 1 (16)
> >> 3686400: divisor = 0 (16) << error here, should be 1 (13), as it is with INT_MAX
> >
> > I get it now... your boolean check wants to use the closer baud to
> > requested baud, so it's mode16 if the delta between baudAbsDiff16 and
> > requested rate is less than delta between baudAbsDiff13 and requested
> > baud.
> >
> >> >> > which is exactly what my patch did. I fail to see where division by zero
> >> >> > would be coming from.
> >> >> >
> >> >> >> if(baudAbsDiff13 < 0)
> >> >> >> baudAbsDiff13 = -baudAbsDiff13;
> >> >> >> if(baudAbsDiff16 < 0)
> >> >> >>
> >> >> >>
> >> >> >> With 48MHz UART clock, it will give
> >> >> >> 300: divisor = 12307 (13), real rate 300 (0.000000%)
> >> >> >> 600: divisor = 6153 (13), real rate 600 (0.000000%)
> >> >> >> 1200: divisor = 3076 (13), real rate 1200 (0.000000%)
> >> >> >> 2400: divisor = 1538 (13), real rate 2400 (0.000000%)
> >> >> >
> >> >> > TRM has these all set with oversampling of 16. In fact only 460800,
> >> >> > 921600, 1843200 and 3686400 should be using oversampling of 13.
> >> >> >
> >> >>
> >> >> That's true, but TRM anyways does not contain all possible baud rates
> >> >> (1M e.g.). IMO, as long as error rate is the same as in TRM,
> >> >> it makes no difference what combination of (mode, divisor) to use.
> >> >>
> >> >> > --
> >> >> > balbi
> >> >>
> >> >> A complex solution may be implemented: use LUT for baud rates that TRM
> >> >> defines explicitly, and use calculation if lookup failed.
> >> >
> >> > why would you try calculating anything if there is nothing in the table
> >> > which can support it ? Whatever is in the lookup table, are the only
> >> > baud rates the SoC supports, right ?
> >> >
> >>
> >> Actually, I haven't found any statement in TRM, which would mention
> >> that listed baudrates in referenced table are the only supported baud
> >> rates,
> >> and all others are illegal.
> >
> > "The UART clocks are connected to produce a baud rate of up to 3.6 Mbps.
> > Table 24-122 lists the *supported* baud rates, requested divisor, and
> > corresponding error versus the standard baud rate."
> >
> >> At least 1M which I use extensively works perfectly, and I can not
> >> figure out any idea why it would not do so.
> >
> > it might very well work, but it's not officially *supported* by the IP.
>
> That's true, but I don't see any reason why driver should disallow
> usage of baud rates that are not supported, but possible by hardware:
> "The UART clocks are connected to produce a baud rate of up to 3.6M bits/s."
fair point.
> I've changed calculation a bit to give priority to mode16, and now it
> gives TRM table as-is + extra baud rates
> 300: divisor = 10000 (16), real rate 300 (0.000000%)
> 600: divisor = 5000 (16), real rate 600 (0.000000%)
> 1200: divisor = 2500 (16), real rate 1200 (0.000000%)
> 2400: divisor = 1250 (16), real rate 2400 (0.000000%)
> 4800: divisor = 625 (16), real rate 4800 (0.000000%)
> 9600: divisor = 312 (16), real rate 9615 (0.156250%)
> 14400: divisor = 208 (16), real rate 14423 (0.159722%)
> 19200: divisor = 156 (16), real rate 19230 (0.156250%)
> 28800: divisor = 104 (16), real rate 28846 (0.159722%)
> 38400: divisor = 78 (16), real rate 38461 (0.158854%)
> 57600: divisor = 52 (16), real rate 57692 (0.159722%)
> 115200: divisor = 26 (16), real rate 115384 (0.159722%)
> 230400: divisor = 13 (16), real rate 230769 (0.160156%)
> 460800: divisor = 8 (13), real rate 461538 (0.160156%)
> 921600: divisor = 4 (13), real rate 923076 (0.160156%)
> 1000000: divisor = 3 (16), real rate 1000000 (0.000000%)
> 1843200: divisor = 2 (13), real rate 1846153 (0.160211%)
> 3000000: divisor = 1 (16), real rate 3000000 (0.000000%)
> 3686400: divisor = 1 (13), real rate 3692307 (0.160238%)
>
> If that's acceptable behavior, I'll prepare a patch.
Sure but I'd suggest splitting the patch so we can get the bugfix in
mainline during this -rc. So first patch would only fix the
division-by-zero bug and second patch would 'fix' baudrate calculation.
Sounds good ?
--
balbi
Download attachment "signature.asc" of type "application/pgp-signature" (837 bytes)
Powered by blists - more mailing lists