[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <499DB1DC.1020704@goop.org>
Date: Thu, 19 Feb 2009 11:24:12 -0800
From: Jeremy Fitzhardinge <jeremy@...p.org>
To: Ian Jackson <Ian.Jackson@...citrix.com>
CC: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Anders Kaseorg <andersk@....EDU>,
"xen-devel@...ts.xensource.com" <xen-devel@...ts.xensource.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] IRQ handling race and spurious IIR read in serial/8250.c
Added cc:
Ian Jackson wrote:
> Anders Kaseorg writes ("Re: Serial console hangs with Linux 2.6.20 HVM guest"):
>
>> Yes, I took Linux v2.6.20 on amd64, ran `make defconfig`, then ran `make
>> menuconfig` and turned off CONFIG_HOTPLUG_CPU (Processor type and features
>> ò Support for hot-pluggable CPUs).
>>
>
> Thanks. I think I have tracked down the bug. In
> drivers/serial/8250.c in Linux there are two bugs:
> 1. UART_BUG_TXEN can be spuriously set, due to an IRQ race
> 2. The workaround then applied by the kernel is itself buggy
>
> Anders: can you try two tests for me ? Firstly, in
> serial8250_startup, delete the section which sets UART_BUG_TXEN (see
> 2nd patch below); I think this will fix the symptoms for you.
> Secondly, in serial8250_start_tx delete the read from the IIR and the
> relevant branch of the text (see 3rd patch below); I think this will
> also in itself fix your symptoms. I haven't compiled either patch (so
> you may find that eg I missed deleting some variables).
>
>
> The bugs in detail (this discussion applies to 2.6.20 and also to
> 2.6.28.4):
>
> 1. The hunk of serial8250_startup I quote below attempts to discover
> whether writing the IER re-asserts the THRI (transmit ready)
> interrupt. However the spinlock that it has taken out,
> port->lock, is not the one that the IRQ service routine takes
> before reading the IIR (i->lock). As a result, on an SMP system
> the generated interrupt races with the straight-line code in
> serial8250_startup.
>
> If serial8250_startup loses the race (perhaps because the system
> is a VM and its VCPU got preempted), UART_BUG_TXEN is spuriously
> added to bugs. This is quite unlikely in a normal system but in
> certain Xen configurations, particularly ones where there is CPU
> pressure, we may lose the race every time.
>
> It is not exactly clear to me how this ought to be resolved. One
> possibility is that the UART_BUG_TXEN problem might be worked
> around perfectly well by the new and very similar workaround
> UART_BUG_THRE[1] in 2.6.21ish in which case it could just be
> removed.
>
> 2. UART_BUG_TXEN's workaround appears to be intended to be harmless.
> However what it actually does is to read the IIR, thus clearing
> any actual interrupt (including incidentally non-THRI), and then
> only perform the intended servicing if the interrupt was _not_
> asserted. That is, it breaks on any serial port with the bug.
>
> As far as I can see there is not much use in UART_BUG_TXEN reading
> IIR at all, so a suitable change if we want to keep UART_BUG_TXEN
> might be the first patch I enclose below (again, not compiled
> or tested).
>
> If UART_BUG_TXEN is retained something along these lines should be
> done at the very least.
>
> Ian.
>
> [1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=40b36daad0ac704e6d5c1b75789f371ef5b053c1
> in which case UART
>
>
> Proposed initial band-aid fix (against 2.6.28.4):
>
>
> Do not read IIR in serial8250_start_tx when UART_BUG_TXEN
>
> Reading the IIR clears some oustanding interrupts so it is not safe.
> Instead, simply transmit immediately if the buffer is empty without
> regard to IIR.
>
> Signed-off-by: Ian Jackson <ian.jackson@...citrix.com>
>
> --- ../linux-2.6.28.4/drivers/serial/8250.c~ 2009-02-06 21:47:45.000000000 +0000
> +++ ../linux-2.6.28.4/drivers/serial/8250.c 2009-02-11 15:55:24.000000000 +0000
> @@ -1257,14 +1257,12 @@
> serial_out(up, UART_IER, up->ier);
>
> if (up->bugs & UART_BUG_TXEN) {
> - unsigned char lsr, iir;
> + unsigned char lsr;
> lsr = serial_in(up, UART_LSR);
> up->lsr_saved_flags |= lsr & LSR_SAVE_FLAGS;
> - iir = serial_in(up, UART_IIR) & 0x0f;
> if ((up->port.type == PORT_RM9000) ?
> - (lsr & UART_LSR_THRE &&
> - (iir == UART_IIR_NO_INT || iir == UART_IIR_THRI)) :
> - (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT))
> + (lsr & UART_LSR_THRE) :
> + (lsr & UART_LSR_TEMT))
> transmit_chars(up);
> }
> }
>
>
> Anders - first patch to try (against 2.6.20):
>
> --- drivers/serial/8250.c~ 2007-02-04 18:44:54.000000000 +0000
> +++ drivers/serial/8250.c 2009-02-11 15:39:43.000000000 +0000
> @@ -1645,25 +1645,6 @@
>
> serial8250_set_mctrl(&up->port, up->port.mctrl);
>
> - /*
> - * Do a quick test to see if we receive an
> - * interrupt when we enable the TX irq.
> - */
> - serial_outp(up, UART_IER, UART_IER_THRI);
> - lsr = serial_in(up, UART_LSR);
> - iir = serial_in(up, UART_IIR);
> - serial_outp(up, UART_IER, 0);
> -
> - if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT) {
> - if (!(up->bugs & UART_BUG_TXEN)) {
> - up->bugs |= UART_BUG_TXEN;
> - pr_debug("ttyS%d - enabling bad tx status workarounds\n",
> - port->line);
> - }
> - } else {
> - up->bugs &= ~UART_BUG_TXEN;
> - }
> -
> spin_unlock_irqrestore(&up->port.lock, flags);
>
> /*
>
>
> Anders - second patch to try (against 2.6.20):
> Fix should be suitable for distribution IMO.
>
> Signed-off-by: Ian Jackson <ian.jackson@...citrix.com>
>
> --- drivers/serial/8250.c~ 2007-02-04 18:44:54.000000000 +0000
> +++ drivers/serial/8250.c 2009-02-11 15:41:51.000000000 +0000
> @@ -1136,10 +1136,9 @@
> serial_out(up, UART_IER, up->ier);
>
> if (up->bugs & UART_BUG_TXEN) {
> - unsigned char lsr, iir;
> + unsigned char lsr;
> lsr = serial_in(up, UART_LSR);
> - iir = serial_in(up, UART_IIR);
> - if (lsr & UART_LSR_TEMT && iir & UART_IIR_NO_INT)
> + if (lsr & UART_LSR_TEMT)
> transmit_chars(up);
> }
> }
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists