[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53333438.3070107@hurleysoftware.com>
Date: Wed, 26 Mar 2014 16:10:32 -0400
From: Peter Hurley <peter@...leysoftware.com>
To: Geert Uytterhoeven <geert@...ux-m68k.org>
CC: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jiri Slaby <jslaby@...e.cz>, linux-serial@...r.kernel.org,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Geert Uytterhoeven <geert+renesas@...ux-m68k.org>
Subject: Re: [PATCH 2/2] [RFC] serial_core: Avoid NULL pointer dereference
in uart_close()
On 03/26/2014 02:58 PM, Geert Uytterhoeven wrote:
> Hi Peter,
>
> Thanks for your comments!
Not a problem; just wanted to save you some time and frustration :)
> On Fri, Mar 21, 2014 at 9:29 PM, Peter Hurley <peter@...leysoftware.com> wrote:
>> On 03/17/2014 09:10 AM, Geert Uytterhoeven wrote:
>>> From: Geert Uytterhoeven <geert+renesas@...ux-m68k.org>
>>> When unbinding a serial driver that's being used as a serial console,
>>> the kernel may crash with a NULL pointer dereference in a uart_*()
>>> function
>>> called from uart_close () (e.g. uart_flush_buffer() or
>>> uart_chars_in_buffer()).
>>>
>>> To fix this, let uart_close() check for port->count == 0. If this is the
>>> case, bail out early. Else tty_port_close_start() will make the port
>>> counts inconsistent, printing out warnings like
>>>
>>> tty_port_close_start: tty->count = 1 port count = 0.
>>>
>>> and
>>>
>>> tty_port_close_start: count = -1
>>
>> As you noted in the patch comments below, the tty core always closes
>> a failed open.
>>
>> So the reason for the port->count mismatch is because tty_port_close_start()
>> only handles the tty hangup condition -- all other failed opens are assumed
>> to carry a port->count.
>>
>> A similar mismatch will occur if the mutex_lock in uart_open() is
>> interrupted.
>>
>> This means that the port->count mismatch can occur even if port->count != 0;
>> so the bug here is that uart_open() and uart_close() don't agree on
>> who does what cleanup under what error conditions.
>>
>> So with respect to the port count mismatches, the conditions need careful
>> auditing and fixing, separate from the tty console teardown problem.
>
> Indeed. Currently uart_open() always decrements port->count again
> in any error condition, which is clearly wrong.
I started looking at this problem only to realize that the
tty_hung_up_p() condition in uart_open() can't actually happen.
Which has lead me to a bunch of cleanups and fixes that I'm still
working on. It's just slow going because tty code audit takes
forever with legacy intentions that no longer apply and some of
the bit-rotting tty drivers that I doubt even run.
>>> and once uport == NULL, it will also crash.
>>
>> Ok, but so will basically every serial core function after
>> uart_remove_one_port() assigns state->uart_port = NULL
>>
>> IOW, uart_close() is not the only serial core function that could be
>> using state->uart_port after uart_remove_one_port() finishes.
>>
>>> Also fix the related crash in pr_debug() by checking for a non-NULL uport
>>> first.
>>
>> So my point is that checking for NULL state->uart_port is simply not the
>> fix,
>> here or anywhere else.
>>
>> AFAICT, either reference-counting the uart_port structure or RCU
>> are the only viable solutions to safe ll driver uart_port removal.
>
> I gave it a quick try, but failed. It seems the driver needs some big surgery
> to fix this.
I figured that would be true.
What are the circumstances of device removal in your case?
Regards,
Peter Hurley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists