[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231221111337.2c1af5bbe4920268dac25e8f@hugovil.com>
Date: Thu, 21 Dec 2023 11:13:37 -0500
From: Hugo Villeneuve <hugo@...ovil.com>
To: Hugo Villeneuve <hugo@...ovil.com>
Cc: Andy Shevchenko <andriy.shevchenko@...el.com>,
gregkh@...uxfoundation.org, jirislaby@...nel.org, jringle@...dpoint.com,
kubakici@...pl, phil@...pberrypi.org, bo.svangard@...eddedart.se,
linux-kernel@...r.kernel.org, linux-serial@...r.kernel.org, Hugo Villeneuve
<hvilleneuve@...onoff.com>, stable@...r.kernel.org, Yury Norov
<yury.norov@...il.com>
Subject: Re: [PATCH 02/18] serial: sc16is7xx: fix invalid sc16is7xx_lines
bitfield in case of probe error
On Thu, 21 Dec 2023 10:56:39 -0500
Hugo Villeneuve <hugo@...ovil.com> wrote:
> On Wed, 20 Dec 2023 17:40:42 +0200
> Andy Shevchenko <andriy.shevchenko@...el.com> wrote:
>
> > On Tue, Dec 19, 2023 at 12:18:46PM -0500, Hugo Villeneuve wrote:
> > > From: Hugo Villeneuve <hvilleneuve@...onoff.com>
> > >
> > > If an error occurs during probing, the sc16is7xx_lines bitfield may be left
> > > in a state that doesn't represent the correct state of lines allocation.
> > >
> > > For example, in a system with two SC16 devices, if an error occurs only
> > > during probing of channel (port) B of the second device, sc16is7xx_lines
> > > final state will be 00001011b instead of the expected 00000011b.
> > >
> > > This is caused in part because of the "i--" in the for/loop located in
> > > the out_ports: error path.
> > >
> > > Fix this by checking the return value of uart_add_one_port() and set line
> > > allocation bit only if this was successful. This allows the refactor of
> > > the obfuscated for(i--...) loop in the error path, and properly call
> > > uart_remove_one_port() only when needed, and properly unset line allocation
> > > bits.
> > >
> > > Also use same mechanism in remove() when calling uart_remove_one_port().
> >
> > Yes, this seems to be the correct one to fix the problem described in
> > the patch 1. I dunno why the patch 1 even exists.
>
> Hi,
> this will indeed fix the problem described in patch 1.
>
> However, if I remove patch 1, and I simulate the same probe error as
> described in patch 1, now we get stuck forever when trying to
> remove the driver. This is something that I observed before and
> that patch 1 also corrected.
>
> The problem is caused in sc16is7xx_remove() when calling this function
>
> kthread_flush_worker(&s->kworker);
>
> I am not sure how best to handle that without patch 1.
Also, if we manage to get past kthread_flush_worker() and
kthread_stop() (commented out for testing purposes), we get another bug:
# rmmod sc16is7xx
...
crystal-duart-24m already disabled
WARNING: CPU: 2 PID: 340 at drivers/clk/clk.c:1090
clk_core_disable+0x1b0/0x1e0
...
Call trace:
clk_core_disable+0x1b0/0x1e0
clk_disable+0x38/0x60
sc16is7xx_remove+0x1e4/0x240 [sc16is7xx]
This one is caused by calling clk_disable_unprepare(). But
clk_disable_unprepare() has already been called in probe error handling
code. Patch 1 also fixed this...
Hugo Villeneuve
Powered by blists - more mailing lists