[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL_Jsq+GFEqP6yJcPB9-G-7f=fqZJ2C=L=djg8KeX0W_5Bi8Vg@mail.gmail.com>
Date: Fri, 24 Mar 2017 08:59:26 -0500
From: Rob Herring <robh@...nel.org>
To: lkml@...garu.com
Cc: linux-kernel <linux-kernel@...r.kernel.org>,
Peter Hurley <peter@...leysoftware.com>
Subject: Re: [BUG] 4.11.0-rc3 xterm hung in D state on exit, wchan is tty_release_struct
+Peter
On Thu, Mar 23, 2017 at 4:22 PM, <lkml@...garu.com> wrote:
> On Thu, Mar 23, 2017 at 11:57:22AM -0500, Rob Herring wrote:
>> On Thu, Mar 23, 2017 at 08:46:03AM -0500, Rob Herring wrote:
>> > On Thu, Mar 23, 2017 at 12:30:18AM -0700, lkml@...garu.com wrote:
>> > > On Wed, Mar 22, 2017 at 11:44:18PM -0700, lkml@...garu.com wrote:
>> > > > On Wed, Mar 22, 2017 at 07:08:46PM -0700, lkml@...garu.com wrote:
>> > > > > Hello list,
>> > > > >
>> > > > > After approximately one day day of running 4.11.0-rc3 with 7e54d9d reverted to
>> > > > > enable regular use, this happened upon destroying an xterm:
>> > > > >
>>
>> [...]
>>
>> > > >
>> > > > Added Rob Herring, author of c3485ee to CC list.
>> > > >
>> > >
>> > > I suspect this part was a mistake:
>> > >
>> > > - tty = READ_ONCE(port->itty);
>> > > - if (tty == NULL)
>> > > - return;
>> > >
>> > > Note release_tty() tty->port->itty is assigned NULL before calling
>> > > tty_buffer_cancel_work():
>> >
>> > The READ_ONCE should still handle that.
>> >
>> > Anyway, the changes were purely to try to remove the need for a ldisc in
>> > the serdev case and avoid referencing it. In fact we still have an
>> > ldisc, it's just not used. So we can restore the original ordering.
>> >
>> > Can you try this patch:
>> >
>>
>> Please try this one instead. It passes the tty struct around instead of
>> the ldisc.
>>
>
> Happy to test the fix, except reproducing the bug without changing anything
> at all has proven elusive. AFAIK we just have my single experience
> described in the report, I'm unconfident in my ability to validate any
> fixes for this specific bug.
>
> If you think you understand the root cause, maybe you can conceive of a
> more reliable reproducer than me just using my machine? It appears I may
> have just been very (un)lucky.
Honestly, I don't understand why my change caused a problem. The fix
just changes things back to how things we ordered before.
Looks like the need for READ_ONCE was originally found with ktsan, so
maybe it could help here. But I've never used it.
Rob
Powered by blists - more mailing lists