lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190701172825.7d861e85@gandalf.local.home>
Date:   Mon, 1 Jul 2019 17:28:25 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Corey Minyard <cminyard@...sta.com>
Cc:     Corey Minyard <minyard@....org>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org,
        Peter Zijlstra <peterz@...radead.org>, tglx@...utronix.de
Subject: Re: [PATCH RT v2] Fix a lockup in wait_for_completion() and friends

On Mon, 1 Jul 2019 17:13:33 -0400
Steven Rostedt <rostedt@...dmis.org> wrote:

> On Mon, 1 Jul 2019 17:06:02 -0400
> Steven Rostedt <rostedt@...dmis.org> wrote:
> 
> > On Mon, 1 Jul 2019 15:43:25 -0500
> > Corey Minyard <cminyard@...sta.com> wrote:
> > 
> >   
> > > I show that patch is already applied at
> > > 
> > >     1921ea799b7dc561c97185538100271d88ee47db
> > >     sched/completion: Fix a lockup in wait_for_completion()
> > > 
> > > git describe --contains 1921ea799b7dc561c97185538100271d88ee47db
> > > v4.19.37-rt20~1
> > > 
> > > So I'm not sure what is going on.    
> > 
> > Bah, I'm replying to the wrong commit that I'm having issues with.
> > 
> > I searched your name to find the patch that is of trouble, and picked
> > this one.
> > 
> > I'll go find the problem patch, sorry for the noise on this one.
> >   
> 
> No, I did reply to the right email, but it wasn't the top patch I was
> having issues with. It was the patch I replied to:
> 
> This change below that Sebastian marked as stable-rt is what is causing
> me an issue. Not the patch that started the thread.
> 

In fact, my system doesn't boot with this commit in 5.0-rt.

If I revert 90e1b18eba2ae4a729 ("swait: Delete the task from after a
wakeup occured") the machine boots again.

Sebastian, I think that's a bad commit, please revert it.

Thanks!

-- Steve

> 
> 
> > Now.. that will fix it, but I think it is also wrong.
> > 
> > The problem being that it violates FIFO, something that might be more
> > important on -RT than elsewhere.
> > 
> > The regular wait API seems confused/inconsistent when it uses
> > autoremove_wake_function and default_wake_function, which doesn't help,
> > but we can easily support this with swait -- the problematic thing is
> > the custom wake functions, we musn't do that.
> > 
> > (also, mingo went and renamed a whole bunch of wait_* crap and didn't do
> > the same to swait_ so now its named all different :/)
> > 
> > Something like the below perhaps.
> > 
> > ---
> > diff --git a/include/linux/swait.h b/include/linux/swait.h
> > index 73e06e9986d4..f194437ae7d2 100644
> > --- a/include/linux/swait.h
> > +++ b/include/linux/swait.h
> > @@ -61,11 +61,13 @@ struct swait_queue_head {
> >  struct swait_queue {
> >  	struct task_struct	*task;
> >  	struct list_head	task_list;
> > +	unsigned int		remove;
> >  };
> >  
> >  #define __SWAITQUEUE_INITIALIZER(name) {				\
> >  	.task		= current,					\
> >  	.task_list	= LIST_HEAD_INIT((name).task_list),		\
> > +	.remove		= 1,						\
> >  }
> >  
> >  #define DECLARE_SWAITQUEUE(name)					\
> > diff --git a/kernel/sched/swait.c b/kernel/sched/swait.c
> > index e83a3f8449f6..86974ecbabfc 100644
> > --- a/kernel/sched/swait.c
> > +++ b/kernel/sched/swait.c
> > @@ -28,7 +28,8 @@ void swake_up_locked(struct swait_queue_head *q)
> >  
> >  	curr = list_first_entry(&q->task_list, typeof(*curr), task_list);
> >  	wake_up_process(curr->task);
> > -	list_del_init(&curr->task_list);
> > +	if (curr->remove)
> > +		list_del_init(&curr->task_list);
> >  }
> >  EXPORT_SYMBOL(swake_up_locked);
> >  
> > @@ -57,7 +58,8 @@ void swake_up_all(struct swait_queue_head *q)
> >  		curr = list_first_entry(&tmp, typeof(*curr), task_list);
> >  
> >  		wake_up_state(curr->task, TASK_NORMAL);
> > -		list_del_init(&curr->task_list);
> > +		if (curr->remove)
> > +			list_del_init(&curr->task_list);
> >  
> >  		if (list_empty(&tmp))
> >  			break;  
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ