lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zsw8FEPMHFe4yoaA@chenyu5-mobl2>
Date: Mon, 26 Aug 2024 16:25:56 +0800
From: Chen Yu <yu.c.chen@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Oliver Sang <oliver.sang@...el.com>, <oe-lkp@...ts.linux.dev>,
	<lkp@...el.com>, <linux-kernel@...r.kernel.org>, <aubrey.li@...ux.intel.com>
Subject: Re: [peterz-queue:sched/core] [sched/fair]  420356c350:
 WARNING:at_kernel/sched/core.c:#__might_sleep

On 2024-08-22 at 17:49:23 +0200, Peter Zijlstra wrote:
> On Mon, Aug 19, 2024 at 12:44:39PM +0800, Chen Yu wrote:
> > On 2024-08-17 at 11:33:29 +0200, Peter Zijlstra wrote:
> > > On Fri, Aug 16, 2024 at 05:15:12PM +0800, kernel test robot wrote:
> > > > kernel test robot noticed "WARNING:at_kernel/sched/core.c:#__might_sleep" on:
> > > > 
> > > > commit: 420356c3504091f0f6021974389df7c58f365dad ("sched/fair: Implement delayed dequeue")
> > > > https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git sched/core
> > > 
> > > > [   86.252370][  T674] ------------[ cut here ]------------
> > > > [ 86.252945][ T674] do not call blocking ops when !TASK_RUNNING; state=1 set at kthread_worker_fn (kernel/kthread.c:?) 
> > > > [ 86.254001][ T674] WARNING: CPU: 1 PID: 674 at kernel/sched/core.c:8469 __might_sleep (kernel/sched/core.c:8465) 
> > > 
> > > > [ 86.283398][ T674] ? handle_bug (arch/x86/kernel/traps.c:239) 
> > > > [ 86.283995][ T674] ? exc_invalid_op (arch/x86/kernel/traps.c:260) 
> > > > [ 86.284787][ T674] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) 
> > > > [ 86.285682][ T674] ? __might_sleep (kernel/sched/core.c:8465) 
> > > > [ 86.286380][ T674] ? __might_sleep (kernel/sched/core.c:8465) 
> > > > [ 86.287116][ T674] kthread_worker_fn (include/linux/kernel.h:73 include/linux/freezer.h:53 kernel/kthread.c:851) 
> > > > [ 86.287701][ T674] ? kthread_worker_fn (kernel/kthread.c:?) 
> > > > [ 86.288138][ T674] kthread (kernel/kthread.c:391) 
> > > > [ 86.288482][ T674] ? __cfi_kthread_worker_fn (kernel/kthread.c:803) 
> > > > [ 86.288951][ T674] ? __cfi_kthread (kernel/kthread.c:342) 
> > > > [ 86.289560][ T674] ret_from_fork (arch/x86/kernel/process.c:153) 
> > > > [ 86.290162][ T674] ? __cfi_kthread (kernel/kthread.c:342) 
> > > > [ 86.291465][ T674] ret_from_fork_asm (arch/x86/entry/entry_64.S:254) 
> > > 
> > > AFAICT this is a pre-existing issue. Notably that all transcribes to:
> > > 
> > > kthread_worker_fn()
> > >   ...
> > > repeat:
> > >   set_current_state(TASK_INTERRUPTIBLE);
> > >   ...
> > >   if (work) { // false
> > >     __set_current_state(TASK_RUNNING);
> > >     ...
> > >   } else if (!freezing(current)) // false -- we are freezing
> > >     schedule();
> > > 
> > >   // so state really is still TASK_INTERRUPTIBLE here
> > >   try_to_freeze()
> > >     might_sleep() <--- boom, per the above.
> > >
> > 
> > Would the following fix make sense?
> 
> Yeah, that looks fine. Could you write it up as a proper patch please?
>

Yes, it should be a race condition in theory and I've sent a patch here:
https://lore.kernel.org/lkml/20240819141551.111610-1-yu.c.chen@intel.com/
And Andrew has given some comments on it.

However, after I did some further investigation, this warning seems to
not be directly related to task freeze, but has connection with the
delay dequeue. I'm planning to add debug patch and investigate the
symptom in 0day's environment, will send the finding later.

thanks,
Chenyu

> > 
> > diff --git a/kernel/kthread.c b/kernel/kthread.c
> > index f7be976ff88a..09850b2109c9 100644
> > --- a/kernel/kthread.c
> > +++ b/kernel/kthread.c
> > @@ -848,6 +848,12 @@ int kthread_worker_fn(void *worker_ptr)
> >  	} else if (!freezing(current))
> >  		schedule();
> >  
> > +	/*
> > +	 * Explictly set the running state in case we are being frozen
> > +	 * and skip the schedule() above. try_to_freeze() expects the
> > +	 * current task to be in running state.
> > +	 */
> > +	__set_current_state(TASK_RUNNING);
> >  	try_to_freeze();
> >  	cond_resched();
> >  	goto repeat;
> > -- 
> > 2.25.1
> > 
> > Hi Oliver,
> > Could you please help check if above change would make the warning go away?
> > 
> > thanks,
> > Chenyu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ