lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 18 Oct 2018 14:36:05 +0200
From:   luca abeni <luca.abeni@...tannapisa.it>
To:     Juri Lelli <juri.lelli@...hat.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Juri Lelli <juri.lelli@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        syzbot <syzbot+385468161961cee80c31@...kaller.appspotmail.com>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        LKML <linux-kernel@...r.kernel.org>, mingo@...hat.com,
        nstange@...e.de, syzkaller-bugs@...glegroups.com, henrik@...tad.us,
        Tommaso Cucinotta <tommaso.cucinotta@...tannapisa.it>,
        Claudio Scordino <claudio@...dence.eu.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>
Subject: Re: INFO: rcu detected stall in do_idle

Hi Juri,

On Thu, 18 Oct 2018 14:21:42 +0200
Juri Lelli <juri.lelli@...hat.com> wrote:
[...]
> > > > I missed the original emails, but maybe the issue is that the
> > > > task blocks before the tick, and when it wakes up again
> > > > something goes wrong with the deadline and runtime assignment?
> > > > (maybe because the deadline is in the past?)    
> > > 
> > > No, the problem is that the task won't be throttled at all,
> > > because its replenishing instant is always way in the past when
> > > tick occurs. :-/  
> > 
> > Ok, I see the issue now: the problem is that the "while
> > (dl_se->runtime <= 0)" loop is executed at replenishment time, but
> > the deadline should be postponed at enforcement time.
> > 
> > I mean: in update_curr_dl() we do:
> > 	dl_se->runtime -= scaled_delta_exec;
> > 	if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) {
> > 		...
> > 		enqueue replenishment timer at dl_next_period(dl_se)
> > But dl_next_period() is based on a "wrong" deadline!
> > 
> > 
> > I think that inserting a
> >         while (dl_se->runtime <= -pi_se->dl_runtime) {
> >                 dl_se->deadline += pi_se->dl_period;
> >                 dl_se->runtime += pi_se->dl_runtime;
> >         }
> > immediately after "dl_se->runtime -= scaled_delta_exec;" would fix
> > the problem, no?  
> 
> Mmm, I also thought of letting the task "pay back" its overrunning.
> But, doesn't this get us quite far from what one would expect. I mean,
> enforcement granularity will be way different from task period, no?

Yes, the granularity will be what the kernel can provide (due to the HZ
value and to the hrtick on/off state). But at least the task will not
starve non-deadline tasks (which is bug that originated this
discussion, I think).

If I understand well, there are two different (and orthogonal) issues
here:
1) Due to a bug in the accounting / enforcement mechanisms
   (the wrong placement of the while() loop), the tasks consumes 100% of
   the CPU time, starving non-deadline tasks
2) Due to the large HZ value, the small runtime (and period) and the
   fact that hrtick is disabled, the kernel cannot provide the
   requested scheduling granularity

The second issue can be fixed by imposing limits on minimum and maximum
runtime and the first issue can be fixed by changing the code as I
suggested in my previous email.

I would suggest to address both the two issues, with separate changes
(the current replenishment code looks strange anyway).



				Luca

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ