lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180723141322.GZ2458@hirez.programming.kicks-ass.net>
Date:   Mon, 23 Jul 2018 16:13:22 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Patrick Bellasi <patrick.bellasi@....com>
Cc:     Alessio Balsini <alessio.balsini@...il.com>,
        linux-kernel@...r.kernel.org,
        Joel Fernandes <joel@...lfernandes.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Tommaso Cucinotta <tommaso.cucinotta@...tannapisa.it>,
        Luca Abeni <luca.abeni@...tannapisa.it>,
        Claudio Scordino <claudio@...dence.eu.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Ingo Molnar <mingo@...hat.com>
Subject: Re: [RFC PATCH] sched/deadline: sched_getattr() returns absolute
 dl-task information

On Mon, Jul 23, 2018 at 01:49:46PM +0100, Patrick Bellasi wrote:
> On 23-Jul 11:49, Peter Zijlstra wrote:
> 
> [...]
> 
> > > -void __getparam_dl(struct task_struct *p, struct sched_attr *attr)
> > > +void __getparam_dl(struct task_struct *p, struct sched_attr *attr,
> > > +		   unsigned int flags)
> > >  {
> > >  	struct sched_dl_entity *dl_se = &p->dl;
> > >  
> > >  	attr->sched_priority = p->rt_priority;
> > > -	attr->sched_runtime = dl_se->dl_runtime;
> > > -	attr->sched_deadline = dl_se->dl_deadline;
> > > +
> > > +	if (flags & SCHED_GETATTR_FLAGS_DL_ABSOLUTE) {
> > > +		/*
> > > +		 * If the task is not running, its runtime is already
> > > +		 * properly accounted. Otherwise, update clocks and the
> > > +		 * statistics for the task.
> > > +		 */
> > > +		if (task_running(task_rq(p), p)) {
> > > +			struct rq_flags rf;
> > > +			struct rq *rq;
> > > +
> > > +			rq = task_rq_lock(p, &rf);
> > > +			sched_clock_tick();
> > > +			update_rq_clock(rq);
> > > +			task_tick_dl(rq, p, 0);
> > 
> > Do we really want task_tick_dl() here, or update_curr_dl()?
> 
> I think this was to cover the case of a syscall being called while the
> task is running and we are midway between two ticks...

Sure, I know what it's there for, just saying that update_curr_dl()
would've updated the accounting as well. Calling tick stuff from !tick
context is a wee bit dodgy.

> > Also, who says the task still is dl ? :-)
> 
> Good point, but what should be the rule in general for these cases?
> 
> We already have:
> 
>    SYSCALL_DEFINE4(sched_getattr())
>        ....
>        if (task_has_dl_policy(p))
>             __getparam_dl(p, &attr);
> 
> which is also potentially racy, isn't it?

Yes, but only in so far as that the whole syscall is racy
per-definition. EVen if we'd lock the rq and get the absolute accurate
values, everything can change the moment we release the locks and return
to userspace again.

> Or just make the syscall return the most updated metrics for all the
> scheduling classes since we cannot grant the user anything about what
> the task will be once we return to userspace?

This.

> > > +			task_rq_unlock(rq, p, &rf);
> > > +		}
> > > +
> > > +		/*
> > > +		 * If the task is throttled, this value could be negative,
> > > +		 * but sched_runtime is unsigned.
> > > +		 */
> > > +		attr->sched_runtime = dl_se->runtime <= 0 ? 0 : dl_se->runtime;
> > > +		attr->sched_deadline = dl_se->deadline;
> > 
> > This is all very racy..
> > 
> > Even if the task wasn't running when you did the task_running() test, it
> > could be running now. And if it was running, it might not be running
> > anymore by the time you've acquired the rq->lock.
> 
> Which means we should use something like:
> 
>    if (flags & SCHED_GETATTR_FLAGS_DL_ABSOLUTE) {
>         /* Lock the task and the RQ before any other check and upate */
>         rq = task_rq_lock(p, &rf);
> 
>         /* Check the task is still DL ?*/
> 
>         /* Update task stats */
> 
>         task_rq_unlock(rq, p, &rf);
>    }
> 
> right?

Yeah, something along those lines.

> If that's better, then we should probably even better move the
> task_rq_lock at the beginning of SYSCALL_DEFINE4(sched_getattr()) ?

Hurm.. yes, we should probably have the has_dl_policy test under the
lock too. Which is really annoying, because this basically turns a
lockless syscall into locked one.

Another method would be to have __getparam_dl() 'fail' and retry if it
finds !has_dl_policy() once we have the lock. That would retain the
lockless nature for all current use-cases and only incur the locking
overhead for this new case.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ