lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrX6e+mqfy-rNV3sA8xGVDNHviQ9vHBBhAPULeLecno7XQ@mail.gmail.com>
Date:	Tue, 20 Oct 2015 14:26:34 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Chris Metcalf <cmetcalf@...hip.com>
Cc:	Gilad Ben Yossef <giladb@...hip.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>, Tejun Heo <tj@...nel.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Christoph Lameter <cl@...ux.com>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will.deacon@....com>,
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
	Linux API <linux-api@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v8 04/14] task_isolation: add initial support

On Tue, Oct 20, 2015 at 2:20 PM, Chris Metcalf <cmetcalf@...hip.com> wrote:
> On 10/20/2015 04:56 PM, Andy Lutomirski wrote:
>>
>> On Tue, Oct 20, 2015 at 1:36 PM, Chris Metcalf <cmetcalf@...hip.com>
>> wrote:
>>>
>>> +/*
>>> + * In task isolation mode we try to return to userspace only after
>>> + * attempting to make sure we won't be interrupted again.  To handle
>>> + * the periodic scheduler tick, we test to make sure that the tick is
>>> + * stopped, and if it isn't yet, we request a reschedule so that if
>>> + * another task needs to run to completion first, it can do so.
>>> + * Similarly, if any other subsystems require quiescing, we will need
>>> + * to do that before we return to userspace.
>>> + */
>>> +bool _task_isolation_ready(void)
>>> +{
>>> +       WARN_ON_ONCE(!irqs_disabled());
>>> +
>>> +       /* If we need to drain the LRU cache, we're not ready. */
>>> +       if (lru_add_drain_needed(smp_processor_id()))
>>> +               return false;
>>> +
>>> +       /* If vmstats need updating, we're not ready. */
>>> +       if (!vmstat_idle())
>>> +               return false;
>>> +
>>> +       /* If the tick is running, request rescheduling; we're not ready.
>>> */
>>> +       if (!tick_nohz_tick_stopped()) {
>>> +               set_tsk_need_resched(current);
>>> +               return false;
>>> +       }
>>> +
>>> +       return true;
>>> +}
>>
>> I still don't get why this is a loop.
>
>
> You mean, why is this code called from prepare_exit_to_userspace()
> in the loop, instead of after the loop?  It's because the actual functions
> that clean up the LRU, vmstat worker, etc., may need interrupts enabled,
> may reschedule internally, etc.  (refresh_cpu_vm_stats() calls
> cond_resched(), for example.)

Yuck.  I guess that's a reasonable argument, although it could also be fixed.

>  Even more importantly, we rely on
> rescheduling to take care of the fact that the scheduler tick may still
> be running, and therefore loop back to the schedule() call that's run
> when TIF_NEED_RESCHED gets set.

This just seems like a mis-design.  We don't know why the scheduler
tick is on, so we're just going to reschedule until the problem goes
away?

>
>> BTW, should isolation just be a scheduler class (SCHED_ISOLATED)?
>
>
> So a scheduler class is an interesting idea certainly, although not
> one I know immediately how to implement.  I'm not sure whether
> it makes sense to require a user be root or have a suitable rtprio
> rlimit, but perhaps so.  The nice thing about the current patch
> series is that you can affinitize yourself to a nohz_full core and
> declare that you want to run task-isolated, and none of that
> requires root nor really is there a reason it should.

Your patches more or less implement "don't run me unless I'm
isolated".  A scheduler class would be more like "isolate me (and
maybe make me super high priority so it actually happens)".

I'm not a scheduler person, so I don't know.  But "don't run me unless
I'm isolated" seems like a design that will, at best, only ever work
by dumb luck.  You have to disable migration, avoid other runnable
tasks, hope that the kernel keeps working the way it did when you
wrote the patch, hope you continue to get lucky enough that you ever
get to user mode in the first place, etc.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ