netdev - Re: [EXT] Re: [PATCH v4 03/13] task_isolation: userspace hard isolation from kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201004231404.GA66364@lothringen>
Date:   Mon, 5 Oct 2020 01:14:04 +0200
From:   Frederic Weisbecker <frederic@...nel.org>
To:     Alex Belits <abelits@...vell.com>
Cc:     "mingo@...nel.org" <mingo@...nel.org>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        "catalin.marinas@....com" <catalin.marinas@....com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "will@...nel.org" <will@...nel.org>,
        Prasun Kapoor <pkapoor@...vell.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [EXT] Re: [PATCH v4 03/13] task_isolation: userspace hard
 isolation from kernel

On Sun, Oct 04, 2020 at 02:44:39PM +0000, Alex Belits wrote:
> On Thu, 2020-10-01 at 15:56 +0200, Frederic Weisbecker wrote:
> > External Email
> > 
> > -------------------------------------------------------------------
> > ---
> > On Wed, Jul 22, 2020 at 02:49:49PM +0000, Alex Belits wrote:
> > > +/*
> > > + * Description of the last two tasks that ran isolated on a given
> > > CPU.
> > > + * This is intended only for messages about isolation breaking. We
> > > + * don't want any references to actual task while accessing this
> > > from
> > > + * CPU that caused isolation breaking -- we know nothing about
> > > timing
> > > + * and don't want to use locking or RCU.
> > > + */
> > > +struct isol_task_desc {
> > > +	atomic_t curr_index;
> > > +	atomic_t curr_index_wr;
> > > +	bool	warned[2];
> > > +	pid_t	pid[2];
> > > +	pid_t	tgid[2];
> > > +	char	comm[2][TASK_COMM_LEN];
> > > +};
> > > +static DEFINE_PER_CPU(struct isol_task_desc, isol_task_descs);
> > 
> > So that's quite a huge patch that would have needed to be split up.
> > Especially this tracing engine.
> > 
> > Speaking of which, I agree with Thomas that it's unnecessary. It's
> > too much
> > code and complexity. We can use the existing trace events and perform
> > the
> > analysis from userspace to find the source of the disturbance.
> 
> The idea behind this is that isolation breaking events are supposed to
> be known to the applications while applications run normally, and they
> should not require any analysis or human intervention to be handled.

Sure but you can use trace events for that. Just trace interrupts, workqueues,
timers, syscalls, exceptions and scheduler events and you get all the local
disturbance. You might want to tune a few filters but that's pretty much it.

As for the source of the disturbances, if you really need that information,
you can trace the workqueue and timer queue events and just filter those that
target your isolated CPUs.

> 
> A process may exit isolation because some leftover delayed work, for
> example, a timer or a workqueue, is still present on a CPU, or because
> a page fault or some other exception, normally handled silently, is
> caused by the task. It is also possible to direct an interrupt to a CPU
> that is running an isolated task -- currently it's perfectly valid to
> set interrupt smp affinity to a CPU running isolated task, and then
> interrupt will cause breaking isolation. While it's probably not the
> best way of handling interrupts, I would rather not prohibit this
> explicitly.

Sure, but you can trace all these events with the existing tracing
interface we have.

> 
> There is also a matter of avoiding race conditions on entering
> isolation. Once CPU entered isolation, other CPUs should avoid
> disturbing it when they know that CPU is running a task in isolated
> mode. However for a short time after entering isolation other CPUs may
> be unaware of this, and will still send IPIs to it. Preventing this
> scenario completely would be very costly in terms of what other CPUs
> will have to do before notifying others, so similar to how EINTR works,
> we can simply specify that this is allowed, and task is supposed to re-
> enter isolation after this. It's still a bad idea to specify that
> isolation breaking can continue happening while application is running
> in isolated mode, however allowing some "grace period" after entering
> is acceptable as long as application is aware of this happening.

Right but that doesn't look related to tracing. Anyway I guess we
can make the CPU enter some specific mode after calling synchronize_rcu().

> 
> In libtmc I have moved this handling of isolation breaking into a
> separate thread, intended to become a separate daemon if necessary. In
> part it was done because initial implementation of isolation made it
> very difficult to avoid repeating delayed work on isolated CPUs, so
> something had to watch for it from non-isolated CPU. It's possible that
> now, when delayed work does not appear on isolated CPUs out of nowhere,
> the need in isolation manager thread will disappear, and task itself
> will be able to handle all isolation breaking, like original
> implementation by Chris was supposed to.
> 
> However in either case it's still useful for the task, or isolation
> manager, to get a description of the isolation-breaking event. This is
> what those things are intended for. Now they only produce log messages
> because this is where initially all description of isolation-breaking
> events went, however I would prefer to make logging optional but always
> let applications read those events descriptions, regardless of any
> tracing mechanism being used. I was more focused on making the
> reporting mechanism properly detect the cause of isolation breaking
> because that functionality was not quite working in earlier work by
> Chris and Yuri, so I have kept logging as the only output, but made it
> suitable for producing events that applications will be able to
> receive. Application, or isolation manager, will receive clear and
> unambiguous reporting, so there will be no need for any additional
> analysis or guesswork.

That still look like a job for userspace, based on trace events.

Thanks.