[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201006103541.GA31325@lothringen>
Date: Tue, 6 Oct 2020 12:35:41 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: Nitesh Narayan Lal <nitesh@...hat.com>
Cc: Alex Belits <abelits@...vell.com>,
"mingo@...nel.org" <mingo@...nel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
"rostedt@...dmis.org" <rostedt@...dmis.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"will@...nel.org" <will@...nel.org>,
Prasun Kapoor <pkapoor@...vell.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>
Subject: Re: [EXT] Re: [PATCH v4 03/13] task_isolation: userspace hard
isolation from kernel
On Mon, Oct 05, 2020 at 02:52:49PM -0400, Nitesh Narayan Lal wrote:
>
> On 10/4/20 7:14 PM, Frederic Weisbecker wrote:
> > On Sun, Oct 04, 2020 at 02:44:39PM +0000, Alex Belits wrote:
> >> On Thu, 2020-10-01 at 15:56 +0200, Frederic Weisbecker wrote:
> >>> External Email
> >>>
> >>> -------------------------------------------------------------------
> >>> ---
> >>> On Wed, Jul 22, 2020 at 02:49:49PM +0000, Alex Belits wrote:
> >>>> +/*
> >>>> + * Description of the last two tasks that ran isolated on a given
> >>>> CPU.
> >>>> + * This is intended only for messages about isolation breaking. We
> >>>> + * don't want any references to actual task while accessing this
> >>>> from
> >>>> + * CPU that caused isolation breaking -- we know nothing about
> >>>> timing
> >>>> + * and don't want to use locking or RCU.
> >>>> + */
> >>>> +struct isol_task_desc {
> >>>> + atomic_t curr_index;
> >>>> + atomic_t curr_index_wr;
> >>>> + bool warned[2];
> >>>> + pid_t pid[2];
> >>>> + pid_t tgid[2];
> >>>> + char comm[2][TASK_COMM_LEN];
> >>>> +};
> >>>> +static DEFINE_PER_CPU(struct isol_task_desc, isol_task_descs);
> >>> So that's quite a huge patch that would have needed to be split up.
> >>> Especially this tracing engine.
> >>>
> >>> Speaking of which, I agree with Thomas that it's unnecessary. It's
> >>> too much
> >>> code and complexity. We can use the existing trace events and perform
> >>> the
> >>> analysis from userspace to find the source of the disturbance.
> >> The idea behind this is that isolation breaking events are supposed to
> >> be known to the applications while applications run normally, and they
> >> should not require any analysis or human intervention to be handled.
> > Sure but you can use trace events for that. Just trace interrupts, workqueues,
> > timers, syscalls, exceptions and scheduler events and you get all the local
> > disturbance. You might want to tune a few filters but that's pretty much it.
> >
> > As for the source of the disturbances, if you really need that information,
> > you can trace the workqueue and timer queue events and just filter those that
> > target your isolated CPUs.
> >
>
> I agree that we can do all those things with tracing.
> However, IMHO having a simplified logging mechanism to gather the source of
> violation may help in reducing the manual effort.
>
> Although, I am not sure how easy will it be to maintain such an interface
> over time.
The thing is: tracing is your simplified logging mechanism here. You can achieve
the same in userspace with _way_ less code, no race, and you can do it in
bash.
Thanks.
Powered by blists - more mailing lists