linux-kernel - Re: [PATCH V5 2/2] tracing, perf : add cpu hotplug trace events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.1103021056460.2701@localhost6.localdomain6>
Date:	Wed, 2 Mar 2011 11:08:45 +0100 (CET)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Vincent Guittot <vincent.guittot@...aro.org>
cc:	linux-kernel@...r.kernel.org, linux-hotplug@...r.kernel.org,
	Frederic Weisbecker <fweisbec@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>, amit.kucheria@...aro.org,
	Rusty Russell <rusty@...tcorp.com.au>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [PATCH V5 2/2] tracing, perf : add cpu hotplug trace events

Vincent,

On Mon, 28 Feb 2011, Vincent Guittot wrote:
> On 24 February 2011 19:40, Thomas Gleixner <tglx@...utronix.de> wrote:
> > On Thu, 24 Feb 2011, Vincent Guittot wrote:

Sorry, this mail got eaten somehow, but I got it from my archive.

> >>  #ifdef CONFIG_SMP
> >>  /* Serializes the updates to cpu_online_mask, cpu_present_mask */
> >>  static DEFINE_MUTEX(cpu_add_remove_lock);
> >> @@ -197,10 +200,13 @@ struct take_cpu_down_param {
> >>  static int __ref take_cpu_down(void *_param)
> >>  {
> >>       struct take_cpu_down_param *param = _param;
> >> +     unsigned int cpu = (unsigned int)(param->hcpu);
> >>       int err;
> >>
> >>       /* Ensure this CPU doesn't handle any more interrupts. */
> >> +     trace_cpu_hotplug_disable_start(cpu);
> >>       err = __cpu_disable();
> >> +     trace_cpu_hotplug_disable_end(cpu);
> >
> > How useful. What about recording the return code of __cpu_disable()?
> >
> 
> The goal is to monitor the cpu hotplug activity and duration. I want
> to detect 2 kind of cpu_down/cpu_up call, ones which succeed to
> unplug/plug a core and ones which don't. But I'm not sure that we need
> to sort the failed calls into to the trace. We trace them because too
> much fails could point out a bug or a wrong use of cpu hotplug.

This does not make sense at all. You want to see the failures, then
recording the error code makes even more sense. Your way of decoding
the error case by checking whether the next trace entry is there or
missing is just sloppy.
 
> >>       if (err < 0)
> >>               return err;
> >
> >> +     trace_cpu_hotplug_down_start(cpu);
> >> +
> >
> > What's the point of this tracepoint _BEFORE_ the cpu_hotplug_disabled
> > check without recording cpu_hotplug_disabled ?
> >
> 
> I want to trace all cpu_down call even those which returns immediately
> which will be part of the failed calls.

Decoding failure from missing entries is simply wrong.
 
> >>       if (cpu_hotplug_disabled) {
> >>               err = -EBUSY;
> >>               goto out;
> >> @@ -284,6 +294,8 @@ int __ref cpu_down(unsigned int cpu)
> >>       err = _cpu_down(cpu, 0);
> >>
> >>  out:
> >> +     trace_cpu_hotplug_down_end(cpu);
> >
> > And this one is misplaced as well. It wants to be only called when we
> > actually called _cpu_down() and it wants to record the return code as
> > well.
> >
> 
> It has been placed here to be called each time
> trace_cpu_hotplug_down_start is called.

That does not make sense at all. You cannot distinguish between
cpu_hotplug_disabled set and the _cpu_down() being called case. Or do
you want to tell me that you decode that from the time stamps? Hell
no. We want traces which are readable w/o crystal balls.
 
Thanks,

	tglx