[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aee872be-c34d-729b-575d-df488a46f2cf@codeaurora.org>
Date: Fri, 16 Feb 2018 17:48:21 -0800
From: Raghavendra Rao Ananta <rananta@...eaurora.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: alexander.shishkin@...ux.intel.com, jolsa@...hat.com,
namhyung@...nel.org, mingo@...hat.com,
linux-kernel@...r.kernel.org, psodagud@...eaurora.org,
tsoni@...eaurora.org, skannan@...eaurora.org
Subject: Re: [PATCH 1/1] perf: Add CPU hotplug support for events
On 02/16/2018 12:39 PM, Peter Zijlstra wrote:
> On Fri, Feb 16, 2018 at 10:06:29AM -0800, Raghavendra Rao Ananta wrote:
>>> No this is absolutely disguisting. You can simply keep the events in the
>>> dead CPU's context. It's really not that hard.
>> Keeping the events in the dead CPU's context was also an idea that we had.
>> However, detaching that event from the PMU when the CPU is offline would be
>> a pain. Consider the scenario in which an event is about to be destroyed
>> when the CPU is offline (yet still attached to the CPU). During it's
>> destruction, a cross-cpu call is made (from perf_remove_from_context()) to
>> the offlined CPU to detach the event from the CPU's PMU. As the CPU is
>> offline, that would not be possible, and again a separate logic has to be
>> written for cleaning up the events whose CPUs are offlined.
>
> That is actually really simple to deal with. The real problems are with
> semantics, is an event enabled when the CPU is dead? Can you
> disable/enable an event on a dead CPU.
>
> The below patch (_completely_ untested) should do most of it, but needs
> help with the details. I suspect we want to allow enable/disable on
> events that are on a dead CPU, and equally I think we want to account
> the time an enabled event spends on a dead CPU to go towards the
> 'enabled' bucket.
I've gone through your diff, and it gave me a hint of similar texture
what we are trying to do (except for maintaining an offline event list).
Nevertheless, I tried to test your patch. I created an hw event, and
tried to offline the CPU in parallel, and I immediately hit a watchdog
soft lockup bug! Tried the same this by first switching off the CPU
(without any event created), and I hit into similar issue. I am sure we
can fix it, but apart from the "why we are doing hotplug?" question, was
was there specifically any issue with our patch?
>
>>> Also, you _still_ don't explain why you care about dead CPUs.
>>>
I wanted to understand, if we no longer care about hotplugging of CPUs,
then why do we still have exported symbols such as cpu_up() and
cpu_down()? Moreover, we also have the hotplug interface exposed to
users-space as well (through sysfs). As long as these interfaces exist,
there's always a potential chance of bringing the CPU up/down. Can you
please clear this thing up for me?
-- Raghavendra
--
Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Powered by blists - more mailing lists