[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57E3C78C.5040400@redhat.com>
Date: Thu, 22 Sep 2016 07:59:08 -0400
From: Prarit Bhargava <prarit@...hat.com>
To: Borislav Petkov <bp@...e.de>
CC: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Len Brown <len.brown@...el.com>,
Andi Kleen <ak@...ux.intel.com>, Jiri Olsa <jolsa@...hat.com>,
Juergen Gross <jgross@...e.com>
Subject: Re: [PATCH 0/2 v3] cpu hotplug: Preserve topology directory after
soft remove event
On 09/21/2016 10:01 AM, Borislav Petkov wrote:
> On Wed, Sep 21, 2016 at 09:32:47AM -0400, Prarit Bhargava wrote:
>> This is not the right thing to do [1]. The topology directory should exist as
>> long as the thread is present in the system. The thread (and its core) are
>> still physically there, it's just that the thread is not available to the
>> scheduler. The topology of the thread hasn't changed due to it being soft
>> offlined this way.
>
> So far so good.
>
>> turbostat was modified to deal with the missing topology directory, and in tree
>> utility cpupower prints out significantly less information when a thread is
>> offline.
>
> Why does it do that? Why does an offlined core change that info?
>
> Concrete details please.
>
>> ISTR a powertop bug due to hotplug too. This makes these monitoring
>> utilities a problem for users who want only one thread per core.
>
> one thread per core? What does that mean?
System boots with (usually) with 2 threads/core. Some performance users want
one thread per core. Since there is no "noht" option anymore, users use /sys to
disable a thread on each core.
>
>> This now means that
>>
>> echo 0 > /sys/devices/system/cpu/cpu29/online
>>
>> will result in the thread's topology directory staying around until the struct
>> device associated with it is destroyed upon a physical socket hotplug event.
>
> So your 2/2 says that on an offlined CPU, you have
>
> /sys/devices/system/cpu/cpu10/topology/core_id:3
> /sys/devices/system/cpu/cpu10/topology/core_siblings:0000
> /sys/devices/system/cpu/cpu10/topology/core_siblings_list:
> /sys/devices/system/cpu/cpu10/topology/physical_package_id:0
> /sys/devices/system/cpu/cpu10/topology/thread_siblings:0000
> /sys/devices/system/cpu/cpu10/topology/thread_siblings_list:
>
> and this information is bollocks. core_siblings is 0, thread_siblings
> is 0. You can just as well not have them there at all.
core_siblings and thread_siblings are the online thread's sibling cores and
threads that are available to the scheduler, and should be 0 when the thread is
offline. That comes directly from reading the code.
>
> So is this whole jumping around just so that you can have a
> /sys/devices/system/cpu/cpu10/topology directory and so that tools don't
> get confused by it missing?
Yes.
>
> So again, what exactly are those tools accessing and how does the
> offlined cores puzzle them?
>
> A concrete example please:
>
See commit 20102ac5bee3 ("cpupower: cpupower monitor reports uninitialized
values for offline cpus"). That patch papers over the bug of not being able to
find core_id and physical_package_id for an offline thread.
P.
Powered by blists - more mailing lists