[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0e77c310-178e-6bf8-1583-c583cdeb7eaf@codeaurora.org>
Date: Thu, 13 Sep 2018 09:10:27 -0600
From: Jeffrey Hugo <jhugo@...eaurora.org>
To: Brice Goglin <brice.goglin@...il.com>,
Sudeep Holla <sudeep.holla@....com>,
James Morse <james.morse@....com>
Cc: Jeremy Linton <jeremy.linton@....com>, rjw@...ysocki.net,
linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org,
vkilari@...eaurora.org
Subject: Re: [PATCH] ACPI/PPTT: Handle architecturally unknown cache types
On 9/13/2018 5:53 AM, Brice Goglin wrote:
> Le 13/09/2018 à 11:35, Sudeep Holla a écrit :
>> On Thu, Sep 13, 2018 at 10:39:10AM +0100, James Morse wrote:
>>> Hi Brice,
>>>
>>> On 13/09/18 06:51, Brice Goglin wrote:
>>>> Le 12/09/2018 à 11:49, Sudeep Holla a écrit :
>>>>>> Yes. Without this change, we hit the lscpu error in the commit message,
>>>>>> and get zero output about the system. We don't even get information
>>>>>> about the caches which are architecturally specified or how many cpus
>>>>>> are present. With this change, we get what we expect out of lscpu (and
>>>>>> also lstopo) including the cache(s) which are not architecturally
>>>>>> specified.
>>>>>>
>>>>> lscpu and lstopo are so broken. They just assume everything on CPU0.
>>>>> If you hotplug them out, you start seeing issues. So reading and file
>>>>> that doesn't exist and then bail out on other essential info though they
>>>>> are present, hmmm ...
>>>> Can you elaborate?
>>>>
>>>> I am not sure cpu0 is supposed to be offlineable on Linux. There's no
>>>> "online" file in /sys/devices/system/cpu/cpu0. That's why former lstopo
>>>> doesn't like CPU0 being hotplugged out. We are actually making that case
>>>> work for another non-standard corner case. But offlining "cpu0" this is
>>>> considered "normal", somebody must add that missing "online" sysfs
>>>> attribute for "cpu0" (change
>>>> https://elixir.bootlin.com/linux/latest/source/drivers/base/cpu.c#L375).
>>> On x86 you can't normally offline CPU0, its something to do with certain
>>> interrupts always being routed to CPU0, (oh, and hibernate).
>>> You should be able to enable this behaviour with 'cpu0_hotplug' on the kernel
>>> command line.
>>>
>>> (Kconfig's CONFIG_BOOTPARAM_HOTPLUG_CPU0 and CONFIG_DEBUG_HOTPLUG_CPU0 are also
>>> worth a look)
>>>
>>> On arm64 at least, cpu0 is just like the others, and can be offlined.
>>>
>> Thanks James, for providing all the details.
>>
>> To add to the issues I spotted with lscpu/lstopo around topology, it ignores
>> the updates to topology sibling masks when CPUs are hotplugged in and out.
>>
>> We have following in lscpu:
>> add_summary_n(tb, _("Core(s) per socket:"),
>> cores_per_socket ?: desc->ncores / desc->nsockets);
>>
>> Now when cores_per_socket = 1, (i.e when we don't have procfs entry),
>> if ncores = (ncores_max - few_cpus_hotplugged_out), core(s) per socket
>> will get computed as less than the actual number.
>>
>> IMO lscpu should be used only when all CPUs are online and it should have
>> a warning when all cores are not online.
>>
>>>> By the way, did anybody actually see an error with lstopo when there's
>>>> no "type" attribute for L3? I can't reproduce any issue, we just skip
>>>> that specific cache entirely, but everything else appears. If you guys
>>>> want to make that "no_cache" cache appear, I'll make it a Unified cache
>>>> unless you tell me what to show :)
>> IIUC, Jeffrey Hugo did see error as per his initial message:
>> "
>> This fixes the following lscpu issue where only the cache type sysfs file
>> is missing which results in no output providing a poor user experience in
>> the above system configuration.
>> lscpu: cannot open /sys/devices/system/cpu/cpu0/cache/index3/type: No such
>> file or directory
>> "
>>
>
> I don't know about lscpu (it's a different project), but lstopo
> shouldn't have any such problem.
>
> If you see an issue with lstopo, I'd be interesting in getting the
> tarball generated by hwloc-gather-topology (it dumps useful files from
> procfs and sysfs so that we may debug offline).
No error was reported with lstopo, but we don't see the cache as
expected. Fixing the type results in the expected lstopo output. This
seems consistent with your expectations.
--
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.
Powered by blists - more mailing lists