lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5dc3a40e-f071-3ac8-4bf0-f555b9d94ff1@arm.com>
Date:   Wed, 30 Mar 2022 17:48:34 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Phil Auld <pauld@...hat.com>
Cc:     linux-kernel@...r.kernel.org,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH] arch/arm64: Fix topology initialization for core
 scheduling

On 29/03/2022 21:50, Phil Auld wrote:
> On Tue, Mar 29, 2022 at 08:55:08PM +0200 Dietmar Eggemann wrote:
>> On 29/03/2022 17:20, Phil Auld wrote:
>>> On Tue, Mar 29, 2022 at 04:02:22PM +0200 Dietmar Eggemann wrote:
>>>> On 22/03/2022 17:03, Phil Auld wrote:

[...]

>>> This instance is an HPE Apollo 70 set to smt-4.  I believe it's ThunderX2
>>> chips.
>>>
>>> ARM (CN9980-2200LG4077-Y21-G) 
>> I'm using the same processor just with ACPI/PPTT.
>>
> 
> Maybe I'm misinformed about these systems having no PPTT...  
> 
> I'm reclaiming the system. Is there a way I can tell from userspace?

# cat /sys/firmware/acpi/tables/PPTT > pptt.dat
# iasl -d pptt.dat
# vim pptt.dsl

[...]

>> so no SMT sched domain. The MPIDR-based topology fallback code in
>> store_cpu_topology() forces `cpuid_topo->thread_id  = -1`.
> 
> Right. So since I'm getting SMT it must not have package_id == -1.
> In which case you should be able to reproduce it because it must
> be that the call the update_siblings_masks() is required.  That
> appears to only be called from store_cpu_topology() which is
> after the scheduler has already setup the core pointers.
> 
> The fix could be the same but I should reword the commit message
> since it should effect all SMT arm systems I'd think.
> 
> Or maybe the ACPI topology code should call update_sibling_masks(). 
>>
>> IMHO this is why on my machine I don't see this issue while running:
>>
>> root@...-apollo7007:~# stress-ng --prctl 256 -t 60
>> stress-ng: info:  [2388042] dispatching hogs: 256 prctl
>>
>> Is there something I miss in my setup to provoke this issue?
>>
> 
> Make sure you have a stress-ng that is new enough and built against
> headers that have the CORE_SCHED prctls defined.

Ah, I was using a pretty old version 0.11.07. Now I switched to 0.13.12
which includes:

  9038e442b92d - stress-prctl: add Linux 5.14 PR_SCHED_CORE prctl

To get SCHED_CORE activated in stress-prctl.c, as a quick hack, I had to
add the definitions of PR_SCHED_CORE, PR_SCHED_CORE_GET, etc. to this file.

Now the issue you described triggers on this machine immediately.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ