lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4b48fd24-6cd5-474c-bed8-3faac096fd58@arm.com>
Date: Tue, 11 Feb 2025 17:27:47 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Hagar Hemdan <hagarhem@...zon.com>
Cc: abuehaze@...zon.com, linux-kernel@...r.kernel.org
Subject: Re: BUG Report: Fork benchmark drop by 30% on aarch64

On 10/02/2025 22:31, Hagar Hemdan wrote:
> On Mon, Feb 10, 2025 at 11:38:51AM +0100, Dietmar Eggemann wrote:
>> On 07/02/2025 12:07, Hagar Hemdan wrote:
>>> On Fri, Feb 07, 2025 at 10:14:54AM +0100, Dietmar Eggemann wrote:
>>>> Hi Hagar,
>>>>
>>>> On 05/02/2025 16:10, Hagar Hemdan wrote:

[...]

>> The 'spawn' tasks in sched_move_task() are 'running' and 'queued' so we
>> call dequeue_task(), put_prev_task(), enqueue_task() and
>> set_next_task().
>>
>> I guess what we need here is the cfs_rq->avg.load_avg (cpu_load() in
>> case of root tg) update in:
>>
>>   task_change_group_fair() -> detach_task_cfs_rq() -> ...,
>>   attach_task_cfs_rq() -> ...
>>
>> since this is used for WF_FORK, WF_EXEC handling in wakeup:
>>
>>   select_task_rq_fair() -> sched_balance_find_dst_cpu() ->
>>   sched_balance_find_dst_group_cpu()
>>
>> in form of 'least_loaded_cpu' and 'load = cpu_load(cpu_rq(i)'.
>>
>> You mentioned AutoGroups (AG). I don't see this issue on my Debian 12
>> Juno-r0 Arm64 board. When I run w/ AG, 'group' is '/' and
>> 'tsk->sched_task_group' is '/autogroup-x' so the condition 'if (group ==
>> tsk->sched_task_group)' isn't true in sched_move_task(). If I disable AG
>> then they match "/" == "/".
>>
>> I assume you run Ubuntu on your AWS instances? What kind of
>> 'cgroup/taskgroup' related setup are you using?
> 
> I'm running AL2023 and use Vanilla kernel 6.13.1 on m6g.xlarge AWS instance.
> AL2023 uses cgroupv2 by default.
>>
>> Can you run w/ this debug snippet w/ and w/o AG enabled?
> 
> I have run that and have attached the trace files to this email.

Thanks!

So w/ AG you see that 'group' and 'tsk->sched_task_group' are both
'/user.slice/user-1000.slice/session-1.scope' so we bail for those tasks
w/o doing the 'cfs_rq->avg.load_avg' update I described above.

You said that there is no issue w/o AG. Unfortunately your 'w/o AG'
trace does not contain any evidence that you ran UnixBench's './Run -c 4
spawn' since there are no lines for tasks with p->comm='spawn'. Could
you rerun this please. My hunch is that 'group' and
'tsk->sched_task_group' differ w/o AG?

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ