[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bf41cc53-6f12-483e-92ea-3a57880b0ac1@arm.com>
Date: Thu, 6 Mar 2025 11:10:26 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Aboorva Devarajan <aboorvad@...ux.ibm.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, riel@...riel.com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
vschneid@...hat.com, odin@...d.al, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] sched/fair: Fix invalid pointer dereference in
child_cfs_rq_on_list()
On 06/03/2025 05:48, Aboorva Devarajan wrote:
> On Wed, 2025-03-05 at 10:23 +0100, Dietmar Eggemann wrote:
>> On 05/03/2025 09:21, Vincent Guittot wrote:
>>> On Tue, 4 Mar 2025 at 18:00, Aboorva Devarajan <aboorvad@...ux.ibm.com> wrote:
[...]
>>>> A recent attempt to reorder fields in struct rq exposed this issue by
>>>> modifying memory offsets and affecting how pointer computations are
>>>> resolved. While the problem existed before, it was previously masked by
>>>> specific field arrangement. The reordering caused erroneous pointer
>>>> accesses, leading to a NULL dereference and a crash, as seen in the
>>
>> I'm running tip/sched/core on arm64 and I still only see the wrong
>> pointer for 'prev_cfs_rq->tg->parent' in the 'prev ==
>> &rq->leaf_cfs_rq_list' case?
>>
>> ...
>> cpu=5 prev_cfs_rq->tg=ffff00097efb63a0 parent=0000000000000010
>> cfs_rq->tg=ffff000802084000
>> ...
>>
>
> Hi Dietmar,
>
> Yes, you are right, I meant that we will still have invalid pointers and use it
> silently in the vanilla kernel, but it won't always lead to a crash.
>
> The crash in this specific case happens if `prev_cfs_rq->tg` points to a memory
> location that cannot be de-referenced. Otherwise, the function de-references and
> uses memory locations that are not valid but did not cause a visible failure so far.
>
> Here are more details on what I meant by reordering the runqueue:
>
> With the system and kernel configuration, I encountered the crash while trying
> to reorder the runqueue structure, here is the minimal change that caused the
> crash on top of v6.14-rc5 kernel:
Ah, OK. You changed the code locally. Somehow I thought you referred to
a change which is already in mainline (or tip/sched/core) and I was
wondering which one it would be.
[...]
> But looks like a patch similar to this is merged yesterday [1], so this can
> be ignored :)
>
>
> [1] https://lore.kernel.org/all/174119292742.14745.16827644501260146974.tip-bot2@tip-bot2/
Ah, OK, same idea though.
BTW, the:
} else {
prev = rq->tmp_alone_branch;
}
path is taken when dealing with CONFIG_CFS_BANDWIDTH and throttling
scenarios so this is important to cover as well.
Powered by blists - more mailing lists