linux-kernel - Re: [PATCH v4 2/6] sched_ext: Implement scx_rq_clock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <41fdc3ec-7082-41f9-99b5-ab28838d9ec1@igalia.com>
Date: Tue, 10 Dec 2024 16:21:00 +0900
From: Changwoo Min <changwoo@...lia.com>
To: Andrea Righi <arighi@...dia.com>
Cc: tj@...nel.org, void@...ifault.com, mingo@...hat.com,
 peterz@...radead.org, kernel-dev@...lia.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/6] sched_ext: Implement scx_rq_clock_update/stale()

Hello Andrea,

Thank you for the review.

On 24. 12. 9. 18:40, Andrea Righi wrote:
>> @@ -766,9 +767,11 @@ struct scx_rq {
>>   	unsigned long		ops_qseq;
>>   	u64			extra_enq_flags;	/* see move_task_to_local_dsq() */
>>   	u32			nr_running;
>> -	u32			flags;
>>   	u32			cpuperf_target;		/* [0, SCHED_CAPACITY_SCALE] */
>>   	bool			cpu_released;
>> +	u32			flags;
>> +	u64			clock;			/* current per-rq clock -- see scx_bpf_now_ns() */
>> +	u64			prev_clock;		/* previous per-rq clock -- see scx_bpf_now_ns() */
> 
> Since we're reordering this struct, we may want to move cpu_released all
> the way to the bottom to get rid of the 3-bytes hole (and still have
> flags, clock and prev_clock in the same cacheline).

We'd better keep the layout as it is. That is because moving
cpu_released to the end of the struct creates 4-byte hole between
flags and clock and 7-byte padding at the end after cpu_released.
I double-checked the two layouts using pahole.


> Nit, this is just personal preference (feel free to ignore it):
> 
> 	if (!scx_enabled())
> 		return;
> 	rq->scx.prev_clock = rq->scx.clock;
> 	rq->scx.clock = clock;
> 	rq->scx.flags |= SCX_RQ_CLK_VALID;
> 
That's prettier. I will change it as you suggested.


> I'm wondering if we need to invalidate the clock on all rqs when we call
> scx_ops_enable() to prevent getting stale information from a previous
> scx scheduler.
> 
> Probably it's not an issue, since scx_ops_disable_workfn() should make
> sure that all tasks are going through rq_unpin_lock() before unloading
> the current scheduler, maybe it could be helpful to add comment about
> this scenario in scx_bpf_now_ns() (PATCH 4/6)?

That's a good catch. In theory, there is a possibility that
a scx_rq is not invalidated when unloading the sched_ext. Since
scx_ops_disable_workfn() iterates all the sched_ext tasks, an rq
would not be invalidated if there is no scx task on the rq.
I will add the code which iterates and invalidates all scx_rqs at
scx_ops_disable_workfn() in the next version.

Thank you again!
Changwoo Min