linux-kernel - Re: [PATCH sched_ext/for-6.15 v3 3/5] sched_ext: Add scx_kfunc_ids_ops_context_sensitive for unified filtering of context-sensitive SCX kfuncs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID:
 <AM6PR03MB5080C1F0E0F10BCE67101F6F99CD2@AM6PR03MB5080.eurprd03.prod.outlook.com>
Date: Thu, 27 Feb 2025 21:23:20 +0000
From: Juntong Deng <juntong.deng@...look.com>
To: Tejun Heo <tj@...nel.org>
Cc: ast@...nel.org, daniel@...earbox.net, john.fastabend@...il.com,
 andrii@...nel.org, martin.lau@...ux.dev, eddyz87@...il.com, song@...nel.org,
 yonghong.song@...ux.dev, kpsingh@...nel.org, sdf@...ichev.me,
 haoluo@...gle.com, jolsa@...nel.org, memxor@...il.com, void@...ifault.com,
 arighi@...dia.com, changwoo@...lia.com, bpf@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH sched_ext/for-6.15 v3 3/5] sched_ext: Add
 scx_kfunc_ids_ops_context_sensitive for unified filtering of
 context-sensitive SCX kfuncs

On 2025/2/27 20:25, Tejun Heo wrote:

>> +/* Each flag corresponds to a btf kfunc id set */
>> +enum scx_ops_kf_flags {
>> +	SCX_OPS_KF_ANY			= 0,
>> +	SCX_OPS_KF_UNLOCKED		= 1 << 1,

> nit: Any specific reason to skip bit 0?

Thanks for your reply.

This is a mistake and will be fixed in the next version.

>> +	[SCX_OP_IDX(exit_task)]			= SCX_OPS_KF_ANY,
>> +	[SCX_OP_IDX(enable)]			= SCX_OPS_KF_ANY,
>> +	[SCX_OP_IDX(disable)]			= SCX_OPS_KF_ANY,
>> +	[SCX_OP_IDX(dump)]			= SCX_OPS_KF_DISPATCH,

> Shouldn't this be SCX_OPS_KF_UNLOCKED?

This is another mistake and will be fixed in the next version.

> Hello,
> 
> On Wed, Feb 26, 2025 at 07:28:18PM +0000, Juntong Deng wrote:
>> +BTF_KFUNCS_START(scx_kfunc_ids_ops_context_sensitive)
>> +/* scx_kfunc_ids_select_cpu */
>> +BTF_ID_FLAGS(func, scx_bpf_select_cpu_dfl, KF_RCU)
>> +/* scx_kfunc_ids_enqueue_dispatch */
>> +BTF_ID_FLAGS(func, scx_bpf_dsq_insert, KF_RCU)
>> +BTF_ID_FLAGS(func, scx_bpf_dsq_insert_vtime, KF_RCU)
>> +BTF_ID_FLAGS(func, scx_bpf_dispatch, KF_RCU)
>> +BTF_ID_FLAGS(func, scx_bpf_dispatch_vtime, KF_RCU)
>> +/* scx_kfunc_ids_dispatch */
>> +BTF_ID_FLAGS(func, scx_bpf_dispatch_nr_slots)
>> +BTF_ID_FLAGS(func, scx_bpf_dispatch_cancel)
>> +BTF_ID_FLAGS(func, scx_bpf_dsq_move_to_local)
>> +BTF_ID_FLAGS(func, scx_bpf_consume)
>> +/* scx_kfunc_ids_cpu_release */
>> +BTF_ID_FLAGS(func, scx_bpf_reenqueue_local)
>> +/* scx_kfunc_ids_unlocked */
>> +BTF_ID_FLAGS(func, scx_bpf_create_dsq, KF_SLEEPABLE)
>> +/* Intersection of scx_kfunc_ids_dispatch and scx_kfunc_ids_unlocked */
>> +BTF_ID_FLAGS(func, scx_bpf_dsq_move_set_slice)
>> +BTF_ID_FLAGS(func, scx_bpf_dsq_move_set_vtime)
>> +BTF_ID_FLAGS(func, scx_bpf_dsq_move, KF_RCU)
>> +BTF_ID_FLAGS(func, scx_bpf_dsq_move_vtime, KF_RCU)
>> +BTF_ID_FLAGS(func, scx_bpf_dispatch_from_dsq_set_slice)
>> +BTF_ID_FLAGS(func, scx_bpf_dispatch_from_dsq_set_vtime)
>> +BTF_ID_FLAGS(func, scx_bpf_dispatch_from_dsq, KF_RCU)
>> +BTF_ID_FLAGS(func, scx_bpf_dispatch_vtime_from_dsq, KF_RCU)
>> +BTF_KFUNCS_END(scx_kfunc_ids_ops_context_sensitive)
> 
> I'm not a big fan of repeating the kfuncs. This is going to be error-prone.
> Can't it register and test the existing sets in the filter function instead?
> If that's too cumbersome, maybe we can put these in a macro so that we don't
> have to repeat the functions?
> 

The core idea of the current design is to separate the kfunc id set used
for filtering purpose and grouping purpose, so that we only need one
filter and do not need to add separate filters for each kfunc id set.
So although kfuncs appear repeatedly in two kfunc id sets, they are
used for different purposes.

scx_kfunc_ids_ops_context_sensitive is only used for filtering purposes
and includes all context-sensitive kfuncs. We need to rely on another
grouping purpose kfunc id set, for example, scx_kfunc_ids_dispatch,
to determine whether a kfunc is allowed to be called in the
dispatch context.

Macro is a good idea, I will try it in the next version.

>> +static int scx_kfunc_ids_ops_context_sensitive_filter(const struct bpf_prog *prog, u32 kfunc_id)
>> +{
>> +	u32 moff, flags;
>> +
>> +	if (!btf_id_set8_contains(&scx_kfunc_ids_ops_context_sensitive, kfunc_id))
>> +		return 0;
>> +
>> +	if (prog->type == BPF_PROG_TYPE_SYSCALL &&
>> +	    btf_id_set8_contains(&scx_kfunc_ids_unlocked, kfunc_id))
>> +		return 0;
> 
> Not from this change but these can probably be allowed from TRACING too.
> 

Not sure if it is safe to make these kfuncs available in TRACING.
If Alexei sees this email, could you please leave a comment?

>> +	if (prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
>> +	    prog->aux->st_ops != &bpf_sched_ext_ops)
>> +		return 0;
> 
> Why can't other struct_ops progs call scx_kfunc_ids_unlocked kfuncs?
> 

Return 0 means allowed. So kfuncs in scx_kfunc_ids_unlocked can be
called by other struct_ops programs.

>> +	/* prog->type == BPF_PROG_TYPE_STRUCT_OPS && prog->aux->st_ops == &bpf_sched_ext_ops*/
>> +
>> +	moff = prog->aux->attach_st_ops_member_off;
>> +	flags = scx_ops_context_flags[SCX_MOFF_IDX(moff)];
>> +
>> +	if ((flags & SCX_OPS_KF_UNLOCKED) &&
>> +	    btf_id_set8_contains(&scx_kfunc_ids_unlocked, kfunc_id))
>> +		return 0;
> 
> Wouldn't this disallow e.g. ops.dispatch() from calling scx_dsq_move()?
> 

No, because

>> [SCX_OP_IDX(dispatch)] = SCX_OPS_KF_DISPATCH | SCX_OPS_KF_ENQUEUE,

Therefore, kfuncs (scx_bpf_dsq_move_*) in scx_kfunc_ids_dispatch can be
called in the dispatch context.

> Have you tested that the before and after behaviors match?
>

I tested the programs in tools/testing/selftests/sched_ext and
tools/sched_ext and all worked fine.

If there are other cases that are not covered, we may need to add new
test cases.

> Thanks.
>