[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z4mzTDPObuHDIuFO@slm.duckdns.org>
Date: Thu, 16 Jan 2025 15:33:00 -1000
From: Tejun Heo <tj@...nel.org>
To: Changwoo Min <changwoo@...lia.com>
Cc: void@...ifault.com, arighi@...dia.com, kernel-dev@...lia.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/7] sched_ext: Implement event counter infrastructure
and add an event
Hello,
On Fri, Jan 17, 2025 at 12:15:37AM +0900, Changwoo Min wrote:
> +/*
> + * Collection of event counters.
> + */
> +struct scx_event_stat {
If we keep this struct, maybe name it scx_event_stats?
> + /*
> + * If ops.select_cpu() returns a CPU which can't be used by the task,
> + * the core scheduler code silently picks a fallback CPU.
> + */
> + u64 INVAL_SELECT_CPU;
Let's do $COMPONENT_$EVENT, so SELECT_CPU_INVALID (or maybe something more
specific than invalid, like disallowed or fallback?).
> +#define SCX_EVENT_IDX(e) (offsetof(struct scx_event_stat, e)/sizeof(u64))
> +#define SCX_EVENT_END_IDX() (sizeof(struct scx_event_stat)/sizeof(u64))
> +#define SCX_EVENT_DEFINE(e) SCX_EVENT_##e = SCX_EVENT_IDX(e)
> +
> +/*
> + * Types of event counters.
> + */
> +enum scx_event_kind {
> + SCX_EVENT_BEGIN = 0,
I think 0 BEGIN value can be implicit.
> + SCX_EVENT_DEFINE(INVAL_SELECT_CPU),
> + SCX_EVENT_END = SCX_EVENT_END_IDX(),
SCX_NR_EVENTS? Also, we don't really need to macro to count the enums.
> +};
This feels a bit odd to me. Why does it need both the struct fields and
indices? Can we do either one of those?
> +static const char *scx_event_stat_str[] = {
> + [SCX_EVENT_INVAL_SELECT_CPU] = "invalid_select_cpu",
> +};
and hopefully derive this through # macro arg expansion?
> +/*
> + * The event counter is organized by a per-CPU variable to minimize the
> + * accounting overhead without synchronization. A system-wide view on the
> + * event counter is constructed when requested by scx_bpf_get_event_stat().
> + */
> +static DEFINE_PER_CPU(struct scx_event_stat, event_stats);
May be better to include "cpu" in the variable name.
> +/**
> + * scx_add_event - Increase an event counter for 'name' by 'cnt'
> + * @name: an event name defined in struct scx_event_stat
> + * @cnt: the number of the event occured
> + */
> +#define scx_add_event(name, cnt) ({ \
> + struct scx_event_stat *__e; \
> + __e = get_cpu_ptr(&event_stats); \
> + WRITE_ONCE(__e->name, __e->name+ (cnt)); \
> + put_cpu_ptr(&event_stats); \
this_cpu_add()?
> @@ -3607,8 +3667,10 @@ static int select_task_rq_scx(struct task_struct *p, int prev_cpu, int wake_flag
> *ddsp_taskp = NULL;
> if (ops_cpu_valid(cpu, "from ops.select_cpu()"))
> return cpu;
> - else
> + else {
> + scx_add_event(INVAL_SELECT_CPU, 1);
> return prev_cpu;
> + }
formatting:
if () {
} else {
}
Also, I'm not sure this is a useful event to count. False ops_cpu_valid()
indicates that the returned CPU is not even possible and the scheduler is
ejected right away. What's more interesting is
kernel/sched/core.c::select_task_rq() tripping on !is_cpu_allowed() and
falling back using select_fallback_rq().
We can either hook into core.c or just compare the ops.select_cpu() picked
CPU against the CPU the task ends up on in enqueue_task_scx().
> @@ -5053,6 +5115,15 @@ static void scx_ops_disable_workfn(struct kthread_work *work)
> scx_rq_clock_invalidate(rq);
> }
>
> + /*
> + * Clear event counters so the next scx scheduler always gets
> + * fresh event counter values.
> + */
> + for_each_possible_cpu(cpu) {
> + struct scx_event_stat *e = per_cpu_ptr(&event_stats, cpu);
> + memset(e, 0, sizeof(*e));
> + }
> +
Wouldn't we want to keep these counters intact on ejection so that the
scheduler's ejection path can capture the counters if necessary? Resetting
on load probably is better.
Thanks.
--
tejun
Powered by blists - more mailing lists