[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z4qETRVdvJLDSUHg@slm.duckdns.org>
Date: Fri, 17 Jan 2025 06:24:45 -1000
From: Tejun Heo <tj@...nel.org>
To: Changwoo Min <changwoo@...lia.com>
Cc: void@...ifault.com, arighi@...dia.com, kernel-dev@...lia.com,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/7] sched_ext: Implement event counter infrastructure
and add an event
Hello,
On Fri, Jan 17, 2025 at 04:08:45PM +0900, Changwoo Min wrote:
> $COMPONENT_$EVENT sounds more systematic. Thanks for the
> suggestion. What about SELECT_CPU_FALLBACK?
Sounds good.
> > This feels a bit odd to me. Why does it need both the struct fields and
> > indices? Can we do either one of those?
>
> Before submitting the patch set, I tested one approach using
> purely enums storing the actual counters in an array and another
> (current) approach using struct fields. I rejected the first
> approach because the BPF verifier does not allow changing the
> array size, causing the BPF binary compatibility problem. The
That makes sense. It'd be useful to note that on top of the struct
definition.
> second approach is satisfactory regarding the BPF binary
> compatibility by CO-RE. However, it is a little bit cumbersome to
> iterate all the events, so I added the enums. I know the enums
> are only usable in the kernel code due to the binary
> compatibility, but I thought adding the enums rather than
> bloating the scx_dump_state() code at the end would be better.
> Does it make sense to you?
Let's just use the structs and open code dumping. We can pack the dump
output better that way too and I don't think BPF scheds would have much use
for enum iterations.
...
> > > +/**
> > > + * scx_add_event - Increase an event counter for 'name' by 'cnt'
> > > + * @name: an event name defined in struct scx_event_stat
> > > + * @cnt: the number of the event occured
> > > + */
> > > +#define scx_add_event(name, cnt) ({ \
> > > + struct scx_event_stat *__e; \
> > > + __e = get_cpu_ptr(&event_stats); \
> > > + WRITE_ONCE(__e->name, __e->name+ (cnt)); \
> > > + put_cpu_ptr(&event_stats); \
> >
> > this_cpu_add()?
>
> That's handier. I will change it.
Note that there's __this_cpu_add() too which can be used from sections of
code that already disables preemption. On x86, both variants cost the same
but on arm __this_cpu_add() is cheaper as it can skip preemption off/on.
Given that most of these counters are modified while holding rq lock, it may
make sense to provide __ version too.
...
> > Also, I'm not sure this is a useful event to count. False ops_cpu_valid()
> > indicates that the returned CPU is not even possible and the scheduler is
> > ejected right away. What's more interesting is
> > kernel/sched/core.c::select_task_rq() tripping on !is_cpu_allowed() and
> > falling back using select_fallback_rq().
> >
> > We can either hook into core.c or just compare the ops.select_cpu() picked
> > CPU against the CPU the task ends up on in enqueue_task_scx().
>
> Modifying core.c will be more direct and straightforward. Also,
> I think it would be better to separate this commit into two: one
> for the infra-structure and another for the SELECT_CPU_FALLBACK
> event, which touches core.c. I will move the necessary code in
> the infrastructure into kernel/sched/sched.h, so we can use
> scx_add_event() in core.c.
Hmm.... yeah, both approaches have pros and cons but I kinda like the idea
of restricting the events within ext.c and here detecting the fallback
condition is pretty trivial. I don't know.
Thanks.
--
tejun
Powered by blists - more mailing lists