[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ea71a8d7-ba0f-4d43-9304-6544060a1bb6@igalia.com>
Date: Thu, 27 Feb 2025 19:23:23 +0900
From: Changwoo Min <changwoo@...lia.com>
To: Andrea Righi <arighi@...dia.com>, tj@...nel.org
Cc: void@...ifault.com, kernel-dev@...lia.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched_ext: Add trace point to track sched_ext core events
On 25. 2. 27. 17:19, Andrea Righi wrote:
> On Thu, Feb 27, 2025 at 05:05:54PM +0900, Changwoo Min wrote:
>> Hi Andrea,
>>
>> Thank you for the review!
>>
>> On 25. 2. 27. 16:38, Andrea Righi wrote:
>>> Hi Changwoo,
>>>
>>> On Wed, Feb 26, 2025 at 11:33:27PM +0900, Changwoo Min wrote:
>>>> Add tracing support, which may be useful for debugging sched_ext schedulers
>>>> that trigger a certain event.
>>>>
>>>> Signed-off-by: Changwoo Min <changwoo@...lia.com>
>>>> ---
>>>> include/trace/events/sched_ext.h | 21 +++++++++++++++++++++
>>>> kernel/sched/ext.c | 4 ++++
>>>> 2 files changed, 25 insertions(+)
>>>>
>>>> diff --git a/include/trace/events/sched_ext.h b/include/trace/events/sched_ext.h
>>>> index fe19da7315a9..88527b9316de 100644
>>>> --- a/include/trace/events/sched_ext.h
>>>> +++ b/include/trace/events/sched_ext.h
>>>> @@ -26,6 +26,27 @@ TRACE_EVENT(sched_ext_dump,
>>>> )
>>>> );
>>>> +TRACE_EVENT(sched_ext_add_event,
>>>> + TP_PROTO(const char *name, int offset, __u64 added),
>>>> + TP_ARGS(name, offset, added),
>>>> +
>>>> + TP_STRUCT__entry(
>>>> + __string(name, name)
>>>> + __field( int, offset )
>>>> + __field( __u64, added )
>>>> + ),
>>>> +
>>>> + TP_fast_assign(
>>>> + __assign_str(name);
>>>> + __entry->offset = offset;
>>>> + __entry->added = added;
>>>> + ),
>>>> +
>>>> + TP_printk("name %s offset %d added %llu",
>>>> + __get_str(name), __entry->offset, __entry->added
>>>> + )
>>>> +);
>>>
>>> Isn't the name enough to determine which event has been triggered? What are
>>> the benefits of reporting also the offset within struct scx_event_stats?
>>>
>>
>> @name and @offset are duplicated information. However, I thought
>> having two is more convenient from the users' point of view
>> because they have different pros and cons.
>>
>> @offset is quick to compare and can be used easily in the BPF
>> code, but the offset of an event can change across kernel
>> versions when new events are added. @offset would be good to
>> write a quick trace hook for debugging.
>>
>> On the other hand, @name won't change across kernel versions,
>> which is good. However, it requires more code to acutally read
>> the string in the BPF code (__data_loc for string is a 32-bit
>> integer encoding string length and location).
>>
>> Does it make sense to you?
> So, IMHO @offset to me would make sense if we guarantee that it won't
> change across kernel versions, and that's probably doable, we just need to
> make sure that we always add new events at the bottom of scx_event_stats.
Keeping the offset across versions is possible if we add new
events to the bottom. However, I am not sure if that is what we
want because we lose the nice logical grouping of the events in
the scx_event_stats struct.
> Otherwise there's the risk to break potential users of this tracepoint that
> may consider the offset like a portable ID.
Hmm... I agree. The @offset would be too low level and could the
potential source of confusion.
> Maybe we can call it @id or @event_id or similar and guarantee its
> portability? What do you think?
Now I think dropping @offset would be better in the long run
because we can maintain scx_event_stats clean and do not create
a source of confusion. Regarding the ease of using @name, adding
an code example in the commit message will suffice, something
like this:
struct tp_add_event {
struct trace_entry ent;
u32 __data_loc_name;
u64 delta;
};
SEC("tracepoint/sched_ext/sched_ext_add_event")
int tp_add_event(struct tp_add_event *ctx)
{
char event_name[128];
unsigned short offset = ctx->__data_loc_name & 0xFFFF;
bpf_probe_read_str((void *)event_name, 128, (char *)ctx + offset);
bpf_printk("name %s delta %llu", event_name, ctx->delta);
return 0;
}
The downside of not having a numerical ID (@offset or @event_id)
is the cost of string comparison to distinguish an event type. If
we assume the probing the event is rare, it will be okay.
@Tejun, @Andrea -- What do you think? Should we provide
a portability-guaranteed @event_id after dropping @offset? Or
would it be more than sufficient to have a string-type event name?
Regards,
Changwoo Min
Powered by blists - more mailing lists