[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <afdac388-061e-a403-3b9e-1273cee98509@intel.com>
Date: Tue, 20 Sep 2022 13:23:29 -0700
From: Dave Jiang <dave.jiang@...el.com>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Ira Weiny <ira.weiny@...el.com>
Cc: Dan Williams <dan.j.williams@...el.com>,
Alison Schofield <alison.schofield@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>,
Ben Widawsky <bwidawsk@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Davidlohr Bueso <dave@...olabs.net>,
linux-kernel@...r.kernel.org, linux-cxl@...r.kernel.org
Subject: Re: [RFC PATCH 1/9] cxl/mem: Implement Get Event Records command
On 9/20/2022 8:49 AM, Jonathan Cameron wrote:
> On Fri, 9 Sep 2022 13:53:55 -0700
> Ira Weiny <ira.weiny@...el.com> wrote:
>
>> On Thu, Sep 08, 2022 at 01:52:40PM +0100, Jonathan Cameron wrote:
>>>
>> [snip]
>>
>>>>>> diff --git a/include/trace/events/cxl-events.h b/include/trace/events/cxl-events.h
>>>>>> new file mode 100644
>>>>>> index 000000000000..f4baeae66cf3
>>>>>> --- /dev/null
>>>>>> +++ b/include/trace/events/cxl-events.h
>>>>>> @@ -0,0 +1,127 @@
>>>>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>>>>> +#undef TRACE_SYSTEM
>>>>>> +#define TRACE_SYSTEM cxl_events
>>>>>> +
>>>>>> +#if !defined(_CXL_TRACE_EVENTS_H) || defined(TRACE_HEADER_MULTI_READ)
>>>>>> +#define _CXL_TRACE_EVENTS_H
>>>>>> +
>>>>>> +#include <linux/tracepoint.h>
>>>>>> +
>>>>>> +#define EVENT_LOGS \
>>>>>> + EM(CXL_EVENT_TYPE_INFO, "Info") \
>>>>>> + EM(CXL_EVENT_TYPE_WARN, "Warning") \
>>>>>> + EM(CXL_EVENT_TYPE_FAIL, "Failure") \
>>>>>> + EM(CXL_EVENT_TYPE_FATAL, "Fatal") \
>>>>>> + EMe(CXL_EVENT_TYPE_MAX, "<undefined>")
>>>>> Hmm. 4 is defined in CXL 3.0, but I'd assume we won't use tracepoints for
>>>>> dynamic capacity events so I guess it doesn't matter.
>>>> I'm not sure why you would say that. I anticipate some user space daemon
>>>> requiring these events to set things up.
>>> Certainly a possible solution. I'd kind of expect a more hand shake based approach
>>> than a tracepoint. Guess we'll see :)
>> Yea I think we should wait an see.
>>
>>>
>>>>>
>>>>>> + { CXL_EVENT_RECORD_FLAG_PERF_DEGRADED, "Performance Degraded" }, \
>>>>>> + { CXL_EVENT_RECORD_FLAG_HW_REPLACE, "Hardware Replacement Needed" } \
>>>>>> +)
>>>>>> +
>>>>>> +TRACE_EVENT(cxl_event,
>>>>>> +
>>>>>> + TP_PROTO(const char *dev_name, enum cxl_event_log_type log,
>>>>>> + struct cxl_event_record_raw *rec),
>>>>>> +
>>>>>> + TP_ARGS(dev_name, log, rec),
>>>>>> +
>>>>>> + TP_STRUCT__entry(
>>>>>> + __string(dev_name, dev_name)
>>>>>> + __field(int, log)
>>>>>> + __array(u8, id, UUID_SIZE)
>>>>>> + __field(u32, flags)
>>>>>> + __field(u16, handle)
>>>>>> + __field(u16, related_handle)
>>>>>> + __field(u64, timestamp)
>>>>>> + __array(u8, data, EVENT_RECORD_DATA_LENGTH)
>>>>>> + __field(u8, length)
>>>>> Do we want the maintenance operation class added in Table 8-42 from CXL 3.0?
>>>>> (only noticed because I happen to have that spec revision open rather than 2.0).
>>>> Yes done.
>>>>
>>>> There is some discussion with Dan regarding not decoding anything and letting
>>>> user space take care of it all. I think this shows a valid reason Dan
>>>> suggested this.
>>> I like being able to print tracepoints with out userspace tools.
>>> This also enforces structure and stability of interface which I like.
>> I tend to agree with you.
>>
>>> Maybe a raw tracepoint or variable length trailing buffer to pass
>>> on what we don't understand?
>> I've already realized that we need to print all reserved fields for this
>> reason. If there is something the kernel does not understand user space can
>> just figure it out on it's own.
>>
>> Sound reasonable?
> Hmm. Printing reserved fields would be unusual. Not sure what is done for similar
> cases elsewhere, CPER records etc...
>
> We could just print a raw array of the whole event as well as decode version, but
> that means logging most of the fields twice...
>
> Not nice either.
>
> I'm a bit inclined to say we should maybe just ignore stuff we don't know about or
> is there a version number we can use to decide between decoded vs decoded as much as
> possible + raw log?
libtraceevent can pull the trace event data structure fields directly.
So the raw data can be pulled directly from the kernel. And what gets
printed to the trace buffer can be decoded data constructed from those
fields by the kernel code. So with that you can have access both.
>
> Jonathan
>
>> Ira
>>
>>> Jonathan
>>>
>>>
Powered by blists - more mailing lists