lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAB8ipk8PZoMyNstOssfwscCgdZ2E_dTtThH94QAjwEGEWAR23Q@mail.gmail.com>
Date: Thu, 28 Aug 2025 15:00:12 +0800
From: Xuewen Yan <xuewen.yan94@...il.com>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: Xuewen Yan <xuewen.yan@...soc.com>, hannes@...xchg.org, peterz@...radead.org, 
	mathieu.desnoyers@...icios.com, mhiramat@...nel.org, rostedt@...dmis.org, 
	mingo@...hat.com, juri.lelli@...hat.com, vincent.guittot@...aro.org, 
	dietmar.eggemann@....com, bsegall@...gle.com, mgorman@...e.de, 
	vschneid@...hat.com, linux-kernel@...r.kernel.org, 
	linux-trace-kernel@...r.kernel.org, ke.wang@...soc.com, yuming.han@...soc.com
Subject: Re: [RFC PATCH V2] sched: psi: Add psi events trace point

On Wed, Aug 27, 2025 at 8:57 AM Suren Baghdasaryan <surenb@...gle.com> wrote:
>
> On Fri, Aug 22, 2025 at 6:51 PM Xuewen Yan <xuewen.yan94@...il.com> wrote:
> >
> > Hi Suren,
> >
> > Thanks for your review:)
> >
> > On Thu, Aug 21, 2025 at 3:51 AM Suren Baghdasaryan <surenb@...gle.com> wrote:
> > >
> > > On Wed, Aug 20, 2025 at 4:28 AM Xuewen Yan <xuewen.yan@...soc.com> wrote:
> > > >
> > > > Add trace point to psi triggers. This is useful to
> > > > observe the psi events in the kernel space.
> > > >
> > > > One use of this is to monitor memory pressure.
> > > > When the pressure is too high, we can kill the process
> > > > in the kernel space to prevent OOM.
> > > >
> > > > Signed-off-by: Xuewen Yan <xuewen.yan@...soc.com>
> > > > ---
> > > > v2:
> > > > -fix compilation error;
> > > > -export the tp;
> > > > -add more commit message;
> > > > ---
> > > >  include/trace/events/sched.h | 5 +++++
> > > >  kernel/sched/psi.c           | 4 ++++
> > > >  2 files changed, 9 insertions(+)
> > > >
> > > > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> > > > index 7b2645b50e78..d54db5fcbca2 100644
> > > > --- a/include/trace/events/sched.h
> > > > +++ b/include/trace/events/sched.h
> > > > @@ -896,6 +896,11 @@ DECLARE_TRACE(sched_set_need_resched,
> > > >         TP_PROTO(struct task_struct *tsk, int cpu, int tif),
> > > >         TP_ARGS(tsk, cpu, tif));
> > > >
> > > > +struct psi_trigger;
> > > > +DECLARE_TRACE(psi_event,
> > >
> > > DECLARE_TRACE will create a tracepoint but will not export it in the
> > > tracefs. Why should we not have it in the tracefs?
> >
> > I haven't figured out what content should be displayed in the trace yet.
> > Until this is fully determined, I think it might be a better option to
> > just export the tracepoint and let users add their own hooks to print
> > the content they need.
>
> You can report what you think makes sense today for this tracepoint.
> Tracepoints can be enhanced later if needed since their format is
> exported to the userspace and well-designed userspace parsers should
> be able to parse new fields.

Okay, I would change this to be tracepoint.

>
> >
> > >
> > > > +       TP_PROTO(struct psi_trigger *t),
> > > > +       TP_ARGS(t));
> > > > +
> > > >  #endif /* _TRACE_SCHED_H */
> > > >
> > > >  /* This part must be outside protection */
> > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > > > index 59fdb7ebbf22..f06eb91a1250 100644
> > > > --- a/kernel/sched/psi.c
> > > > +++ b/kernel/sched/psi.c
> > > > @@ -141,6 +141,8 @@
> > > >  #include <linux/psi.h>
> > > >  #include "sched.h"
> > > >
> > > > +EXPORT_TRACEPOINT_SYMBOL_GPL(psi_event_tp);
> > >
> > > So, are you planning to attach some handler to this trace event in your driver?
> >
> > Yes, our modules would attach a handler to observe the memory pressure.
> >
> > >
> > > > +
> > > >  static int psi_bug __read_mostly;
> > > >
> > > >  DEFINE_STATIC_KEY_FALSE(psi_disabled);
> > > > @@ -509,6 +511,8 @@ static void update_triggers(struct psi_group *group, u64 now,
> > > >                 if (now < t->last_event_time + t->win.size)
> > > >                         continue;
> > > >
> > > > +               trace_psi_event_tp(t);
> > >
> > > This should only be done if the below cmpxchg() check is true, right?
> > > Otherwise it will not match with what userspace is receiving.
> >
> > If we put it below cmpxchg() check, we may lose some event before the
> > user space repose the signal.
> > Because the t->event needs to be set to 0 again.
> > In order to ensure that all events are displayed, it is better to put
> > it before the cmpxchg.
>
> Huh? Are you modifying the t->event inside your handler?! If so, that
> is unacceptable and I will NAK this change. A driver should not
> interfere with core kernel mechanisms.

Sorry for missing this, the next patch, I would export some variables
instead of the pointer of t...

Thanks!

>
> >
> > Thanks!
> >
> > >
> > > > +
> > > >                 /* Generate an event */
> > > >                 if (cmpxchg(&t->event, 0, 1) == 0) {
> > > >                         if (t->of)
> > > > --
> > > > 2.25.1
> > > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ