linux-kernel - Re: RE：[PATCH] sched: Add trace for task wake up latency and leave running time

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200903074232.GW1362448@hirez.programming.kicks-ass.net>
Date:   Thu, 3 Sep 2020 09:42:32 +0200
From:   peterz@...radead.org
To:     gengdongjiu <gengdongjiu@...wei.com>
Cc:     "mingo@...hat.com" <mingo@...hat.com>,
        "juri.lelli@...hat.com" <juri.lelli@...hat.com>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
        "dietmar.eggemann@....com" <dietmar.eggemann@....com>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "bsegall@...gle.com" <bsegall@...gle.com>,
        "mgorman@...e.de" <mgorman@...e.de>,
        "thara.gopinath@...aro.org" <thara.gopinath@...aro.org>,
        "pauld@...hat.com" <pauld@...hat.com>,
        "vincent.donnefort@....com" <vincent.donnefort@....com>,
        "rdunlap@...radead.org" <rdunlap@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: RE：[PATCH] sched: Add trace
 for task wake up latency and leave running time

On Wed, Sep 02, 2020 at 10:35:34PM +0000, gengdongjiu wrote:

> > NAK, that tracepoint is already broken, we don't want to proliferate the broken.
>   
> Sorry, What the meaning that tracepoint is already broken? 

Just that, the tracepoint is crap. But we can't fix it because ABI. Did
I tell you I utterly hate tracepoints?

> Maybe I need to explain the reason that why I add two trace point. 
> when using perf tool or Ftrace sysfs to capture the task wake-up latency and the task leaving running queue time, usually the trace data is too large and the CPU utilization rate is too high in the process due to a lot of disk write. Sometimes even the disk is full, the issue still does not reproduced that above two time exceed a certain threshold.  So I added two trace points, using filter we can only record the abnormal trace that includes wakeup latency and leaving running time larger than an threshold. 
> Or do you have better solution?

Learn to use a MUA and wrap your lines at 78 chars like normal people.

Yes, use ftrace synthetic events, or bpf or really anything other than
this.

> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c index
> > > 8471a0f7eb32..b5a1928dc948 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -2464,6 +2464,8 @@ static void ttwu_do_wakeup(struct rq *rq, struct
> > > task_struct *p, int wake_flags,  {
> > >  	check_preempt_curr(rq, p, wake_flags);
> > >  	p->state = TASK_RUNNING;
> > > +	p->ts_wakeup = local_clock();
> > > +	p->wakeup_state = true;
> > >  	trace_sched_wakeup(p);
> > >
> > >  #ifdef CONFIG_SMP
> > 
> > NAK, userless overhead.
> 
>  When sched switch, we do not know the next task previous state and
>  wakeup timestamp, so I record the task previous state if it is waken
>  from sleep.  And then it can calculate the wakeup latency when task
>  switch.

I don't care. You're making things slower.