linux-kernel - Re: [PATCH] sched: Forward deadline for early tick

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20241225024819.51511-1-15645113830zzh@gmail.com>
Date: Wed, 25 Dec 2024 10:48:20 +0800
From: zihan zhou <15645113830zzh@...il.com>
To: vincent.guittot@...aro.org
Cc: 15645113830zzh@...il.com,
	bsegall@...gle.com,
	dietmar.eggemann@....com,
	juri.lelli@...hat.com,
	linux-kernel@...r.kernel.org,
	mgorman@...e.de,
	mingo@...hat.com,
	peterz@...radead.org,
	rostedt@...dmis.org,
	vschneid@...hat.com,
	yaozhenguo@...com,
	zhouzihan30@...com
Subject: Re: [PATCH] sched: Forward deadline for early tick

From: zhouzihan30 <zhouzihan30@...com>

Thank you for your reply!

> Having delta of rq_clock toggling above or below 1ms is normal because
> of the clockevent precision, if the previous delta was longer than 1ms
> then the next one will be shorter. But the average of several ticks
> remains 1ms like in your trace above
> 
> >  than 1ms
> >
> > In order to conduct a comparative experiment, I turned off those CONFIG
> >  and re checked the changes in clock, It is found that the values of
> >  rq clock and rq clock task become completely consistent, However,
> > according to the information from perf, there are still errors in tick
> >  (slice=3ms) :
> 
> Did you check that the whole tick was accounted for the task ?
> According to your trace of rq clock delta and rq clock task delta
> above, most of the sum of 3 consecutives tick is greater than 3ms for
> rq clock delta so I would assume that the  sum of delta_exec would be
> greater than 3ms as well after 3 ticks
> 
> >
> >       time    cpu  task name     wait time  sch delay   run time
> >                    [tid/pid]        (msec)     (msec)     (msec)
> > ---------- ------  ------------  ---------  ---------  ---------
> > 110.436513 [0001]  perf[1414]        0.000      0.000      0.000
> > 110.440490 [0001]  bash[1341]        0.000      0.000      3.977
> > 110.441490 [0001]  bash[1344]        0.000      0.000      0.999
> > 110.441548 [0001]  perf[1414]        4.976      0.000      0.058
> > 110.445491 [0001]  bash[1344]        0.058      0.000      3.942
> > 110.449490 [0001]  bash[1341]        5.000      0.000      3.999
> > 110.452490 [0001]  bash[1344]        3.999      0.000      2.999
> > 110.456491 [0001]  bash[1341]        2.999      0.000      4.000
> > 110.460489 [0001]  bash[1344]        4.000      0.000      3.998
> > 110.463490 [0001]  bash[1341]        3.998      0.000      3.001
> > 110.467493 [0001]  bash[1344]        3.001      0.000      4.002
> > 110.471490 [0001]  bash[1341]        4.002      0.000      3.996
> > 110.474489 [0001]  bash[1344]        3.996      0.000      2.999
> > 110.477490 [0001]  bash[1341]        2.999      0.000      3.000
> >

I use perf to record the impact of tick errors on runtime. When slice=3ms,
two busy tasks compete for one CPU. If one task runs for 4ms, it means
 that three ticks are less than 3ms. The task can only switch to another
 task after running 4ms on the next tick, which is 1ms more. Based on my
 observation, about 50% of the time will be like this (no 
CONFIG_IRQ_TIME_ACCOUNTING, if there is, there will be more time for the
 task to run for 4ms even if slice=3ms).


> > We once considered subtracting a little from a slice when setting it,
> > for example, if someone sets 3ms, we can subtract 0.1ms from it and
> >  make it 2.9ms. But this is not a good solution. If someone sets it to
> >  3.1ms, should we use 2.9ms or 3ms? There doesn't seem to be a
> >  particularly good option, and it may lead to even greater system errors.
> 
> And we end up giving less than its slice to task which could have set
> it to this value for a good reason.

Thank you, I think we have reached an agreement that the time given to a
 task at once should be less than or equal to the slice. EEVDF never
 guarantees that a task must run a slice at once, but the kernel ensures
 this. However, due to tick errors, there have been some issues like task
 has exceeded the allotted time.
I will propose patch v2 to try to solve this problem.