lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 16 May 2016 18:00:04 +0200
From:	Tommaso Cucinotta <tommaso.cucinotta@...up.it>
To:	linux-kernel <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...il.com>,
	Luca Abeni <luca.abeni@...tn.it>, mingo@...hat.com
Subject: SCHED_DEADLINE cpudeadline.{h,c} fixup

Hi,

looking at the SCHED_DEADLINE code, I spotted an opportunity to
make cpudeadline.c faster, in that we can skip real swaps during
re-heapify()ication of items after addition/removal. As such ops
are done under a domain spinlock, it sounded like an interesting
try.

Indeed, I've got a speed-up of up to ~6% for the cpudl_set() calls
on a randomly generated workload of 1K,10K,100K random insertions
and deletions (75% cpudl_set() calls with is_valid=1 and 25% with
is_valid=0), and randomly generated cpu IDs with 2, 4, ..., 256 CPUs.
Details in the attached plot.

The attached patch does this, along with a minimum of rework of
cpudeadline.c internals, and a final clean-up of the cpudeadline.h
interface (second patch).

The measurements have been made on an Intel Core2 Duo with the CPU
frequency fixed at max, by letting cpudeadline.c be initialized with
various numbers of CPUs, then  making many calls sequentially, taking
the rdtsc among calls, then dumping all numbers through printk(),
and I'm plotting the average of clock ticks between consecutive calls.
[ I can share the benchmarking code as well if needed ]

Also, this fixes what seems to me a bug I noticed comparing the whole
heap contents as handledbut the modified code vs the original one,
insertion by insertion. The problem is in this code:

		cp->elements[cp->size - 1].dl = 0;
		cp->elements[cp->size - 1].cpu = cpu;
		cp->elements[cpu].idx = cp->size - 1;
		mycpudl_change_key(cp, cp->size - 1, dl);

when fed by an absolute deadline that is so large to have a negative
value as a s64. In such a case, as from dl_time_before(), the kernel
should handle correctly the abs deadline wrap-around, however the
current code in cpudeadline.c goes mad, and doesn't re-heapify
correctly the just inserted element... that said, if these are ns,
such a bug should be hit after a ~292 years of uptime :-D...

I'd be happy to hear comments from others. I can provide additional
info / make additional experiments as needed.

Please, reply-all to this e-mail, I'm not subscribed to linux-kernel@.

Thanks,

	Tommaso
-- 
Tommaso Cucinotta, Computer Engineering PhD
Associate Professor at the Real-Time Systems Laboratory (ReTiS)
Scuola Superiore Sant'Anna, Pisa, Italy
http://retis.sssup.it/people/tommaso

View attachment "0001-Make-deadline-max-heap-faster-and-fix-deadline-wrap-.patch" of type "text/x-patch" (5309 bytes)

View attachment "0002-Split-cpudl_set-into-cpudl_set-and-cpudl_clear.patch" of type "text/x-patch" (5500 bytes)

Download attachment "cpudl.pdf" of type "application/pdf" (10719 bytes)

Powered by blists - more mailing lists