linux-kernel - SCHED_DEADLINE cpudeadline.{h,c} fixup

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <5739EE84.9070801@sssup.it>
Date:	Mon, 16 May 2016 18:00:04 +0200
From:	Tommaso Cucinotta <tommaso.cucinotta@...up.it>
To:	linux-kernel <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...il.com>,
	Luca Abeni <luca.abeni@...tn.it>, mingo@...hat.com
Subject: SCHED_DEADLINE cpudeadline.{h,c} fixup

Hi,

looking at the SCHED_DEADLINE code, I spotted an opportunity to
make cpudeadline.c faster, in that we can skip real swaps during
re-heapify()ication of items after addition/removal. As such ops
are done under a domain spinlock, it sounded like an interesting
try.

Indeed, I've got a speed-up of up to ~6% for the cpudl_set() calls
on a randomly generated workload of 1K,10K,100K random insertions
and deletions (75% cpudl_set() calls with is_valid=1 and 25% with
is_valid=0), and randomly generated cpu IDs with 2, 4, ..., 256 CPUs.
Details in the attached plot.

The attached patch does this, along with a minimum of rework of
cpudeadline.c internals, and a final clean-up of the cpudeadline.h
interface (second patch).

The measurements have been made on an Intel Core2 Duo with the CPU
frequency fixed at max, by letting cpudeadline.c be initialized with
various numbers of CPUs, then  making many calls sequentially, taking
the rdtsc among calls, then dumping all numbers through printk(),
and I'm plotting the average of clock ticks between consecutive calls.
[ I can share the benchmarking code as well if needed ]

Also, this fixes what seems to me a bug I noticed comparing the whole
heap contents as handledbut the modified code vs the original one,
insertion by insertion. The problem is in this code:

		cp->elements[cp->size - 1].dl = 0;
		cp->elements[cp->size - 1].cpu = cpu;
		cp->elements[cpu].idx = cp->size - 1;
		mycpudl_change_key(cp, cp->size - 1, dl);

when fed by an absolute deadline that is so large to have a negative
value as a s64. In such a case, as from dl_time_before(), the kernel
should handle correctly the abs deadline wrap-around, however the
current code in cpudeadline.c goes mad, and doesn't re-heapify
correctly the just inserted element... that said, if these are ns,
such a bug should be hit after a ~292 years of uptime :-D...

I'd be happy to hear comments from others. I can provide additional
info / make additional experiments as needed.

Please, reply-all to this e-mail, I'm not subscribed to linux-kernel@.

Thanks,

	Tommaso
-- 
Tommaso Cucinotta, Computer Engineering PhD
Associate Professor at the Real-Time Systems Laboratory (ReTiS)
Scuola Superiore Sant'Anna, Pisa, Italy
http://retis.sssup.it/people/tommaso

View attachment "0001-Make-deadline-max-heap-faster-and-fix-deadline-wrap-.patch" of type "text/x-patch" (5309 bytes)

View attachment "0002-Split-cpudl_set-into-cpudl_set-and-cpudl_clear.patch" of type "text/x-patch" (5500 bytes)

Download attachment "cpudl.pdf" of type "application/pdf" (10719 bytes)