lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <5739EE84.9070801@sssup.it> Date: Mon, 16 May 2016 18:00:04 +0200 From: Tommaso Cucinotta <tommaso.cucinotta@...up.it> To: linux-kernel <linux-kernel@...r.kernel.org>, Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...il.com>, Luca Abeni <luca.abeni@...tn.it>, mingo@...hat.com Subject: SCHED_DEADLINE cpudeadline.{h,c} fixup Hi, looking at the SCHED_DEADLINE code, I spotted an opportunity to make cpudeadline.c faster, in that we can skip real swaps during re-heapify()ication of items after addition/removal. As such ops are done under a domain spinlock, it sounded like an interesting try. Indeed, I've got a speed-up of up to ~6% for the cpudl_set() calls on a randomly generated workload of 1K,10K,100K random insertions and deletions (75% cpudl_set() calls with is_valid=1 and 25% with is_valid=0), and randomly generated cpu IDs with 2, 4, ..., 256 CPUs. Details in the attached plot. The attached patch does this, along with a minimum of rework of cpudeadline.c internals, and a final clean-up of the cpudeadline.h interface (second patch). The measurements have been made on an Intel Core2 Duo with the CPU frequency fixed at max, by letting cpudeadline.c be initialized with various numbers of CPUs, then making many calls sequentially, taking the rdtsc among calls, then dumping all numbers through printk(), and I'm plotting the average of clock ticks between consecutive calls. [ I can share the benchmarking code as well if needed ] Also, this fixes what seems to me a bug I noticed comparing the whole heap contents as handledbut the modified code vs the original one, insertion by insertion. The problem is in this code: cp->elements[cp->size - 1].dl = 0; cp->elements[cp->size - 1].cpu = cpu; cp->elements[cpu].idx = cp->size - 1; mycpudl_change_key(cp, cp->size - 1, dl); when fed by an absolute deadline that is so large to have a negative value as a s64. In such a case, as from dl_time_before(), the kernel should handle correctly the abs deadline wrap-around, however the current code in cpudeadline.c goes mad, and doesn't re-heapify correctly the just inserted element... that said, if these are ns, such a bug should be hit after a ~292 years of uptime :-D... I'd be happy to hear comments from others. I can provide additional info / make additional experiments as needed. Please, reply-all to this e-mail, I'm not subscribed to linux-kernel@. Thanks, Tommaso -- Tommaso Cucinotta, Computer Engineering PhD Associate Professor at the Real-Time Systems Laboratory (ReTiS) Scuola Superiore Sant'Anna, Pisa, Italy http://retis.sssup.it/people/tommaso View attachment "0001-Make-deadline-max-heap-faster-and-fix-deadline-wrap-.patch" of type "text/x-patch" (5309 bytes) View attachment "0002-Split-cpudl_set-into-cpudl_set-and-cpudl_clear.patch" of type "text/x-patch" (5500 bytes) Download attachment "cpudl.pdf" of type "application/pdf" (10719 bytes)
Powered by blists - more mailing lists