[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1906141610060.1722@nanos.tec.linutronix.de>
Date: Fri, 14 Jun 2019 16:13:31 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Dmitry Safonov <dima@...sta.com>
cc: linux-kernel@...r.kernel.org, Andrei Vagin <avagin@...il.com>,
Adrian Reber <adrian@...as.de>,
Andrei Vagin <avagin@...nvz.org>,
Andy Lutomirski <luto@...nel.org>,
Arnd Bergmann <arnd@...db.de>,
Christian Brauner <christian.brauner@...ntu.com>,
Cyrill Gorcunov <gorcunov@...nvz.org>,
Dmitry Safonov <0x7f454c46@...il.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
Jann Horn <jannh@...gle.com>, Jeff Dike <jdike@...toit.com>,
Oleg Nesterov <oleg@...hat.com>,
Pavel Emelyanov <xemul@...tuozzo.com>,
Shuah Khan <shuah@...nel.org>,
Vincenzo Frascino <vincenzo.frascino@....com>,
containers@...ts.linux-foundation.org, criu@...nvz.org,
linux-api@...r.kernel.org, x86@...nel.org
Subject: Re: [PATCHv4 26/28] x86/vdso: Align VDSO functions by CPU L1 cache
line
On Wed, 12 Jun 2019, Dmitry Safonov wrote:
> From: Andrei Vagin <avagin@...il.com>
>
> After performance testing VDSO patches a noticeable 20% regression was
> found on gettime_perf selftest with a cold cache.
> As it turns to be, before time namespaces introduction, VDSO functions
> were quite aligned to cache lines, but adding a new code to adjust
> timens offset inside namespace created a small shift and vdso functions
> become unaligned on cache lines.
>
> Add align to vdso functions with gcc option to fix performance drop.
>
> Coping the resulting numbers from cover letter:
>
> Hot CPU cache (more gettime_perf.c cycles - the better):
> | before | CONFIG_TIME_NS=n | host | inside timens
> --------|------------|------------------|-------------|-------------
> cycles | 139887013 | 139453003 | 139899785 | 128792458
> diff (%)| 100 | 99.7 | 100 | 92
Why is CONFIG_TIME_NS=n behaving worse than current mainline and
worse than 'host' mode?
> Cold cache (lesser tsc per gettime_perf_cold.c cycle - the better):
> | before | CONFIG_TIME_NS=n | host | inside timens
> --------|------------|------------------|-------------|-------------
> tsc | 6748 | 6718 | 6862 | 12682
> diff (%)| 100 | 99.6 | 101.7 | 188
Weird, now CONFIG_TIME_NS=n is better than current mainline and 'host' mode
drops.
Either I'm misreading the numbers or missing something or I'm just confused
as usual :)
Thanks,
tglx
Powered by blists - more mailing lists