[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <666.1519738993@warthog.procyon.org.uk>
Date: Tue, 27 Feb 2018 13:43:13 +0000
From: David Howells <dhowells@...hat.com>
To: Jeff Layton <jlayton@...hat.com>
Cc: dhowells@...hat.com, kemi <kemi.wang@...el.com>,
Ye Xiaolong <xiaolong.ye@...el.com>, lkp@...org,
Linus Torvalds <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [LKP] [lkp-robot] [iversion] c0cef30e4f: aim7.jobs-per-min -18.0% regression
Jeff Layton <jlayton@...hat.com> wrote:
> 0xffffffff813ae828 <+136>: je 0xffffffff813ae83a <ima_file_free+154>
> 0xffffffff813ae82a <+138>: mov 0x150(%rbp),%rcx
> 0xffffffff813ae831 <+145>: shr %rcx
> 0xffffffff813ae834 <+148>: cmp %rcx,0x20(%rax)
> 0xffffffff813ae838 <+152>: je 0xffffffff813ae862 <ima_file_free+194>
Is it possible there's a stall between the load of RCX and the subsequent
instructions because they all have to wait for RCX to become available?
The interleaving between operating on RSI and RCX in the older code might
alleviate that.
In addition, the load if the 20(%rax) value is now done in the CMP instruction
rather than earlier, so it might not get speculatively loaded in time, whereas
the earlier code explicitly loads it up front.
David
Powered by blists - more mailing lists