[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1807050708320.1579@nanos.tec.linutronix.de>
Date: Thu, 5 Jul 2018 07:30:11 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: "Yang, Bin" <bin.yang@...el.com>
cc: "mingo@...nel.org" <mingo@...nel.org>,
"hpa@...or.com" <hpa@...or.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"peterz@...radead.org" <peterz@...radead.org>,
"x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH] x86/mm: fix cpu stuck issue in
__change_page_attr_set_clr
On Thu, 5 Jul 2018, Yang, Bin wrote:
Please do not top post.
> This is what my new patch tries to improve.
> On 04/07/2018, 10:02 PM, "Thomas Gleixner" <tglx@...utronix.de> wrote:
>
> The check loop itself is stupid as well. Instead of looping in 4K steps
> the thing can be rewritten to check for overlapping ranges and then check
> explicitely for those. If there is no overlap in the first place the
> whole loop thing can be avoided, but that's a pure optimization and needs
> more thought than the straight forward and obvious solution to the
> problem at hand.
Sorry, I don't see in which way your patch would improve the check loop.
It tries to avoid the checks for a consecutive invocation of CPA on the
same address range, but it does it in a convoluted way instead of doing the
obvious.
And in no way it does improve the situation when the check needs to be
done. Here is the relevant snippet:
+ if (!do_split_cache &&
+ address_cache >= addr && address_cache < nextpage_addr &&
+ pgprot_val(new_prot) == pgprot_val(old_prot)) {
+ do_split = do_split_cache;
+ goto out_unlock;
+ }
That only ever avoids the check loop when:
1) the previous call cleared do_split_cache
2) the address is within the range of the previous call
3) the pgprot_vals match
while moving the pgprot_val check and the alignment check _before_ the loop
avoids even more pointless invocations of the loop and is obvious and
correct without voodoo logic.
The check loop is with both your and my variant absolutely the same and it
does not get any smarter or more performant when it's entered.
That is a totally different change which needs to do the following:
if (overlaps_check_regions(address, numpages))
check_overlapping_regions(address, numpages);
and that can be done smart without massive looping in 4K steps, but we
really want to analyze _first_ whether the checks are just silently fixing
up sloppy callers and whether they are needed at all.
Thanks,
tglx
Powered by blists - more mailing lists