[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <784a6843-c5fb-46eb-a472-5d96101478a9@intel.com>
Date: Tue, 23 Jan 2024 09:00:45 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: David Binderman <dcb314@...mail.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Cc: Andy Lutomirski <luto@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH] x86/mm: Simplify redundant overlap calculation
On 1/23/24 08:54, David Binderman wrote:
>> Remove the second condition. It is exactly the same as the first.
> I don't think the first condition is sufficient. I suspect something like
>
> return (r2_start <= r1_start && r1_start <= r2_end) ||
> (r2_start <= r1_end && r1_end <= r2_end);
>
> Given the range [r2_start .. r2_end], then if r1_start or r1_end
> are in that range, you have overlap.
>
> Unless you know different.
First of all, I've gotten these bounds checks wrong in code more times
than I can count. I have zero trust that I'll get them right. :)
But the compiler seems to know different at least:
int overlaps1(unsigned long r1_start, unsigned long r1_end,
unsigned long r2_start, unsigned long r2_end)
{
return (r1_start <= r2_end && r1_end >= r2_start) ||
(r2_start <= r1_end && r2_end >= r1_start);
}
int overlaps2(unsigned long r1_start, unsigned long r1_end,
unsigned long r2_start, unsigned long r2_end)
{
return (r1_start <= r2_end && r1_end >= r2_start);
}
Results in:
0000000000001180 <overlaps1>:
1180: f3 0f 1e fa endbr64
1184: 48 39 cf cmp %rcx,%rdi
1187: 49 89 d0 mov %rdx,%r8
118a: 0f 96 c2 setbe %dl
118d: 31 c0 xor %eax,%eax
118f: 4c 39 c6 cmp %r8,%rsi
1192: 0f 93 c0 setae %al
1195: 21 d0 and %edx,%eax
1197: c3 ret
1198: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
119f: 00
00000000000011a0 <overlaps2>:
11a0: f3 0f 1e fa endbr64
11a4: 48 39 cf cmp %rcx,%rdi
11a7: 49 89 d0 mov %rdx,%r8
11aa: 0f 96 c2 setbe %dl
11ad: 31 c0 xor %eax,%eax
11af: 4c 39 c6 cmp %r8,%rsi
11b2: 0f 93 c0 setae %al
11b5: 21 d0 and %edx,%eax
11b7: c3 ret
11b8: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
11bf: 00
I also wrote a quick program to throw random numbers into both versions
and see if they differ. They never did, which they obviously can't if
they're the exact same instructions.
Powered by blists - more mailing lists