[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200329211316.GG4675@SDF.ORG>
Date: Sun, 29 Mar 2020 21:18:14 +0000
From: George Spelvin <lkml@....ORG>
To: James Bottomley <James.Bottomley@...senPartnership.com>
Cc: Takashi Iwai <tiwai@...e.de>, linux-kernel@...r.kernel.org,
Hannes Reinecke <hare@...e.de>, linux-scsi@...r.kernel.org,
Marek Lindner <mareklindner@...mailbox.ch>,
Simon Wunderlich <sw@...onwunderlich.de>,
Antonio Quartulli <a@...table.cc>,
Sven Eckelmann <sven@...fation.org>,
Johannes Berg <johannes@...solutions.net>,
linux-wireless@...r.kernel.org, Jaroslav Kysela <perex@...ex.cz>,
Takashi Iwai <tiwai@...e.com>, alsa-devel@...a-project.org,
lkml@....org
Subject: Re: [RFC PATCH v1 13/50] Avoid some useless msecs/jiffies conversions
On Sun, Mar 29, 2020 at 11:16:47AM -0700, James Bottomley wrote:
> On Sun, 2020-03-29 at 17:50 +0000, George Spelvin wrote:
>> On Sun, Mar 29, 2020 at 07:13:33PM +0200, Takashi Iwai wrote:
>>> Fair enough. But it's still a question whether an open code X * HZ
>>> is good at all...
>>
>> I'm sorry, I don't understand what you mean by "good at all" here.
>> The value computed is exactly the same.
>
> I think he means what the compiler does with it.
>
> We all assume that msecs_to_jiffies is properly optimized so there
> should be no need to open code it like you're proposing. So firstly
> can you produce the assembly that shows the worse output from
> msecs_to_jiffies? If there is a problem, then we should be fixing it
> in msecs_to_jiffies, not adding open coding.
Fair enough. For the two alternative functions:
#include <linux/jiffies.h>
unsigned long timeout1(unsigned seconds)
{
return jiffies + msecs_to_jiffies(seconds * 1000);
}
unsigned long timeout2(unsigned seconds)
{
return jiffies + seconds * HZ;
}
compiled with Kbuild, the object code produced is:
Disassembly of section .text:
0000000000000000 <timeout1>:
0: 69 ff e8 03 00 00 imul $0x3e8,%edi,%edi
6: e8 00 00 00 00 callq b <timeout1+0xb>
7: R_X86_64_PLT32 __msecs_to_jiffies-0x4
b: 49 89 c0 mov %rax,%r8
e: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 15
<timeout1+0x15>
11: R_X86_64_PC32 jiffies-0x4
15: 4c 01 c0 add %r8,%rax
18: c3 retq
19: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
0000000000000020 <timeout2>:
20: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 27
<timeout2+0x7>
23: R_X86_64_PC32 jiffies-0x4
27: 69 c7 2c 01 00 00 imul $0x12c,%edi,%eax
2d: 48 01 d0 add %rdx,%rax
30: c3 retq
This is the type of code I replaced: code where the number of seconds
is not known at compile time. Notice how the first multiplies by 1000
and then calls __msecs_to_jiffies. The second multiplies by 300 and
makes no function call.
__msecs_to_jiffies (from kernel/time/time.o) is:
0000000000000100 <__msecs_to_jiffies>:
100: 48 b8 fe ff ff ff ff movabs $0x3ffffffffffffffe,%rax
107: ff ff 3f
10a: 85 ff test %edi,%edi
10c: 78 1c js 12a
<__msecs_to_jiffies+0x2a>
10e: b8 9a 99 99 99 mov $0x9999999a,%eax
113: 89 ff mov %edi,%edi
115: 48 0f af f8 imul %rax,%rdi
119: 48 b8 cc cc cc cc 01 movabs $0x1cccccccc,%rax
120: 00 00 00
123: 48 01 f8 add %rdi,%rax
126: 48 c1 e8 21 shr $0x21,%rax
12a: c3 retq
12b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
I didn't try to replace code that uses compile-time constant arguments
such as include/linux/ceph/libceph.h.
Powered by blists - more mailing lists