[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08a4b591e9ee41ca8cec888772a0fc43@baidu.com>
Date: Thu, 10 Jul 2025 09:39:51 +0000
From: "Li,Rongqing" <lirongqing@...du.com>
To: David Laight <david.laight.linux@...il.com>, Andrew Morton
<akpm@...ux-foundation.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
CC: "u.kleine-koenig@...libre.com" <u.kleine-koenig@...libre.com>, "Nicolas
Pitre" <npitre@...libre.com>, Oleg Nesterov <oleg@...hat.com>, Peter Zijlstra
<peterz@...radead.org>, Biju Das <biju.das.jz@...renesas.com>,
"rostedt@...dmis.org" <rostedt@...dmis.org>
Subject: 答复: [????] Re: [PATCH v3 next 09/10] lib: mul_u64_u64_div_u64() Optimise the divide code
> $ cc -O2 -o div_perf div_perf.c -DMULDIV_OPT=0xc3 && sudo ./div_perf
> 0: ok 162 134 78 78 78 78 78 80 80 80
> mul_u64_u64_div_u64_new b*7/3 = 19
> 1: ok 91 91 91 91 91 91 91 91 91 91
> mul_u64_u64_div_u64_new ffff0000*ffff0000/f = 1110eeef00000000
> 2: ok 75 77 75 77 77 77 77 77 77 77
> mul_u64_u64_div_u64_new ffffffff*ffffffff/1 = fffffffe00000001
> 3: ok 89 91 91 91 91 91 89 90 91 91
> mul_u64_u64_div_u64_new ffffffff*ffffffff/2 = 7fffffff00000000
> 4: ok 147 147 128 128 128 128 128 128 128 128
> mul_u64_u64_div_u64_new 1ffffffff*ffffffff/2 = fffffffe80000000
> 5: ok 128 128 128 128 128 128 128 128 128 128
> mul_u64_u64_div_u64_new 1ffffffff*ffffffff/3 = aaaaaaa9aaaaaaab
> 6: ok 121 121 121 121 121 121 121 121 121 121
> mul_u64_u64_div_u64_new 1ffffffff*1ffffffff/4 = ffffffff00000000
> 7: ok 274 234 146 138 138 138 138 138 138 138
> mul_u64_u64_div_u64_new
> ffff000000000000*ffff000000000000/ffff000000000001 = fffeffffffffffff
> 8: ok 177 148 148 149 149 149 149 149 149 149
> mul_u64_u64_div_u64_new
> 3333333333333333*3333333333333333/5555555555555555 =
> 1eb851eb851eb851
> 9: ok 138 90 118 91 91 91 91 92 92 92
> mul_u64_u64_div_u64_new 7fffffffffffffff*2/3 = 5555555555555554
> 10: ok 113 114 86 86 84 86 86 84 87
> 87 mul_u64_u64_div_u64_new ffffffffffffffff*2/8000000000000000 = 3
> 11: ok 87 88 88 86 88 88 88 88 90
> 90 mul_u64_u64_div_u64_new ffffffffffffffff*2/c000000000000000 = 2
> 12: ok 82 86 84 86 83 86 83 86 83
> 87 mul_u64_u64_div_u64_new
> ffffffffffffffff*4000000000000004/8000000000000000 = 8000000000000007
> 13: ok 82 86 84 86 83 86 83 86 83
> 86 mul_u64_u64_div_u64_new
> ffffffffffffffff*4000000000000001/8000000000000000 = 8000000000000001
> 14: ok 189 187 138 132 132 132 131 131 131
> 131 mul_u64_u64_div_u64_new ffffffffffffffff*8000000000000001/ffffffffffffffff =
> 8000000000000001
> 15: ok 221 175 159 131 131 131 131 131 131
> 131 mul_u64_u64_div_u64_new fffffffffffffffe*8000000000000001/ffffffffffffffff
> = 8000000000000000
> 16: ok 134 132 134 134 134 135 134 134 134
> 134 mul_u64_u64_div_u64_new ffffffffffffffff*8000000000000001/fffffffffffffffe
> = 8000000000000001
> 17: ok 172 134 137 134 134 134 134 134 134
> 134 mul_u64_u64_div_u64_new ffffffffffffffff*8000000000000001/fffffffffffffffd
> = 8000000000000002
> 18: ok 182 182 129 129 129 129 129 129 129
> 129 mul_u64_u64_div_u64_new 7fffffffffffffff*ffffffffffffffff/c000000000000000
> = aaaaaaaaaaaaaaa8
> 19: ok 130 129 130 129 129 129 129 129 129
> 129 mul_u64_u64_div_u64_new ffffffffffffffff*7fffffffffffffff/a000000000000000
> = ccccccccccccccca
> 20: ok 130 129 129 129 129 129 129 129 129
> 129 mul_u64_u64_div_u64_new ffffffffffffffff*7fffffffffffffff/9000000000000000
> = e38e38e38e38e38b
> 21: ok 130 129 129 129 129 129 129 129 129
> 129 mul_u64_u64_div_u64_new 7fffffffffffffff*7fffffffffffffff/5000000000000000
> = ccccccccccccccc9
> 22: ok 206 140 138 138 138 138 138 138 138
> 138 mul_u64_u64_div_u64_new ffffffffffffffff*fffffffffffffffe/ffffffffffffffff =
> fffffffffffffffe
> 23: ok 174 140 138 138 138 138 138 138 138
> 138 mul_u64_u64_div_u64_new
> e6102d256d7ea3ae*70a77d0be4c31201/d63ec35ab3220357 =
> 78f8bf8cc86c6e18
> 24: ok 135 137 137 137 137 137 137 137 137
> 137 mul_u64_u64_div_u64_new
> f53bae05cb86c6e1*3847b32d2f8d32e0/cfd4f55a647f403c =
> 42687f79d8998d35
> 25: ok 134 136 136 136 136 136 136 136 136
> 136 mul_u64_u64_div_u64_new
> 9951c5498f941092*1f8c8bfdf287a251/a3c8dc5f81ea3fe2 = 1d887cb25900091f
> 26: ok 136 134 134 134 134 134 134 134 134
> 134 mul_u64_u64_div_u64_new
> 374fee9daa1bb2bb*d0bfbff7b8ae3ef/c169337bd42d5179 = 3bb2dbaffcbb961
> 27: ok 139 138 138 138 138 138 138 138 138
> 138 mul_u64_u64_div_u64_new
> eac0d03ac10eeaf0*89be05dfa162ed9b/92bb1679a41f0e4b =
> dc5f5cc9e270d216
> 28: ok 130 143 95 95 96 96 96 96 96
> 96 mul_u64_u64_div_u64_new
> 2d256d7ea3ae*7d0be4c31201/d63ec35ab3220357 = 1a599d6e
> 29: ok 169 158 158 138 138 138 138 138 138
> 138 mul_u64_u64_div_u64_new
> eac0d03ac10eeaf0*89be05dfa162ed9b/92bb1679a41f0e4b =
> dc5f5cc9e270d216
> 30: ok 178 164 144 147 147 147 147 147 147
> 147 mul_u64_u64_div_u64_new
> 2d256d7ea3ae*7d0be4c31201/63ec35ab3220357 = 387f55cef
> 31: ok 163 128 128 128 128 128 128 128 128
> 128 mul_u64_u64_div_u64_new
> eac0d03ac10eeaf0*89be05dfa162ed9b/92bb000000000000 =
> dc5f7e8b334db07d
> 32: ok 163 184 137 136 136 138 138 138 138
> 138 mul_u64_u64_div_u64_new
> eac0d03ac10eeaf0*89be05dfa162ed9b/92bb1679a41f0e4b =
> dc5f5cc9e270d216
>
Nice work!
Is it necessary to add an exception test case, such as a case for division result overflowing 64bit?
Thanks
-Li
Powered by blists - more mailing lists