[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <51BC7C74-68BF-4A8E-8CFB-DB4EBBC89706@goldelico.com>
Date: Wed, 21 Apr 2021 18:00:34 +0200
From: "H. Nikolaus Schaller" <hns@...delico.com>
To: "Maciej W. Rozycki" <macro@...am.me.uk>
Cc: Arnd Bergmann <arnd@...db.de>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Huacai Chen <chenhuacai@...nel.org>,
Huacai Chen <chenhuacai@...ngson.cn>,
Jiaxun Yang <jiaxun.yang@...goat.com>,
linux-arch@...r.kernel.org, linux-mips@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/4] Reinstate and improve MIPS `do_div' implementation
Hi,
> Am 20.04.2021 um 04:50 schrieb Maciej W. Rozycki <macro@...am.me.uk>:
>
> Hi,
>
> As Huacai has recently discovered the MIPS backend for `do_div' has been
> broken and inadvertently disabled with commit c21004cd5b4c ("MIPS: Rewrite
> <asm/div64.h> to work with gcc 4.4.0."). As it is code I have originally
> written myself and Huacai had issues bringing it back to life leading to a
> request to discard it even I have decided to step in.
>
> In the end I have fixed the code and measured its performance to be ~100%
> better on average than our generic code.
That would be good.
> I have decided it would be worth
> having the test module I have prepared for correctness evaluation as well
> as benchmarking, so I have included it with the series, also so that I can
> refer to the results easily.
>
> In the end I have included four patches on this occasion: 1/4 is the test
> module, 2/4 is an inline documentation fix/clarification for the `do_div'
> wrapper, 3/4 enables the MIPS `__div64_32' backend and 4/4 adds a small
> performance improvement to it.
How can I apply them to the kernel? There is something wrong which makes
git am fail.
>
> I have investigated a fifth change as a potential improvement where I
> replaced the call to `do_div64_32' with a DIVU instruction for cases where
> the high part of the intermediate divident is zero, but it has turned out
> to regress performance a little, so I have discarded it.
>
> Also a follow-up change might be worth having to reduce the code size and
> place `__div64_32' out of line for CC_OPTIMIZE_FOR_SIZE configurations,
> but I have not fully prepared such a change at this time. I did use the
> WIP form I have for performance evaluation however; see the figures quoted
> with 4/4.
>
> These changes have been verified with a DECstation system with an R3400
> MIPS I processor @40MHz and a MTI Malta system with a 5Kc MIPS64 processor
> @160MHz.
I'd like to test on ~320 MHz JZ4730.
>
> See individual change descriptions and any additional discussions for
> further details.
>
> Questions, comments or concerns? Otherwise please apply.
>
> Maciej
BR and thanks,
Nikolaus Schaller
Powered by blists - more mailing lists