[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200601084426.GB1667318@kroah.com>
Date: Mon, 1 Jun 2020 10:44:26 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Tao pilgrim <pilgrimtao@...il.com>
Cc: Jens Axboe <axboe@...nel.dk>, hch@....de, sth@...ux.ibm.com,
viro@...iv.linux.org.uk, clm@...com, jaegeuk@...nel.org,
hch@...radead.org, Mark Fasheh <mark@...heh.com>,
dhowells@...hat.com, balbi@...nel.org, damien.lemoal@....com,
bvanassche@....org, ming.lei@...hat.com,
martin.petersen@...cle.com, satyat@...gle.com,
chaitanya.kulkarni@....com, houtao1@...wei.com,
asml.silence@...il.com, ajay.joshi@....com,
linux-kernel@...r.kernel.org,
Muchun Song <songmuchun@...edance.com>, hoeppner@...ux.ibm.com,
heiko.carstens@...ibm.com, gor@...ux.ibm.com,
borntraeger@...ibm.com, linux-s390@...r.kernel.org,
sagi@...mberg.me, linux-nvme@...ts.infradead.org,
linux-usb@...r.kernel.org, Josef Bacik <josef@...icpanda.com>,
dsterba@...e.com, linux-btrfs@...r.kernel.org, chao@...nel.org,
linux-f2fs-devel@...ts.sourceforge.net, darrick.wong@...cle.com,
linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
jlbec@...lplan.org, joseph.qi@...ux.alibaba.com,
ocfs2-devel@....oracle.com, deepa.kernel@...il.com
Subject: Re: [PATCH v2] blkdev: Replace blksize_bits() with ilog2()
On Mon, Jun 01, 2020 at 03:22:01PM +0800, Tao pilgrim wrote:
> On Fri, May 29, 2020 at 10:13 PM Jens Axboe <axboe@...nel.dk> wrote:
> >
> > On 5/29/20 8:11 AM, Kaitao Cheng wrote:
> > > There is a function named ilog2() exist which can replace blksize.
> > > The generated code will be shorter and more efficient on some
> > > architecture, such as arm64. And ilog2() can be optimized according
> > > to different architecture.
> >
> > When you posted this last time, I said:
> >
> > "I like the simplification, but do you have any results to back up
> > that claim? Is the generated code shorter? Runs faster?"
> >
>
> Hi Jens Axboe:
>
> I did a test on ARM64.
> unsigned int ckt_blksize(int size)
> {
> return blksize_bits(size);
> }
> unsigned int ckt_ilog2(int size)
> {
> return ilog2(size);
> }
>
> When I compiled it into assembly code, I got the following result,
>
> 0000000000000088 <ckt_blksize>:
> 88: 2a0003e8 mov w8, w0
> 8c: 321d03e0 orr w0, wzr, #0x8
> 90: 11000400 add w0, w0, #0x1
> 94: 7108051f cmp w8, #0x201
> 98: 53017d08 lsr w8, w8, #1
> 9c: 54ffffa8 b.hi 90 <ckt_blksize+0x8>
> a0: d65f03c0 ret
> a4: d503201f nop
>
> 00000000000000a8 <ckt_ilog2>:
> a8: 320013e8 orr w8, wzr, #0x1f
> ac: 5ac01009 clz w9, w0
> b0: 4b090108 sub w8, w8, w9
> b4: 7100001f cmp w0, #0x0
> b8: 5a9f1100 csinv w0, w8, wzr, ne
> bc: d65f03c0 ret
>
> The generated code of ilog2 is shorter , and runs faster
But does this code path actually show up anywhere that is actually
measurable as mattering?
If so, please show that benchmark results.
thanks,
greg k-h
Powered by blists - more mailing lists