[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20171129093555.11395-1-courbet@google.com>
Date: Wed, 29 Nov 2017 10:35:55 +0100
From: Clement Courbet <courbet@...gle.com>
To: Yury Norov <ynorov@...iumnetworks.com>
Cc: linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>
Subject: [PATCH v5] lib: optimize cpumask_next_and()
> > Note that on Arm (), the new c implementation still outperforms the
> > old one that uses c+ the asm implementation of `find_next_bit` [3].
> What is 'c+'? Is it typo?
I meant "a mix of C and asm" ~(C + asm). Rephrased.
> If you find generic find_bit() on arm faster that asm one, we'd
> definitely drop that piece of asm. I have this check it in my
> long list.
What's faster for sure is the mix (the improvement in this commit minus the
possible hit from not using the ASM implementation). I can't tell whether the
latter is negligible or not (I only have one ARM board to try it out), but
that's definitly something to try.
> This is old version of test based on get_cycles. New one is based on
> ktime_get and has other minor changes. I think you'd rerun tests to
> not confuse readers. New version is already in linux-next.
So I'm not sure whether I should be submitting this against 'linux' or
'linux-next' ? This patch is against 'linux', so I think it should
be consistent with the code around.
> > #ifndef find_first_bit
> > #define find_first_bit(addr, size) find_next_bit((addr), (size), 0)
> > #endif
> > #ifndef find_first_zero_bit
> > #define find_first_zero_bit(addr, size) find_next_zero_bit((addr), (size), 0)
> > #endif
> How this change related to the find_next_and_bit?
The arm header defines these symbols. Now that we're including
the generic implementation in the arm headers, we need to guard this to
avoid the duplicate definition.
> > test_find_next_and_bit_ref
> I don't understand the purpose of this. It's obviously clear that
> test_find_next_and_bit cannot be slower than test_find_next_and_bit_ref
Fair enough :) That was to back my claim that this commit is worth it.
I've removed the "_ref" version.
> For sparse bitmaps it will be like traversing zero-bitmaps. I doubt
> this numbers will be representative. Do we need this test at all?
It's just two lines, and gives an interesting data point. Why not
keep it ?
Powered by blists - more mailing lists