[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101210042752.GA3144@amd>
Date: Fri, 10 Dec 2010 15:27:52 +1100
From: Nick Piggin <npiggin@...nel.dk>
To: Nick Piggin <npiggin@...nel.dk>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
linux-arch@...r.kernel.org, x86@...nel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: Big git diff speedup by avoiding x86 "fast string" memcmp
On Thu, Dec 09, 2010 at 06:09:38PM +1100, Nick Piggin wrote:
> So replace it with an open-coded byte comparison. This increases code
> size by 24 bytes in the critical __d_lookup_rcu function, but the
Actually, if the loop assumes len is non zero (which is the case for
dentry compare), then the bloat is only 8 bytes, so not a problem.
Also got numbers versus vanilla kernel, out of interest.
> speedup is huge, averaging 10 runs of each:
>
> git diff st user sys elapsed CPU
vanilla 1.19 3.21 4.47 98.0
> before 1.15 2.57 3.82 97.1
> after 1.14 2.35 3.61 96.8
>
> git diff mt user sys elapsed CPU
vanilla 1.57 45.75 3.60 1312
> before 1.27 3.85 1.46 349
> after 1.26 3.54 1.43 333
>
Single thread elapsed time improvment vanilla vs vfs 19.23%. Not quite
as big as the AMD fam10h speedup, that's probably because Westmere does
atomics so damn quickly.
Multi thread numbers are no surprise.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists