[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131012172124.GA18241@gmail.com>
Date:	Sat, 12 Oct 2013 19:21:24 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Neil Horman <nhorman@...driver.com>
Cc:	linux-kernel@...r.kernel.org, sebastien.dugue@...l.net,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org
Subject: Re: [PATCH] x86: Run checksumming in parallel accross multiple alu's
* Neil Horman <nhorman@...driver.com> wrote:
> Sébastien Dugué reported to me that devices implementing ipoib (which 
> don't have checksum offload hardware were spending a significant amount 
> of time computing checksums.  We found that by splitting the checksum 
> computation into two separate streams, each skipping successive elements 
> of the buffer being summed, we could parallelize the checksum operation 
> accros multiple alus.  Since neither chain is dependent on the result of 
> the other, we get a speedup in execution (on hardware that has multiple 
> alu's available, which is almost ubiquitous on x86), and only a 
> negligible decrease on hardware that has only a single alu (an extra 
> addition is introduced).  Since addition in commutative, the result is 
> the same, only faster
This patch should really come with measurement numbers: what performance 
increase (and drop) did you get on what CPUs.
Thanks,
	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
