[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110316194542.22530.qmail@science.horizon.com>
Date: 16 Mar 2011 15:45:42 -0400
From: "George Spelvin" <linux@...izon.com>
To: hughd@...gle.com, linux@...izon.com
Cc: herbert@...dor.hengli.com.au, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, mpm@...enic.com, penberg@...helsinki.fi
Subject: Re: [PATCH 1/8] drivers/random: Cache align ip_random better
>> 1) A smart compiler could note the alignment and issue wider copy
>> instructions. (Especially on alignment-required architectures.)
> Right, that part of it would benefit from stronger alignment,
> but does not generally need cacheline alignment.
Agreed. The only reason the structure is cacheline aligned is to keep
it all in a single cache line, and swapping the order of the elements
made the buffer more aligned without hurting the counter.
>> 2) The cacheline fetch would get more data faster. The data would
>> be transferred in the first 6 beats of the load from RAM (assuming a
>> 64-bit data bus) rather than waiting for 7, so you'd finish the copy
>> 1 ns sooner or so. Similar 1-cycle win on a 128-bit Ln->L(n-1) cache
>> transfer.
> That argument worries me. I don't know enough to say whether you are
> correct or not. But if you are correct, then it worries me that your
> patch will be the first of a trickle growing to a stream to an avalanche
> of patches where people align and reorder structures so that the most
> commonly accessed fields are at the beginnng of the cacheline, so that
> those can then be accessed minutely faster.
>
> Aargh, and now I am setting off the avalanche with that remark.
> Please, someone, save us by discrediting George's argument.
It was mostly #1 and #3. The *important* thing is to minimize the number
of cache lines touched by common operations, which has already been the
subject of a lot of kernel patches.
Remember, most hardware does have critical-word-first loads. So alignment
to the width of the data bus is enough. "Keep it naturally aligned" is
all that's necessary, and most kernel data structures already obey that.
I was just extending it, because I wanted to make it *possible* to use
wider loads.
>> As I said, "infinitesimal". The main reason that I bothered to
>> generate a patch was that it appealed to my sense of neatness to
>> keep the 3x16-byte buffer 16-byte aligned.
> Ah, now you come clean! Yes, it does feel neater to me too;
> but I doubt that would be sufficient justification by itself.
It took both factors to make it worth it to me. The real reason was:
1) Neater
2) Definitely not slower
3) Maybe a tiny bit faster
Conclusion: do it.
Sorry to alarm you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists