[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 31 Dec 2013 10:31:07 -0500
From: Bill Cox <waywardgeek@...il.com>
To: discussions@...sword-hashing.net
Subject: Re: [PHC] optimizing memory access speed
This is fantastic! Thanks for the very helpful links. Maybe a bit off
topic, but it kills me how we still use 1960's memory layout which thrashes
the cache. When I write CPU intensive code, like IC place and route, I
still refuse to use C structs or C++ classes, due to the style of memory
layout, which insures that for the 1 or 2 fields of an object I need in an
inner loop, I load a full cache line into cache, replacing data that may
have been useful. Instead, I use a code generator for data structures
called DataDraw, which organizes fields in arrays, unless "cache together"
clauses are used, which cause only specific fields to be included together
in a struct. This memory layout makes walking huge grid-based graphs 6X
faster in my old benchmarks. For typical EDA applications, the improvement
is about a 1.6X speedup on average (I can generate either struct-based or
array-based data structures for comparison). Most people today don't
realize that for memory intensive tasks, speed is all about optimizing
cache performance.
Content of type "text/html" skipped
Powered by blists - more mailing lists