[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 05 May 2010 14:52:40 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: eric.dumazet@...il.com
Cc: netdev@...r.kernel.org, hadi@...erus.ca, therbert@...gle.com
Subject: Re: [PATCH net-next-2.6] net: __alloc_skb() speedup
From: Eric Dumazet <eric.dumazet@...il.com>
Date: Wed, 05 May 2010 14:00:09 +0200
> Sorry, I was thinking about the shinfo part :
>
> memset(shinfo, 0, offsetof(struct skb_shared_info, dataref));
>
> offsetof(struct skb_shared_info, dataref) is small enough and we dont
> dirty a full cache line, so maybe I can keep prefetchw(data + size) ?
You do dirty a full line on sparc64, the prefetch invalidate goes a L1
cache line at a time, so 32 bytes. And this memset() is 40 bytes.
The call to the memset symbol is still generated by gcc for this case.
I think the cutoff for doing it inline is something like 16 bytes on
sparc64, four 64-bit loads and stores.
Unlike x86 these risc chips don't have string-op instructions, and for
sparc64 and powerpc the instructions are fixed in size (4 bytes) so
the inline cost is "(memset_size / word_size) * 4". Whereas on x86 the
inlining cost is more-or-less fixed.
> If not, in which cases can we use prefetchw() in kernel, if some arches
> dont handle it well ?
It has to be looked at in a case-by-case basis. There is no simple
answer here.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists