lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20071221055822.8d977800.billfink@mindspring.com>
Date:	Fri, 21 Dec 2007 05:58:22 -0500
From:	Bill Fink <billfink@...dspring.com>
To:	David Miller <davem@...emloft.net>
Cc:	herbert@...dor.apana.org.au, ilpo.jarvinen@...sinki.fi,
	netdev@...r.kernel.org, jheffner@....edu
Subject: Re: TSO trimming question

On Fri, 21 Dec 2007, David Miller wrote:

> From: Herbert Xu <herbert@...dor.apana.org.au>
> Date: Fri, 21 Dec 2007 17:29:27 +0800
> 
> > On Fri, Dec 21, 2007 at 01:27:20AM -0800, David Miller wrote:
> > >
> > > It's two shifts, and this gets scheduled along with the other
> > > instructions on many cpus so it's effectively free.
> > > 
> > > I don't see why this is even worth mentioning and discussing.
> > 
> > I totally agree.  Two shifts are way better than a branch.
> 
> We take probably a thousand+ 100+ cycle cache misses in the TCP stack
> on big window openning ACKs.
> 
> Instead of discussing ways to solve that huge performance killer we're
> wanking about two friggin' integer shifts.
> 
> It's hilarious isn't it? :-)

I don't think obfuscated code is hilarious.  Instead of the convoluted
and dense code:

	/* Defer for less than two clock ticks. */
	if (tp->tso_deferred &&
	    ((jiffies << 1) >> 1) - (tp->tso_deferred >> 1) > 1)

You can have the much simpler and more easily understandable:

	/* Defer for less than two clock ticks. */
	if (tp->tso_deferred && (jiffies - tp->tso_deferred) > 1)

And instead of:

	/* Ok, it looks like it is advisable to defer.  */
        tp->tso_deferred = 1 | (jiffies<<1);

        return 1;

You could do as Ilpo suggested:

	/* Ok, it looks like it is advisable to defer.  */
	tp->tso_deferred = max_t(u32, jiffies, 1);

        return 1;

Or perhaps more efficiently:

	/* Ok, it looks like it is advisable to defer.  */
	tp->tso_deferred = jiffies;
	if (unlikely(jiffies == 0))
		tp->tso_deferred = 1;

        return 1;

Or perhaps even:

	/* Ok, it looks like it is advisable to defer.  */
	tp->tso_deferred = jiffies;

	/* need to return a non-zero value to defer, which means won't
	 * defer if jiffies == 0 but it's only a 1 in 4 billion event
	 * (and avoids a compare/branch by not checking jiffies)
	 /
        return jiffies;

Since it really only needs a non-zero return value to defer.

See, no branches needed and much clearer code.  That seems worthwhile
to me from a code maintenance standpoint, even if it isn't any speed
improvement.

And what about the 64-bit jiffies versus 32-bit tp->tso_deferred issue?
Should tso_deferred be made unsigned long to match jiffies?

						-Bill
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ