lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02874ECE860811409154E81DA85FBB580DB022@ORSMSX105.amr.corp.intel.com>
Date:	Thu, 22 Mar 2012 21:59:01 +0000
From:	"Keller, Jacob E" <jacob.e.keller@...el.com>
To:	Richard Cochran <richardcochran@...il.com>
CC:	chetan loke <loke.chetan@...il.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"e1000-devel@...ts.sourceforge.net" 
	<e1000-devel@...ts.sourceforge.net>,
	"Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>,
	"Ronciak, John" <john.ronciak@...el.com>,
	"john.stultz@...aro.org" <john.stultz@...aro.org>,
	"tglx@...utronix.de" <tglx@...utronix.de>
Subject: RE: [PATCH net V4 2/2] igb: offer a PTP Hardware Clock instead of
 the timecompare method

> -----Original Message-----
> From: Richard Cochran [mailto:richardcochran@...il.com]
> Sent: Thursday, March 22, 2012 12:09 AM
> To: Keller, Jacob E
> Cc: chetan loke; netdev@...r.kernel.org; e1000-devel@...ts.sourceforge.net;
> Kirsher, Jeffrey T; Ronciak, John; john.stultz@...aro.org; tglx@...utronix.de
> Subject: Re: [PATCH net V4 2/2] igb: offer a PTP Hardware Clock instead of the
> timecompare method
> 
> On Wed, Mar 21, 2012 at 05:06:09PM +0000, Keller, Jacob E wrote:
> >
> > I agree with Chetan. I think it would be best to make sure the correct
> > form of locking is done, as we are providing an interface to the user.
> > Using a seqlock would allow for preventing the ioctls from blocking
> > the hardware timestamp code.
> 
> Okay, you can improve the time stamping path in the driver to avoid contending
> with callers to clock_gettime. But that will not help the hundreds of
> clock_gettime callers from contending with each other.
>

Seqlocks don't block when reading. The way they work is with a sequence number. When a reader wants to perform a read, it atomically records the sequence number. Then it performs the operation it wants (which can't have *any* side effects. It needs to be read only on the data). After finishing, it checks to see if the sequence number has increased. If so, it repeats the process until the sequence number stays the same.

Writers work like normal spin locks but increase a sequence number whenever they start work.

This means that readers don't block at all, and as long as the readers don't conflict with each other, there is no contention or writer starvation. It is possible for readers to 'live' lock, due to a large number of write operations. However, the lock is designed for few-writers, many readers. Which is what we have.

> > It's a fairly simple change for the gettime function (the most likely
> > culprit to be hammered) by changing it to use timecounter_cyc2time
> > function instead of timecounter_read. (as long as timecounter_read is
> > called at least every 1/2 the system time overflow, which it should be
> > due to the work task.)
> >
> > With that change, then the section use a seqlock (along with the
> > section for checking hardware timestamps). Other places would do the
> > full write lock.
> 
> So, you think that clock_gettime should not read the card's time registers?
> 
You can have it read the register value directly and use timecounter_cyc2time to convert that value into ns since the epoch. This works because you are already using timecounter_read during the worktask which should be running at least as often as half the counter wraparound time. Timecounter_read has a side effect of updating the last-read value in the timecounter structure. Timecounter_cyc2time does not. The MAC timestamps will be correct as long as the timecounter_read is updated at most half a wrap-around time before the systime is calculated.

This means that gettime, and tx/rx hwtstamp functions would be read operations, while settime, adjtime, cyclecounter_update and (if igb needs it?) the function which adjusts for link speed would be writes. Since these operations only occur occasionally (relative to tx/rx hwtstamp and gettime) we have a many readers/few writers scenario.

I can show you the changes on the ixgbe driver.

> Richard
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ