lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 13 May 2010 16:42:30 -0700
From:	Dan Williams <dan.j.williams@...el.com>
To:	David Howells <dhowells@...hat.com>
Cc:	linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
	netdev@...r.kernel.org,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Maciej Sosnowski <maciej.sosnowski@...el.com>
Subject: Re: [PATCH 2/2] ioat2,3: convert to producer/consumer locking

On Wed, May 12, 2010 at 1:36 AM, David Howells <dhowells@...hat.com> wrote:
>
> Out of interest, does it make the code smaller if you mark
> ioat2_get_ring_ent() and ioat2_ring_mask() with __attribute_const__?
>
> I'm not sure whether it'll affect how long gcc is willing to cache these, but
> once computed, I would guess they won't change within the calling function.

Unfortunately, it does not make a difference, but I'll keep this in
mind if ioat2_get_ring_ent() ever gets more complicated (which it
might in the future).

> Also, is the device you're driving watching the ring and its indices?  If so,
> does it modify the indices?  If that is the case, you might need to use
> read_barrier_depends() rather than smp_read_barrier_depends().

The device does not observe the indices directly.  Instead we
increment a free running 'count' register by the distance between
ioat->pending and ioat->head.

>
>> +             prefetch(ioat2_get_ring_ent(ioat, idx + i + 1));
>> +             desc = ioat2_get_ring_ent(ioat, idx + i);
>>               dump_desc_dbg(ioat, desc);
>>               tx = &desc->txd;
>>               if (tx->cookie) {
>
> Is this right, I wonder?  You're prefetching [i+1] before reading [i]?  Doesn't
> this mean that you might have to wait for [i+1] to be retrieved from RAM before
> [i] can be read?  Should you instead read tx->cookie before issuing the
> prefetch?  Admittedly, this is only likely to affect the reading of the head of
> the queue - subsequent reads in the same loop will, of course, have been
> prefetched.

Yes, it should be the other way around.

Thanks!

--
Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ