netdev - Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and el bits)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <46608BFE.7000905@pobox.com>
Date:	Fri, 01 Jun 2007 17:13:34 -0400
From:	Jeff Garzik <jgarzik@...ox.com>
To:	David Acker <dacker@...net.com>
CC:	Milton Miller <miltonm@....com>,
	"Kok, Auke" <auke-jan.h.kok@...el.com>,
	e1000-devel@...ts.sourceforge.net,
	Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
	John Ronciak <john.ronciak@...el.com>,
	Jesse Brandeburg <jesse.brandeburg@...el.com>,
	Scott Feldman <sfeldma@...ox.com>, netdev@...r.kernel.org
Subject: Re: [PATCH] fix e100 rx path on ARM (was [PATCH] e100 rx: or s and
 el bits)

David Acker wrote:
> Milton Miller wrote:
>> Your logic works, this adds some conditional branching but saves a 
>> pointer, not sure which is better.  Mine can be used for initializing 
>> without special casing since it will just calculate rx_to_unmark as 
>> rx[n-2] and rx_to_mark as rx[n-2] != rx[n-2]->prev; unmarking a never 
>> marked still works, where as for yours the "was marked" must be 
>> explicitly set to something (hmm, rxs = rx[0] might work for that 
>> initial value until we mark -2??)
>>
>> again, the compare of rx->el instead of rx->rfd->el is to save cache 
>> traffic to the rfd under io.  The rx->was_0 is required, so the el 
>> flag is free.
>>
> 
> Ok, I took a stab at coding and testing these ideas.  Below is a patch 
> against 2.6.22-rc3.
> Let me know what you think.  Testing shows that we can hit any of the 
> following scenarios:
> 
> Find a buffer not complete with rx->el and rx->s0 set. I read back the 
> status and see if
> the receiver is still running.
> Find a buffer not complete with rx->el not set and rx->s0 set.  I check 
> the next buffer
> and if it is complete, I free the skb and return 0.  If the next buffer 
> is not
> complete, I read the receiver's status to see if he is still running.
> Find a buffer that is complete with rx->el not set and rx->s0 set.  It 
> appears that hardware
> can read the rfd's el-bit, then software can clear the rfd el-bit and 
> set the rfd size, and
> then hardware can come in and read the size.  I am reading the status 
> back, although
> I don't think that I have to in this instance.
> 
> I am testing a version of this code patched against 2.6.18.4 on my PXA 
> 255 based system.  I will let you all know how it goes.
> 
> On the ARM, their is a race condition between software allocating a new 
> receive
> buffer and hardware writing into a buffer.  The two race on touching the 
> last
> Receive Frame Descriptor (RFD).  It has its el-bit set and its next link 
> equal
> to 0.  When hardware encounters this buffer it attempts to write data to it
> and then update Status Word bits and Actual Count in the RFD.  At the 
> same time
> software may try to clear the el-bit and set the link address to a new 
> buffer.
> Since the entire RFD is once cache-line, the two write operations can
> collide.  This can lead to the receive unit stalling or freed receive 
> buffers
> getting written to.
> 
> The fix is to set the el-bit on and the size to 0 on the next to last 
> buffer
> in the chain.  When the hardware encounters this buffer it stops and does
> not write to it at all.  The hardware issues an RNR interrupt with the
> receive unit in the No Resources state.  When software allocates buffers,
> it can update the tail of the list when either it knows the hardware
> has stopped or the previous to the new one to mark marked.
> Once it has a new next to last buffer prepared, it can clear the el-bit
> and set the size on the previous one.  The race on this buffer is safe
> since the link already points to a valid next buffer.  We keep flags
> in our software descriptor to note if the el bit is set and if the size
> was 0.  When we clear the RFD's el bit and set its size, we also clear
> the el flag but we leave the size was 0 bit set.  This was we can find
> this buffer again later.
> 
> If the hardware sees the el-bit cleared without the size set, it will
> move on to the next buffer and skip this one.  If it sees
> the size set but the el-bit still set, it will complete that buffer
> and then RNR interrupt and wait.
> 
> 
> Signed-off-by: David Acker <dacker@...net.com>

That seems to vaguely match my memory of what eepro100 was doing (or 
trying to do).

I _really_ appreciate you working on this problem.  Getting e100 driver 
stable for the long term, and ditching eepro100, is a big hurdle to 
cross.  Getting this right is really one of the last steps.

The patch looks OK at quick glance.

Thanks,

	Jeff



-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html