[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <4B4FADEF.6020408@majjas.com>
Date: Thu, 14 Jan 2010 18:51:11 -0500
From: Michael Breuer <mbreuer@...jas.com>
To: Stephen Hemminger <shemminger@...tta.com>
Cc: Jarek Poplawski <jarkao2@...il.com>,
David Miller <davem@...emloft.net>, mikem@...g3k.org,
flyboy@...il.com, rjw@...k.pl, netdev@...r.kernel.org
Subject: Re: [PATCH] sky2: safer transmit ring cleaning (v4)
On 1/14/2010 12:52 PM, Stephen Hemminger wrote:
> On Thu, 14 Jan 2010 10:14:45 +0000
> Jarek Poplawski<jarkao2@...il.com> wrote:
>
>
>> This makes it safe, but it still resembles the "short term fix"
>> according do David's opinion.
>>
>> This change seems to affect dev->stats too. Since they are not
>> updated in sky2_tx_clean(). Btw, I hope "&" is some optimization
>> because it's less readable than "&&".
>>
> Stats don't matter for packets flushed during device reset.
>
> The& is because in the most common case device is up,
> and we don't want the additional conditional branch.
>
I've been looking at what might explain the dhcp stuff - as well as the
dropped packets only when there's an extra hop. I came across one path
that seems suspect - although I'm really not familiar with the network
stack code... that said, I'm wondering about neigh_compat_output (and
eth_rebuild_header and arp_find). If I'm following things correctly (or
perhaps mostly correctly), the only time anything goes this route (pun
intentional) is when the packet was routed to this box. I'm guessing
that bridging makes this more likely. So my dhcp stuff would all be
going through here, as would the smb stuff that seemed flaky. The race
I'm seeing (maybe) is that when the arp table is being rebuilt, there's
a possibility that arp_find frees the skb. There's some other locking
and stuff going on that seems maybe races with sky2.c in places on both
the rx and tx path. I *think* it's right from looking at it, but test
results suggest otherwise. Aside from the potential race, I think
there's also a corner case where neigh_compat_output can return either
with or without freeing the skb depending on the return from
dev_hard_header... this may also be part of the race.
Maybe I've missed something... but as far as I can see, this is just
about the only difference in code path taken between stuff that is
working and stuff that is occasionally not.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists