[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <985009134.71250769263099.JavaMail.root@mail.holmansrus.com>
Date: Thu, 20 Aug 2009 06:54:23 -0500 (CDT)
From: Walt Holman <walt@...mansrus.com>
To: Krzysztof Halasa <khc@...waw.pl>
Cc: David Miller <davem@...emloft.net>, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: Strange network timeouts w/ 2.6.30.5
----- "Krzysztof Halasa" <khc@...waw.pl> wrote:
> > Since patching to 2.6.30.5 I'm experiencing periodic timeouts on my
> > e100 which is used as my WAN interface on a server/router box.
> Nothing
> > is reported in any logs and eventually the traffic resumes. It
> seems
> > to happen at fairly regular intervals, although I've not timed
> them.
> > The timeouts last for approx. 60-120 seconds and then traffic
> resumes
> > normally with no hint of what happened.
>
> x86-64, intel P965...
>
> Can you provide "dmesg" output, please?
>
> I wonder what additional side effect did the patch cause. Streaming
> allocs on such x86 should already be coherent, no?
>
> Perhaps you have more than 2 GB RAM (or so) and swiotlb has to
> provide
> buffering? I think of something like:
>
> - the driver does "sync for CPU" and examines status
> - the descriptor is tested to be still empty
> - meanwhile e100 chip changes the status in the descriptor
> - the driver does "sync for device" (it's what the patch added)
> - at this point swiotlb doesn't know the descriptor is clean and
> writes
> it out, thus dropping the change done by the e100 chip.
>
> Does the above seem plausible? I admit I'm not swiotlb expert, it's
> a pure guess that it simply and blindly moves data in and out.
>
> If that's the case, I don't really know how could it work without the
> patch in question. Perhaps the timings were just right?
>
> What can we do with it? Rewriting to use consistent allocs, of
> course.
> Temporarily adding #ifdef CONFIG_ARM around the
> pci_dma_sync_single_for_device()? Not sure if other archs were
> affected.
>
> The root problem is that the driver shouldn't use streaming
> allocations
> for its descriptors (they are written from both sides
> simultaneously).
> Only skb->data can be streaming.
> --
> Krzysztof Halasa
Hi Krzystof,
dmesg is attached. This box does have >2GB Ram (6GB total). The dmesg will show e100 init'd 3 times since the first is the stock modprobe, 2nd was forced with use_io and the 3rd modprobe was after reverting the patch.
-Walt
View attachment "dmesg.txt" of type "text/plain" (60718 bytes)
Powered by blists - more mailing lists