lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 20 Jun 2008 16:06:12 -0500
From:	Travis Stratman <tstratman@...cinc.com>
To:	Evgeniy Polyakov <johnpol@....mipt.ru>
Cc:	netdev@...r.kernel.org
Subject: Re: data received but not detected

On Fri, 2008-06-20 at 22:23 +0400, Evgeniy Polyakov wrote:
> On Fri, Jun 20, 2008 at 01:17:06PM -0500, Travis Stratman (tstratman@...cinc.com) wrote:
> > Let me clarify this again... I see the packet being sent at the expected
> > time from the sender on the tcpdump. The packet does not show up in
> > tcpdump or in the application on the receive side. When some other data
> > is received by the receiver (i.e. ARP), the missing packet shows up in
> > the tcpdump and in the application at the same time. So the delay shows
> > up in the tcpdump as well. It seems to me that everything is pointing to
> > the packet being in the DMA buffer but the controller driver not knowing
> > anything about it.
> 
> Argh. Ok, then please check that napi polling is called and rx interrupt
> happen for the driver.

This is what I have been focusing on. I'm still trying to figure out a
good way to see if the interrupt is triggered for a specific packet
because I have no way of determining which packet it will freeze on and
if I put any prints in the interrupt handler or poll function it slows
things down enough that the problem disappears.

In the meantime I was testing why the FIONREAD ioctl made such a big
difference and I found that if I insert a usleep(1) between the two
receive calls, the problem does not occur. During my testing before I
had put a usleep() between the send calls, which fixed the issue for me
and led me to assume that an IRQ was being missed if the packets come in
too close to each other.

The fact that inserting a sleep between the two receive calls fixes the
issue makes this seem less like a driver issue. The only hypothesis that
I have bee able to come up with so far is that calling recv() somehow
masks the interrupts momentarily so that if the packet comes in at
exactly the same time as the recv or poll() is called, the system does
not know anything about it, to the point that it does not even show on
the packet trace. I have no idea how this could happen at this point.

Thanks,

Travis

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists