lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 4 Dec 2013 21:23:23 +0000
From:	Zoltan Kiss <zoltan.kiss@...rix.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
	Malcolm Crossley <malcolm.crossley@...rix.com>,
	Jonathan Davies <Jonathan.Davies@...citrix.com>,
	Paul Durrant <Paul.Durrant@...rix.com>,
	Wei Liu <wei.liu2@...rix.com>,
	Ian Campbell <Ian.Campbell@...rix.com>
Subject: Re: NAPI rescheduling and the delay caused by it

On 04/12/13 20:41, Eric Dumazet wrote:
> On Wed, 2013-12-04 at 18:55 +0000, Zoltan Kiss wrote:
>
>> So, my questions are:
>> - why is NAPI rescheduled on an another CPU?
>> - why does it cause a 3-4 milisec delay?
>
> NAPI can not be scheduled on another cpu.
>
> But at the time of napi_schedule() call, napi_struct can be already be
> scheduled by another cpu.
>
> ( NAPI_STATE_SCHED bit already set)
> So I would say something made the 'other' cpu non responsive fast enough
> to softirq events being ready for service.
>
> (Another wakeup happened 3-4 millisec later)
Oh, thanks! I forgot to mention, I have my grant mapping patches 
applied. The callback when the previous packet is sent to the another 
vif schedules the NAPI instance on that other CPU. But it's still not 
clear why it takes so long to serve that softirq!


> Really, I suspect your usage of netif_wake_queue() is simply wrong.
>
> Check why we have netif_rx() and netif_rx_ni() variants.
>
> And ask yourself if xenvif_notify_tx_completion() is correct, being
> called from process context.
So, at the moment we use netif_wake_queue to notify the stack that it 
can call xenvif_start_xmit, the thread is ready to accept new packets 
for transmission. It is called when we get an interrupt from the 
frontend (it marks it made room in the ring), and from 
xenvif_notify_tx_completion at the end of the thread. The latter checks 
if queueing were stopped in the meantime, and see if the guest made 
space after our recent transmission.
I see netif_rx_ni makes sure the softirq is executed, but I'm not sure I 
get how is it related to wake_queue. Can you explain a bit more?

Thanks,

Zoli

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ