netdev - RE: [PATCH net] hyperv: Fix the error processing in netvsc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <BN1PR0301MB0770914291EB27A38F9C69EFCA3A0@BN1PR0301MB0770.namprd03.prod.outlook.com>
Date:	Wed, 4 Feb 2015 22:26:51 +0000
From:	Haiyang Zhang <haiyangz@...rosoft.com>
To:	Jason Wang <jasowang@...hat.com>
CC:	"davem@...emloft.net" <davem@...emloft.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	KY Srinivasan <kys@...rosoft.com>,
	"olaf@...fle.de" <olaf@...fle.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"driverdev-devel@...uxdriverproject.org" 
	<driverdev-devel@...uxdriverproject.org>
Subject: RE: [PATCH net] hyperv: Fix the error processing in netvsc_send()



> -----Original Message-----
> From: Jason Wang [mailto:jasowang@...hat.com]
> Sent: Wednesday, February 4, 2015 2:29 AM
> > The EAGAIN error doesn't normally happen, because we set the hi water
> > mark
> > to stop send queue.
> 
> This is not true since only txq was stopped which means only network
> stack stop sending packets but not for control path e.g
> rndis_filter_send_request() or other callers who call
> vmbus_sendpacket() directly (e.g recv completion).
> 
> For control path, user may meet several errors when they want to change
> mac address under heavy load.
> 
> What's more serious is netvsc_send_recv_completion(), it can not even
> recover from more than 3 times of EAGAIN.
> 
> I must say mixing data packets with control packets with the same
> channel sounds really scary. Since control packets could be blocked or
> even dropped because of data packets already queued during heavy load,
> and you need to synchronize two paths carefully (e.g I didn't see any
> tx lock were held if rndis_filter_send_request() call netsc_send()
> which may stop or start a queue).

The RING_AVAIL_PERCENT_HIWATER is defined to be 20, so the data traffic
can only occupy 20% of the ring buffer before stopping the txq. So, this
mechanism ensures the control messages are not blocked by data traffic.

> >  If in really rare case, the ring buffer is full and there
> > is no outstanding sends, we can't stop queue here because there will
> > be no
> > send-completion msg to wake it up.
> 
> Confused, I believe only txq is stopped but we may still get completion
> interrupt in this case.

If there is no outstanding sends in this queue (queue_sends[q_idx]), we 
won't receive any more send-completion msg.

> 
> > And, the ring buffer is likely to be
> > occupied by other special msg, e.g. receive-completion msg (not a
> > normal case),
> > so we can't assume there are available slots.
> 
> Then why not checking hv_ringbuf_avail_percent() instead? And there's
> no need to check queue_sends since it does not count recv completion.

When ret == -EAGAIN, which means the ring is full, we don't need to check
hv_ringbuf_avail_percent().

> > We don't request retry from
> > the upper layer in this case to avoid possible busy retry.
> 
> Can't we just do this by stopping txq and depending on tx interrupt to
> wake it?

There is no tx interrupt. Do you mean rx interrupt for the send-completion?

In usual cases, when we hit the high water mark, the stopped queue depends on
the send-completion msg to wake up. But, not in some special cases.
As said above, we won't receive any more send-completion msg when there is 
no outstanding sends in this queue.

Thanks,
- Haiyang