linux-kernel - Re: [PATCH] xen/netback: calculate correctly the SKB slots.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1337678512.10118.40.camel@zakaz.uk.xensource.com>
Date:	Tue, 22 May 2012 10:21:52 +0100
From:	Ian Campbell <Ian.Campbell@...rix.com>
To:	Ben Hutchings <bhutchings@...arflare.com>
CC:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	"xen-devel@...ts.xensource.com" <xen-devel@...ts.xensource.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Adnan Misherfi <adnan.misherfi@...cle.com>
Subject: Re: [PATCH] xen/netback: calculate correctly the SKB slots.

On Mon, 2012-05-21 at 20:14 +0100, Ben Hutchings wrote:
> On Mon, 2012-05-21 at 13:36 -0400, Konrad Rzeszutek Wilk wrote:
> > From: Adnan Misherfi <adnan.misherfi@...cle.com>
> > 
> > A programming error cause the calculation of receive SKB slots to be
> > wrong, which caused the RX ring to be erroneously declared full,
> > and the receive queue to be stopped. The problem shows up when two
> > guest running on the same server tries to communicates using large
> > MTUs. Each guest is connected to a bridge with VLAN over bond
> > interface, so traffic from one guest leaves the server on one bridge
> > and comes back to the second guest on the second bridge. This can be
> > reproduces using ping, and one guest as follow:
> > 
> > - Create active-back bond (bond0)
> > - Set up VLAN 5 on bond0 (bond0.5)
> > - Create a bridge (br1)
> > - Add bond0.5 to a bridge (br1)
> > - Start a guest and connect it to br1
> > - Set MTU of 9000 across the link
> > 
> > Ping the guest from an external host using packet sizes of 3991, and
> > 4054; ping -s 3991 -c 128 "Guest-IP-Address"
> > 
> > At the beginning ping works fine, but after a while ping packets do
> > not reach the guest because the RX ring becomes full, and the queue
> > get stopped. Once the problem accrued, the only way to get out of it
> > is to reboot the guest, or use xm network-detach/network-attach.
> > 
> > ping works for packets sizes 3990,3992, and many other sizes including
> > 4000,5000,9000, and 1500 ..etc. MTU size of 3991,4054 are the sizes
> > that quickly reproduce this problem.
> > 
> > Signed-off-by: Adnan Misherfi <adnan.misherfi@...cle.com>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
> > ---
> >  drivers/net/xen-netback/netback.c |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> > index 957cf9d..e382e5b 100644
> > --- a/drivers/net/xen-netback/netback.c
> > +++ b/drivers/net/xen-netback/netback.c
> > @@ -212,7 +212,7 @@ unsigned int xenvif_count_skb_slots(struct xenvif *vif, struct sk_buff *skb)
> 
> The function name is xen_netbk_count_skb_slots() in net-next.  This
> appears to depend on the series in
> <http://lists.xen.org/archives/html/xen-devel/2012-01/msg00982.html>.

Yes, I don't think that patchset was intended for prime time just yet.
Can this issue be reproduced without it?

> >  	int i, copy_off;
> >  
> >  	count = DIV_ROUND_UP(
> > -			offset_in_page(skb->data)+skb_headlen(skb), PAGE_SIZE);
> > +			offset_in_page(skb->data + skb_headlen(skb)), PAGE_SIZE);
> 
> The new version would be equivalent to:
> 	count = offset_in_page(skb->data + skb_headlen(skb)) != 0;
> which is not right, as netbk_gop_skb() will use one slot per page.

Just outside the context of this patch we separately count the frag
pages.

However I think you are right if skb->data covers > 1 page, since the
new version can only ever return 0 or 1. I expect this patch papers over
the underlying issue by not stopping often enough, rather than actually
fixing the underlying issue.

> The real problem is likely that you're not using the same condition to
> stop and wake the queue.

Agreed, it would be useful to see the argument for this patch presented
in that light. In particular the relationship between
xenvif_rx_schedulable() (used to wake queue) and
xen_netbk_must_stop_queue() (used to stop queue).

As it stands the description describes a setup which can repro the
problem but doesn't really analyse what actually happens, nor justify
the correctness of the fix.

>   Though it appears you're also missing an
> smp_mb() at the top of xenvif_notify_tx_completion().

I think the necessary barrier is in RING_PUSH_RESPONSES_AND_CHECK_NOTIFY
which is just prior to the single callsite of this function.

Ian.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/