[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <151954917.20121011120004@eikelenboom.it>
Date: Thu, 11 Oct 2012 12:00:04 +0200
From: Sander Eikelenboom <linux@...elenboom.it>
To: Ian Campbell <Ian.Campbell@...rix.com>
CC: xen-devel <xen-devel@...ts.xen.org>,
Konrad Rzeszutek Wilk <konrad@...nel.org>,
Eric Dumazet <eric.dumazet@...il.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>
Subject: Re: [Xen-devel] compound skb frag pages appearing in start_xmit
Thursday, October 11, 2012, 10:02:26 AM, you wrote:
> On Wed, 2012-10-10 at 15:49 +0100, Sander Eikelenboom wrote:
>> Wednesday, October 10, 2012, 3:09:58 PM, you wrote:
>>
>> > On Wed, 2012-10-10 at 11:13 +0100, Ian Campbell wrote:
>> >> I haven't tackled netfront yet.
>>
>> > I seem to be totally unable to reproduce the equivalent issue on the
>> > netfront xmit side, even though it seems like the loop in
>> > xennet_make_frags ought to be obviously susceptible to it.
>>
>> > Konrad, Sander, are either of you able to repro, e.g. with:
>>
>>
>> Hmrrrmm i don't see any traces, only strange behaviour ..
>>
>> - i can connect to guests by ssh, but it's sluggish, and sometimes stops working
> I saw something like this (ssh sluggish) even with dom0 itself. I'm
> trying to see if I can characterise it enough to reliably bisect it.
> I already switched out xen-unstable for 4.2-testing but that didn't make
> any difference.
>> - The guest seem to keep trying to connect to netback:
>>
>> [ 658.276719] xen_bridge: port 2(vif40.0) entered forwarding state
>> [ 658.282258] xen_bridge: port 2(vif40.0) entered forwarding state
>> [ 663.945964] xen_bridge: port 7(vif39.0) entered forwarding state
>> [ 669.674277] xen_bridge: port 2(vif40.0) entered disabled state
>> [ 669.680290] device vif40.0 left promiscuous mode
>> [ 669.685464] xen_bridge: port 2(vif40.0) entered disabled state
>> [ 672.857222] device vif41.0 entered promiscuous mode
>> [ 673.166254] xen-blkback:ring-ref 8, event-channel 9, protocol 1 (x86_64-abi)
>> [ 673.176368] xen_bridge: port 2(vif41.0) entered forwarding state
>> [ 673.182042] xen_bridge: port 2(vif41.0) entered forwarding state
>> [ 674.439725] xen_bridge: port 7(vif39.0) entered disabled state
>> [ 674.445708] device vif39.0 left promiscuous mode
>> [ 674.450955] xen_bridge: port 7(vif39.0) entered disabled state
>> [ 677.726040] device vif42.0 entered promiscuous mode
>> [ 678.053381] xen-blkback:ring-ref 8, event-channel 9, protocol 1 (x86_64-abi)
>> [ 678.062804] xen_bridge: port 7(vif42.0) entered forwarding state
>> [ 678.068433] xen_bridge: port 7(vif42.0) entered forwarding state
>> [ 688.224736] xen_bridge: port 2(vif41.0) entered forwarding state
>> [ 693.080557] xen_bridge: port 7(vif42.0) entered forwarding state
>> [ 700.786276] xen_bridge: port 7(vif42.0) entered disabled state
>> [ 700.792484] device vif42.0 left promiscuous mode
>> [ 700.802409] xen_bridge: port 7(vif42.0) entered disabled state
>> [ 704.133606] device vif43.0 entered promiscuous mode
>> [ 704.460160] xen-blkback:ring-ref 8, event-channel 9, protocol 1 (x86_64-abi)
>> [ 704.469800] xen_bridge: port 7(vif43.0) entered forwarding state
>> [ 704.475303] xen_bridge: port 7(vif43.0) entered forwarding state
>> [ 719.493788] xen_bridge: port 7(vif43.0) entered forwarding state
>> [ 726.302456] xen_bridge: port 7(vif43.0) entered disabled state
>> [ 726.308898] device vif43.0 left promiscuous mode
>> [ 726.314029] xen_bridge: port 7(vif43.0) entered disabled state
>>
>> All the guests are already up, but this keeps on going and going and going ....
> The domain number seems to be climbing, are you sure something isn't
> (crashing and) restarting?
Probably due to the BUG_ON from the patch below, i changed it into a WARN_ON.
And i seem to hit it, but only in one of the guests at the moment and it triggers quite irregularly.
[ 34.298549] ------------[ cut here ]------------
[ 34.298567] WARNING: at drivers/net/xen-netfront.c:465 xennet_start_xmit+0x7fe/0x860()
[ 34.298574] Modules linked in:
[ 34.298597] Pid: 1580, comm: sshd Not tainted 3.6.0pre-rc1-20121011 #1
[ 34.298603] Call Trace:
[ 34.298611] [<ffffffff810664ea>] warn_slowpath_common+0x7a/0xb0
[ 34.298617] [<ffffffff81066535>] warn_slowpath_null+0x15/0x20
[ 34.298623] [<ffffffff8146d89e>] xennet_start_xmit+0x7fe/0x860
[ 34.298631] [<ffffffff8161f349>] dev_hard_start_xmit+0x209/0x460
[ 34.298637] [<ffffffff8163b036>] sch_direct_xmit+0xf6/0x290
[ 34.298643] [<ffffffff8161f746>] dev_queue_xmit+0x1a6/0x5a0
[ 34.298649] [<ffffffff8161f5a0>] ? dev_hard_start_xmit+0x460/0x460
[ 34.298656] [<ffffffff810aa8e5>] ? trace_softirqs_off+0x85/0x1b0
[ 34.298663] [<ffffffff816b9536>] ip_finish_output+0x226/0x530
[ 34.298668] [<ffffffff816b93dd>] ? ip_finish_output+0xcd/0x530
[ 34.298674] [<ffffffff816b9899>] ip_output+0x59/0xe0
[ 34.298680] [<ffffffff816b83b8>] ip_local_out+0x28/0x90
[ 34.298687] [<ffffffff816b896f>] ip_queue_xmit+0x17f/0x4a0
[ 34.298692] [<ffffffff816b87f0>] ? ip_send_unicast_reply+0x340/0x340
[ 34.298699] [<ffffffff810a0ba7>] ? getnstimeofday+0x47/0xe0
[ 34.298705] [<ffffffff8160f4c9>] ? __skb_clone+0x29/0x120
[ 34.298711] [<ffffffff816cea20>] tcp_transmit_skb+0x400/0x8d0
[ 34.298717] [<ffffffff816d19fa>] tcp_write_xmit+0x21a/0xa50
[ 34.298723] [<ffffffff816d225b>] tcp_push_one+0x2b/0x40
[ 34.298728] [<ffffffff816c2dec>] tcp_sendmsg+0x8dc/0xe20
[ 34.298735] [<ffffffff816e8f19>] inet_sendmsg+0xa9/0x100
[ 34.298740] [<ffffffff816e8e70>] ? inet_autobind+0x70/0x70
[ 34.298746] [<ffffffff810b0f88>] ? lock_acquire+0xd8/0x100
[ 34.298753] [<ffffffff8160630d>] sock_aio_write+0x12d/0x140
[ 34.298762] [<ffffffff811435b2>] do_sync_write+0xa2/0xe0
[ 34.298768] [<ffffffff810ad22d>] ? trace_hardirqs_on+0xd/0x10
[ 34.298774] [<ffffffff811441d4>] vfs_write+0x174/0x190
[ 34.298779] [<ffffffff811442fa>] sys_write+0x5a/0xa0
[ 34.298786] [<ffffffff812b33de>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 34.298792] [<ffffffff817491cc>] cstar_dispatch+0x7/0x26
[ 34.298797] ---[ end trace 2e28eec93b7a8b74 ]---
Complete dmesg from guest attached.
>> > diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
>> > index b06ef81..8a3f770 100644
>> > --- a/drivers/net/xen-netfront.c
>> > +++ b/drivers/net/xen-netfront.c
>> > @@ -462,6 +462,8 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
>> > ref = gnttab_claim_grant_reference(&np->gref_tx_head);
>> > BUG_ON((signed short)ref < 0);
>> >
>> > + BUG_ON(PageCompound(skb_frag_page(frag)));
>> > +
>> > mfn = pfn_to_mfn(page_to_pfn(skb_frag_page(frag)));
>> > gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
>> > mfn, GNTMAP_readonly);
>>
>> > My repro for netback was just to netcat a wodge of data from dom0->domU
>> > but going the other way doesn't seem to trigger.
>>
>>
>>
View attachment "dmesg-netfront.txt" of type "text/plain" (177798 bytes)
Powered by blists - more mailing lists