lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 18 Feb 2013 21:17:13 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Marc MERLIN <marc@...lins.org>
Cc:	David Miller <davem@...emloft.net>, Larry.Finger@...inger.net,
	bhutchings@...arflare.com, linux-wireless@...r.kernel.org,
	netdev@...r.kernel.org
Subject: Re: 3.7.8/amd64 full interrupt hangs due to iwlwifi under big nfs
 copies out

On Mon, 2013-02-18 at 20:05 -0800, Marc MERLIN wrote:
> On Mon, Jul 16, 2012 at 06:21:57PM +0200, Eric Dumazet wrote:
> > > No, it's atually when I'm 'uploading' from my laptop to my server.
> > > One interesting thing is that my server is running lvm2 with snapshots,
> > > which makes writes slower than my laptop can push data over the network, so
> > > it's definitely causing buffers to fill up.
> > > I just did a download test and got 4.5MB/s sustained without problems.
> > 
> > Hmm, nfs apparently is able to push lot of data, try to reduce
> > rsize/wsize to sane values, like 32K instead of 512K ?
> > 
> > gargamel:/mnt/dshelf2/ /net/gargamel/mnt/dshelf2 nfs4
> > rw,nosuid,nodev,relatime,vers=4.0,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=192.168.205.7,local_lock=none,addr=192.168.205.3 0 0
> > 
> > You could trace svc_sock_setbufsize() and check how large is set
> > sk_sndbuf
> 
> My apologies, I totally dropped the ball on this.
> 
> So, the problem was still there in more recent kernels.
> 
> TL;DR:
> - reducing nfs buffers removes the full hang
> - iwlwifi has a problem where lack of pages causes the whoe machine to hang
> - NFS copies out, even with buffers down to 32K is very wonky and cp does not
>   return until over 2mn after the copy is actually finished.
>   (I have a trace of what's hung in cp/nfs when this happens)
> 
> 
> Details:
> 
> It's still pretty severe because whatever blocks doesn't just end up
> blocking disk IO, but actually blocking interrupts altogether since my mouse
> can't move for a minute or more until some buffer flushes.
> 
> The last trace I got during this (I can't do sysrq because I have a broken 
> Lenovo T530 without a sysrq key, and typing doesn't really work when
> interrupts aren't firing).
> 
> Not sure if it's useful. First chrome had an issue, and then iwlwifi
> 
> chrome: page allocation failure: order:1, mode:0x4020
> Pid: 8730, comm: chrome Tainted: G           O 3.7.8-amd64-preempt-20121226-fixwd #1
> Call Trace:
>  <IRQ>  [<ffffffff810d5f38>] warn_alloc_failed+0x117/0x12c
>  [<ffffffff810d8cfd>] __alloc_pages_nodemask+0x66a/0x702
>  [<ffffffff8108a948>] ? arch_local_irq_save+0x15/0x1b
>  [<ffffffff811064af>] alloc_pages_current+0xcd/0xee
>  [<ffffffffa039b579>] iwl_rx_allocate+0x8c/0x271 [iwlwifi]
>  [<ffffffffa039c24e>] iwl_irq_tasklet+0x7e5/0x91c [iwlwifi]
>  [<ffffffff8104805e>] tasklet_action+0x80/0xd2
>  [<ffffffff81047c99>] __do_softirq+0xdf/0x1c5
>  [<ffffffff814c1ed6>] ? _raw_spin_lock+0x1b/0x1f
>  [<ffffffff810a7f37>] ? handle_irq_event+0x4d/0x62
>  [<ffffffff814c7f5c>] call_softirq+0x1c/0x30
>  [<ffffffff8101104e>] do_softirq+0x41/0x7f
>  [<ffffffff81047e52>] irq_exit+0x3f/0xa7
>  [<ffffffff81010d40>] do_IRQ+0x88/0x9f
>  [<ffffffff814c246d>] common_interrupt+0x6d/0x6d
>  <EOI> Mem-Info:

You could try to load iwlwifi with amsdu_size_8K set to 0 (disable)

It should hopefully use order-0 pages

Some drivers cant fallback to low order page allocations.

mlx4 is another example (it uses order-2 pages )



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ