lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 20 Apr 2017 15:25:37 +0200
From:   Frederic Weisbecker <fweisbec@...il.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Tariq Toukan <tariqt@...lanox.com>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        peterz@...radead.org
Subject: Re: Heads-up: two regressions in v4.11-rc series

On Thu, Apr 20, 2017 at 11:00:42AM +0200, Jesper Dangaard Brouer wrote:
> Hi Linus,
> 
> Just wanted to give a heads-up on two regressions in 4.11-rc series.
> 
> (1) page allocator optimization revert
> 
> Mel Gorman and I have been playing with optimizing the page allocator,
> but Tariq spotted that we caused a regression for (NIC) drivers that
> refill DMA RX rings in softirq context.
> 
> The end result was a revert, and this is waiting in AKPMs quilt queue:
>  http://ozlabs.org/~akpm/mmots/broken-out/revert-mm-page_alloc-only-use-per-cpu-allocator-for-irq-safe-requests.patch
> 
> 
> (2) Busy softirq can cause userspace not to be scheduled
> 
> I bisected the problem to a499a5a14dbd ("sched/cputime: Increment
> kcpustat directly on irqtime account"). See email thread with
>  Subject: Bisected softirq accounting issue in v4.11-rc1~170^2~28
>  http://lkml.kernel.org/r/20170328101403.34a82fbf@redhat.com
> 
> I don't know the scheduler code well enough to fix this, and will have
> to rely others to figure out this scheduler regression.
> 
> To make it clear: I'm only seeing this scheduler regression when a
> remote host is sending many many network packets, towards the kernel
> which keeps NAPI/softirq busy all the time.  A possible hint: tool
> "top" only shows this in "si" column, while on v4.10 "top" also blames
> "ksoftirqd/N", plus "ps" reported cputime (0:00) seems wrong for ksoftirqd.

(I'm currently working on reproducing that one.)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ