netdev - Re: Page allocator order-0 optimizations merged

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170323144347.1e6f29de@redhat.com>
Date:   Thu, 23 Mar 2017 14:43:47 +0100
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Tariq Toukan <tariqt@...lanox.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        akpm@...ux-foundation.org, linux-mm <linux-mm@...ck.org>,
        Saeed Mahameed <saeedm@...lanox.com>, brouer@...hat.com
Subject: Re: Page allocator order-0 optimizations merged

On Wed, 22 Mar 2017 23:40:04 +0000
Mel Gorman <mgorman@...hsingularity.net> wrote:

> On Wed, Mar 22, 2017 at 07:39:17PM +0200, Tariq Toukan wrote:
> > > > > This modification may slow allocations from IRQ context slightly
> > > > > but the
> > > > > main gain from the per-cpu allocator is that it scales better for
> > > > > allocations from multiple contexts.  There is an implicit
> > > > > assumption that
> > > > > intensive allocations from IRQ contexts on multiple CPUs from a single
> > > > > NUMA node are rare  
> > Hi Mel, Jesper, and all.
> > 
> > This assumption contradicts regular multi-stream traffic that is naturally
> > handled
> > over close numa cores.  I compared iperf TCP multistream (8 streams)
> > over CX4 (mlx5 driver) with kernels v4.10 (before this series) vs
> > kernel v4.11-rc1 (with this series).
> > I disabled the page-cache (recycle) mechanism to stress the page allocator,
> > and see a drastic degradation in BW, from 47.5 G in v4.10 to 31.4 G in
> > v4.11-rc1 (34% drop).
> > I noticed queued_spin_lock_slowpath occupies 62.87% of CPU time.  
> 
> Can you get the stack trace for the spin lock slowpath to confirm it's
> from IRQ context?

AFAIK allocations happen in softirq.  Argh and during review I missed
that in_interrupt() also covers softirq.  To Mel, can we use a in_irq()
check instead?

(p.s. just landed and got home)
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer