[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171103134020.3hwquerifnc6k6qw@techsingularity.net>
Date: Fri, 3 Nov 2017 13:40:20 +0000
From: Mel Gorman <mgorman@...hsingularity.net>
To: Tariq Toukan <tariqt@...lanox.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>,
David Miller <davem@...emloft.net>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Eric Dumazet <eric.dumazet@...il.com>,
Alexei Starovoitov <ast@...com>,
Saeed Mahameed <saeedm@...lanox.com>,
Eran Ben Elisha <eranbe@...lanox.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...e.com>
Subject: Re: Page allocator bottleneck
On Thu, Nov 02, 2017 at 07:21:09PM +0200, Tariq Toukan wrote:
>
>
> On 18/09/2017 12:16 PM, Tariq Toukan wrote:
> >
> >
> > On 15/09/2017 1:23 PM, Mel Gorman wrote:
> > > On Thu, Sep 14, 2017 at 07:49:31PM +0300, Tariq Toukan wrote:
> > > > Insights: Major degradation between #1 and #2, not getting any
> > > > close to linerate! Degradation is fixed between #2 and #3. This is
> > > > because page allocator cannot stand the higher allocation rate. In
> > > > #2, we also see that the addition of rings (cores) reduces BW (!!),
> > > > as result of increasing congestion over shared resources.
> > > >
> > >
> > > Unfortunately, no surprises there.
> > >
> > > > Congestion in this case is very clear. When monitored in perf
> > > > top: 85.58% [kernel] [k] queued_spin_lock_slowpath
> > > >
> > >
> > > While it's not proven, the most likely candidate is the zone lock
> > > and that should be confirmed using a call-graph profile. If so, then
> > > the suggestion to tune to the size of the per-cpu allocator would
> > > mitigate the problem.
> > >
> > Indeed, I tuned the per-cpu allocator and bottleneck is released.
> >
>
> Hi all,
>
> After leaving this task for a while doing other tasks, I got back to it now
> and see that the good behavior I observed earlier was not stable.
>
> Recall: I work with a modified driver that allocates a page (4K) per packet
> (MTU=1500), in order to simulate the stress on page-allocator in 200Gbps
> NICs.
>
There is almost new in the data that hasn't been discussed before. The
suggestion to free on a remote per-cpu list would be expensive as it would
require per-cpu lists to have a lock for safe remote access. However,
I'd be curious if you could test the mm-pagealloc-irqpvec-v1r4 branch
ttps://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git . It's an
unfinished prototype I worked on a few weeks ago. I was going to revisit
in about a months time when 4.15-rc1 was out. I'd be interested in seeing
if it has a postive gain in normal page allocations without destroying
the performance of interrupt and softirq allocation contexts. The
interrupt/softirq context testing is crucial as that is something that
hurt us before when trying to improve page allocator performance.
--
Mel Gorman
SUSE Labs
Powered by blists - more mailing lists