[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFzGFvVGD_8Y=jTkYwgmYgZnW0p0Fjf7OHFPRcL6Mz4HOw@mail.gmail.com>
Date: Mon, 2 Mar 2015 11:47:52 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Dave Chinner <david@...morbit.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...nel.org>, Matt B <jackdachef@...il.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>, xfs@....sgi.com
Subject: Re: [regression v4.0-rc1] mm: IPIs from TLB flushes causing
significant performance degradation.
On Sun, Mar 1, 2015 at 5:04 PM, Dave Chinner <david@...morbit.com> wrote:
>
> Across the board the 4.0-rc1 numbers are much slower, and the
> degradation is far worse when using the large memory footprint
> configs. Perf points straight at the cause - this is from 4.0-rc1
> on the "-o bhash=101073" config:
>
> - 56.07% 56.07% [kernel] [k] default_send_IPI_mask_sequence_phys
> - 99.99% physflat_send_IPI_mask
> - 99.37% native_send_call_func_ipi
..
>
> And the same profile output from 3.19 shows:
>
> - 9.61% 9.61% [kernel] [k] default_send_IPI_mask_sequence_phys
> - 99.98% physflat_send_IPI_mask
> - 96.26% native_send_call_func_ipi
...
>
> So either there's been a massive increase in the number of IPIs
> being sent, or the cost per IPI have greatly increased. Either way,
> the result is a pretty significant performance degradatation.
And on Mon, Mar 2, 2015 at 11:17 AM, Matt <jackdachef@...il.com> wrote:
>
> Linus already posted a fix to the problem, however I can't seem to
> find the matching commit in his tree (searching for "TLC regression"
> or "TLB cache").
That was commit f045bbb9fa1b, which was then refined by commit
721c21c17ab9, because it turned out that ARM64 had a very subtle
relationship with tlb->end and fullmm.
But both of those hit 3.19, so none of this should affect 4.0-rc1.
There's something else going on.
I assume it's the mm queue from Andrew, so adding him to the cc. There
are changes to the page migration etc, which could explain it.
There are also a fair amount of APIC changes in 4.0-rc1, so I guess it
really could be just that the IPI sending itself has gotten much
slower. Adding Ingo for that, although I don't think
default_send_IPI_mask_sequence_phys() itself hasn't actually changed,
only other things around the apic. So I'd be inclined to blame the mm
changes.
Obviously bisection would find it..
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists