[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3f44cc18-4dd8-e313-26b9-1502b0b40507@redhat.com>
Date: Wed, 9 Jan 2019 13:54:36 -0500
From: Waiman Long <longman@...hat.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Kees Cook <keescook@...omium.org>,
Thomas Gleixner <tglx@...utronix.de>,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
Davidlohr Bueso <dave@...olabs.net>,
Miklos Szeredi <miklos@...redi.hu>,
Daniel Colascione <dancol@...gle.com>,
Dave Chinner <david@...morbit.com>,
Randy Dunlap <rdunlap@...radead.org>
Subject: Re: [PATCH v2 0/4] /proc/stat: Reduce irqs counting performance
overhead
On 01/09/2019 01:37 PM, Waiman Long wrote:
> On 01/09/2019 01:24 PM, Matthew Wilcox wrote:
>> On Wed, Jan 09, 2019 at 01:03:33PM -0500, Waiman Long wrote:
>>> The paragraph above may be a bit misleading. This v2 patch actually
>>> touches very little on percpu accounting aspect of the IRQ counts. See
>>> patches 2 and 3 for the relevant changes which is just a few line of new
>>> codes. Please review the individual patches before Nak'ing.
>>>
>>> I could theoretically generalize them into a new set of percpu counting
>>> helpers, but the idea behind it is quite different from the use cases of
>>> percpu counter. So it may not be a good idea of adding it to there.
>> Did you even try just using the general purpose infrastructure that's
>> in place? If that shows a performance problem _then_ it's time to make
>> this special snowflake just a little more special. Not before.
> I have looked into the percpu counter code. There are two aspects that I
> don't like to introduce to the interrupt handler's code path for
> updating the counts.
>
> 1) There is a raw spinlock in the percpu_counter structure that may need
> to be acquired in the update path. This can be a performance drag
> especially if lockdep is enabled.
>
> 2) The percpu_counter structure is 40 bytes in size on 64-bit systems
> compared with just 8 bytes for the percpu count pointer and an
> additional 4 bytes that I introduced in patch 2. With thousands of irq
> descriptors, it can consume quite a lot more memory. Memory consumption
> was a point that you brought up in one of your previous mails.
If you read patch 4, you can see that quite a bit of CPU cycles was
spent looking up the radix tree to locate the IRQ descriptor for each of
the interrupts. Those overhead will still be there even if I use percpu
counters. So using percpu counter alone won't be as performant as this
patch or my previous v1 patch.
Cheers,
Longman
Powered by blists - more mailing lists