[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181104201821.GA22049@krava>
Date: Sun, 4 Nov 2018 21:18:21 +0100
From: Jiri Olsa <jolsa@...hat.com>
To: David Miller <davem@...emloft.net>
Cc: acme@...nel.org, linux-kernel@...r.kernel.org, namhyung@...nel.org,
jolsa@...nel.org
Subject: Re: [PATCH RFC] hist lookups
On Fri, Nov 02, 2018 at 11:30:03PM -0700, David Miller wrote:
> From: David Miller <davem@...emloft.net>
> Date: Wed, 31 Oct 2018 09:08:16 -0700 (PDT)
>
> > From: Jiri Olsa <jolsa@...hat.com>
> > Date: Wed, 31 Oct 2018 16:39:07 +0100
> >
> >> it'd be great to make hist processing faster, but is your main target here
> >> to get the load out of the reader thread, so we dont lose events during the
> >> hist processing?
> >>
> >> we could queue events directly from reader thread into another thread and
> >> keep it (the reader thread) free of processing, focusing only on event
> >> reading/passing
> >
> > Indeed, we could create threads that take samples from the thread processing
> > the ring buffers, and insert them into the histogram.
>
> So I played around with some ideas like this and ran into some dead ends.
>
> I ran each mmap ring's processing in a separate thread.
>
> This doesn't help at all, the problem is that all the threads serialize
> at the pthread lock for the histogram part of the work.
>
> And the histogram part dominates the cost of processing each sample.
yep, it suck.. I was thinking of keeping separate hist objects for
each thread and merge them at the end
>
> Nevertheless I started work on formally threading all of the code that
> the mmap threads operate on, such as symbol processing etc. and while
> doing so I came to the conclusion that pushing the histogram processing
> only to a separate thread poses it's own set of big challenges.
>
> To make this work we would have to make a piece of transient on-stack
> state (the processed event) into allocated persistent state.
>
> These persistent event structures get queued up to the histogram
> thread(s).
>
> Therefore, if the histogram thread(s) can't keep up (and as per my
> experiment above, it is easy to enter this state because the histogram
> code itself is going to run linearly with the histgram lock held),
> this persistent event memory will just get larger and larger.
>
> We would have to find some way to parallelize the histgram code to
> make any kind of threading worthwhile.
do you have some code I could check on?
I'm going to make that separate thread to get the processing out
of the reading thread.. I think we need that in any case, so the
ring buffer is kept free as fast as possible
thanks,
jirka
Powered by blists - more mailing lists