[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081126222714.GA10981@Krystal>
Date: Wed, 26 Nov 2008 17:27:15 -0500
From: Mathieu Desnoyers <compudj@...stal.dyndns.org>
To: Andrew McDermott <andrew.mcdermott@...driver.com>
Cc: Davide Libenzi <davidel@...ilserver.org>,
Ingo Molnar <mingo@...e.hu>, ltt-dev@...ts.casi.polymtl.ca,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
William Lee Irwin III <wli@...omorphy.com>
Subject: Re: [ltt-dev] [PATCH] Poll : introduce poll_wait_exclusive() new
function
* Andrew McDermott (andrew.mcdermott@...driver.com) wrote:
>
> Mathieu Desnoyers <compudj@...stal.dyndns.org> writes:
>
> [...]
>
> >> > Mathieu Desnoyers explained it cause following problem to LTTng.
> >> >
> >> > In LTTng, all lttd readers are polling all the available debugfs files
> >> > for data. This is principally because the number of reader threads is
> >> > user-defined and there are typical workloads where a single CPU is
> >> > producing most of the tracing data and all other CPUs are idle,
> >> > available to consume data. It therefore makes sense not to tie those
> >> > threads to specific buffers. However, when the number of threads grows,
> >> > we face a "thundering herd" problem where many threads can be woken up
> >> > and put back to sleep, leaving only a single thread doing useful work.
> >>
> >> Why do you need to have so many threads banging a single device/file?
> >> Have one (or any other very little number) puller thread(s), that
> >> activates with chucks of pulled data the other processing threads. That
> >> way there's no need for a new wakeup abstraction.
> >>
> >>
> >>
> >> - Davide
> >
> > One of the key design rule of LTTng is to do not depend on such
> > system-wide data structures, or entity (e.g. single manager thread).
> > Everything is per-cpu, and it does scale very well.
> >
> > I wonder how badly the approach you propose can scale on large NUMA
> > systems, where having to synchronize everything through a single thread
> > might become an important point of contention, just due to the cacheline
> > bouncing and extra scheduler activity involved.
>
> But at the end of the day these threads end up writing to the (possibly)
> single spindle. Isn't that the biggest bottlneck here?
>
Not if those threads are either
- analysing the data on-the-fly without exporting it to disk
- sending the data through more than one network card
- Writing data to multiple disks
There are therefore ways to improve scalability by adding more data
output paths. Therefore, I don't want to limit scalability due to the
inner design, so that if someone has the resources to send the
information out at great speed scaleably, he can.
Mathieu
--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists