lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 13 Sep 2007 00:18:36 -0700
From:	"David Schwartz" <davids@...master.com>
To:	"Linux Kernel Development" <linux-kernel@...r.kernel.org>
Subject: RE: some bad numbers with Java/database threading


> I was working on some unit tests and thought I'd give CFS a whirl to see
> if it had any impact on my workloads (to see what the fuss was about),
> and I came up with some pretty disturbing numbers:
> http://devloop.org.uk/documentation/database-performance/Linux-Ker
> nels/Kernels-ManyThreads-CombinedTests-noload2.png
> As above but also showing the load average:
> http://devloop.org.uk/documentation/database-performance/Linux-Ker
> nels/Kernels-ManyThreads-CombinedTests2.png
> Looks like a regression to me...

I've tried reasonalby diligently to figure out what the hell you're doing and gone through quite a bit of your documentation, and I just can't figure it out. This could entirely be the result of your test's sensitivity to execution order.

For example, if you run ten threads that all insert, query, and delete from the *same* table, then the exact interleaving pattern will determine the size of the results. A slight change in the scheduling quantum could multiply the size of the result data by a huge factor. There is a big difference between:

1) Thread A inserts data.
2) Thread A queries data.
3) Thread A deletes data.
4) Thread B inserts data.
...


and
1) Thread A inserts data.
2) Thread B insers data.
...
101) Thread A queries data.
102) Thread B queries data.
...

Now, even if they're using separate tables, your test is still very sensitive to execution order. If thread A runs to completion and then thread B does, the database data will fit better into cache. If thread A runs partially, then thread B runs partially, when thread A runs again, its database stuff will not be hot.

>* java threads are created first and the data is prepared, then all the
>threads are started in a tight loop. Each thread runs multiple queries
>with a 10ms pause (to allow the other threads to get scheduled)

There are a number of ways you might be measuring nothing but how the scheduler chooses to interleave your threads. Benchmarking threads that yield suggests just this type of thing -- if a thread has useful work to do and another thread is not going to help it, *why* *yield*?

Are you worried the scheduler isn't going to schedule other threads?! Or is there some sane reason to force suboptimal scheduling when you're trying to benchmark a scheduler? Are you trying to see how it deals with pathological patterns? ;)

The only documentation I can see about what you're actually *doing* says things like "The schema and statements are almost identical to the non-threaded tests." Do you see why that's not helpful?

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ