lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 25 Sep 2009 21:02:44 -0400
From:	Loren Rogers <loren.rogers@...il.com>
To:	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Kernel getting hosed?

Hello,
I am developing a multi-threaded media-based application written for
an iMX27-based processor running kernel 2.6.24.  But I'm seeing a
weird "phenomenon" where certain processes/threads are not being
serviced and my clock (according to gettimeofday()) get's set back as
well.  There are many symptoms to this behavior.  Here are some
symptoms:

1. It's usually the same application-based threads that are either
being serviced or not serviced
2. The problem usually lasts for about 5 and a half minutes and then
appears to correct itself
3. I'll see the cpu load for my application-process quickly jump up to
99% right before the phenomenon (according to top)
4. My IP-telnet and serial terminal sessions are both unusable.
5. I have a logging utility with a timestamp feature (gettimeofday())
where, once this problem corrects itself, the clock has been set to
the exact time the problem started (i.e. let's say the problem starts
at 12:00:00, and I'll be logging msgs like 12:01:00, 12:04:22, etc...
Then after the problem "stops" the timestamp on my logger is once
again 12:00:00).  And when I do a command "date" the clock will say
12:00:00!
6. I think all of my IP-based network threads are being serviced.
7. A colleague wrote a utility on one of the "alive" threads to start
collecting proc data once we know we are in this state; and he told me
that the proc counters have pretty much halted.


My colleagues and I have been chasing this for three weeks now.  I
have no clue on how to determine the culprit(s).  At first I thought
it was some bad code in the user-based application, but can someone
tell me with 100% certainty that this is either a user-space problem
or a kernel problem?  If it is a kernel problem, how can a user-space
application hose a kernel to this extent?

If anybody can help me with some tool or tools to help diagnose the
cause of the problem or even where to start looking I would REALLY
appreciate it.  Thank you
/Loren
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ