lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090821000108.GC6078@nowhere>
Date:	Fri, 21 Aug 2009 02:01:12 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Masami Hiramatsu <mhiramat@...hat.com>
Cc:	Ingo Molnar <mingo@...e.hu>, Steven Rostedt <rostedt@...dmis.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Ananth N Mavinakayanahalli <ananth@...ibm.com>,
	Avi Kivity <avi@...hat.com>, Andi Kleen <ak@...ux.intel.com>,
	Christoph Hellwig <hch@...radead.org>,
	"Frank Ch. Eigler" <fche@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Jason Baron <jbaron@...hat.com>,
	Jim Keniston <jkenisto@...ibm.com>,
	"K.Prasad" <prasad@...ux.vnet.ibm.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Li Zefan <lizf@...fujitsu.com>,
	PrzemysławPawełczyk <przemyslaw@...elczyk.it>,
	Roland McGrath <roland@...hat.com>,
	Sam Ravnborg <sam@...nborg.org>,
	Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
	Tom Zanussi <tzanussi@...il.com>,
	Vegard Nossum <vegard.nossum@...il.com>,
	systemtap <systemtap@...rces.redhat.com>,
	kvm <kvm@...r.kernel.org>,
	DLE <dle-develop@...ts.sourceforge.net>
Subject: Re: [TOOL] kprobestest : Kprobe stress test tool

On Thu, Aug 20, 2009 at 03:45:18PM -0400, Masami Hiramatsu wrote:
> Frederic Weisbecker wrote:
>> On Thu, Aug 13, 2009 at 04:57:20PM -0400, Masami Hiramatsu wrote:
>>> This script tests kprobes to probe on all symbols in the kernel and finds
>>> symbols which must be blacklisted.
>>>
>>>
>>> Usage
>>> -----
>>>    kprobestest [-s SYMLIST] [-b BLACKLIST] [-w WHITELIST]
>>>       Run stress test. If SYMLIST file is specified, use it as
>>>       an initial symbol list (This is useful for verifying white list
>>>       after diagnosing all symbols).
>>>
>>>    kprobestest cleanup
>>>       Cleanup all lists
>>>
>>>
>>> How to Work
>>> -----------
>>> This tool list up all symbols in the kernel via /proc/kallsyms, and sorts
>>> it into groups (each of them including 64 symbols in default). And then,
>>> it tests each group by using kprobe-tracer. If a kernel crash occurred,
>>> that group is moved into 'failed' dir. If the group passed the test, this
>>> script moves it into 'passed' dir and saves kprobe_profile into
>>> 'passed/profiles/'.
>>> After testing all groups, all 'failed' groups are merged and sorted into
>>> smaller groups (divided by 4, in default). And those are tested again.
>>> This loop will be repeated until all group has just 1 symbol.
>>>
>>> Finally, the script sorts all 'passed' symbols into 'tested', 'untested',
>>> and 'missed' based on profiles.
>>>
>>>
>>> Note
>>> ----
>>>   - This script just gives us some clues to the blacklisted functions.
>>>     In some cases, a combination of probe points will cause a problem, but
>>>     each of them doesn't cause the problem alone.
>>>
>>> Thank you,
>>>
>>
>>
>> This script makes my x86-64 dual core easily and hardly locking-up
>> on the 1st batch of symbols to test.
>> I have one sym list in the failed and unset directories:
>>
>> int_very_careful
>> int_signal
>> int_restore_rest
>> stub_clone
>> stub_fork
>> stub_vfork
>> stub_sigaltstack
>> stub_iopl
>> ptregscall_common
>> stub_execve
>> stub_rt_sigreturn
>> irq_entries_start
>> common_interrupt
>> ret_from_intr
>> exit_intr
>> retint_with_reschedule
>> retint_check
>> retint_swapgs
>> retint_restore_args
>> restore_args
>> irq_return
>> retint_careful
>> retint_signal
>> retint_kernel
>> irq_move_cleanup_interrupt
>> reboot_interrupt
>> apic_timer_interrupt
>> generic_interrupt
>> invalidate_interrupt0
>> invalidate_interrupt1
>> invalidate_interrupt2
>> invalidate_interrupt3
>> invalidate_interrupt4
>> invalidate_interrupt5
>> invalidate_interrupt6
>> invalidate_interrupt7
>> threshold_interrupt
>> thermal_interrupt
>> mce_self_interrupt
>> call_function_single_interrupt
>> call_function_interrupt
>> reschedule_interrupt
>> error_interrupt
>> spurious_interrupt
>> perf_pending_interrupt
>> divide_error
>> overflow
>> bounds
>> invalid_op
>> device_not_available
>> double_fault
>> coprocessor_segment_overrun
>> invalid_TSS
>> segment_not_present
>> spurious_interrupt_bug
>> coprocessor_error
>> alignment_check
>> simd_coprocessor_error
>> native_load_gs_index
>> gs_change
>> kernel_thread
>> child_rip
>> kernel_execve
>> call_softirq
>>
>>
>> I don't have a crash log because I was running with X.
>> But it also happened with other batch of symbols.
>
> Thank you for reporting, here, I also have a result
> tested on KVM@...-64.
>
> native_read_tscp
> native_read_msr_safe
> native_read_msr_amd_safe
> native_write_msr_safe
> vmalloc_fault
> spurious_fault
> search_exception_tables
> notify_die
> trace_hardirqs_off_caller
> ident_complete
> lock_acquire
> lock_release
> bad_address
> secondary_startup_64
> stack_start
> bad_address
> restore_args
> irq_return
> restore
> trace_hardirqs_off_thunk
> init_level4_pgt
> level3_ident_pgt
> level3_kernel_pgt
> level2_fixmap_pgt
> _text
> startup_64
> level1_fixmap_pgt
> level2_ident_pgt
> level2_kernel_pgt
> level2_spare_pgt
> native_get_debugreg
> native_set_debugreg
> native_set_iopl_mask
> native_load_sp0
> debug_show_all_locks
> debug_check_no_locks_held
> valid_state
> mark_lock
> mark_held_locks
> lockdep_trace_alloc
> trace_softirqs_on
> trace_hardirqs_on_caller
> __down_write
> __down_read
> trace_hardirqs_on_thunk
> lockdep_sys_exit_thunk
>
> Most of them can be fixed just by adding __kprobes.
> Some of them which are already in the another section, kprobes
> should check the symbols are in the section.



You mean the blacklist?

I also fear that putting bad kprobed functions into the kprobe
section or into the blacklist may hide some kprobe internal bugs.

Doing so is indeed mandatory for functions that trigger tracing
recursion of things like that, but what if kprobe has an internal
bug that only triggers while probe a certain class of function.

Ie: it would be nice to identify the reason of the crash for
each culprit in these lists.

That may even help to find the others in advance.

Also kprobes seems to be a very fragile feature (that's what
this selftest unearthes at least for me).
And it really needs a recursion detection that stops every kprobing
while reaching a given threshold of recursion. Something
that would dump the stack and the falling kprobe structure.

That would avoid such hard lockups and also help to identify
the dangerous symbols to probe.



>> The problem is that I don't have any serial line in this
>> box then I can't catch any crash log.
>> My K7 testbox also died in my arms this afternoon.
>>
>> But I still have two other testboxes (one P2 and one P3),
>> hopefully I could reproduce the problem in these boxes
>> in which I can connect a serial line.
>
> Thank you for helping me to find it!
>
>> I've pushed your patches in the following git tree:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/fgrederic/random-tracing.git \
>> 	tracing/kprobes
>>
>> So you can send patches on top of this one.
>
> Great! I've found another trivial bugs, so I'll fix those on it.

Cool :)

Btw, here is the result of your stress test in a PIII (attaching the log
and the config).

Thanks.

View attachment "ttyS0.log" of type "text/plain" (97086 bytes)

View attachment ".config" of type "text/plain" (43486 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ