lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 22 Jan 2008 09:59:19 -0500 From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca> To: LKML <linux-kernel@...r.kernel.org> Cc: Ingo Molnar <mingo@...e.hu>, Linus Torvalds <torvalds@...ux-foundation.org>, Andrew Morton <akpm@...ux-foundation.org>, Christoph Hellwig <hch@...radead.org>, "Frank Ch. Eigler" <fche@...hat.com>, Jan Kiszka <jan.kiszka@...mens.com>, John Stultz <johnstul@...ibm.com>, Steven Rostedt <srostedt@...hat.com>, ltt-dev@...fik.org, Peter Zijlstra <a.p.zijlstra@...llo.nl>, Gregory Haskins <ghaskins@...ell.com>, Arnaldo Carvalho de Melo <acme@...stprotocols.net>, Thomas Gleixner <tglx@...utronix.de>, Sam Ravnborg <sam@...nborg.org>, Robert Wisniewski <bobww@...ibm.com> Subject: [RFC] LTTng Userspace Tracing, vDSO Hi, I am poking in the userspace tracing problem again. I had two different implementations in LTTng, but I left them out so I could come back with a more satisfying solution. Here are the ideas I would like to use in my forthcoming implementation. I submit them for comments, especially about limitations of vDSO I might not be aware of (code size... ?, max data size ?). The idea is to fast-path tracing so a system call is not required. I have to determine if I put the tracing code in a normal shared object or if it belongs to a vDSO. Basic ideas : * Buffers Per CPU buffers mapped by both userspace and the kernel menuconfig options for buffers - per process (safe, a process cannot overwrite other processes'data) - 4k default size, menuconfig configurable - Total of 12.5MB for 200 traced processes * 4 CPUs.. not bad. - system-wide (unsafe, low memory consumption) - 1MB default size, menuconfig configurable Depending on the system type, one could wish for efficiency (single-user server) or protection (multi-user system) Since threads can be stopped by the OS when the buffers are filled, we can afford to have smaller buffers than the kernel buffers (4k vs 1MB). Atomic space reservation to protect against traced signal handlers. Note that a full cmpxchg (synchronized wrt other CPUs) is required because the vgetcpu only returns a statistically correct cpu ID (might be migrated). * Synchronization - Use seqlock to synchronize vsyscall with kernel if required. * time source - vsyscall to get the raw cycle count - fallback on kernel syscall One could ask : why don't you stay in the kernel when you have to use a syscall to get the timestamps anyway ? Well, because passing a format string and var args from userspace to the kernel would not be "polite". ;) If userspace passes wrong format strings to the tracing code, I would rather prefer the segfault to happen in userspace and not involve the kernel at all. * marker registration System call to register markers in library init. Markers in special section, modify the linker script. Keep a per-process data structure (in kernel) to keep track of the registered markers. When the last process using a marker set exits, remove those markers. (refcount) * buffer switch buffer switch performed by a system call. Use inotify to tell lttd (trace reading daemon) that new debugfs files are created upon first buffer switch of a userspace process. * Filtering Export filter data and code to user-space. (eventually) * Multiple traces LTTng supports writing to multiple traces at the same time. Each of them can have different filtering expressions or be in different mode (extract the data to disk when the trace is running or flight recorder mode where the data is only kept in circular buffers and always overwritten). Can be useful for multiuser systems too. Therefore, we would have to reserve enough virtual address space in the vDSO region to put roughly 16 per-cpu buffers (?). Then, upon buffer allocation for a new trace, we would have to iterate over each process in the system; if it is being traced, we would allocate extra buffers within the tracing vDSO data region. Any thoughts ? Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists