lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 28 Feb 2016 01:10:19 +0900
From:	Taeung Song <treeze.taeung@...il.com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
	Jiri Olsa <jolsa@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Lai Jiangshan <jiangshanlai@...il.com>
Subject: Re: [RFC] Re: [PATCH v3 1/2] tracing/syscalls: Rename variable 'nr'
 to '__syscall_nr'

Hi, Steven

On 02/27/2016 03:23 AM, Steven Rostedt wrote:
> On Fri, 26 Feb 2016 10:57:13 -0300
> Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
>
>> Em Fri, Feb 26, 2016 at 10:14:06PM +0900, Taeung Song escreveu:
>>> There is a problem about duplicated variable name i.e.
>>>
>>>      # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_io_getevents/format
>>>      name: sys_enter_io_getevents
>>>      ID: 739
>>>      format:
>>
>> Steven, what do you think?
>>
>> Should we break this ABI while disambiguating the 'nr' field, using
>> '__syscall_nr' in an attempt to use a name that is unlikely to be used
>> by a real syscall argument name?
>>
>> If we stand by published ABIs, we should keep it written in stone and
>> state that the first 'nr' means '__syscall_nr' while keeping it as-is,
>> the change for 'perf trace' in that case is to do nothing, it work
>> as-is, we have just to fix the python binding to do that rename.
>
> ABIs only matter if they break something, and people complain. Linus
> has been somewhat accepting of us fixing those tools that break and we
> push out the fixes. If an ABI breaks in the forest and nobody is around
> to complain about it, did it really break?
>
> I would say, lets make the change and fix perf. If people complain, we
> send them the fixes for their tools. If they need the distros to have
> the fixes, then let the change be reverted, and we wait till the
> distros have the update (this may take a few years), then re-submit.
>
> This worked for me to get rid of padding that was in every trace event.
> The change was reverted, I fixed the tools that broke, waited till all
> the major distros had the updates. And resubmitted the change. Linus
> took it.
>
>
>>
>> Perhaps we can live with that, to avoid having three different cases:
>> !nr, nr and __syscall_nr.
>
> We could, do this as well. Want me to add something to event-parse?
>
>>
>> Ingo, Peter, have you guys followed this case?
>>
>> Summary: Some tracepoint have multiple fields with the same name, 'nr',
>>           the first one is a unique syscall ID, the other is a syscall
>>           argument:
>>
>> [root@...et ~]# cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_io_getevents/format
>> name: sys_enter_io_getevents
>> ID: 747
>> format:
>> 	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
>> 	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
>> 	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
>> 	field:int common_pid;	offset:4;	size:4;	signed:1;
>>
>> 	field:int nr;	offset:8;	size:4;	signed:1;
>> 	field:aio_context_t ctx_id;	offset:16;	size:8;	signed:0;
>> 	field:long min_nr;	offset:24;	size:8;	signed:0;
>> 	field:long nr;	offset:32;	size:8;	signed:0;
>> 	field:struct io_event * events;	offset:40;	size:8;	signed:0;
>> 	field:struct timespec * timeout;	offset:48;	size:8;	signed:0;
>>
>> print fmt: "ctx_id: 0x%08lx, min_nr: 0x%08lx, nr: 0x%08lx, events: 0x%08lx, timeout: 0x%08lx", ((unsigned long)(REC->ctx_id)), ((unsigned long)(REC->min_nr)), ((unsigned long)(REC->nr)), ((unsigned long)(REC->events)), ((unsigned long)(REC->timeout))
>> [root@...et ~]#
>>
>
> BTW, here's a less intrusive change, because honestly, I hate the
> kernel structure having underscores in the name.
>
> This could be signed off by Taeung Song and myself.
>
> -- Steve
>
> diff --git a/kernel/trace/trace_syscalls.c b/kernel/trace/trace_syscalls.c
> index 0655afbea83f..d1663083d903 100644
> --- a/kernel/trace/trace_syscalls.c
> +++ b/kernel/trace/trace_syscalls.c
> @@ -186,11 +186,11 @@ print_syscall_exit(struct trace_iterator *iter, int flags,
>
>   extern char *__bad_type_size(void);
>
> -#define SYSCALL_FIELD(type, name)					\
> -	sizeof(type) != sizeof(trace.name) ?				\
> +#define SYSCALL_FIELD(type, field, name)				\
> +	sizeof(type) != sizeof(trace.field) ?				\
>   		__bad_type_size() :					\
> -		#type, #name, offsetof(typeof(trace), name),		\
> -		sizeof(trace.name), is_signed_type(type)
> +		#type, #name, offsetof(typeof(trace), field),		\
> +		sizeof(trace.field), is_signed_type(type)
>
>   static int __init
>   __set_enter_print_fmt(struct syscall_metadata *entry, char *buf, int len)
> @@ -261,7 +261,8 @@ static int __init syscall_enter_define_fields(struct trace_event_call *call)
>   	int i;
>   	int offset = offsetof(typeof(trace), args);
>
> -	ret = trace_define_field(call, SYSCALL_FIELD(int, nr), FILTER_OTHER);
> +	ret = trace_define_field(call, SYSCALL_FIELD(int, nr, __syscall_nr),
> +				 FILTER_OTHER);
>   	if (ret)
>   		return ret;
>
> @@ -281,11 +282,12 @@ static int __init syscall_exit_define_fields(struct trace_event_call *call)
>   	struct syscall_trace_exit trace;
>   	int ret;
>
> -	ret = trace_define_field(call, SYSCALL_FIELD(int, nr), FILTER_OTHER);
> +	ret = trace_define_field(call, SYSCALL_FIELD(int, nr, __syscall_nr),
> +				 FILTER_OTHER);
>   	if (ret)
>   		return ret;
>
> -	ret = trace_define_field(call, SYSCALL_FIELD(long, ret),
> +	ret = trace_define_field(call, SYSCALL_FIELD(long, ret, ret),
>   				 FILTER_OTHER);
>
>   	return ret;
>

Would you mean to avoid struct syscall_trace_enter or _exit
has '__syscall_nr' variable ?

So not including a portion of this patch([PATCH v3 1/2] 
tracing/syscalls: ...)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 8414fa4..98b3c66 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -88,13 +88,13 @@ enum trace_type {
   */
  struct syscall_trace_enter {
          struct trace_entry      ent;
-        int                     nr;
+        int                     __syscall_nr;
          unsigned long           args[];
  };

  struct syscall_trace_exit {
          struct trace_entry      ent;
-        int                     nr;
+        int                     __syscall_nr;
          long                    ret;
  };


I got it :-)
In conclusion, output of format 
(/sys/kernel/debug/tracing/events/syscalls/*/format)
has __syscall_nr for syscall number but the kernel structure
'syscall_trace_enter' and 'syscall_trace_exit' have not __syscall_nr 
variable.
Is it right ?

I'll resend modified patch after testing new patch you suggest soon.

Thanks,
Taeung

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ