lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c86c4470907290537q42195dc6s61d0f6d4a3a70154@mail.gmail.com>
Date:	Wed, 29 Jul 2009 14:37:10 +0200
From:	stephane eranian <eranian@...glemail.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Ingo Molnar <mingo@...e.hu>, LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Robert Richter <robert.richter@....com>,
	Paul Mackerras <paulus@...ba.org>,
	Andi Kleen <andi@...stfloor.org>,
	Maynard Johnson <mpjohn@...ibm.com>,
	Carl Love <cel@...ibm.com>,
	Corey J Ashford <cjashfor@...ibm.com>,
	Philip Mucci <mucci@...s.utk.edu>,
	Dan Terpstra <terpstra@...s.utk.edu>,
	perfmon2-devel <perfmon2-devel@...ts.sourceforge.net>,
	Michael Kerrisk <mtk.manpages@...glemail.com>,
	oleg <oleg@...hat.com>
Subject: Re: perf_counters issue with self-sampling threads

Peter,

On Wed, Jul 29, 2009 at 2:19 PM, Peter Zijlstra<a.p.zijlstra@...llo.nl> wrote:
> On Mon, 2009-07-27 at 18:51 +0200, stephane eranian wrote:
>> I believe there is a problem with the current perf_counters (PCL)
>> code for self-sampling threads. The problem is related to sample
>> notifications via signal.
>>
>> PCL (just like perfmon) is using SIGIO, an asynchronous signal,
>> to notify user applications of the availability of data in the event
>> buffer.
>>
>> POSIX does not mandate that asynchronous signals be delivered
>> to the thread in which they originated. Any thread in the process
>> may process the signal, assuming it does not have the signal
>> blocked.
>
> This signal stuff makes my head spin a little, however:
>
> fcntl(2) for F_SETOWN says:
>
> If a non-zero value is given to F_SETSIG  in  a  multi‐ threaded
> process running with a threading library that supports thread groups
> (e.g., NPTL),  then  a  positive value  given  to  F_SETOWN  has  a
> different  meaning: instead of being a process ID identifying a whole
> pro‐ cess,  it  is a thread ID identifying a specific thread within a
> process.  Consequently, it may be necessary to pass  F_SETOWN  the
> result of gettid(2) instead of get‐ pid(2) to get sensible results
> when F_SETSIG  is  used.  (In  current  Linux  threading
> implementations, a main thread’s thread ID is the same as its process
> ID.  This means  that  a  single-threaded program can equally use
> gettid(2) or getpid(2) in this scenario.)   Note,  how‐ ever,  that
> the  statements  in  this paragraph do not apply to the SIGURG signal
> generated  for  out-of-band data  on a socket: this signal is always
> sent to either a process or a process group, depending  on  the  value
> given  to  F_SETOWN.   Note  also  that Linux imposes a limit on the
> number of real-time signals  that  may  be queued  to  a  process (see
> getrlimit(2) and signal(7)) and if this limit is reached, then the
> kernel  reverts to  delivering  SIGIO,  and this signal is delivered
> to the entire process rather than to a specific thread.
>
>
> Which seems to imply that when we feed fcntl(F_SETOWN) a TID instead of
> a PID it should deliver SIGIO to the thread instead of the whole process
> -- which, to me, seems a sane semantic.
>
Yes, I remember that manpage. I got the same impression and in fact that is
what I document in some of my test programs. So you read this right.

> However,
>
>  kill_fasync(SIGIO)
>    __kill_fasync()
>      send_sigio()
>        /* if pid_type is a PIDTYPE_PID and pid a TID this should
>           only iterate the one thread, I think */
>        do_each_pid_task() {
>          send_sigio_to_task();
>        } while_each_pid_task();
>
> where:
>
>  send_sigio_to_task()
>    group_send_sig_info()
>      __group_send_sig_info()
>        send_signal(.group = 1) /* uh-ow trouble */
>          __send_signal()
>            if (group)
>               pending = &t->signal->shared_pending
>
> which will result in the signal being send to the whole process anyway.
>
Exactly! That is the code path and this is why this does not work as
expected. Nowhere along that path is there special casing for that
F_SETOWN of tid vs. pid. kill_fasync() implies group.


>
> Now I was considering teaching send_sigio_to_task() to use
> specific_send_sig_info() when fown->pid != fown->group_leader->pid or
> something, but I'm not sure that won't break anything.
>
Yes, that's the problem with touching this. I don't know if this will break
things. That's why I was suggested creating a parallel code path which
does what we want without modifying the existing path. Unless you know
some signal expert at redhat or elsewhere.

> Alternatively, I've missed a detail and I either read the manpage wrong,
> or the code, or both of them.
>
The code does not correspond to the manpage. Not clear which one
is correct though. This F_SETOWN trick looks very Linux specific.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ