lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <FDBACF11-D9F6-4DE5-A0D4-800903A243B7@gmail.com>
Date:	Tue, 27 May 2014 22:09:54 -0700
From:	Tony Luck <tony.luck@...il.com>
To:	Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Cc:	"iskra@....anl.gov" <iskra@....anl.gov>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Andi Kleen <andi@...stfloor.org>, Borislav Petkov <bp@...e.de>,
	"gong.chen@...ux.jf.intel.com" <gong.chen@...ux.jf.intel.com>
Subject: Re: [PATCH 1/2] memory-failure: Send right signal code to correct thread

I'm exploring options to see what writers of threaded applications might want/need. I'm very doubtful that they would really want "broadcast to all threads". What if there are hundreds or thousands of threads? We send the signals from the context of the thread that hit the error. But that might take a while. Meanwhile any of those threads that were already scheduled on other CPUs are back running again. So there are big races even if we broadcast.

Sent from my iPhone

> On May 27, 2014, at 17:15, Naoya Horiguchi <n-horiguchi@...jp.nec.com> wrote:
> 
> On Tue, May 27, 2014 at 03:53:55PM -0700, Tony Luck wrote:
>>> - make sure that every thread in a recovery aware application should have
>>>   a SIGBUS handler, inside which
>>>   * code for SIGBUS(BUS_MCEERR_AR) is enabled for every thread
>>>   * code for SIGBUS(BUS_MCEERR_AO) is enabled only for a dedicated thread
>> 
>> But how does the kernel know which is the special thread that
>> should see the "AO" signal?  Broadcasting the signal to all
>> threads seems to be just as likely to cause problems to
>> an application as the h/w broadcasting MCE to all processors.
> 
> I thought that kernel doesn't have to know about which thread is the
> special one if the AO signal is broadcasted to all threads, because
> in such case the special thread always gets the AO signal.
> 
> The reported problem happens only the application sets PF_MCE_EARLY flag,
> and such application is surely recovery aware, so we can assume that the
> coders must implement SIGBUS handler for all threads. Then all other threads
> but the special one can intentionally ignore AO signal. This is to avoid the
> default behavior for SIGBUS ("kill all threads" as Kamil said in the previous
> email.)
> 
> And I hope that downside of signal broadcasting is smaller than MCE
> broadcasting because the range of broadcasting is limited to a process group,
> not to the whole system.
> 
> # I don't intend to rule out other possibilities like adding another prctl
> # flag, so if you have a patch, that's would be great.
> 
> Thanks,
> Naoya Horiguchi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ