lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZeiZWjRUmXszp0CN@x1>
Date: Wed, 6 Mar 2024 13:27:06 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Mike Galbraith <efault@....de>, Peter Zijlstra <peterz@...radead.org>,
	Marco Elver <elver@...gle.com>, linux-rt-users@...r.kernel.org,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Clark Williams <williams@...hat.com>,
	Alessandro Carminati <acarmina@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Artem Savkov <asavkov@...hat.com>
Subject: Re: 'perf test sigtrap' failing on PREEMPT_RT_FULL

On Wed, Mar 06, 2024 at 10:06:30AM -0300, Arnaldo Carvalho de Melo wrote:
> > In Thu, 4 Jan 2024 19:35:57 -0300, Arnaldo Carvalho de Melo wrote:
> > > +++ b/kernel/events/core.c
> > > @@ -6801,7 +6801,7 @@ static void perf_pending_task(struct callback_head *head)
> > >         * If we 'fail' here, that's OK, it means recursion is already disabled
> > >         * and we won't recurse 'further'.
> > >         */
> > >-       preempt_disable_notrace();
> > >+       migrate_disable();
> > >        rctx = perf_swevent_get_recursion_context();
>  
> > Pardon my ignorance, is it safe to call preempt_count() with preemption
> > enabled on PREEMPT_RT, or at least in the context being discussed here?
>  
> > Because:
>  
> > 	 perf_swevent_get_recursion_context()
> > 	     get_recursion_context()
> >                  interrupt_context_level()
> >                      preempt_count()	
>  
> > And:
>  
> > int perf_swevent_get_recursion_context(void)
> > {
> >         struct swevent_htable *swhash = this_cpu_ptr(&swevent_htable);
> > 
> >         return get_recursion_context(swhash->recursion);
> > }
> 
> Seems to be enough because perf_pending_task is a irq_work callback and

s/irq_work/task_work/ but that also doesn't reentry, I think

> that is guaranteed not to reentry?
> 
> Artem's tests with a RHEL kernel seems to indicate that, ditto for my,
> will test with upstream linux-6.8.y-rt.
> 
> But there is a lot more happening in perf_sigtrap and I'm not sure if
> the irq_work callback gets preempted we would not race with something
> else.
> 
> Marco, Mike, ideas?

Looking at:

commit ca6c21327c6af02b7eec31ce4b9a740a18c6c13f
Author: Peter Zijlstra <peterz@...radead.org>
Date:   Thu Oct 6 15:00:39 2022 +0200

    perf: Fix missing SIGTRAPs
    
    Marco reported:
    
    Due to the implementation of how SIGTRAP are delivered if
    perf_event_attr::sigtrap is set, we've noticed 3 issues:
    
      1. Missing SIGTRAP due to a race with event_sched_out() (more
         details below).
    
      2. Hardware PMU events being disabled due to returning 1 from
         perf_event_overflow(). The only way to re-enable the event is
         for user space to first "properly" disable the event and then
         re-enable it.
    
      3. The inability to automatically disable an event after a
         specified number of overflows via PERF_EVENT_IOC_REFRESH.
    
    The worst of the 3 issues is problem (1), which occurs when a
    pending_disable is "consumed" by a racing event_sched_out(), observed
    as follows:

-------------------------------------------------------------

That its what introduces perf_pending_task(), I'm now unsure we can just
disable migration, as event_sched_out() seems to require being called
under a raw_spin_lock and that disables preemption...

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ