lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 Jan 2020 18:26:47 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Masami Hiramatsu <mhiramat@...nel.org>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Brendan Gregg <brendan.d.gregg@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Ingo Molnar <mingo@...nel.org>, bpf@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Daniel Borkmann <daniel@...earbox.net>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        "David S . Miller" <davem@...emloft.net>, joel@...lfernandes.org,
        "Naveen N . Rao" <naveen.n.rao@...ux.ibm.com>,
        Anil S Keshavamurthy <anil.s.keshavamurthy@...el.com>
Subject: Re: [RFT PATCH 04/13] kprobes: Make optimizer delay to 1 second

On Thu, Jan 23, 2020 at 10:33:34AM +0900, Masami Hiramatsu wrote:
> On Wed, 22 Jan 2020 08:54:32 -0800
> "Paul E. McKenney" <paulmck@...nel.org> wrote:
> 
> > On Wed, Jan 22, 2020 at 10:12:40PM +0900, Masami Hiramatsu wrote:
> > > On Wed, 22 Jan 2020 07:11:15 -0500
> > > Steven Rostedt <rostedt@...dmis.org> wrote:
> > > 
> > > > On Wed, 22 Jan 2020 16:23:17 +0900
> > > > Masami Hiramatsu <mhiramat@...nel.org> wrote:
> > > > 
> > > > > On Tue, 21 Jan 2020 19:29:05 -0500
> > > > > Steven Rostedt <rostedt@...dmis.org> wrote:
> > > > > 
> > > > > > On Thu, 16 Jan 2020 23:44:52 +0900
> > > > > > Masami Hiramatsu <mhiramat@...nel.org> wrote:
> > > > > >   
> > > > > > > Since the 5 jiffies delay for the optimizer is too
> > > > > > > short to wait for other probes, make it longer,
> > > > > > > like 1 second.  
> > > > > > 
> > > > > > Hi Masami,
> > > > > > 
> > > > > > Can you explain more *why* 5 jiffies is too short.  
> > > > > 
> > > > > Yes, I had introduced this 5 jiffies delay for multiple probe registration
> > > > > and unregistration like systemtap, which will use array-based interface to
> > > > > register/unregister. In that case, 5 jiffies will be enough for the delay
> > > > > to wait for other kprobe registration/unregsitration.
> > > > > 
> > > > > However, since perf and ftrace register/unregister probes one-by-one with
> > > > > RCU synchronization interval, the optimizer will be started before
> > > > > finishing to register/unregister all probes.
> > > > > And the optimizer locks kprobe_mutex a while -- RCU-tasks synchronization.
> > > > > Since the kprobe_mutex is also involved in disabling kprobes, this also
> > > > > stops probe-event disabling.
> > > > > 
> > > > > Maybe 5 jiffies is enough for adding/removing a few probe events, but
> > > > > not enough for dozens of probe events.
> > > > > 
> > > > 
> > > > Perhaps we should have a mechanism that can detect new probes being
> > > > added, and just continue to delay the optimization, instead of having
> > > > some arbitrary delay.
> > > 
> > > Yes, that is what [03/13] does :) 
> > > Anyway, it seems that the RCU-synchronization takes more than 5 jiffies.
> > > And in that case, [03/13] still doesn't work. That's why I added this patch
> > > after that.
> > 
> > If the RCU synchronization is synchronize_rcu_tasks(), then yes, it
> > will often take way more than 5 jiffies.  If it is synchronize_rcu(),
> > 5 jiffies would not be unusual, especially on larger systems.
> > But in the case of synchronize_rcu(), one option is to instead use
> > synchronize_rcu_expedited().  It is not clear that this last is really
> > justified in this case, but figured it might be worth mentioning.
> 
> It is synchronize_rcu(), but it seems sometimes it is called several
> times on one probe disabling.

OK, good to know.

> Anyway, without this update, I added a printk to count optimizer
> queue-length and found that all optimizer call with a single kprobe
> on the quenes. I think this depends on the machine, but as far as
> I tested on 8-threads qemu x86, shows this result...
> 
> Probes: 256 kprobe_events
> Enable events
> real	0m 0.02s
> user	0m 0.00s
> sys	0m 0.02s
> [   17.730548] Queue-update: 180, skipped, Total Queued: 180
> [   17.739445] Queue-update: 1, go, Total Queued: 180
> Disable events
> [   19.744634] Queue-update: 1, go, Total Queued: 1
> [   19.770743] Queue-update: 1, go, Total Queued: 1
> [   19.886625] Queue-update: 1, go, Total Queued: 1
> [   20.006631] Queue-update: 1, go, Total Queued: 1
> [   20.126628] Queue-update: 1, go, Total Queued: 1
> [   20.246620] Queue-update: 1, go, Total Queued: 1
> [   20.374665] Queue-update: 1, go, Total Queued: 1
> [   20.486617] Queue-update: 1, go, Total Queued: 1
> [   20.606608] Queue-update: 1, go, Total Queued: 1
> [   20.726596] Queue-update: 1, go, Total Queued: 1
> [   20.846608] Queue-update: 1, go, Total Queued: 1
> [   20.966723] Queue-update: 1, go, Total Queued: 1
> [   21.086611] Queue-update: 1, go, Total Queued: 1
> [   21.206605] Queue-update: 1, go, Total Queued: 1
> [   21.326603] Queue-update: 1, go, Total Queued: 1
> [   21.462609] Queue-update: 1, go, Total Queued: 1
> [   21.566755] Queue-update: 1, go, Total Queued: 1
> [   21.686606] Queue-update: 1, go, Total Queued: 1
> [   21.806611] Queue-update: 1, go, Total Queued: 1
> [   21.926609] Queue-update: 1, go, Total Queued: 1
> [   22.046621] Queue-update: 1, go, Total Queued: 1
> [   22.166729] Queue-update: 1, go, Total Queued: 1
> [   22.302609] Queue-update: 1, go, Total Queued: 1
> [   22.510627] Queue-update: 1, go, Total Queued: 1
> [   22.536638] Queue-update: 1, go, Total Queued: 1
> [   22.654618] Queue-update: 1, go, Total Queued: 1
> [   22.774643] Queue-update: 1, go, Total Queued: 1
> [   22.902609] Queue-update: 1, go, Total Queued: 1
> [   23.022608] Queue-update: 1, go, Total Queued: 1
> [   23.158606] Queue-update: 1, go, Total Queued: 1
> [   23.254618] Queue-update: 1, go, Total Queued: 1
> [   23.374647] Queue-update: 1, go, Total Queued: 1
> [   23.494605] Queue-update: 1, go, Total Queued: 1
> [   23.614610] Queue-update: 1, go, Total Queued: 1
> [   23.734606] Queue-update: 1, go, Total Queued: 1
> [   23.854609] Queue-update: 1, go, Total Queued: 1
> [   23.974615] Queue-update: 1, go, Total Queued: 1
> [   24.094609] Queue-update: 1, go, Total Queued: 1
> [   24.230607] Queue-update: 1, go, Total Queued: 1
> [   24.342625] Queue-update: 1, go, Total Queued: 1
> [   24.475478] Queue-update: 1, go, Total Queued: 1
> [   24.574554] Queue-update: 1, go, Total Queued: 1
> [   24.694605] Queue-update: 1, go, Total Queued: 1
> [   24.814585] Queue-update: 1, go, Total Queued: 1
> [   24.934591] Queue-update: 1, go, Total Queued: 1
> [   25.054614] Queue-update: 1, go, Total Queued: 1
> [   25.174628] Queue-update: 1, go, Total Queued: 1
> [   25.294637] Queue-update: 1, go, Total Queued: 1
> [   25.414620] Queue-update: 1, go, Total Queued: 1
> [   25.534553] Queue-update: 1, go, Total Queued: 1
> [   25.654610] Queue-update: 1, go, Total Queued: 1
> [   25.774627] Queue-update: 1, go, Total Queued: 1
> [   25.894609] Queue-update: 1, go, Total Queued: 1
> [   26.030548] Queue-update: 1, go, Total Queued: 1
> [   26.134626] Queue-update: 1, go, Total Queued: 1
> [   26.254620] Queue-update: 1, go, Total Queued: 1
> [   26.373603] Queue-update: 1, go, Total Queued: 1
> [   26.502606] Queue-update: 1, go, Total Queued: 1
> [   26.614607] Queue-update: 1, go, Total Queued: 1
> [   26.734610] Queue-update: 1, go, Total Queued: 1
> [   26.854620] Queue-update: 1, go, Total Queued: 1
> [   26.974660] Queue-update: 1, go, Total Queued: 1
> [   27.094620] Queue-update: 1, go, Total Queued: 1
> [   27.214596] Queue-update: 1, go, Total Queued: 1
> [   27.334640] Queue-update: 1, go, Total Queued: 1
> [   27.494606] Queue-update: 1, go, Total Queued: 1
> [   27.574703] Queue-update: 1, go, Total Queued: 1
> [   27.694609] Queue-update: 1, go, Total Queued: 1
> [   27.814607] Queue-update: 1, go, Total Queued: 1
> [   27.934631] Queue-update: 1, go, Total Queued: 1
> [   28.054606] Queue-update: 1, go, Total Queued: 1
> [   28.182685] Queue-update: 1, go, Total Queued: 1
> [   28.318568] Queue-update: 1, go, Total Queued: 1
> [   28.422610] Queue-update: 1, go, Total Queued: 1
> [   28.534621] Queue-update: 1, go, Total Queued: 1
> [   28.654618] Queue-update: 1, go, Total Queued: 1
> [   28.774622] Queue-update: 1, go, Total Queued: 1
> [   28.894609] Queue-update: 1, go, Total Queued: 1
> [   29.022609] Queue-update: 1, go, Total Queued: 1
> [   29.150615] Queue-update: 1, go, Total Queued: 1
> [   29.262610] Queue-update: 1, go, Total Queued: 1
> [   29.374621] Queue-update: 1, go, Total Queued: 1
> [   29.494606] Queue-update: 1, go, Total Queued: 1
> [   29.614616] Queue-update: 1, go, Total Queued: 1
> [   29.734607] Queue-update: 1, go, Total Queued: 1
> [   29.854601] Queue-update: 1, go, Total Queued: 1
> [   29.974610] Queue-update: 1, go, Total Queued: 1
> [   30.094625] Queue-update: 1, go, Total Queued: 1
> [   30.214606] Queue-update: 1, go, Total Queued: 1
> [   30.334602] Queue-update: 1, go, Total Queued: 1
> [   30.454634] Queue-update: 1, go, Total Queued: 1
> [   30.574606] Queue-update: 1, go, Total Queued: 1
> [   30.694589] Queue-update: 1, go, Total Queued: 1
> [   30.814613] Queue-update: 1, go, Total Queued: 1
> [   30.934602] Queue-update: 1, go, Total Queued: 1
> [   31.054605] Queue-update: 1, go, Total Queued: 1
> [   31.182596] Queue-update: 1, go, Total Queued: 1
> [   31.318621] Queue-update: 1, go, Total Queued: 1
> [   31.414615] Queue-update: 1, go, Total Queued: 1
> [   31.534610] Queue-update: 1, go, Total Queued: 1
> [   31.670608] Queue-update: 1, go, Total Queued: 1
> [   31.774626] Queue-update: 1, go, Total Queued: 1
> [   31.894607] Queue-update: 1, go, Total Queued: 1
> [   32.014609] Queue-update: 1, go, Total Queued: 1
> [   32.134607] Queue-update: 1, go, Total Queued: 1
> [   32.254611] Queue-update: 1, go, Total Queued: 1
> [   32.374608] Queue-update: 1, go, Total Queued: 1
> [   32.494619] Queue-update: 1, go, Total Queued: 1
> [   32.614607] Queue-update: 1, go, Total Queued: 1
> [   32.734612] Queue-update: 1, go, Total Queued: 1
> [   32.862616] Queue-update: 1, go, Total Queued: 1
> [   32.974620] Queue-update: 1, go, Total Queued: 1
> [   33.110609] Queue-update: 1, go, Total Queued: 1
> [   33.214609] Queue-update: 1, go, Total Queued: 1
> [   33.342611] Queue-update: 1, go, Total Queued: 1
> [   33.454607] Queue-update: 1, go, Total Queued: 1
> [   33.574611] Queue-update: 1, go, Total Queued: 1
> [   33.694619] Queue-update: 1, go, Total Queued: 1
> [   33.814607] Queue-update: 1, go, Total Queued: 1
> [   33.950614] Queue-update: 1, go, Total Queued: 1
> [   34.062609] Queue-update: 1, go, Total Queued: 1
> [   34.174609] Queue-update: 1, go, Total Queued: 1
> [   34.294619] Queue-update: 1, go, Total Queued: 1
> [   34.430533] Queue-update: 1, go, Total Queued: 1
> [   34.534594] Queue-update: 1, go, Total Queued: 1
> [   34.654605] Queue-update: 1, go, Total Queued: 1
> [   34.790596] Queue-update: 1, go, Total Queued: 1
> [   34.902611] Queue-update: 1, go, Total Queued: 1
> [   35.014630] Queue-update: 1, go, Total Queued: 1
> [   35.134634] Queue-update: 1, go, Total Queued: 1
> [   35.262608] Queue-update: 1, go, Total Queued: 1
> [   35.374634] Queue-update: 1, go, Total Queued: 1
> [   35.494617] Queue-update: 1, go, Total Queued: 1
> [   35.622608] Queue-update: 1, go, Total Queued: 1
> [   35.742610] Queue-update: 1, go, Total Queued: 1
> [   35.854572] Queue-update: 1, go, Total Queued: 1
> [   35.982596] Queue-update: 1, go, Total Queued: 1
> [   36.094603] Queue-update: 1, go, Total Queued: 1
> [   36.222612] Queue-update: 1, go, Total Queued: 1
> [   36.334610] Queue-update: 1, go, Total Queued: 1
> [   36.454619] Queue-update: 1, go, Total Queued: 1
> [   36.574619] Queue-update: 1, go, Total Queued: 1
> [   36.694643] Queue-update: 1, go, Total Queued: 1
> [   36.814614] Queue-update: 1, go, Total Queued: 1
> [   36.934610] Queue-update: 1, go, Total Queued: 1
> [   37.062609] Queue-update: 1, go, Total Queued: 1
> [   37.198611] Queue-update: 1, go, Total Queued: 1
> [   37.294618] Queue-update: 1, go, Total Queued: 1
> [   37.414618] Queue-update: 1, go, Total Queued: 1
> [   37.534595] Queue-update: 1, go, Total Queued: 1
> [   37.662594] Queue-update: 1, go, Total Queued: 1
> [   37.774610] Queue-update: 1, go, Total Queued: 1
> [   37.894618] Queue-update: 1, go, Total Queued: 1
> [   38.014619] Queue-update: 1, go, Total Queued: 1
> [   38.142612] Queue-update: 1, go, Total Queued: 1
> [   38.254609] Queue-update: 1, go, Total Queued: 1
> [   38.374619] Queue-update: 1, go, Total Queued: 1
> [   38.502481] Queue-update: 1, go, Total Queued: 1
> [   38.614596] Queue-update: 1, go, Total Queued: 1
> [   38.734597] Queue-update: 1, go, Total Queued: 1
> [   38.854606] Queue-update: 1, go, Total Queued: 1
> [   38.974620] Queue-update: 1, go, Total Queued: 1
> [   39.094617] Queue-update: 1, go, Total Queued: 1
> [   39.222597] Queue-update: 1, go, Total Queued: 1
> [   39.334610] Queue-update: 1, go, Total Queued: 1
> [   39.454609] Queue-update: 1, go, Total Queued: 1
> [   39.574633] Queue-update: 1, go, Total Queued: 1
> [   39.694611] Queue-update: 1, go, Total Queued: 1
> [   39.814608] Queue-update: 1, go, Total Queued: 1
> [   39.934610] Queue-update: 1, go, Total Queued: 1
> [   40.054621] Queue-update: 1, go, Total Queued: 1
> [   40.174611] Queue-update: 1, go, Total Queued: 1
> [   40.297471] Queue-update: 1, go, Total Queued: 1
> [   40.414504] Queue-update: 1, go, Total Queued: 1
> [   40.534601] Queue-update: 1, go, Total Queued: 1
> [   40.654611] Queue-update: 1, go, Total Queued: 1
> [   40.774609] Queue-update: 1, go, Total Queued: 1
> [   40.894628] Queue-update: 1, go, Total Queued: 1
> [   41.014608] Queue-update: 1, go, Total Queued: 1
> [   41.135594] Queue-update: 1, go, Total Queued: 1
> real	0m 21.40s
> user	0m 0.00s
> sys	0m 0.04s

So 21.4s/256 = 84 milliseconds per event disable, correct?

It might be worth trying synchronize_rcu_expedited() just as an experiment
to see if it speeds things up significantly.

							Thanx, Paul

> Remove events
> real	0m 2.14s
> user	0m 0.00s
> sys	0m 0.04s
> 
> 
> Thank you,
> 
> 
> -- 
> Masami Hiramatsu <mhiramat@...nel.org>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ