lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 06 Nov 2018 22:13:49 -0800 (PST)
From:   David Miller <davem@...emloft.net>
To:     jolsa@...hat.com
Cc:     acme@...nel.org, linux-kernel@...r.kernel.org, namhyung@...nel.org,
        jolsa@...nel.org
Subject: Re: [PATCH RFC] hist lookups

From: Jiri Olsa <jolsa@...hat.com>
Date: Tue, 6 Nov 2018 21:42:55 +0100

> I pushed that fix in perf/fixes branch, but I'm still occasionaly
> hitting the namespace crash.. working on it ;-)

Jiri, how can this new scheme work without setting copy_on_queue
for the queued_events we use here?

I don't see copy_on_queue being set and that means the queued event
structures reference the event memory directly in the mmaps, after the
mmap thread has released them back to the queue.

That means new events can come in to the mmap ring and overwrite what
was there previously, maybe even while deliver_event() is in the
middle of parsing the event.

Setting copy_on_queue for data[0] and data[1] makes all of the crashes
go away for me.

I get a lot of "[unknown]" shared objects shortly after perf top
starts up during a full workload.  I've been wondering about one
side effect of how the mmap queues are processed, consider the
following:

	cpu 0			cpu 1

				exec
				create new mmap2 events
				scheduled to cpu 0 for whatever reason
	sample 1
	sample 2

And let's say that perf top is backlogged processing the mmap ring of
events generated for cpu 0, and sees sample 1 and sample 2 before
getting to any of cpu 1's events.

This means the thread and map and symbol objects won't exist and
we'll get those '[Unknown]' histogram entries, and they won't go
away.

When it finally stops looping over the mmap ring for cpu 0's events
it gets to cpu 1's mmap ring and sees the exec and mmap2 events
but at that point it's far too late.

I surmise from what I see with perf top right now that this happens
a lot.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ