linux-kernel - Re: [PATCH v2] perf tool: Fix ppid for synthesized fork events

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150327131005.GA162412@redhat.com>
Date:	Fri, 27 Mar 2015 09:10:05 -0400
From:	Don Zickus <dzickus@...hat.com>
To:	David Ahern <dsahern@...il.com>
Cc:	acme@...nel.org, linux-kernel@...r.kernel.org,
	Joe Mario <jmario@...hat.com>, Jiri Olsa <jolsa@...hat.com>
Subject: Re: [PATCH v2] perf tool: Fix ppid for synthesized fork events

On Thu, Mar 26, 2015 at 03:37:29PM -0600, David Ahern wrote:
> On 3/26/15 3:11 PM, Don Zickus wrote:
> >Sorry for drawing this out.  Originally the performance still seemed off.
> >But as we split the patch up to see where the perf impact was, the problem
> >seemed to have disappeared.  So we are testing the original patch again.
> >
> >The only difference now is we were playing with the -BN option in perf based
> >on your changelog, just because we never used it before. :-)
> 
> I was beyond surprised that you were measuring a 50% hit with the
> first patch. As mentioned in a previous response it only adds the
> processing of 3 additional lines to the already opened and read
> /proc/pid/status file. So, when I wrote this second version I wanted
> to make sure we are only measuring the impact of this change. The
> /proc/pid/status files are read on startup of the record -- before
> any samples are taken.
> 
> The intent of '-e cpu-clock -F 1000 -- usleep 1' is to avoid any
> samples since we don't care about them. Really the -a should be
> dropped as well -- no need to open per-cpu events.
> 
> -B impacts processing done at the end of the run:
> 
> builin-record.c, __cmd_record():
> 
>                 if (!rec->no_buildid)
>                         process_buildids(rec);
> 
> and -N says don't copying anything to ~/.debug. All together it
> tries to focus the measurement to /proc walking.
> 
> >
> >One last test without the -BN option and if that looks fine, then we have no
> >objections.  Again sorry for dragging this out.  I will let you know
> >tomorrow EST.
> 
> no problem; appreciate the heads up.

I talked with Joe on my way out the door yesterday and he confirmed, just
removing -BN from our test showed a performance hit with your patch.  With
the -BN option, there is no performance hit and we are perfectly fine with
your patch.

So, I guess I am confused how the -BN and your patch could change behaviour.

Just to re-iterate what we did, Joe kicked off a specJBB run and he did 20
captures of two runs (one with the unpatched binary and one with a pached
binary).

for i in {1..20}
do
  time perf.unpatched mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10
  time perf.patched   mem record -a -e cpu/mem-loads,ldlat=50/pp -e cpu/mem-stores/pp sleep 10
done

then we repeat the above test but with -BN in both runs.  We compare the
log sizes to make sure they are similar for the random snapshots and compare
the times.  With the -BN option, the times are generally within +/- 0.5
seconds of each.  Without the -BN option the patched perf binary is
generally +20-40 seconds slower.



However, based on your description above about what the -BN option does, I
am scratching my head about our results.  Thoughts?

Cheers,
Don

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/