[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1lj842nph.fsf@fess.ebiederm.org>
Date: Wed, 18 Aug 2010 03:56:58 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: "Zhang\, Yanmin" <yanmin_zhang@...ux.intel.com>
Cc: LKML <linux-kernel@...r.kernel.org>, alex.shi@...el.com,
Pavel Emelyanov <xemul@...nvz.org>,
"David S. Miller" <davem@...emloft.net>
Subject: Re: hackbench regression with 2.6.36-rc1
"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com> writes:
> Comparing with 2.6.35's result, hackbench (thread mode) has about
> 80% regression on dual-socket Nehalem machine and about 90% regression
> on 4-socket Tigerton machines.
That seems unfortunate. Do you only show a regression in the pthread
hackbench test? Do you show a regression when you use pipes?
Does the size of the regression very based on the number of loop
iterations? I ask because it appears that on the last message the
sender will exit necessitating that the receiver put the senders pid.
Which should be atypical.
> Command to start hackbench:
> #./hackbench 100 thread 2000
>
> process mode has no such regression.
>
> Profiling shows:
> #perf top
> samples pcnt function DSO
> _______ _____ ________________________ ________________________
>
> 74415.00 29.9% put_pid [kernel.kallsyms]
> 38395.00 15.4% unix_stream_recvmsg [kernel.kallsyms]
> 34877.00 14.0% unix_stream_sendmsg [kernel.kallsyms]
> 25204.00 10.1% pid_vnr [kernel.kallsyms]
> 21864.00 8.8% unix_scm_to_skb [kernel.kallsyms]
> 13637.00 5.5% cred_to_ucred [kernel.kallsyms]
> 6520.00 2.6% unix_destruct_scm [kernel.kallsyms]
> 4731.00 1.9% sock_alloc_send_pskb [kernel.kallsyms]
>
>
> With 2.6.35, perf doesn't show put_pid/pid_NR.
Yes. 2.6.35 is imperfect and can report the wrong pid in some
circumstances. I am surprised nothing related to the reference count on
struct cred does not show up in your profiling traces.
You are performing statistical sampling so I don't believe the
percentage of hits per function is the same as the percentage of
time per function.
Given that we are talking about a scheduler benchmark that is
doing something rather artificial (inter thread communication via
sockets), I don't know that this case is worth worrying about.
> Alex Shi and I did a quick bisect and located below 2 patches.
That is a plausible result. The atomic reference counts may
be causing you to ping pong cache lines between cpus.
Eric
> 1) commit 7361c36c5224519b258219fe3d0e8abc865d8134
> Author: Eric W. Biederman <ebiederm@...ssion.com>
> Date: Sun Jun 13 03:34:33 2010 +0000
>
> af_unix: Allow credentials to work across user and pid namespaces.
>
> In unix_skb_parms store pointers to struct pid and struct cred instead
> of raw uid, gid, and pid values, then translate the credentials on
> reception into values that are meaningful in the receiving processes
> namespaces.
>
>
> 2) commit 257b5358b32f17e0603b6ff57b13610b0e02348f
> Author: Eric W. Biederman <ebiederm@...ssion.com>
> Date: Sun Jun 13 03:32:34 2010 +0000
>
> scm: Capture the full credentials of the scm sender.
>
> Start capturing not only the userspace pid, uid and gid values of the
> sending process but also the struct pid and struct cred of the sending
> process as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists