linux-kernel - Re: 2.6.19-rc1: Slowdown in lmbench's fork

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 02 Nov 2006 11:33:18 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	tim.c.chen@...ux.intel.com
Cc:	linux-kernel@...r.kernel.org
Subject: Re: 2.6.19-rc1: Slowdown in lmbench's fork

Tim Chen <tim.c.chen@...ux.intel.com> writes:

> After introduction of the following patch:
>
> [PATCH] genirq: x86_64 irq: make vector_irq per cpu
> http://kernel.org/git/?
> p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=550f2299ac8ffaba943cf211380d3a8d3fa75301
>
> we see fork benchmark in lmbench-3.0-a7 slowed by 
> 11.5% on a 2 socket woodcrest machine.  Similar change
> is seen also on other SMP Xeon machines.
>
> When running lmbench, we have chosen the lmbench option
> to pin parent and child on different processor cores 
>   
> Overhead of calling sched_setaffinity to place the process 
> on processor is included in lmbench's fork time measurement. 
> The patch may play a role in increasing this.

The only think I can think of is that because data structures
moved around we may be seeing some more cache misses.  If we
were talking normal interrupts taking a little longer I can
see my change having a direct correlation.  But I don't believe
I touched anything in that patch that touched the IPI path.

I did add some things to the per cpu area and expanded it a little
which may be what you are seeing.

That feels like a significant increase in fork times.  I will think
about it and holler if I can think of a productive direction to try.

My only partial guess is that it might be worth adding the per cpu
variables my patch adds without any of the corresponding code changes.
And see if adding variables to the per cpu area is what is causing the
change.

The two tests I can see in this line are:
- to add the percpu vector_irq variable.
- to increase NR_IRQs.

My suspicion is that one of those two changes alone will change
things enough that you see your lmbench slowdown.  If that is the
case then it is probably worth shuffling around the variables in the
per cpu area to get better cache line affinity.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/