[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJvTdKmz9105cYDiZThfCd29PWeNh2d2QbZFZ0n_UEeqCah7Fg@mail.gmail.com>
Date: Fri, 29 May 2015 03:47:16 -0400
From: Len Brown <lenb@...nel.org>
To: Ingo Molnar <mingo@...nel.org>
Cc: Jan H. Schönherr <jschoenh@...zon.de>,
Thomas Gleixner <tglx@...utronix.de>, X86 ML <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Anthony Liguori <aliguori@...zon.com>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Tim Deegan <tim@....org>,
Gang Wei <gang.wei@...el.com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: native_cpu_up speed (Re: [PATCH] x86: skip delays during SMP
initialization similar to Xen)
>> I don't know if anything can be done for the 1700us wait
>> for the remote processor to mark itself initialized.
>> That is the 1st thing it does when it enters cpu_init().
>
> So that 1.7 msecs delay is the firmware in essence?
Yes -- hardware+microcode+firmware initialization.
I measured this on the 4-socket IVT and found that this
delay is not constant. Here are how many udelay(100)
executed for each processor waiting for "initialized" map:
1:64 (for cpu1, we wait 64 * udelay(100) = 6400 usec)
2:25 (for cpu2, we wait 25 * udelay(100) = 2500 usec)
3:3 (for cpu3, we wait 3 * udelay(100) = 300 usec)
4:3 etc.
5:4
6:4
7:3
8:4
9:3
10:3
11:3
12:3
13:3
14:3
15:20
16:20
17:20
18:20
19:20
20:20
21:20
22:20
23:20
24:20
25:18
26:18
27:18
28:18
29:18
30:20
31:20
32:20
33:20
34:20
35:20
36:20
37:20
38:20
39:20
40:18
41:18
42:18
43:18
44:18
45:20
46:20
47:20
48:20
49:20
50:20
51:20
52:20
53:20
54:20
55:18
56:18
57:18
58:18
59:18
60:0
61:3
62:3
63:3
64:3
65:4
66:4
67:3
68:4
69:3
70:3
71:3
72:3
73:3
74:3
75:20
76:20
77:20
78:20
79:20
80:20
81:20
82:20
83:20
84:20
85:19
86:19
87:19
88:19
89:19
90:20
91:20
92:20
93:20
94:20
95:21
96:20
97:20
98:20
99:20
100:19
101:19
102:19
103:19
104:19
105:20
106:20
107:20
108:20
109:20
110:21
111:20
112:20
113:20
114:20
115:19
116:19
117:18
118:19
119:19
I can't explain this topology, but it gives you an idea of where the time goes.
However, a clear pattern jumped out of the trace for how long
the BSP waits for the AP to set itself in cpu_callin_mask.
This is the time in start secondary where cpu_init() is running,
up through smp_callin() is called.
On the 1st package, each remote AP take 9 delays = 900 us to do this,
whether they are new cores or HT siblings of cores already up.
But the 1st processor on remote _packages_ aka nodes, takes 60,000 us
No typo -- that is 60ms!
subsequent cores in remote nodes take about 1,800 us
whether they are new cores, or HT siblings of cores already up.
cheers,
Len Brown, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists