linux-kernel - Re: [PATCH] x86_64: Limit the number of processor bootup messages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4AEB847C.4020003@sgi.com>
Date:	Fri, 30 Oct 2009 17:27:40 -0700
From:	Mike Travis <travis@....com>
To:	David Rientjes <rientjes@...gle.com>
CC:	Ingo Molnar <mingo@...e.hu>, Andi Kleen <ak@...ux.intel.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Heiko Carstens <heiko.carstens@...ibm.com>,
	Roland Dreier <rdreier@...co.com>,
	Randy Dunlap <rdunlap@...otime.net>, Tejun Heo <tj@...nel.org>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	Yinghai Lu <yinghai@...nel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Rusty Russell <rusty@...tcorp.com.au>,
	Hidetoshi Seto <seto.hidetoshi@...fujitsu.com>,
	Jack Steiner <steiner@....com>,
	Frederic Weisbecker <fweisbec@...il.com>, x86@...nel.org,
	Linux Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] x86_64: Limit the number of processor bootup messages



David Rientjes wrote:
> On Fri, 30 Oct 2009, Mike Travis wrote:
> 
>>>> x86_64: Limit the number of processor bootup messages
>>>>
> 
> Is this really only limited to 64 bit?

[That was a quick edit to change it from SGI X86_64 UV and it didn't
occur to me to remove the _64. :-)]

> 
>>>> With a large number of processors in a system there is an excessive amount
>>>> of messages sent to the system console.  It's estimated that with 4096
>>>> processors in a system, and the console baudrate set to 56K, the startup
>>>> messages will take about 84 minutes to clear the serial port.
>>>>
>>>> This set of patches limits the number of repetitious messages which
>>>> contain
>>>> no additional information.  Much of this information is obtainable from
>>>> the
>>>> /proc and /sysfs.   Most of the messages are also sent to the kernel log
>>>> buffer as KERN_DEBUG messages so it can be used to examine more closely
>>>> any
>>>> details specific to a processor.
>>>>
>>>> The list of message transformations....
>>>>
>>>> For system_state == SYSTEM_BOOTING:
>>>>
>>>> 	[   25.388280] Booting Processors 1-7,320-327, Node 0
>>>> 	[   26.064742] Booting Processors 8-15,328-335, Node 1
>>>> 	[   26.837006] Booting Processors 16-31,336-351, Nodes 2-3
>>>> 	[   28.440427] Booting Processors 32-63,352-383, Nodes 4-7
>>>> 	[   31.640450] Booting Processors 64-127,384-447, Nodes 8-15
>>>> 	[   38.041430] Booting Processors 128-255,448-575, Nodes 16-31
>>>> 	[   50.917504] Booting Processors 256-319,576-639, Nodes 32-39
>>>> 	[   90.964169] Brought up 640 CPUs
>>>>
>>>> The range of processors increases as a power of 2, so 4096 CPU's should
>>>> only take 12 lines.
>>>>
> 
> On your particular machine, yes, but there's no x86 restriction on the 
> number of cpus per node.

Yes, my comment is wrong.  The limit would be 10 lines for the current kernel
limit of 512 nodes.

> 
>>>> @@ -671,6 +759,50 @@
>>>> 	complete(&c_idle->done);
>>>> }
>>>>
>>>> +/* Summarize the "Booting processor ..." startup messages */
>>>> +static void __init print_summary_bootmsg(int cpu)
>>>> +{
>>>> +	static int next_node, node_shift;
>>>> +	int node = cpu_to_node(cpu);
>>>> +
>>>> +	if (node >= next_node) {
>>>> +		cpumask_var_t cpulist;
>>>> +
>>>> +		node = next_node;
>>>> +		next_node = 1 << node_shift;
>>>> +		node_shift++;
>>>> +
>>>> +		if (alloc_cpumask_var(&cpulist, GFP_KERNEL)) {
>>>> +			int i, tmp, last_node = node;
>>>> +			char buf[32];
>>>> +
>>>> +			cpumask_clear(cpulist);
>>>> +			for_each_present_cpu(i) {
>>>> +				if (i == 0)	/* boot cpu */
>>>> +					continue;
>>>> +
>>>> +				tmp = cpu_to_node(i);
>>>> +				if (node <= tmp && tmp < next_node) {
>>>> +					cpumask_set_cpu(i, cpulist);
>>>> +					if (last_node < tmp)
>>>> +						last_node = tmp;
>>>> +				}
>>>> +			}
>>>> +			if (cpumask_weight(cpulist)) {
>>>> +				cpulist_scnprintf(buf, sizeof(buf), cpulist);
>>>> +				printk(KERN_INFO "Booting Processors %s,",
>>>> buf);
>>>> +
>>>> +				if (node == last_node)
>>>> +					printk(KERN_CONT " Node %d\n", node);
>>>> +				else
>>>> +					printk(KERN_CONT " Nodes %d-%d\n",
>>>> +						node, last_node);
>>>> +			}
>>>> +			free_cpumask_var(cpulist);
>>>> +		}
>>>> +	}
>>>> +}
>>>> +
>>>> /*
>>>>  * NOTE - on most systems this is a PHYSICAL apic ID, but on multiquad
>>>>  * (ie clustered apic addressing mode), this is a LOGICAL apic ID.
>>> Why isn't cpumask_of_node() available yet?
>> I'll try that.  It gets a bit tricky in specifying the actual last node that
>> is being booted.
>>
> 
> Why do you need to call print_summary_bootmsg() for each cpu?  It seems 
> like you'd be able to move this out to a single call to a new function:
> 
> void __init print_summary_bootmsg(void)
> {
> 	char buf[128];
> 	int nid;
> 
> 	for_each_online_node(nid) {
> 		const struct cpumask *mask = cpumask_of_node(nid);
> 
> 		if (cpumask_empty(mask))
> 			continue;
> 		cpulist_scnprintf(buf, sizeof(buf), cpumask_of_node(nid));
> 		pr_info("Booting Processors %s, Node %d\n", buf, nid);
> 	}
> }

Well one thing I did find out, cpumask_of_node (or more specifically
node_to_cpumask_map[] is filled in while the CPU's are booting, not
before.

Also, the above could potentially print 512 lines of boot messages before
booting cpu 1. The printk times also would not be accurate for each group
of cpus.  And there's something to be said about actually doing what it
is you say you are doing. ;-)

	Booting Processors 0-15  Node 0
	Booting Processors 16-31  Node 1
	<Here you expect cpus 0-15 to have already been booted.>

Why not just say:

        cpulist_scnprintf(buf, sizeof(buf), cpu_present_mask);
        pr_info("Booting Processors %s\n", buf);

Since the node -> cpu map can be printed much more efficiently some other way?

For example:

	Nodes 0-7:  0-7,512-519   8-15,520-527   ...

would shrink it to 64 lines max.

(Note, it's important to include the "cpu_present_mask" because cpus can
be powered on disabled, and be booted later on, to decrease the initial
system startup time.)

A request was made (by AK?) that getting a general sense of progress is
a "good thing".  I wanted to avoid something more mundane like dots or
sequential numbers.  The one thing that Andi mentioned that I haven't
figured out is how to "delay print" specific cpu info in the case of a
boot error.  I suppose one way would be to save the current position in
the kernel log buffer at the start of each cpu boot, and print that to
the console in case of an error?

Thanks,
Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/