linux-kernel - Re: Re-implement MCE log ring buffer as per-CPU ring buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1240552261.6842.857.camel@yhuang-dev.sh.intel.com>
Date:	Fri, 24 Apr 2009 13:51:01 +0800
From:	Huang Ying <ying.huang@...el.com>
To:	Andi Kleen <ak@...ux.intel.com>
Cc:	Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: Re-implement MCE log ring buffer as per-CPU ring buffer

On Wed, 2009-04-22 at 19:11 +0800, Andi Kleen wrote:
> Huang Ying wrote:
> 
> Basic concept is good and it fixes some long standing problems.
> 
> But some details need to be improved I think:
> 
> > +/*
> > + * Initialize MCE per-CPU log buffer
> > + */
> > +static void mce_log_init(void)
> > +{
> > +	int cpu;
> > +	struct mce_log_cpu *mcelog_cpu;
> > +
> > +	if (mcelog.log_cpus)
> > +		return;
> > +	mcelog.log_cpus = kmalloc(num_possible_cpus() * sizeof(mcelog_cpu),
> > +				  GFP_KERNEL);
> > +	for_each_possible_cpu(cpu) {
> > +		mcelog_cpu = &per_cpu(mcelog_cpus, cpu);
> > +		mcelog.log_cpus[cpu] = mcelog_cpu;
> > +	}
> 
> I don't understand why the separate allocation. Can't you just put
> the whole log buffer directly into the per cpu data?
> 
> This would also make the initialization cleaner because you would need to only
> initialize state for the currently initialized CPU.

The original intention to use mcelog.log_cpus is to make it easier to
find per-cpu buffers from memory image produced by kdump. But is seems
that something as follow is just sufficient.

mcelog.log_cpus = alloc_percpu(struct mce_log_cpu);

Is this what you suggested?

> 
> > +			while (!mcelog_cpu->entry[i].finished) {
> > +				rdtscll(now);
> > +				if (now - start > WRITER_TIMEOUT_CYC) {
> > +					memset(mcelog_cpu->entry + i, 0,
> >  					       sizeof(struct mce));
> > +					head = mcelog_cpu->head;
> >  					goto timeout;
> 
> This timeout should be reported somehow, perhaps with a printk.
> Also it's problematic to hardcode cycles here, i think it would
> be better to use a similar scheme as the Monarch timeout
> with a ndelay() and a spinunit. That is guaranteed to stay
> the same even as CPU frequencies change.

Yes. I will change this.

> > +static ssize_t mce_read(struct file *filp, char __user *inubuf, size_t usize,
> > +			loff_t *off)
> > +{
> > +	char __user *ubuf = inubuf;
> > +	struct mce_log_cpu *mcelog_cpu;
> > +	int cpu, new_mce, err = 0;
> > +	static DEFINE_MUTEX(mce_read_mutex);
> > +	size_t usize_limit;
> > +
> > +	/* Too large user buffer size may cause system not response */
> 
> Did you understand why that happens? It's worrying.

The loop in mce_read() does not release the CPU in non-preempt kernel.
If the user buffer size is too large, it may cause the CPU to be
soft-lock-upped or not response. One way to solve the issue is to
constrain the size of user buffer, another way may be insert a
schedule() in the loop. 

> > +static ssize_t show_log_flags(struct sys_device *s,
> > +			      struct sysdev_attribute *attr,
> > +			      char *buf)
> > +{
> > +	int cpu = s->id;
> > +	struct mce_log_cpu *mcelog_cpu = &per_cpu(mcelog_cpus, cpu);
> > +	unsigned flags;
> > +	do {
> > +		flags = mcelog_cpu->flags;
> > +	} while (cmpxchg((unsigned *)&mcelog_cpu->flags, flags, 0) != flags);
> > +	return sprintf(buf, "0x%x\n", flags);
> > +}
> > +
> >  static SYSDEV_ATTR(trigger, 0644, show_trigger, set_trigger);
> >  static SYSDEV_INT_ATTR(tolerant, 0644, tolerant);
> >  ACCESSOR(check_interval,check_interval,mce_restart())
> > +static SYSDEV_ATTR(log_flags, 0644, show_log_flags, NULL);
> 
> Do we really need that interface? It seems rather obscure.
> mcelog should get these flags anyways. Better to drop it.

For overflow flag, it seems not very useful. I will drop this.

Best Regards,
Huang Ying


Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)