lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1295019849.2655.8.camel@work-vm>
Date:	Fri, 14 Jan 2011 07:44:09 -0800
From:	john stultz <johnstul@...ibm.com>
To:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	Stefano Stabellini <stefano.stabellini@...citrix.com>,
	Ian Campbell <Ian.Campbell@...citrix.com>, mingo@...hat.com,
	hpa@...or.com, linux-kernel@...r.kernel.org, andi@...stfloor.org,
	williams@...hat.com, schwidefsky@...ibm.com, mingo@...e.hu,
	linux-tip-commits@...r.kernel.org
Subject: Re: [PATCH] acpi/pm: If failed at validating ACPI PM timer,
 inhibit future reads.

On Fri, 2011-01-14 at 09:09 -0500, Konrad Rzeszutek Wilk wrote:
> On Thu, Jan 13, 2011 at 11:15:16PM +0100, Thomas Gleixner wrote:
> > On Thu, 13 Jan 2011, Konrad Rzeszutek Wilk wrote:
> > 
> > > tgl, John,
> > > 
> > > Should I push this to Linus or are you guys going to push
> > > this patch during this merge window?
> > 
> > Wait a moment. This patch is fresh of the press and not that urgent,
> > really.
> 
> It is a regression compared to 2.6.37 kernel. I don't know the
> urgency requirements for regressions but I figured the earlier the
> better.
> 
> >  
> > >     I've traced it down to the fact that when we boot under Xen we do
> > >     not have the HPET enabled nor the ACPI PM timer setup. The
> > 
> > Crap. If Xen would not have setup the pm timer then it would not even
> > reach the consistency check. It would simply bail out via
> 
> Keep in mind that Linux (under Xen) does see the ACPI PM-Timer at bootup
> (it parses the ACPI tables), and when it tries to actually read the 
> values, so past this point:
> 
> > 
> >         if (!pmtmr_ioport)
> >                 return -ENODEV;
> > 
> 
> .. it fails at:
>    if (i == ACPI_PM_READ_CHECKS) {
> 
> and returns -ENODEV. So pmtmr_ioport was still valid at that time.
> 
> > and the whole misery would not have happened at all. Though it's a
> > Good Thing that Xen is so screwed as it points to a real flaw which
> > might happen on real hardware as well. See below
> > 
> > >     hpet_enable() is never called (b/c xen_time_init is called), and
> > >     for calibration of tsc_khz (calibrate_tsc == xen_tsc_khz) we
> > >     get a valid value.
> > >     
> > >     So 'tsc_read_refs' tries to read the ACPI PM timer (acpi_pm_read_early),
> > >     however that is disabled under Xen:
> > >     
> > >     [    1.099272] calling  init_acpi_pm_clocksource+0x0/0xdc @ 1
> > >     [    1.140186] PM-Timer failed consistency check  (0x0xffffff) - aborting.
> > >     
> > >     So the tsc_calibrate_check gets called, it can't do HPET, and reading
> > >     from ACPI PM timer results in getting 0xffffff.. .. and
> > >     (0xffff..-0xffff..)/some other value results in div_zero.
> > 
> > Nonsense. 0/(some other value) does not result in a divide by zero
> > except "some other value" is zero.
> 
> <scratches his head> You are right. 

The (0xffff - 0xffff) bit ends up as the divisor in calc_pmtmr_ref.

> > 
> > >     There is a check in 'tsc_refine_calibration_work' for invalid
> > >     values:
> > >     
> > >        /* hpet or pmtimer available ? */
> > >              if (!hpet && !ref_start && !ref_stop)
> > >                       goto out;
> > >     
> > >     But since ref_start and ref_stop have 0xffffff it does not trigger.
> > >
> > >     This little fix makes the read to be 0 and the check triggers.
> > 
> > First of all the patch disables the pm_timer completely, which happens
> > to results in a 0 read as a side effect. But the main point of this
> 
> I does not look like a side-effect. Specifically:
> 
> static inline u32 acpi_pm_read_early(void)
> {
>         if (!pmtmr_ioport)
>                 return 0;
> 
> 	 return acpi_pm_read_verified() & ACPI_PM_MASK;
> }
> 
> .. ends up taking the !pmtmr_ioport path which is what
> tsc_refine_calibration_work has a check for.
> 
> > fix is to disable pmtimer in case of failure in the init function
> > completely.
> > 
> > Further there are several error conditions in this init function and
> > we really need to disable pmtimer for all of them not just for the
> > case you encountered.
> 
> Good point. What about this patch? John, is it OK if I carry
> your Ack-by on this modified patch?

I'm actually looking at a different fix, as I'm worried by Thomas'
comment about hitting the same issue on real hardware if we catch the
same pmtrm value both times.

thanks
-john




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ