linux-kernel - Re: [2.6.28-rc2] EeePC ACPI errors & exceptions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 29 Oct 2008 15:39:34 +0800
From:	Zhao Yakui <yakui.zhao@...el.com>
To:	Alexey Starikovskiy <aystarik@...il.com>
Cc:	Darren Salt <linux@...mustbejoking.demon.co.uk>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>
Subject: Re: [2.6.28-rc2] EeePC ACPI errors & exceptions

On Tue, 2008-10-28 at 13:46 -0700, Alexey Starikovskiy wrote:
> Hi Darren,
> 
> Please check if the patch 
> http://marc.info/?l=linux-acpi&m=122516784917952&w=4
> helps.
In the attached patch the msleep is replaced by udelay gain. 
   In the following commit the udelay is replaced by msleep. 
   >commit 1b7fc5aae8867046f8d3d45808309d5b7f2e036a
   >Author: Alexey Starikovskiy <astarikovskiy@...e.de>
   >Date:   Fri Jun 6 11:49:33 2008 -0400
     >ACPI: EC: Use msleep instead of udelay while waiting for event
   
   After the problem happens again, the udelay is restored again before
getting the root cause. 
   Maybe we should find the root cause of the problem and change the
working flowchart about the EC driver. It is inappropriate that we make
some changes and it is reverted again when the problem happens.
   
   At the same time after mlseep is replaced by the udelay, the CPU will
do thing but loop while doing EC transaction on some laptops (In the
function of ec_poll). If 100 EC transactions are done, the CPU will do
nothing but loop at least for 100*2*100 microseconds. In such case maybe
the performance will be affected.

  After the following commit is merged, the EC transaction will be
executed in EC GPE interrupt context on most laptops.Maybe it is easier.
But for the some laptops it can't be done in EC GPE interrupt context.
So it falls back to the EC polling mode. (This is realized by the
function of ec_poll).
    >commit 7c6db4e050601f359081fde418ca6dc4fc2d0011
    >Author: Alexey Starikovskiy <astarikovskiy@...e.de>
    >Date:   Thu Sep 25 21:00:31 2008 +0400
       >ACPI: EC: do transaction from interrupt context
   
   Why is AE_TIME sometimes returned by the function of ec_poll?
>static int ec_poll(struct acpi_ec *ec)
{
        unsigned long delay = jiffies + msecs_to_jiffies(ACPI_EC_DELAY);
        msleep(1);
// Maybe the current jiffies is already after the predefined jiffies
after msleep(1). In such case the ETIME will be returned. Of course the
EC transaction can't be finished. If so, IMO this is not reasonable as
this is caused by that OS has no opportunity to issue the following EC
command sequence.
        while (time_before(jiffies, delay)) {
                gpe_transaction(ec, acpi_ec_read_status(ec));
                msleep(1);
                if (ec_transaction_done(ec))
                        return 0;
//Maybe there exists the following cases. EC transaction is not finished
after msleep(1),but the current jiffies is already after predefined
jiffies. So ETIME is returned. In such case, IMO this is also not
reasonable.
        }
        return -ETIME;
}
     At the same time msleep is realized by schedule_timeout. On linux
although one process is waked up by some events, it won't be scheduled
immediately. So maybe the current jiffies is already after the
predefined timeout jiffies  after msleep(1). 
    Although the possibility of this issue can be reduced by that msleep
is replaced by udelay,maybe the issue still exists if the preempt
schedule happens at the corresponding place.

    In the above case the ETIME will be returned by ec_poll. But the
reason is not that EC controller can't update its status in time.
Instead it is caused by that host has no opportunity to issue the
sequence operation in the current work flowchart. In current EC work
flowchart the EC transaction is done in a big loop. 
    
    Maybe the better solution is that the EC transaction is explicitly
divided into several different phases. 

    Maybe my analysis is not correct. If so, please correct me. 
Welcome the comments.

    thanks.
    
    
     
> Thanks,
> Alex.
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/