lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210528154339.GA9116@suse.com>
Date:   Fri, 28 May 2021 17:43:39 +0200
From:   Vojtech Pavlik <vojtech@...e.com>
To:     Egor Ignatov <egori@...linux.org>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: Problem with i8042 and PS/2 keyboard on HP laptop

On Fri, May 28, 2021 at 05:02:53PM +0300, Egor Ignatov wrote:

Hello Egor,

> I have a problem with the PS/2 keyboard on an HP laptop
> (15s-fq2020ur).  The problem is that after booting the
> system, the keyboard does not work.  But it starts working
> about 10 seconds after pressing any key.
> 
> I looked at the i8042 log and it seems to me that the
> problem is that the driver does not wait for a response to
> the GETID. It receives ACK and immediately sends the
> 0xed command without waiting for ID.

Actually, that's not the case if you look at the logs:

Here we send the GETID command

> [    0.460964] i8042: [1] f2 -> i8042 (kbd-data)

And here we get the ACK for the command back, 10ms later.

> [    0.471708] i8042: [12] fa <- i8042 (interrupt, 0, 1)

Here we wait for half a second, as you can see from the timestamps, and
nothing arrives. No ID data from the keyboard at all.

So here we see that the GETID command timed out, so we try a backup
plan. Some very old keyboards don't support GETID, so we try SETLEDS,
which every keyboard should support.

> [    0.977581] i8042: [518] ed -> i8042 (kbd-data)

[ .... crickets .... ]

There is no answer at all. We should at least get an 'fa' response here,
so that we can send the parameter of the command.

We wait for another 800ms and nothing at all arrives.

And so the atkbd_probe() function gives up and returns failure.

Then it's back to i8042.c's i8042_port_close();

And it issues the WCTR command with 0x64 as a parameter, to disable the
keyboard IRQ (dropping KBDINT = 1).

> [    1.185586] i8042: [726] 60 -> i8042 (command)
> [    1.185686] i8042: [726] 64 -> i8042 (parameter)

And then i8042.c enables the interrupt again, to look for hotplug
(setting KBDINT=1):

> [    1.185842] i8042: [726] 60 -> i8042 (command)
> [    1.185935] i8042: [726] 65 -> i8042 (parameter)

And oh wow, once we kicked the controller by toggling the interrupt
disable/enable, see what's coming in!

The GETID response!

> [    1.185975] i8042: [726] ab <- i8042 (interrupt, 0, 0)

But something is suspicious here, the "0, 0". The last number is the
interrupt number and the KBD port always uses IRQ1.

So this comes from manually checking the port for waiting data by
calling i8042_interrupt(0, NULL); at the end of i8042_port_close().

And the controller that got stuck after the GETID command is unstuck
again and properly generates an interrupt for the 2nd byte of the GETID
response:

> [    1.189909] i8042: [730] 83 <- i8042 (interrupt, 0, 1)

Yay, we got that.

Now an incoming byte on the KBD port triggers a hotplug event, we think
there may be a new keyboard plugged in.

So we repeat the detection sequence of atkbd again, sending the GETID
command:

> [    1.189952] i8042: [730] f2 -> i8042 (kbd-data)

And we get a proper ACK response:

> [    1.200096] i8042: [740] fa <- i8042 (interrupt, 0, 1)

But what the hell, there is one more ACK coming that shouldn't have:

> [    1.204012] i8042: [744] fa <- i8042 (interrupt, 0, 1)

So we bail out. An ID of 0xfa is not a keyboard!

Back to i8042.c, we toggle the interrupt enable bit:

> [    1.204031] i8042: [744] 60 -> i8042 (command)
> [    1.204124] i8042: [744] 64 -> i8042 (parameter)
> [    1.204272] i8042: [744] 60 -> i8042 (command)
> [    1.204364] i8042: [744] 65 -> i8042 (parameter)

But there's nothing waiting for us, so nothing else is happening.

> At this point it doesn't do anything until you press a key.
> Then the driver starts sending GETID repeatedly until at
> some point it gets the correct answer, after which the
> keyboard starts working. As I sad it takes about 10 secs.
> 
> Here is a part of the log after pressing a key:
> 

> [   11.103249] i8042: [10643] 1d <- i8042 (interrupt, 0, 1)

Indeed, a keypress means new bytes coming in, so this is a new hotplug
event - and we try to detect if there is a keyboard:

> [   11.103287] i8042: [10643] f2 -> i8042 (kbd-data)
> [   11.113673] i8042: [10654] fa <- i8042 (interrupt, 0, 1)
> [   11.113719] i8042: [10654] ab <- i8042 (interrupt, 0, 1)

And something goes awry again. We're supposed to get 'fa ab 83', not
just 'fa ab'.

So we wait and timeout 0.5 seconds later. We fall back to trying the
SETLED command again.

> [   11.617485] i8042: [11158] ed -> i8042 (kbd-data)

And we don't even get an ACK. The keyboard controller is stuck again.
Ouch.

> [   11.825485] i8042: [11366] 60 -> i8042 (command)
> [   11.825778] i8042: [11366] 64 -> i8042 (parameter)
> [   11.825924] i8042: [11366] 60 -> i8042 (command)
> [   11.826016] i8042: [11366] 65 -> i8042 (parameter)

So we're back in closing the port in i8042.c. We toggled the line, and
we check for any data in the data port:

> [   11.826049] i8042: [11366] 83 <- i8042 (interrupt, 0, 0)

Yes, like before, the 0x83 was waiting there for us and was blocking the
data port for any further communication.

> [   11.830084] i8042: [11370] fa <- i8042 (interrupt, 0, 1)

And another ACK was waiting there, too, probably from the SETLEDs
command. This time, however, we're lucky and manage to read the ACK
before we start reinitializing the keyboard.

So we send a GETID:

> [   11.830107] i8042: [11370] f2 -> i8042 (kbd-data)

Get an ACK:

> [   11.840241] i8042: [11380] fa <- i8042 (interrupt, 0, 1)

And this I don't even have an idea where is coming from. Possibly still
the keypress ... ?

> [   11.844063] i8042: [11384] 38 <- i8042 (interrupt, 0, 1)

Nevertheless, it's not a valid ID, so we bail out again.

We toggle the interrupt pin.

> [   11.844083] i8042: [11384] 60 -> i8042 (command)
> [   11.844174] i8042: [11384] 64 -> i8042 (parameter)
> [   11.844320] i8042: [11384] 60 -> i8042 (command)
> [   11.844413] i8042: [11384] 65 -> i8042 (parameter)

And this time there is no data stuck there. But some comes later via the
normal interrupt way (still no idea what the keybaord is trying to tell
us, maybe more keypresses):

> [   11.849039] i8042: [11389] 3c <- i8042 (interrupt, 0, 1)

And we try to identify the keyboard ....

> [   11.849059] i8042: [11389] f2 -> i8042 (kbd-data)
> [   11.859198] i8042: [11399] fa <- i8042 (interrupt, 0, 1)
> [   12.361490] i8042: [11902] ed -> i8042 (kbd-data)
> ...
> [   27.516138] i8042: [27455] f2 -> i8042 (kbd-data)
> [   27.526395] i8042: [27466] fa <- i8042 (interrupt, 0, 1)
> [   27.531044] i8042: [27471] fa <- i8042 (interrupt, 0, 1)
> [   27.531080] i8042: [27471] 60 -> i8042 (command)
> [   27.531183] i8042: [27471] 64 -> i8042 (parameter)
> [   27.531336] i8042: [27471] 60 -> i8042 (command)
> [   27.531713] i8042: [27471] 65 -> i8042 (parameter)
> [   27.536215] i8042: [27476] 1d <- i8042 (interrupt, 0, 1)
> **HERE IT FINALLY RECEIVES THE CORRECT RESPONSE**

And indeed, later the sequence finally succeeds:

> [   27.536290] i8042: [27476] f2 -> i8042 (kbd-data)
> [   27.546882] i8042: [27487] fa <- i8042 (interrupt, 0, 1)
> [   27.546940] i8042: [27487] ab <- i8042 (interrupt, 0, 1)
> [   27.546997] i8042: [27487] 83 <- i8042 (interrupt, 0, 1)

We get the correct ID and we proceed to RESET_DIS to prevent any
keypresses messing up our further communication with the keyboard:

> [   27.547018] i8042: [27487] f5 -> i8042 (kbd-data)
> [   27.557566] i8042: [27497] fa <- i8042 (interrupt, 0, 1)

We then turn the LEDs off:

> [   27.557615] i8042: [27497] ed -> i8042 (kbd-data)
> [   27.568242] i8042: [27508] fa <- i8042 (interrupt, 0, 1)
> [   27.568294] i8042: [27508] 00 -> i8042 (kbd-data)
> [   27.578730] i8042: [27518] fa <- i8042 (interrupt, 0, 1)

Set the repeat rate:

> [   27.578785] i8042: [27518] f3 -> i8042 (kbd-data)
> [   27.589151] i8042: [27529] fa <- i8042 (interrupt, 0, 1)
> [   27.589206] i8042: [27529] 00 -> i8042 (kbd-data)
> [   27.599602] i8042: [27539] fa <- i8042 (interrupt, 0, 1)

And finally enable the keyboard for use.

> [   27.599676] i8042: [27539] f4 -> i8042 (kbd-data)
> [   27.609986] i8042: [27550] fa <- i8042 (interrupt, 0, 1)
> 
> Any idea what to do about this?

So it's not the problem that the driver would not be waiting for a GETID
answer. It actually waits for a long long time.

It's the virtual i8042 keyboard controller implemented in the BIOS that
has an issue, not properly delivering interrupts when the keyboard sends
three bytes (fa ab 83) in a too quick succession.

You can try experimenting with the 'noaux', 'nomux' and 'dumbkbd',
and 'kbdreset' options of i8042, and also the 'reset' option of 'atkbd'.

This will change the init sequence and there is a chance it'll stop
tickling the virtual i8042 controller in the laptop the wrong way.

If that helps, there is a quirk table in i8042 to enable these options
based on the EDID data of the laptop automatically.

If it doesn't help, then we'd need to find a workaround how to recover
from the lost IRQ situation without giving up on keyboar detection.

Possibly by signalling the detection timeout from atkbd.c back to
i8042.c to check for a stuck byte in the queue.

Vojtech

-- 
Vojtech Pavlik
VP Linux Systems Group, SUSE 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ