linux-kernel - Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87vbmtbrx2.fsf@spindle.srvr.nix>
Date:	Wed, 05 Nov 2014 15:14:49 +0000
From:	Nix <nix@...eri.org.uk>
To:	Johan Hovold <johan@...nel.org>
Cc:	Paul Martin <pm@...ian.org>,
	Daniel Silverstone <dsilvers@...ian.org>,
	Oliver Neukum <oliver@...kum.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org
Subject: Re: [3.16.1 BISECTED REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16

On 5 Nov 2014, Johan Hovold stated:

> On Fri, Oct 31, 2014 at 04:44:46PM +0000, Nix wrote:
>> Sorry for the delay: illness and work-related release time flurries.
>> 
>> On 24 Oct 2014, Johan Hovold told this:
>> 
>> > [ +CC: linux-usb ]
>> >
>> > On Wed, Oct 22, 2014 at 04:36:59PM +0100, Nix wrote:
>> >> On 22 Oct 2014, Johan Hovold outgrape:
>> >> 
>> >> > On Wed, Oct 22, 2014 at 10:31:17AM +0100, Nix wrote:
>> >> >> On 14 Oct 2014, Johan Hovold verbalised:
>> >> >> 
>> >> >> > On Sun, Oct 12, 2014 at 10:36:30PM +0100, Nix wrote:
>> >> >> >> I have checked: this code is being executed against a symlink that
>> >> >> >> points to /dev/ttyACM0, and the tcsetattr() succeeds. (At least, it's
>> >> >> >> succeeding on the kernel I'm running now, but of course that's 3.16.5
>> >> >> >> with this commit reverted...)
>> >> >> >
>> >> >> > You could verify that by enabling debugging in the cdc-acm driver and
>> >> >> > making sure that the corresponding control messages are indeed sent on
>> >> >> > close.
>> >> >> 
>> >> >> I have a debugging dump at
>> >> >> <http://www.esperi.org.uk/~nix/temporary/cdc-acm.log>; it's fairly
>> >> >
>> >> > What kernel were you using here? The log seems to suggest that it was
>> >> > generated with the commit in question reverted.
>> >> 
>> >> Look now :) (the previous log is still there in cdc-acm-reverted.log.)
>> >
>> > Unfortunately, it seems the logs are incomplete. There are lots of
>> > entries missing (e.g. "acm_tty_install" when opening, but also some
>> > "acm_submit_read_urb"), some of which were there in your first log.
>> 
>> OK. What we have now in
>> <http://www.esperi.org.uk/~nix/temporary/cdc-acm.log> is a log from the
>> pristine upstream 3.16.6 kernel with cdc-acm debugging turned on and the
>> acm_tty_write - count stuff in acm_tty_write() disabled: I've increased
>> the dmesg buffer size so the top isn't being cut off any more. It took a
>> lot of boots for it to fail this time: about a dozen. The log contains
>> the boot that failed and the one before.
>> 
>> (To my uneducated eye, the initial traffic to/from the key looks *very*
>> different in the second boot. Something is clearly wrong by this point,
>> but that's not much of a surprise!)
>
> The log appears incomplete again, this time it seems the last part is
> completely missing (the device is never closed for example). The device
> still seems to be responding.

Nope -- by the time I clipped the log, the device was definitely
nonresponsive.

I've appended the remaining log, just in case. This is the same as the
snapshot I added to the email itself last time: a close-and-open as I
tried restarting the daemon, and another close as part of system
shutdown.

>> >                                                      What if you
>> > physically disconnect and reconnect the device instead, or simply
>> 
>> That fixes it, in fact it's the only way to fix it once it's broken by
>> this bug.
>
> I didn't mean whether it would get the device working again, but rather
> whether you could kill it this way.

Never seen it fail after a physical disconnection.

>> ... no, that doesn't help. Additional log from that shows a lot of what
>> looks like error returns:
>> 
>> Oct 31 16:38:03 fold kern debug: : [  168.135213] cdc_acm 2-1:1.0: acm_tty_close
>> Oct 31 16:38:08 fold kern debug: : [  173.130531] cdc_acm 2-1:1.0: acm_ctrl_msg - rq 0x22, val 0x0, len 0x0, result -110
>
> Yeah, it seems your device firmware has crashed. It stops responding to
> control requests.

Not surprising: I was fairly sure we were provoking a key firmware crash
or something like that. This is a device with no support for flow
control at all, after all, so I'm not terribly surprised that trying to
do flow control confuses it.

> The above is all normal, but
>
>> Oct 31 16:38:08 fold kern debug: : [  173.161489] cdc_acm 2-1:1.0: acm_ctrl_msg - rq 0x22, val 0x3, len 0x0, result -62
>
> here's another timeout. It's dead.

Again, not surprising.

> Did you get anywhere with trying to look at the device firmware?

Look at it? Only Daniel Silverstone (Cc:ed) can do that. The only copy
of the firmware I have is baked into the sealed key. :)

-- 
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/