linux-kernel - Re: [PATCH 0/7] HID: picoLCD updates

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120818144039.0356cc7c@neptune.home>
Date:	Sat, 18 Aug 2012 14:40:39 +0200
From:	Bruno Prémont <bonbons@...ux-vserver.org>
To:	Jiri Kosina <jkosina@...e.cz>
Cc:	linux-input@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-fbdev@...r.kernel.org, Alan Stern <stern@...land.harvard.edu>
Subject: Re: [PATCH 0/7] HID: picoLCD updates

Hi Jiri,

[CCing Alan Stern]

On Thu, 16 August 2012 Jiri Kosina <jkosina@...e.cz> wrote:
> On Thu, 16 Aug 2012, Bruno Prémont wrote:
> 
> > > I don't really understand this explanation. Once usb_kill_urb() returns, 
> > > the URB should be available for future use (and therefore all queues 
> > > completely drained).
> > 
> > I won't have time today to check, though my guess is that on each
> > echo $usb-id > bind; echo $usb-id > unbind
> > under /sys/bus/hid/drivers/hid-picolcd/ the USB urb queue fills a bit does
> > not get cleared.
> > 
> > Is usb_kill_urb() called when unbinding just the specific hid driver? 
> 
> Yup, through hid_hw_stop() -> usbhid_stop().
> 
> > If so my short timing between bind/unbind must be triggering something 
> > else...
> > 
> > Otherwise I'm missing something as at first time I got no "output queue full"
> > messages, but as I repeated the bind/unbind sequences the prints per bind/unbind
> > iteration increased in number.
> > 
> > Anyhow, on Friday evening/week-end I will continue digging and report back with my
> > findings.

Huh, after changing some of the hid-picolcd data in order to have less racy
coupling between hid and framebuffer I'm now dying way too often in _mmx_memcpy
and most of the time I don't get a (complete) trace...

The only full trace I got was:
[ 3857.426136] BUG: unable to handle kernel paging request at dbdf9000
[ 3857.432555] IP: [<c11e2be7>] _mmx_memcpy+0x27/0x16c
[ 3857.435906] *pde = 1bebb063 *pte = 1bdf9161 
[ 3857.435906] Oops: 0003 [#1] 
[ 3857.435906] Modules linked in: hid_picolcd fb_sys_fops sysimgblt sysfillrect syscopyarea drm_kms_helper nfs3 nfs_acl nfs lockd sunrpc
[ 3857.435906] Pid: 1935, comm: bash Not tainted 3.6.0-rc1-jupiter-00363-g0e8ccbc #1 NVIDIA Corporation. nFORCE-MCP/MS-6373
[ 3857.435906] EIP: 0060:[<c11e2be7>] EFLAGS: 00010013 CPU: 0
[ 3857.435906] EIP is at _mmx_memcpy+0x27/0x16c
[ 3857.435906] EAX: dd40a000 EBX: dd6ea010 ECX: 035af77b EDX: dbdf4230
[ 3857.435906] ESI: dbe0d1b0 EDI: dbdf9000 EBP: dd40bea8 ESP: dd40be8c
[ 3857.435906]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 3857.435906] CR0: 8005003b CR2: dbdf9000 CR3: 1d7ee000 CR4: 000007d0
[ 3857.435906] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 3857.435906] DR6: ffff0ff0 DR7: 00000400
[ 3857.435906] Process bash (pid: 1935, ti=dd40a000 task=dcdb6120 task.ti=dbce2000)
[ 3857.435906] Stack:
[ 3857.435906]  0d6d6d6f dd6eb49c dd40beb8 dbde0080 dd6ea010 dbde0080 dbd31010 dd40bed0
[ 3857.435906]  c133b364 00000000 00000000 ddfaa0e0 dbd31010 dbdf4230 dd6ea010 00000046
[ 3857.435906]  dbd31010 dd40bef0 c133bea0 00000000 dbd0c3f0 00000000 dbd0c3f0 00000000
[ 3857.435906] Call Trace:
[ 3857.435906]  [<c133b364>] hid_submit_out+0xa4/0x130
[ 3857.435906]  [<c133bea0>] hid_irq_out+0xa0/0x100
[ 3857.435906]  [<c12e425e>] usb_hcd_giveback_urb+0x4e/0x90
[ 3857.435906]  [<c12f9bd2>] finish_urb+0xb2/0xf0
[ 3857.435906]  [<c12fa67d>] takeback_td+0x3d/0x100
[ 3857.435906]  [<c12faccf>] dl_done_list+0x14f/0x180
[ 3857.435906]  [<c12fc351>] ohci_irq+0x191/0x300
[ 3857.435906]  [<c1077790>] ? unmask_irq+0x20/0x20
[ 3857.435906]  [<c12e3ace>] usb_hcd_irq+0x1e/0x40
[ 3857.435906]  [<c107549d>] handle_irq_event_percpu+0x6d/0x1c0
[ 3857.435906]  [<c1077790>] ? unmask_irq+0x20/0x20
[ 3857.435906]  [<c1075618>] handle_irq_event+0x28/0x40
[ 3857.435906]  [<c107781a>] handle_fasteoi_irq+0x8a/0xe0
[ 3857.435906]  <IRQ> 
[ 3857.435906]  [<c1003e0a>] ? do_IRQ+0x3a/0xb0
[ 3857.435906]  [<c140c870>] ? common_interrupt+0x30/0x38
[ 3857.435906]  [<c1035143>] ? __do_softirq+0x53/0x1c0
[ 3857.435906]  [<c10350f0>] ? local_bh_enable_ip+0x80/0x80
[ 3857.435906]  <IRQ> 
[ 3857.435906]  [<c103544e>] ? irq_exit+0x3e/0xa0
[ 3857.435906]  [<c101dfde>] ? smp_apic_timer_interrupt+0x6e/0x80
[ 3857.435906]  [<c140bee1>] ? apic_timer_interrupt+0x31/0x38
[ 3857.435906]  [<c102fdba>] ? vprintk_emit+0x34a/0x390
[ 3857.435906]  [<c1409f58>] ? printk+0x38/0x3a
[ 3857.435906]  [<de94b021>] ? picolcd_init_framebuffer+0x261/0x350 [hid_picolcd]
[ 3857.435906]  [<de949bd1>] ? picolcd_probe+0x3d1/0x5a0 [hid_picolcd]
[ 3857.435906]  [<c13321f7>] ? hid_device_probe+0x67/0xf0
[ 3857.435906]  [<c128ac17>] ? driver_sysfs_add+0x57/0x80
[ 3857.435906]  [<c128af1d>] ? driver_probe_device+0xbd/0x1d0
[ 3857.435906]  [<c1331aab>] ? hid_match_device+0x7b/0x90
[ 3857.435906]  [<c1289e65>] ? driver_bind+0x75/0xd0
[ 3857.435906]  [<c1289df0>] ? driver_unbind+0x90/0x90
[ 3857.435906]  [<c1289537>] ? drv_attr_store+0x27/0x30
[ 3857.435906]  [<c111a59c>] ? sysfs_write_file+0xac/0xf0
[ 3857.435906]  [<c10cca0c>] ? vfs_write+0x9c/0x130
[ 3857.435906]  [<c10da2ef>] ? sys_dup3+0x11f/0x160
[ 3857.435906]  [<c111a4f0>] ? sysfs_poll+0x90/0x90
[ 3857.435906]  [<c10ccc7d>] ? sys_write+0x3d/0x70
[ 3857.435906]  [<c140c357>] ? sysenter_do_call+0x12/0x26
[ 3857.435906] Code: 90 90 90 90 55 89 e5 57 56 89 d6 53 83 ec 10 89 45 f0 89 4d e4 89 e0 25 00 e0 ff ff f7 40 14 00 ff ff 07 74 17 c1 e9 02 8b 7d f0 <f3> a5 8b 4d e4 83 e1 03 74 02 f3 a4 e9 27 01 00 00 8b 45 1
[ 3857.435906] EIP: [<c11e2be7>] _mmx_memcpy+0x27/0x16c SS:ESP 0068:dd40be8c
[ 3857.435906] CR2: 00000000dbdf9000
[ 3857.435906] ---[ end trace 10ae520e65e3b763 ]---
[ 3857.435906] Kernel panic - not syncing: Fatal exception in interrupt

It happens around the time of calling
  device_create_file(dev, &dev_attr_fb_update_rate);
that is, after picolcd_fb_reset(data, 1) has returned but before
calling fb_deferred_io_init(info); according to printk() I put
around the various calls in picolcd_init_framebuffer().

The other case I got part of a trace it was again in _mmx_memcpy()
but the trace output stopped in the middle of register dumping.


The full trace happened after the following sequence:
- system boot
- hotplug of picolcd (triggering modprobe hid-picolcd)
- cd /sys/bus/hid/drivers/hid-picolcd/
- echo 0003:04D8:C002.0003 > unbind
- wait some time
- echo 0003:04D8:C002.0003 > bind

Any idea what it could be? It might be during processing of first
reports send in order to clear the picolcd display.




As (un)binding in hid-picolcd crashes I tried doing so a level above,
at usbhid driver on usb bus which resulted in the following:
- system boot
- hotplug of picolcd (triggering modprobe hid-picolcd)
- cd /sys/bus/usb/drivers/usbhid/
- echo 2-1:1.0 > unbind
- wait some time
- echo 2-1:1.0 > bind
- wait some time
- echo 2-1:1.0 > unbind
produced the following trace:
[ 1418.387466] BUG: unable to handle kernel NULL pointer dereference at 00000030
[ 1418.390013] IP: [<c1339c40>] hid_submit_ctrl+0x80/0x2a0
[ 1418.390013] *pde = 00000000 
[ 1418.390013] Oops: 0002 [#1] 
[ 1418.390013] Modules linked in: hid_picolcd fb_sys_fops sysimgblt sysfillrect syscopyarea drm_kms_helper nfs3 nfs_acl nfs lockd sunrpc
[ 1418.390013] Pid: 1994, comm: kworker/0:0 Not tainted 3.6.0-rc1-jupiter-00363-g0e8ccbc #1 NVIDIA Corporation. nFORCE-MCP/MS-6373
[ 1418.390013] EIP: 0060:[<c1339c40>] EFLAGS: 00010086 CPU: 0
[ 1418.390013] EIP is at hid_submit_ctrl+0x80/0x2a0
[ 1418.390013] EAX: 80000200 EBX: dc2d0000 ECX: dcfe2020 EDX: 00000000
[ 1418.390013] ESI: 00000040 EDI: dc1f8f00 EBP: dd743e2c ESP: dd743df4
[ 1418.390013]  DS: 007b ES: 007b FS: 0000 GS: 00e0 SS: 0068
[ 1418.390013] CR0: 8005003b CR2: 00000030 CR3: 1cd42000 CR4: 000007d0
[ 1418.390013] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 1418.390013] DR6: ffff0ff0 DR7: 00000400
[ 1418.390013] Process kworker/0:0 (pid: 1994, ti=dd742000 task=dcdd0d40 task.ti=dd742000)
[ 1418.390013] Stack:
[ 1418.390013]  0000003e dc2f9020 dcf64c7c 00000005 00000002 dd743e1c 00000002 dc1f8f00
[ 1418.390013]  00000040 dcfe2020 dc2cb4d0 dc2d0000 00000000 dcfe2020 dd743e44 c1339ef6
[ 1418.390013]  00000008 dc2d0000 dc2cb4d0 dcfe2020 dd743e74 c133bbeb 00000023 000000ff
[ 1418.390013] Call Trace:
[ 1418.390013]  [<c1339ef6>] usbhid_restart_ctrl_queue+0x96/0xe0
[ 1418.390013]  [<c133bbeb>] __usbhid_submit_report+0x29b/0x320
[ 1418.390013]  [<c1330200>] ? hid_parser_local+0x260/0x2d0
[ 1418.390013]  [<c133bd2b>] usbhid_submit_report+0x1b/0x30
[ 1418.390013]  [<de996cb9>] picolcd_fb_deferred_io+0x3b9/0x4c0 [hid_picolcd]
[ 1418.390013]  [<c120e118>] fb_deferred_io_work+0x78/0xc0
[ 1418.390013]  [<c1045e3f>] process_one_work+0x1ff/0x380
[ 1418.390013]  [<c120e0a0>] ? fb_deferred_io_fault+0xa0/0xa0
[ 1418.390013]  [<c10467f7>] worker_thread+0x237/0x360
[ 1418.390013]  [<c10465c0>] ? manage_workers.clone.25+0x250/0x250
[ 1418.390013]  [<c10465c0>] ? manage_workers.clone.25+0x250/0x250
[ 1418.390013]  [<c104a30d>] kthread+0x6d/0x80
[ 1418.390013]  [<c104a2a0>] ? kthread_freezable_should_stop+0x70/0x70
[ 1418.390013]  [<c140c87e>] kernel_thread_helper+0x6/0xd
[ 1418.390013] Code: 01 89 45 e8 0f 85 91 00 00 00 8b 4d ec 8b 53 1c 8b 7d e4 8b 75 e8 8b 81 68 0c 00 00 8b 00 8b 40 9c c1 e0 08 0d 00 00 00 80 85 ff <89> 42 30 8b 43 1c 89 70 54 0f 84 19 01 00 00 81 fe ff 01 0
[ 1418.390013] EIP: [<c1339c40>] hid_submit_ctrl+0x80/0x2a0 SS:ESP 0068:dd743df4
[ 1418.390013] CR2: 0000000000000030
[ 1418.390013] ---[ end trace 089e20ed16261a0a ]---
pretty early during unbind process, e.g. way before picolcd even started to
dispose framebuffer.

Either I'm doing something really wrong or I'm the first one to produce
lots of outbound reports thus exploring corner-cases...

PS: attached is a diff between the hid-picolcd from this patch series to what
I'm testing right now.

Thanks,
Bruno

View attachment "pico.diff" of type "text/x-patch" (16454 bytes)