lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5537C839.2060702@intel.com>
Date:	Wed, 22 Apr 2015 17:11:37 +0100
From:	Michel Thierry <michel.thierry@...el.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Daniel Vetter <daniel.vetter@...el.com>,
	Jani Nikula <jani.nikula@...ux.intel.com>,
	David Airlie <airlied@...ux.ie>,
	Ben Widawsky <ben@...dawsk.net>,
	Mika Kuoppala <mika.kuoppala@...el.com>
CC:	intel-gfx <intel-gfx@...ts.freedesktop.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: NULL ptr dereference in current i915 driver

On 4/22/2015 12:36 AM, Linus Torvalds wrote:
> So I just go the appended NULL pointer de-reference when trying to
> look at a video from my GoPro.
>
> The code disassembles to
>
>     0: 81 fb 00 04 00 00     cmp    $0x400,%ebx
>     6: 41 89 07             mov    %eax,(%r15)
>     9: 74 78                 je     0x83
>     b: 48 8d 7c 24 18       lea    0x18(%rsp),%rdi
>    10: e8 6e b3 1b c1       callq  0xffffffffc11bb383
>    15: 84 c0                 test   %al,%al
>    17: 74 4a                 je     0x63
>    19: 48 85 ed             test   %rbp,%rbp
>    1c: 75 b5                 jne    0xffffffffffffffd3
>    1e: 48 8b 04 24           mov    (%rsp),%rax
>    22: 49 8b 84 c4 98 01 00 mov    0x198(%r12,%rax,8),%rax
>    29: 00
>    2a:* 48 8b 28             mov    (%rax),%rbp <-- trapping instruction
>    2d: 65 ff 05 1f e8 ef 3f incl   %gs:0x3fefe81f(%rip)        # 0x3fefe853
>    34: 48 b8 00 00 00 00 00 movabs $0x160000000000,%rax
>    3b: 16 00 00
>
> which matches up with the asm code
>
>          cmpl    $1024, %ebx     #, act_pte
>          movl    %eax, (%r15)    # D.49217, *_26
>          je      .L118   #,
>    .L110:
>          leaq    24(%rsp), %rdi  #, tmp156
>          call    __sg_page_iter_next     #
>          testb   %al, %al        # D.49219
>          je      .L119   #,
>          testq   %rbp, %rbp      # pt_vaddr
>          jne     .L109   #,
>          movq    (%rsp), %rax    # %sfp, act_pt
>          movq    408(%r12,%rax,8), %rax  # MEM[(struct i915_hw_ppgtt
> *)vm_8(D)].D.36998.pd.page
>          movq    (%rax), %rbp    # _21->page, D.49221
>    #APP
>    # 72 "./arch/x86/include/asm/preempt.h" 1
>          incl %gs:__preempt_count(%rip)  # __preempt_count
>    # 0 "" 2
>    #NO_APP
>          movabsq $24189255811072, %rax   #, tmp150
>
> which in turn seems to come from the C code
>
>                          pt_vaddr =
> kmap_atomic(ppgtt->pd.page_table[act_pt]->page);
>
> (that "testq %rbp,%rbp; jne" just before the oopsing instruction group
> is that "if (pt_vaddr == NULL)" test.
>
> IOW, it looks like
>
>       ppgtt->pd.page_table[act_pt]
>
> is NULL, and then trying to dereference ->page off of it is what
> oopses (the preempt-count increment that comes after is the
> "pagefault_disable()" in kmap_atomic, and the big constant we're
> loading into %rax is part of "page_address(page)").
>
> I have no idea why "ppgtt->pd.page_table[act_pt]" would be NULL, but
> clearly it can be. Can somebody who knows this code look into it. I've
> added a few people who have worked in this area recently, in addition
> to the usual maintainer list..
>
> Thanks,
>
>                    Linus
>
> ---
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: [<ffffffffc010c137>] gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> PGD 0
> Oops: 0000 [#1] SMP
> Modules linked in: rfcomm fuse cmac ip6t_rpfilter ip6t_REJECT
> nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4
> nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_nat ebtable_broute
> bridge stp llc ebtable_filter ebtables ip6table_mangle
> ip6table_security ip6table_raw ip6table_filter ip6_tables
> iptable_mangle iptable_security iptable_raw bnep arc4 vfat fat
> x86_pkg_temp_thermal pn544_mei mei_phy coretemp pn544 hci nfc
> kvm_intel iTCO_wdt iTCO_vendor_support snd_hda_codec_realtek
> snd_hda_codec_hdmi kvm snd_hda_codec_generic uvcvideo
> videobuf2_vmalloc videobuf2_memops microcode videobuf2_core
> snd_hda_intel v4l2_common hid_multitouch snd_hda_controller videodev
> btusb snd_hda_codec iwlmvm media snd_hwdep mac80211 btbcm snd_seq
> btintel bluetooth snd_seq_device joydev snd_pcm serio_raw
>   i2c_i801 iwlwifi cfg80211 snd_hda_core sony_laptop snd_timer snd
> rfkill mei_me soundcore lpc_ich shpchp mei mfd_core dm_crypt
> crct10dif_pclmul i915 crc32_pclmul crc32c_intel i2c_algo_bit
> drm_kms_helper ghash_clmulni_intel drm i2c_core video
> CPU: 1 PID: 2697 Comm: chrome Not tainted 4.0.0-09362-g1fc149933fd4 #8
> Hardware name: Sony Corporation SVP11213CXB/VAIO, BIOS R0270V7 05/17/2013
> task: ffff88010dc51b30 ti: ffff88003f328000 task.ti: ffff88003f328000
> RIP: 0010:[<ffffffffc010c137>]  [<ffffffffc010c137>]
> gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
> RSP: 0018:ffff88003f32b9a8  EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000075b1b
> RDX: ffff88007d848990 RSI: 0000000000000001 RDI: ffff88003f32b9c0
> RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88003f6f7e58
> R10: 000000000d836000 R11: 0000000000000000 R12: ffff8800d4164000
> R13: 0000000000000000 R14: 0000000000000001 R15: ffff88003f7bbffc
> FS:  00007f7f0ee94a00(0000) GS:ffff88011fa80000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000005f607000 CR4: 00000000001407e0
> Stack:
>   0000000000000201 000002010dc51b30 0000000000000000 ffff88007d848990
>   0000004000075b1b ffff880100000001 0000000000000fe0 ffff880011215900
>   0000000000000000 ffff88006cc4c380 ffff88003f6f0000 0000000000000001
> Call Trace:
>   ggtt_bind_vma+0x97/0x110 [i915]
>   i915_vma_bind+0x40/0x410 [i915]
>   swiotlb_map_sg_attrs+0x74/0x140
>   i915_gem_object_do_pin+0x864/0x9f0 [i915]
>   mutex_lock+0x9/0x30
>   i915_gem_execbuffer_reserve_vma.isra.20+0x66/0x130 [i915]
>   i915_gem_execbuffer_reserve+0x2ec/0x320 [i915]
>   i915_gem_do_execbuffer.isra.27+0x5ee/0xf80 [i915]
>   mutex_optimistic_spin+0x16e/0x1f0
>   __mutex_lock_interruptible_slowpath+0x21/0x130
>   shmem_fault+0x57/0x1c0
>   drm_gem_object_lookup+0x14/0xa0 [drm]
>   i915_gem_execbuffer2+0xb2/0x2a0 [i915]
>   drm_ioctl+0x15a/0x580 [drm]
>   current_fs_time+0x9/0x50
>   do_vfs_ioctl+0x2e8/0x4f0
>   file_has_perm+0x77/0x80
>   syscall_trace_enter_phase1+0x116/0x140
>   SyS_ioctl+0x79/0x90
>   system_call_fastpath+0x12/0x6a
> Code: 00 81 fb 00 04 00 00 41 89 07 74 78 48 8d 7c 24 18 e8 6e b3 1b
> c1 84 c0 74 4a 48 85 ed 75 b5 48 8b 04 24 49 8b 84 c4 98 01 00 00 <48>
> 8b 28 65 ff 05 1f e8 ef 3f 48 b8 00 00 00 00 00 16 00 00 48
> RIP  gen6_ppgtt_insert_entries+0xa7/0x120 [i915]
>   RSP <ffff88003f32b9a8>
> CR2: 0000000000000000
>

Hi,

I see a possible va re-allocation that could be the culprit, but the 
change was commited just 2 days ago 
(http://cgit.freedesktop.org/drm-intel/commit/?id=5c5f645773b6d147bf68c350674dc3ef4f8de83d).

-Michel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ