lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <3d2bd09735dbdaf003585ca376b7c1e5b69a19bd.camel@intel.com>
Date:   Wed, 08 Jan 2020 17:38:25 +0200
From:   Antti Laakso <antti.laakso@...el.com>
To:     vdronov@...hat.com
Cc:     netdev@...r.kernel.org, sjohnsto@...hat.com, vlovejoy@...hat.com,
        richardcochran@...il.com, davem@...emloft.net,
        artem.bityutskiy@...el.com
Subject: NULL pointer dereference bug when unbinding 'igb' -driver from
 Intel i210 ethernet card

Hello,

Unbinding Intel 'igb' driver from Intel I210 PCIe ethernet card causes crash on kernel 5.5-rc5.
Running git bisect pointed to following patch triggering the issue:

commit a33121e5487b424339636b25c35d3a180eaa5f5e
Author: Vladis Dronov <vdronov@...hat.com>
Date:   Fri Dec 27 03:26:27 2019 +0100

    ptp: fix the race between the release of ptp_clock and cdev

Steps to reproduce:
$ lspci -D | grep I210
0000:18:00.0 Ethernet controller: Intel Corporation I210 Gigabit Network Connection (rev 03)
$ ls -l /sys/class/net/i210/device/driver/
total 0
lrwxrwxrwx 1 root root    0 Jan  8 15:59 0000:18:00.0 ->
../../../../devices/pci0000:17/0000:17:00.0/0000:18:00.0
--w------- 1 root root 4096 Jan  8 15:59 bind
lrwxrwxrwx 1 root root    0 Jan  8 15:59 module -> ../../../../module/igb
--w------- 1 root root 4096 Jan  8 15:59 new_id
--w------- 1 root root 4096 Jan  8 15:59 remove_id
--w------- 1 root root 4096 Jan  8 15:56 uevent
--w------- 1 root root 4096 Jan  8 15:59 unbind
$ echo "0000:18:00.0" > /sys/bus/pci/drivers/igb/unbind

[  878.869098] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  878.876060] #PF: supervisor read access in kernel mode
[  878.881197] #PF: error_code(0x0000) - not-present page
[  878.886335] PGD 0 P4D 0 
[  878.888876] Oops: 0000 [#1] SMP NOPTI
[  878.892541] CPU: 22 PID: 1844 Comm: bash Not tainted 5.5.0-rc5 #14
[  878.898718] Hardware name: Intel Corporation S2600WFS/S2600WFS, BIOS
SE5C620.86B.0D.01.0395.022720191340 02/27/2019
[  878.909151] RIP: 0010:strlen+0x0/0x20
[  878.912818] Code: f6 82 00 fd ee a4 20 74 14 48 c7 c1 00 fd ee a4 48 83 c0 01 0f b6 10 f6 04 11
20 75 f3 c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7
48 29 f8 c3 31
[  878.931565] RSP: 0018:ffff9f898f9b3c90 EFLAGS: 00010246
[  878.936789] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  878.943921] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  878.951052] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff91e8f6758898
[  878.958183] R10: 0000000000000000 R11: ffff91e8f6758380 R12: 0000000000000000
[  878.965309] R13: 0000000000000000 R14: ffff9f898f9b3f08 R15: ffff91e9175127a0
[  878.972439] FS:  00007f00a0020740(0000) GS:ffff91e920b80000(0000) knlGS:0000000000000000
[  878.980525] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  878.986272] CR2: 0000000000000000 CR3: 00000017b4c10003 CR4: 00000000007606e0
[  878.993404] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  879.000536] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  879.007666] PKRU: 55555554
[  879.010369] Call Trace:
[  879.012819]  kernfs_name_hash+0x12/0x80
[  879.016654]  kernfs_find_ns+0x35/0xc0
[  879.020321]  kernfs_remove_by_name_ns+0x31/0x90
[  879.024855]  remove_files.isra.1+0x30/0x70
[  879.028953]  sysfs_remove_group+0x3d/0x80
[  879.032964]  sysfs_remove_groups+0x29/0x40
[  879.037064]  device_remove_attrs+0x39/0x70
[  879.041163]  device_del+0x169/0x3f0
[  879.044657]  cdev_device_del+0x15/0x30
[  879.048412]  posix_clock_unregister+0x21/0x50
[  879.052769]  ptp_clock_unregister+0x6e/0x80 [ptp]
[  879.057479]  igb_ptp_stop+0x21/0x50 [igb]
[  879.061493]  igb_remove+0x46/0x120 [igb]
[  879.065420]  pci_device_remove+0x3b/0xc0
[  879.069344]  device_release_driver_internal+0xe5/0x1c0
[  879.074483]  unbind_store+0x94/0x120
[  879.078063]  kernfs_fop_write+0xc1/0x1a0
[  879.081990]  vfs_write+0xa5/0x1a0
[  879.085305]  ksys_write+0x59/0xd0
[  879.088627]  do_syscall_64+0x55/0x1d0
[  879.092292]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  879.097343] RIP: 0033:0x7f00a01154b7
[  879.100923] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04
25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24
18 48 89 74 24
[  879.119668] RSP: 002b:00007ffcd6953b18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  879.127233] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f00a01154b7
[  879.134363] RDX: 000000000000000d RSI: 000055a7075c9f60 RDI: 0000000000000001
[  879.141494] RBP: 000055a7075c9f60 R08: 000000000000000a R09: 000000000000000c
[  879.148619] R10: 000055a707503c10 R11: 0000000000000246 R12: 000000000000000d
[  879.155752] R13: 00007f00a01e6500 R14: 000000000000000d R15: 00007f00a01e6700
[  879.162883] Modules linked in: intel_rapl_msr intel_rapl_common skx_edac iTCO_wdt
x86_pkg_temp_thermal iTCO_vendor_support watchdog ipmi_ssif kvm lpc_ich ipmi_si irqbypass pcspkr
ioatdma mfd_core i2c_i801 ipmi_devintf wmi ipmi_msghandler acpi_pad vfat fat dm_multipath 
ixgbe igb i40e dca mdio nvme megaraid_sas crct10dif_pclmul ptp crc32_pclmul nvme_core crc32c_intel
pps_core i2c_algo_bit sunrpc
[  879.197670] CR2: 0000000000000000
[  879.200990] ---[ end trace c61f33a12ae52fc5 ]---
[  879.298587] RIP: 0010:strlen+0x0/0x20
[  879.302249] Code: f6 82 00 fd ee a4 20 74 14 48 c7 c1 00 fd ee a4 48 83 c0 01 0f b6 10 f6 04 11
20 75 f3 c3 66 66 2e 0f 1f 84 00 00 00 00 00 90 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7
48 29 f8 c3 31
[  879.320993] RSP: 0018:ffff9f898f9b3c90 EFLAGS: 00010246
[  879.326221] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  879.333351] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  879.340483] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff91e8f6758898
[  879.347618] R10: 0000000000000000 R11: ffff91e8f6758380 R12: 0000000000000000
[  879.354746] R13: 0000000000000000 R14: ffff9f898f9b3f08 R15: ffff91e9175127a0
[  879.361872] FS:  00007f00a0020740(0000) GS:ffff91e920b80000(0000) knlGS:0000000000000000
[  879.369957] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  879.375703] CR2: 0000000000000000 CR3: 00000017b4c10003 CR4: 00000000007606e0
[  879.382833] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  879.389958] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  879.397089] PKRU: 55555554

Usually problem is hit when doing unbind first time.
But sometimes binding device back to 'igb'
-driver, and unbinding it again, is needed:
$ echo "0000:18:00.0" > /sys/bus/pci/drivers/igb/bind
$
echo "0000:18:00.0" > /sys/bus/pci/drivers/igb/unbind

At most 50 unbind/bind iterations is needed to reproduce the issue.

Reproduced on Intel Xeon 8260L, with Intel I210 1GbE PCIe Ethernet Card.


--
Antti Laakso
antti.laakso@...el.com

---------------------------------------------------------------------
Intel Finland Oy
Registered Address: PL 281, 00181 Helsinki 
Business Identity Code: 0357606 - 4 
Domiciled in Helsinki 

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ