lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8fb634f8-7330-4cf4-a8ce-485af9c0a61a@intel.com>
Date: Tue, 21 May 2024 08:34:29 +0300
From: Sasha Neftin <sasha.neftin@...el.com>
To: Vinicius Costa Gomes <vinicius.gomes@...el.com>, Mikhail Gavrilov
	<mikhail.v.gavrilov@...il.com>, Linux List Kernel Mailing
	<linux-kernel@...r.kernel.org>, <vinschen@...hat.com>, <hkelam@...vell.com>,
	<naamax.meir@...ux.intel.com>, <anthony.l.nguyen@...el.com>, "Ruinskiy, Dima"
	<dima.ruinskiy@...el.com>, "Lifshits, Vitaly" <vitaly.lifshits@...el.com>
CC: <intel-wired-lan@...ts.osuosl.org>
Subject: Re: [Intel-wired-lan] 6.10; regression; bisected - commit
 86167183a17e cause info msg "trying to register non-static key"

On 18/05/2024 0:05, Vinicius Costa Gomes wrote:
> Hi,
> 
> + intel-wired-lan
> 
> Mikhail Gavrilov <mikhail.v.gavrilov@...il.com> writes:
Hello Corinna,

Any thought? Revert?

Thanks,
Sasha
> 
>> Hi,
>> I am here to test unstable kernels.
>> Yesterday at Fedora Rawhide arrived the new kernel
>> 20240515git1b294a1f3561 and I spotter in kernel log new "red" message
>> with follow stacktrace:
>> [    8.471610] nvme nvme1: pci function 0000:0e:00.0
>> [    8.471616] nvme nvme0: pci function 0000:04:00.0
>> [    8.492638] nvme nvme1: 32/0/0 default/read/poll queues
>> [    8.496190] nvme nvme0: 31/0/0 default/read/poll queues
>> [    8.507051]  nvme0n1: p1 p2 p3
>> [    8.522270] INFO: trying to register non-static key.
>> [    8.522273] The code is fine but needs lockdep annotation, or maybe
>> [    8.522274] you didn't initialize this object before use?
>> [    8.522275] turning off the locking correctness validator.
>> [    8.522276] CPU: 31 PID: 683 Comm: (udev-worker) Not tainted
>> 6.10.0-0.rc0.20240515git1b294a1f3561.4.fc41.x86_64+debug #1
>> [    8.522278] Hardware name: ASUS System Product Name/ROG STRIX
>> B650E-I GAMING WIFI, BIOS 2611 04/07/2024
>> [    8.522280] Call Trace:
>> [    8.522281]  <TASK>
>> [    8.522282]  dump_stack_lvl+0x84/0xd0
>> [    8.522287]  register_lock_class+0xd84/0x1000
>> [    8.522291]  ? __pfx_register_lock_class+0x10/0x10
>> [    8.522293]  __lock_acquire+0x3d2/0x5c70
>> [    8.522295]  ? debug_object_free+0x298/0x550
>> [    8.522297]  ? __pfx_debug_object_free+0x10/0x10
>> [    8.522299]  ? __pfx_lock_release+0x10/0x10
>> [    8.522301]  ? __pfx___lock_acquire+0x10/0x10
>> [    8.522302]  ? hrtimer_try_to_cancel+0x22/0x460
>> [    8.522305]  lock_acquire+0x1ae/0x540
>> [    8.522307]  ? igc_ptp_clear_tx_tstamp+0x38/0x1b0 [igc]
>> [    8.522321]  ? __pfx_lock_acquire+0x10/0x10
>> [    8.522323]  ? seqcount_lockdep_reader_access.constprop.0+0xa5/0xb0
>> [    8.522325]  ? lockdep_hardirqs_on+0x7c/0x100
>> [    8.522326]  ? seqcount_lockdep_reader_access.constprop.0+0xa5/0xb0
>> [    8.522329]  _raw_spin_lock_irqsave+0x51/0xa0
>> [    8.522330]  ? igc_ptp_clear_tx_tstamp+0x38/0x1b0 [igc]
>> [    8.522337]  igc_ptp_clear_tx_tstamp+0x38/0x1b0 [igc]
>> [    8.522344]  igc_ptp_set_timestamp_mode+0x3cc/0x700 [igc]
>> [    8.522350]  ? igc_power_down_phy_copper+0xf1/0x140 [igc]
>> [    8.522358]  igc_ptp_reset+0x3b/0x5e0 [igc]
>> [    8.530693]  ? igc_set_eee_i225+0xfd/0x1e0 [igc]
>> [    8.530706]  igc_reset+0x2d9/0x3d0 [igc]
>> [    8.531707]  igc_probe+0x14ca/0x1e20 [igc]
>> [    8.531715]  ? _raw_spin_unlock_irqrestore+0x4f/0x80
>> [    8.531717]  ? __pfx_igc_probe+0x10/0x10 [igc]
>> [    8.531723]  local_pci_probe+0xdc/0x180
>> [    8.531727]  pci_device_probe+0x23c/0x810
>> [    8.531729]  ? kernfs_add_one+0x3ab/0x4a0
>> [    8.534068]  ? kernfs_new_node+0x13d/0x240
>> [    8.534070]  ? __pfx_pci_device_probe+0x10/0x10
>> [    8.534072]  ? kernfs_create_link+0x16e/0x240
>> [    8.534074]  ? kernfs_put+0x1c/0x40
>> [    8.534076]  ? sysfs_do_create_link_sd+0x8e/0x100
>> [    8.534078]  really_probe+0x1e0/0x8a0
>> [    8.536433]  __driver_probe_device+0x18c/0x370
>> [    8.536436]  driver_probe_device+0x4a/0x120
>> [    8.536438]  __driver_attach+0x194/0x4a0
>> [    8.536439]  ? __pfx___driver_attach+0x10/0x10
>> [    8.536441]  bus_for_each_dev+0x106/0x190
>> [    8.536443]  ? __pfx_bus_for_each_dev+0x10/0x10
>> [    8.536445]  bus_add_driver+0x2ff/0x530
>> [    8.536448]  driver_register+0x1a5/0x360
>> [    8.536449]  ? __pfx_igc_init_module+0x10/0x10 [igc]
>> [    8.536456]  do_one_initcall+0xd6/0x460
>> [    8.536459]  ? __pfx_do_one_initcall+0x10/0x10
>> [    8.536461]  ? kasan_unpoison+0x44/0x70
>> [    8.536464]  do_init_module+0x296/0x7c0
>> [    8.536466]  load_module+0x567b/0x74f0
>> [    8.536470]  ? __pfx_load_module+0x10/0x10
>> [    8.536473]  ? __might_fault+0x9d/0x120
>> [    8.536475]  ? local_clock_noinstr+0xd/0x100
>> [    8.536478]  ? __pfx___might_resched+0x10/0x10
>> [    8.536481]  ? __do_sys_init_module+0x1ef/0x220
>> [    8.536482]  __do_sys_init_module+0x1ef/0x220
>> [    8.536483]  ? __pfx___do_sys_init_module+0x10/0x10
>> [    8.536487]  do_syscall_64+0x97/0x190
>> [    8.536490]  ? lockdep_hardirqs_on_prepare+0x171/0x400
>> [    8.536492]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
>> [    8.536494] RIP: 0033:0x7f29997265ae
>> [    8.536499] Code: 48 8b 0d 85 a8 0c 00 f7 d8 64 89 01 48 83 c8 ff
>> c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00
>> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 a8 0c 00 f7 d8 64 89
>> 01 48
> 
> It seems I have missed this following flow (in reverse):
> 
> + igc_ptp_clear_tx_tstamp
>    + igc_ptp_disable_tx_timestamp
>      + igc_ptp_set_timestamp_mode
>        + igc_ptp_reset
>          + igc_reset
>            + igc_probe
> 
> And that in igc_ptp_clear_tx_tstamp(), it uses ->ptp_tx_lock. So
> igc_ptp_init() needs to be called before igc_reset().
> 
> I think that is what this splat is telling us.
> 
> Don't know what's the best way forward, reverting the commit in question
> or trying to fix the initial problem some other way?
> 
>> [    8.536500] RSP: 002b:00007ffee505bff8 EFLAGS: 00000246 ORIG_RAX:
>> 00000000000000af
>> [    8.536502] RAX: ffffffffffffffda RBX: 000055eafe4a2ac0 RCX: 00007f29997265ae
>> [    8.536503] RDX: 00007f2998e6007d RSI: 00000000001067d6 RDI: 00007f299828c010
>> [    8.536504] RBP: 00007ffee505c0b0 R08: 000055eafe43a010 R09: 0000000000000007
>> [    8.536505] R10: 0000000000000007 R11: 0000000000000246 R12: 00007f2998e6007d
>> [    8.536506] R13: 0000000000020000 R14: 000055eafe476280 R15: 000055eafe4a25a0
>> [    8.536508]  </TASK>
>> [    8.542741] pps pps0: new PPS source ptp0
>> [    8.550932] igc 0000:0a:00.0 eth0: PHC added
>> [    8.550992] igc 0000:0a:00.0: 4.000 Gb/s available PCIe bandwidth
>> (5.0 GT/s PCIe x1 link)
>> [    8.550996] igc 0000:0a:00.0 eth0: MAC: e8:9c:25:6c:40:75
>> [    8.558160] igc 0000:0a:00.0 eno1: renamed from eth0
>>
>> Of course, I immediately wanted to find the first bad commit.
>> And now it has already been found:
>> 86167183a17e03ec77198897975e9fdfbd53cb0b is the first bad commit
>> commit 86167183a17e03ec77198897975e9fdfbd53cb0b (HEAD)
>> Author: Corinna Vinschen <vinschen@...hat.com>
>> Date:   Tue Apr 23 12:24:54 2024 +0200
>>
>>      igc: fix a log entry using uninitialized netdev
>>
>>      During successful probe, igc logs this:
>>
>>      [    5.133667] igc 0000:01:00.0 (unnamed net_device)
>> (uninitialized): PHC added
>>                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>      The reason is that igc_ptp_init() is called very early, even before
>>      register_netdev() has been called. So the netdev_info() call works
>>      on a partially uninitialized netdev.
>>
>>      Fix this by calling igc_ptp_init() after register_netdev(), right
>>      after the media autosense check, just as in igb.  Add a comment,
>>      just as in igb.
>>
>>      Now the log message is fine:
>>
>>      [    5.200987] igc 0000:01:00.0 eth0: PHC added
>>
>>      Signed-off-by: Corinna Vinschen <vinschen@...hat.com>
>>      Reviewed-by: Hariprasad Kelam <hkelam@...vell.com>
>>      Acked-by: Vinicius Costa Gomes <vinicius.gomes@...el.com>
>>      Tested-by: Naama Meir <naamax.meir@...ux.intel.com>
>>      Signed-off-by: Tony Nguyen <anthony.l.nguyen@...el.com>
>>
>>   drivers/net/ethernet/intel/igc/igc_main.c | 5 +++--
>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> The next step I was convinced that reverting the commit 86167183a17e
>> removed this info message.
> 
> Thank you for the effort.
> 
>>
>> I also attach here a full kernel log and build config.
>>
>> My hardware specs: https://linux-hardware.org/?probe=98ecbf3636
>>
>> I hope it helps.
>>
>> -- 
>> Best Regards,
>> Mike Gavrilov.
> 
> 
> Cheers,


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ