lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 26 Dec 2019 11:03:12 +0100
From:   Paul Menzel <pmenzel+amd-gfx@...gen.mpg.de>
To:     Chang Zhu <Changfeng.Zhu@....com>,
        Christian König <christian.koenig@....com>,
        Alex Deucher <alexander.deucher@....com>
Cc:     amd-gfx@...ts.freedesktop.org, LKML <linux-kernel@...r.kernel.org>,
        David Airlie <airlied@...ux.ie>
Subject: Warning: check cp_fw_version and update it to realize GRBM requires
 1-cycle delay in cp firmware

Dear Chang, Christian, and Alex,


With Linux 5.5-rc3 I am seeing the warning and null pointer dereference 
below. Are those related?

> [   13.406253] [drm] amdgpu kernel modesetting enabled.
> [   13.406294] checking generic (7fe0000000 300000) vs hw (7fe0000000 10000000)
> [   13.406294] fb0: switching to amdgpudrmfb from EFI VGA
> [   13.406380] Console: switching to colour dummy device 80x25
> [   13.406423] amdgpu 0000:26:00.0: vgaarb: deactivate vga console
> [   13.406805] amdgpu 0000:26:00.0: enabling device (0006 -> 0007)
> [   13.408153] [drm] initializing kernel modesetting (RAVEN 0x1002:0x15DD 0x1002:0x15DD 0xC8).
> [   13.408175] [drm] register mmio base: 0xFCC00000
> [   13.408175] [drm] register mmio size: 524288
> [   13.408201] [drm] add ip block number 0 <soc15_common>
> [   13.408202] [drm] add ip block number 1 <gmc_v9_0>
> [   13.408202] [drm] add ip block number 2 <vega10_ih>
> [   13.408203] [drm] add ip block number 3 <psp>
> [   13.408203] [drm] add ip block number 4 <gfx_v9_0>
> [   13.408204] [drm] add ip block number 5 <sdma_v4_0>
> [   13.408205] [drm] add ip block number 6 <powerplay>
> [   13.408205] [drm] add ip block number 7 <dm>
> [   13.408206] [drm] add ip block number 8 <vcn_v1_0>
> [   13.408687] input: HD-Audio Generic HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:08.1/0000:26:00.1/sound/card0/input5
> [   13.408863] input: HD-Audio Generic HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:08.1/0000:26:00.1/sound/card0/input6
> [   13.409048] input: HD-Audio Generic HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:08.1/0000:26:00.1/sound/card0/input7
> [   13.409069] snd_hda_codec_realtek hdaudioC1D0: autoconfig for ALC892: line_outs=3 (0x14/0x15/0x16/0x0/0x0) type:line
> [   13.409070] snd_hda_codec_realtek hdaudioC1D0:    speaker_outs=0 (0x0/0x0/0x0/0x0/0x0)
> [   13.409072] snd_hda_codec_realtek hdaudioC1D0:    hp_outs=1 (0x1b/0x0/0x0/0x0/0x0)
> [   13.409073] snd_hda_codec_realtek hdaudioC1D0:    mono: mono_out=0x0
> [   13.409073] snd_hda_codec_realtek hdaudioC1D0:    dig-out=0x1e/0x0
> [   13.409074] snd_hda_codec_realtek hdaudioC1D0:    inputs:
> [   13.409075] snd_hda_codec_realtek hdaudioC1D0:      Front Mic=0x19
> [   13.409076] snd_hda_codec_realtek hdaudioC1D0:      Rear Mic=0x18
> [   13.409077] snd_hda_codec_realtek hdaudioC1D0:      Line=0x1a
> [   13.415816] ATOM BIOS: 113-RAVEN-114
> [   13.416590] [drm] VCN decode is enabled in VM mode
> [   13.416591] [drm] VCN encode is enabled in VM mode
> [   13.416591] [drm] VCN jpeg decode is enabled in VM mode
> [   13.416831] [drm] vm size is 262144 GB, 4 levels, block size is 9-bit, fragment size is 9-bit
> [   13.416939] amdgpu 0000:26:00.0: VRAM: 2048M 0x000000F400000000 - 0x000000F47FFFFFFF (2048M used)
> [   13.416940] amdgpu 0000:26:00.0: GART: 1024M 0x0000000000000000 - 0x000000003FFFFFFF
> [   13.416941] amdgpu 0000:26:00.0: AGP: 267419648M 0x000000F800000000 - 0x0000FFFFFFFFFFFF
> [   13.416955] [drm] Detected VRAM RAM=2048M, BAR=2048M
> [   13.416956] [drm] RAM width 128bits DDR4
> [   13.419381] [TTM] Zone  kernel: Available graphics memory: 7168268 KiB
> [   13.419382] [TTM] Zone   dma32: Available graphics memory: 2097152 KiB
> [   13.419382] [TTM] Initializing pool allocator
> [   13.419399] [TTM] Initializing DMA pool allocator
> [   13.419570] [drm] amdgpu: 2048M of VRAM memory ready
> [   13.419583] [drm] amdgpu: 3072M of GTT memory ready.
> [   13.419664] [drm] GART: num cpu pages 262144, num gpu pages 262144
> [   13.419903] [drm] PCIE GART of 1024M enabled (table at 0x000000F400900000).
> [   13.421454] amdgpu 0000:26:00.0: Direct firmware load for amdgpu/raven_ta.bin failed with error -2
> [   13.421456] amdgpu 0000:26:00.0: psp v10.0: Failed to load firmware "amdgpu/raven_ta.bin"
> [   13.435852] input: HD-Audio Generic Front Mic as /devices/pci0000:00/0000:00:08.1/0000:26:00.6/sound/card1/input8
> [   13.436026] input: HD-Audio Generic Rear Mic as /devices/pci0000:00/0000:00:08.1/0000:26:00.6/sound/card1/input9
> [   13.436220] input: HD-Audio Generic Line as /devices/pci0000:00/0000:00:08.1/0000:26:00.6/sound/card1/input10
> [   13.436790] input: HD-Audio Generic Line Out Front as /devices/pci0000:00/0000:00:08.1/0000:26:00.6/sound/card1/input11
> [   13.436974] input: HD-Audio Generic Line Out Surround as /devices/pci0000:00/0000:00:08.1/0000:26:00.6/sound/card1/input12
> [   13.437240] input: HD-Audio Generic Line Out CLFE as /devices/pci0000:00/0000:00:08.1/0000:26:00.6/sound/card1/input13
> [   13.437403] input: HD-Audio Generic Front Headphone as /devices/pci0000:00/0000:00:08.1/0000:26:00.6/sound/card1/input14
> [   13.446975] [drm] Warning: check cp_fw_version and update it to realize 			      GRBM requires 1-cycle delay in cp firmware
> [   13.448211] BUG: kernel NULL pointer dereference, address: 0000000000000026
> [   13.448216] #PF: supervisor read access in kernel mode
> [   13.448217] #PF: error_code(0x0000) - not-present page
> [   13.448219] PGD 0 P4D 0 
> [   13.448221] Oops: 0000 [#1] SMP
> [   13.448223] CPU: 2 PID: 354 Comm: comp_1.0.0 Not tainted 5.5.0-rc3-00045-g7618e88ac987 #25
> [   13.448225] Hardware name: Micro-Star International Co., Ltd. MS-7A37/B350M MORTAR (MS-7A37), BIOS 1.MR 12/02/2019
> [   13.448231] RIP: 0010:__kthread_should_park+0x5/0x30
> [   13.448233] Code: 7d 01 00 f6 40 26 20 74 11 48 8b 80 88 05 00 00 48 8b 00 48 d1 e8 83 e0 01 c3 0f 0b eb eb 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <f6> 47 26 20 74 12 48 8b 87 88 05 00 00 48 8b 00 48 c1 e8 02 83 e0
> [   13.448235] RSP: 0018:ffffbe1b804bfe50 EFLAGS: 00010246
> [   13.448237] RAX: 7fffffffffffffff RBX: ffff988638f72e78 RCX: 0000000000000294
> [   13.448238] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000000
> [   13.448240] RBP: ffff988636827f50 R08: ffff98863da34058 R09: ffff98863e80e8d8
> [   13.448241] R10: ffffbe1b804bfeac R11: ffffffffc466dad2 R12: ffff988636827f50
> [   13.448243] R13: ffffbe1b808c77e0 R14: ffff988636827f50 R15: ffff98863da33c80
> [   13.448245] FS:  0000000000000000(0000) GS:ffff988640680000(0000) knlGS:0000000000000000
> [   13.448247] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   13.448248] CR2: 0000000000000026 CR3: 0000000378783000 CR4: 00000000003406e0
> [   13.448250] Call Trace:
> [   13.448254]  drm_sched_get_cleanup_job+0x42/0x100 [gpu_sched]
> [   13.448257]  drm_sched_main+0x5c/0x390 [gpu_sched]
> [   13.448261]  ? __schedule+0x298/0x6c0
> [   13.448263]  ? __wake_up_common+0x80/0x180
> [   13.448265]  kthread+0xfb/0x130
> [   13.448267]  ? drm_sched_get_cleanup_job+0x100/0x100 [gpu_sched]
> [   13.448269]  ? kthread_park+0x90/0x90
> [   13.448272]  ret_from_fork+0x22/0x40
> [   13.448273] Modules linked in: snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi amdgpu(+) snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core k10temp i2c_piix4 snd_hwdep snd_pcm snd_timer gpu_sched snd soundcore r8169 realtek wmi video acpi_cpufreq crc32c_intel xhci_pci xhci_hcd
> [   13.448286] CR2: 0000000000000026
> [   13.448288] ---[ end trace 22194bd02a932bab ]---
> [   13.448290] RIP: 0010:__kthread_should_park+0x5/0x30
> [   13.448292] Code: 7d 01 00 f6 40 26 20 74 11 48 8b 80 88 05 00 00 48 8b 00 48 d1 e8 83 e0 01 c3 0f 0b eb eb 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <f6> 47 26 20 74 12 48 8b 87 88 05 00 00 48 8b 00 48 c1 e8 02 83 e0
> [   13.448295] RSP: 0018:ffffbe1b804bfe50 EFLAGS: 00010246
> [   13.448296] RAX: 7fffffffffffffff RBX: ffff988638f72e78 RCX: 0000000000000294
> [   13.448298] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000000
> [   13.448299] RBP: ffff988636827f50 R08: ffff98863da34058 R09: ffff98863e80e8d8
> [   13.448300] R10: ffffbe1b804bfeac R11: ffffffffc466dad2 R12: ffff988636827f50
> [   13.448302] R13: ffffbe1b808c77e0 R14: ffff988636827f50 R15: ffff98863da33c80
> [   13.448303] FS:  0000000000000000(0000) GS:ffff988640680000(0000) knlGS:0000000000000000
> [   13.448305] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   13.448307] CR2: 0000000000000026 CR3: 0000000378783000 CR4: 00000000003406e0
> [   13.474706] [drm] use_doorbell being set to: [true]
> [   13.474886] amdgpu: [powerplay] hwmgr_sw_init smu backed is smu10_smu
> [   13.497484] [drm] Found VCN firmware Version ENC: 1.9 DEC: 1 VEP: 0 Revision: 28
> [   13.497508] [drm] PSP loading VCN firmware
> [   13.518866] [drm] reserve 0x400000 from 0xf47f800000 for PSP TMR
> [   13.533919] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: errors=remount-ro
> [   13.533962] ext4 filesystem being mounted at /boot supports timestamps until 2038 (0x7fffffff)
> [   13.563013] r8169 0000:22:00.0 enp34s0: renamed from eth0
> [   13.605292] [drm] DM_PPLIB: values for F clock
> [   13.605297] [drm] DM_PPLIB:	 0 in kHz, 3649 in mV
> [   13.605299] [drm] DM_PPLIB:	 400000 in kHz, 3649 in mV
> [   13.605300] [drm] DM_PPLIB:	 933000 in kHz, 4074 in mV
> [   13.605301] [drm] DM_PPLIB:	 1067000 in kHz, 4250 in mV
> [   13.605303] [drm] DM_PPLIB: values for DCF clock
> [   13.605304] [drm] DM_PPLIB:	 300000 in kHz, 3649 in mV
> [   13.605305] [drm] DM_PPLIB:	 600000 in kHz, 4074 in mV
> [   13.605307] [drm] DM_PPLIB:	 626000 in kHz, 4250 in mV
> [   13.605308] [drm] DM_PPLIB:	 654000 in kHz, 4399 in mV
> [   13.606355] [drm] Display Core initialized with v3.2.56!
> [   13.638961] snd_hda_intel 0000:26:00.1: bound 0000:26:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu])
> [   13.653730] [drm:dm_helpers_parse_edid_caps [amdgpu]] *ERROR* Couldn't read SADs: -2
> [   13.656251] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
> [   13.656255] [drm] Driver supports precise vblank timestamp query.
> [   13.658879] [drm] VCN decode and encode initialized successfully(under DPG Mode).
> [   13.662388] [drm] fb mappable at 0x38FBC1000
> [   13.662393] [drm] vram apper at 0x38F000000
> [   13.662395] [drm] size 5242880
> [   13.662396] [drm] fb depth is 24
> [   13.662397] [drm]    pitch is 5120
> [   13.662706] fbcon: amdgpudrmfb (fb0) is primary device
> [   13.673551] Console: switching to colour frame buffer device 160x64
> [   13.694375] amdgpu 0000:26:00.0: fb0: amdgpudrmfb frame buffer device
> [   13.700769] amdgpu 0000:26:00.0: ring gfx uses VM inv eng 0 on hub 0
> [   13.700814] amdgpu 0000:26:00.0: ring comp_1.0.0 uses VM inv eng 1 on hub 0
> [   13.700856] amdgpu 0000:26:00.0: ring comp_1.1.0 uses VM inv eng 4 on hub 0
> [   13.700898] amdgpu 0000:26:00.0: ring comp_1.2.0 uses VM inv eng 5 on hub 0
> [   13.700940] amdgpu 0000:26:00.0: ring comp_1.3.0 uses VM inv eng 6 on hub 0
> [   13.700982] amdgpu 0000:26:00.0: ring comp_1.0.1 uses VM inv eng 7 on hub 0
> [   13.701035] amdgpu 0000:26:00.0: ring comp_1.1.1 uses VM inv eng 8 on hub 0
> [   13.701077] amdgpu 0000:26:00.0: ring comp_1.2.1 uses VM inv eng 9 on hub 0
> [   13.701119] amdgpu 0000:26:00.0: ring comp_1.3.1 uses VM inv eng 10 on hub 0
> [   13.701162] amdgpu 0000:26:00.0: ring kiq_2.1.0 uses VM inv eng 11 on hub 0
> [   13.701204] amdgpu 0000:26:00.0: ring sdma0 uses VM inv eng 0 on hub 1
> [   13.701243] amdgpu 0000:26:00.0: ring vcn_dec uses VM inv eng 1 on hub 1
> [   13.701284] amdgpu 0000:26:00.0: ring vcn_enc0 uses VM inv eng 4 on hub 1
> [   13.701325] amdgpu 0000:26:00.0: ring vcn_enc1 uses VM inv eng 5 on hub 1
> [   13.701366] amdgpu 0000:26:00.0: ring vcn_jpeg uses VM inv eng 6 on hub 1
> [   13.754721] [drm] Initialized amdgpu 3.36.0 20150101 for 0000:26:00.0 on minor 0

Chang, it looks like you added that warning in commit 11c6108934.

> drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9
>
> It needs to add warning to update firmware in gfx9
> in case that firmware is too old to have function to
> realize dummy read in cp firmware.

Unfortunately, it looks like you did not even check how the warning is 
formatted (needless spaces), so I guess this was totally untested. Also, 
what is that warning about, and what is the user supposed to do? I am 
unable to find `cp_fw_version` in the source code at all.

Where can I get updated firmware? What version do I need? I think, this 
should be thought over again, and you should gracefully deal with old 
firmware.

Please tell me, if you want me to create a bug report, or something else 
to get this fixed.

The package *firmware-amd-graphics* 20190717-2 from Debian Sid/unstable 
is installed, which lacks `raven_ta.bin`, but which is not even in the 
upstream linux-firmware repository [1]. :( It’s off-topic to this 
thread, but how can upstream Linux have code for unpublished blobs?


Kind regards,

Paul


[1]: 
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amdgpu?id=7319341e6e40f8bae1f2623eb5e4ddc0e2b50076

View attachment "20191224–msi-MS-7A37–linux-messages.txt" of type "text/plain" (75361 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ