[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4fefdf5d-28d8-3abf-1495-320e6f275c8e@amd.com>
Date: Mon, 27 Jan 2020 08:31:18 -0500
From: Mikita Lipski <mlipski@....com>
To: Lyude Paul <lyude@...hat.com>, amd-gfx@...ts.freedesktop.org
Cc: Harry Wentland <harry.wentland@....com>, stable@...r.kernel.org,
Leo Li <sunpeng.li@....com>,
Alex Deucher <alexander.deucher@....com>,
Christian König <christian.koenig@....com>,
"David (ChunMing) Zhou" <David1.Zhou@....com>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
Bhawanpreet Lakha <Bhawanpreet.Lakha@....com>,
Mikita Lipski <mikita.lipski@....com>,
Martin Tsai <martin.tsai@....com>,
David Francis <David.Francis@....com>,
Sam Ravnborg <sam@...nborg.org>,
Alvin Lee <alvin.lee2@....com>,
Jean Delvare <jdelvare@...e.de>,
dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] drm/amd/dm/mst: Ignore payload update failures
On 1/24/20 5:01 PM, Lyude Paul wrote:
> On Fri, 2020-01-24 at 16:46 -0500, Lyude Paul wrote:
>> On Fri, 2020-01-24 at 14:20 -0500, Mikita Lipski wrote:
>>> On 1/24/20 2:10 PM, Lyude Paul wrote:
>>>> Disabling a display on MST can potentially happen after the entire MST
>>>> topology has been removed, which means that we can't communicate with
>>>> the topology at all in this scenario. Likewise, this also means that we
>>>> can't properly update payloads on the topology and as such, it's a good
>>>> idea to ignore payload update failures when disabling displays.
>>>> Currently, amdgpu makes the mistake of halting the payload update
>>>> process when any payload update failures occur, resulting in leaving
>>>> DC's local copies of the payload tables out of date.
>>>>
>>>> This ends up causing problems with hotplugging MST topologies, and
>>>> causes modesets on the second hotplug to fail like so:
>>>>
>>>> [drm] Failed to updateMST allocation table forpipe idx:1
>>>> ------------[ cut here ]------------
>>>> WARNING: CPU: 5 PID: 1511 at
>>>> drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:2677
>>>> update_mst_stream_alloc_table+0x11e/0x130 [amdgpu]
>>>> Modules linked in: cdc_ether usbnet fuse xt_conntrack nf_conntrack
>>>> nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 ipt_REJECT nf_reject_ipv4
>>>> nft_counter nft_compat nf_tables nfnetlink tun bridge stp llc sunrpc
>>>> vfat fat wmi_bmof uvcvideo snd_hda_codec_realtek snd_hda_codec_generic
>>>> snd_hda_codec_hdmi videobuf2_vmalloc snd_hda_intel videobuf2_memops
>>>> videobuf2_v4l2 snd_intel_dspcfg videobuf2_common crct10dif_pclmul
>>>> snd_hda_codec videodev crc32_pclmul snd_hwdep snd_hda_core
>>>> ghash_clmulni_intel snd_seq mc joydev pcspkr snd_seq_device snd_pcm
>>>> sp5100_tco k10temp i2c_piix4 snd_timer thinkpad_acpi ledtrig_audio snd
>>>> wmi soundcore video i2c_scmi acpi_cpufreq ip_tables amdgpu(O)
>>>> rtsx_pci_sdmmc amd_iommu_v2 gpu_sched mmc_core i2c_algo_bit ttm
>>>> drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm
>>>> crc32c_intel serio_raw hid_multitouch r8152 mii nvme r8169 nvme_core
>>>> rtsx_pci pinctrl_amd
>>>> CPU: 5 PID: 1511 Comm: gnome-shell Tainted: G O 5.5.0-
>>>> rc7Lyude-Test+ #4
>>>> Hardware name: LENOVO FA495SIT26/FA495SIT26, BIOS R12ET22W(0.22 )
>>>> 01/31/2019
>>>> RIP: 0010:update_mst_stream_alloc_table+0x11e/0x130 [amdgpu]
>>>> Code: 28 00 00 00 75 2b 48 8d 65 e0 5b 41 5c 41 5d 41 5e 5d c3 0f b6 06
>>>> 49 89 1c 24 41 88 44 24 08 0f b6 46 01 41 88 44 24 09 eb 93 <0f> 0b e9
>>>> 2f ff ff ff e8 a6 82 a3 c2 66 0f 1f 44 00 00 0f 1f 44 00
>>>> RSP: 0018:ffffac428127f5b0 EFLAGS: 00010202
>>>> RAX: 0000000000000002 RBX: ffff8d1e166eee80 RCX: 0000000000000000
>>>> RDX: ffffac428127f668 RSI: ffff8d1e166eee80 RDI: ffffac428127f610
>>>> RBP: ffffac428127f640 R08: ffffffffc03d94a8 R09: 0000000000000000
>>>> R10: ffff8d1e24b02000 R11: ffffac428127f5b0 R12: ffff8d1e1b83d000
>>>> R13: ffff8d1e1bea0b08 R14: 0000000000000002 R15: 0000000000000002
>>>> FS: 00007fab23ffcd80(0000) GS:ffff8d1e28b40000(0000)
>>>> knlGS:0000000000000000
>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 00007f151f1711e8 CR3: 00000005997c0000 CR4: 00000000003406e0
>>>> Call Trace:
>>>> ? mutex_lock+0xe/0x30
>>>> dc_link_allocate_mst_payload+0x9a/0x210 [amdgpu]
>>>> ? dm_read_reg_func+0x39/0xb0 [amdgpu]
>>>> ? core_link_enable_stream+0x656/0x730 [amdgpu]
>>>> core_link_enable_stream+0x656/0x730 [amdgpu]
>>>> dce110_apply_ctx_to_hw+0x58e/0x5d0 [amdgpu]
>>>> ? dcn10_verify_allow_pstate_change_high+0x1d/0x280 [amdgpu]
>>>> ? dcn10_wait_for_mpcc_disconnect+0x3c/0x130 [amdgpu]
>>>> dc_commit_state+0x292/0x770 [amdgpu]
>>>> ? add_timer+0x101/0x1f0
>>>> ? ttm_bo_put+0x1a1/0x2f0 [ttm]
>>>> amdgpu_dm_atomic_commit_tail+0xb59/0x1ff0 [amdgpu]
>>>> ? amdgpu_move_blit.constprop.0+0xb8/0x1f0 [amdgpu]
>>>> ? amdgpu_bo_move+0x16d/0x2b0 [amdgpu]
>>>> ? ttm_bo_handle_move_mem+0x118/0x570 [ttm]
>>>> ? ttm_bo_validate+0x134/0x150 [ttm]
>>>> ? dm_plane_helper_prepare_fb+0x1b9/0x2a0 [amdgpu]
>>>> ? _cond_resched+0x15/0x30
>>>> ? wait_for_completion_timeout+0x38/0x160
>>>> ? _cond_resched+0x15/0x30
>>>> ? wait_for_completion_interruptible+0x33/0x190
>>>> commit_tail+0x94/0x130 [drm_kms_helper]
>>>> drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper]
>>>> drm_atomic_helper_set_config+0x70/0xb0 [drm_kms_helper]
>>>> drm_mode_setcrtc+0x194/0x6a0 [drm]
>>>> ? _cond_resched+0x15/0x30
>>>> ? mutex_lock+0xe/0x30
>>>> ? drm_mode_getcrtc+0x180/0x180 [drm]
>>>> drm_ioctl_kernel+0xaa/0xf0 [drm]
>>>> drm_ioctl+0x208/0x390 [drm]
>>>> ? drm_mode_getcrtc+0x180/0x180 [drm]
>>>> amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
>>>> do_vfs_ioctl+0x458/0x6d0
>>>> ksys_ioctl+0x5e/0x90
>>>> __x64_sys_ioctl+0x16/0x20
>>>> do_syscall_64+0x55/0x1b0
>>>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>> RIP: 0033:0x7fab2121f87b
>>>> Code: 0f 1e fa 48 8b 05 0d 96 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff
>>>> ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01
>>>> f0 ff ff 73 01 c3 48 8b 0d dd 95 2c 00 f7 d8 64 89 01 48
>>>> RSP: 002b:00007ffd045f9068 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
>>>> RAX: ffffffffffffffda RBX: 00007ffd045f90a0 RCX: 00007fab2121f87b
>>>> RDX: 00007ffd045f90a0 RSI: 00000000c06864a2 RDI: 000000000000000b
>>>> RBP: 00007ffd045f90a0 R08: 0000000000000000 R09: 000055dbd2985d10
>>>> R10: 000055dbd2196280 R11: 0000000000000246 R12: 00000000c06864a2
>>>> R13: 000000000000000b R14: 0000000000000000 R15: 000055dbd2196280
>>>> ---[ end trace 6ea888c24d2059cd ]---
>>>>
>>>> Note as well, I have only been able to reproduce this on setups with 2
>>>> MST displays.
>>>>
>>>> Changes since v1:
>>>> * Don't return false when part 1 or part 2 of updating the payloads
>>>> fails, we don't want to abort at any step of the process even if
>>>> things fail
>>>>
>>>> Signed-off-by: Lyude Paul <lyude@...hat.com>
>>>> Acked-by: Harry Wentland <harry.wentland@....com>
>>>> Cc: stable@...r.kernel.org
>>>> ---
>>>> .../drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c | 13 ++++---------
>>>> 1 file changed, 4 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
>>>> b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
>>>> index 069b7a6f5597..318b474ff20e 100644
>>>> --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
>>>> +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
>>>> @@ -216,7 +216,8 @@ bool
>>>> dm_helpers_dp_mst_write_payload_allocation_table(
>>>> drm_dp_mst_reset_vcpi_slots(mst_mgr, mst_port);
>>>> }
>>>>
>>>> - ret = drm_dp_update_payload_part1(mst_mgr);
>>>> + /* It's OK for this to fail */
>>>> + drm_dp_update_payload_part1(mst_mgr);
>>>>
>>>> /* mst_mgr->->payloads are VC payload notify MST branch using
>>>> DPCD or
>>>> * AUX message. The sequence is slot 1-63 allocated sequence
>>>> for each
>>>> @@ -225,9 +226,6 @@ bool
>>>> dm_helpers_dp_mst_write_payload_allocation_table(
>>>>
>>>> get_payload_table(aconnector, proposed_table);
>>>>
>>>> - if (ret)
>>>> - return false;
>>>> -
>>>
>>> Sorry for being picky, but I think this might cause compilation error on
>>> some systems for having unused variable (int ret). Its better just to
>>> strip out both ret integer declarations.
>>
>> No problem! It wouldn't be fair if I was the only one who got to be picky
>> anyway ;)
>
> Actually, I think you might have made a mistake here - ret is still used in
> this function, mind double checking?
>
Sorry, yes you are correct, I only meant
dm_helpers_dp_mst_send_payload_allocation function.
The ret variable is still used in
dm_helpers_dp_mst_write_payload_allocation_table.
>>
>>> Otherwise the patch is good. Thanks again!
>>>
>>> Reviewed-by: Mikita Lipski <Mikita.Lipski@....com>
>>>
>>> Mikita
>>>
>>>> return true;
>>>> }
>>>>
>>>> @@ -285,7 +283,6 @@ bool dm_helpers_dp_mst_send_payload_allocation(
>>>> struct amdgpu_dm_connector *aconnector;
>>>> struct drm_dp_mst_topology_mgr *mst_mgr;
>>>> struct drm_dp_mst_port *mst_port;
>>>> - int ret;
>>>>
>>>> aconnector = (struct amdgpu_dm_connector *)stream-
>>>>> dm_stream_context;
>>>>
>>>> @@ -299,10 +296,8 @@ bool dm_helpers_dp_mst_send_payload_allocation(
>>>> if (!mst_mgr->mst_state)
>>>> return false;
>>>>
>>>> - ret = drm_dp_update_payload_part2(mst_mgr);
>>>> -
>>>> - if (ret)
>>>> - return false;
>>>> + /* It's OK for this to fail */
>>>> + drm_dp_update_payload_part2(mst_mgr);
>>>>
>>>> if (!enable)
>>>> drm_dp_mst_deallocate_vcpi(mst_mgr, mst_port);
>>>>
--
Thanks,
Mikita Lipski
Software Engineer, AMD
mikita.lipski@....com
Powered by blists - more mailing lists