[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2da16c01-260b-365d-4651-91d9172e5ed8@linux.dev>
Date: Thu, 7 Sep 2023 09:40:15 +0800
From: Sui Jingfeng <sui.jingfeng@...ux.dev>
To: Alex Deucher <alexdeucher@...il.com>,
suijingfeng <suijingfeng@...ngson.cn>
Cc: Christian König <christian.koenig@....com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
nouveau@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
amd-gfx@...ts.freedesktop.org, linux-pci@...r.kernel.org,
Alex Deucher <alexander.deucher@....com>,
Alex Williamson <alex.williamson@...hat.com>
Subject: Re: [RFC,drm-misc-next v4 3/9] drm/radeon: Implement .be_primary()
callback
On 2023/9/7 00:00, Alex Deucher wrote:
> On Tue, Sep 5, 2023 at 1:25 PM suijingfeng <suijingfeng@...ngson.cn> wrote:
>> Hi,
>>
>>
>> On 2023/9/5 13:50, Christian König wrote:
>>> Am 04.09.23 um 21:57 schrieb Sui Jingfeng:
>>>> From: Sui Jingfeng <suijingfeng@...ngson.cn>
>>>>
>>>> On a machine with multiple GPUs, a Linux user has no control over
>>>> which one
>>>> is primary at boot time.
>>> Question is why is that useful? Should we give users the ability to
>>> control that?
>>>
>>> I don't see an use case for this.
>>>
>> On a specific machine with multiple GPUs mounted, only the
>> primary graphics get POST-ed (initialized) by the firmware.
>> Therefore the DRM drivers for the rest video cards have to
>> work without the prerequisite setups done by firmware, This
>> is called as POST.
> I think that should be regarded as a bug in the driver that should be
> fixed and this would not help with that case. If a driver can't
> initialize a device without aid from the pre-OS environment, that
> should be fixed in the driver. This solution also doesn't fix which
> device is selected as the primary by the pre-OS environment. That can
> only be fixed in the pre-OS environment code.
>
>> One of the use cases is to test if a specific DRM driver
>> would works properly, under the circumstance of not being
>> POST-ed, The ast drm driver is the first one which refused
>> to work if not being POST-ed by the firmware.
>>
>> Before apply this series, I was unable make drm/ast as the
>> primary video card easily. The problem is that on a multiple
>> video card configuration, the monitor connected with my
>> AST2400 card not light up. While confusing, a naive programmer
>> may suspect the PRIME is not working.
>>
>> After applied this series and passing ast.modeset=10 on the
>> kernel cmd line, I found that the monitor connected with my
>> ast2400 video card still black, It doesn't display and It
>> doesn't show image to me.
> The problem with adding modeset=10 is that it only helps when you have
> one GPU driven by that driver in the system. If you have multiple
> GPUs driven by that driver, which one would that apply to? E.g., what
> if you have 2 AMD GPUs in the system.
>
>> While in the process of study drm/ast, I know that drm/ast
>> driver has the POST code shipped, See the ast_post_gpu() function.
>> Then, I was wondering why this function doesn't works.
>>
>> After a short-time (hasty) debugging, I found that the ast_post_gpu()
>> function didn't get run. Because it have something to do with the
>> ast->config_mode. Without thinking too much, I hardcoded the
>> ast->config_mode as ast_use_p2a, the key point is to force the
>> ast_post_gpu() function to run.
>>
>>
>> ```
>>
>> --- a/drivers/gpu/drm/ast/ast_main.c
>> +++ b/drivers/gpu/drm/ast/ast_main.c
>> @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device
>> *ast)
>> }
>> }
>>
>> + ast->config_mode = ast_use_p2a;
>> +
>> switch (ast->config_mode) {
>> case ast_use_defaults:
>> drm_info(dev, "Using default configuration\n");
>>
>> ```
>>
>> Then, the monitor light up, it display the Ubuntu greeter to me. Therefore
>> my patch is useful, at least for the Linux drm driver tester and developer.
>> It allow programmers to test the specific part of a specific driver without
>> changing a line of the source code and without the need of sudo authority.
>>
>> It improves the efficiency of the testing and patch verification. I know
>> the PrimaryGPU option of Xorg conf, but this approach will remember the
>> setup have been made, you need modify it with root authority each time
>> you want to switch the primary. But on the process of rapid developing
>> and/or testing for multiple video drivers, with only one computer hardware
>> resource available. What we really want is a one-shot command, as provided
>> by this series. So, this is the first use case.
>>
>>
>> The second use case is that sometime the firmware is not reliable.
>> While there are thousands of ARM64, PowerPC and Mips servers machine,
>> Most of them don't have a good UEFI firmware support. I haven't test the
>> drm/amdgpu and drm/radeon at my ARM64 server yet. Because this ARM64
>> server always use the platform(BMC) integrated display controller as primary.
>> The UEFI firmware of it does not provide options menu to tune.
>> So, for the first time, the discrete card because useless, despite more powerful.
>> I will take time to carry on the testing, so I will be able to tell more
>> in the future.
>>
>>
>> Even on X86, when select the PEG as primary on the UEFI BIOS menu.
>> There is no way to tell the bios which one of my three
>> discrete video be the primary. Not to mention some old UEFI
>> firmware, which doesn't provide a setting at all.
>> While the benefit of my approach is the flexibility.
>> Yes the i915, amdgpu and radeon are good quality,
>> but there may have programmers want to try nouveau.
>>
>>
>> The third use case is that VGAARB is also not reliable, It will
>> select a wrong device as primary. Especially on Arm64, Loongarch
>> and mips arch etc. And the X server will use this wrong device
>> as primary and completely crash there. Either because of lacking
>> a driver or the driver has a bug which can not bear the graphic
>> environment up. VGAARB is firmware dependent.
>> My patch provide a temporary method to rescue.
>>
> It sounds like we need a general purpose "primary" selector. I think
> it's sort of orthogonal to VGA. VGAARB is just for managing VGA
> routing on PCI. It's not really directly related to which GPU you
> want to be the primary when the OS loads. Maybe some new kernel
> parameter, E.g., primary_display=<string> where the string would be
> the specific device you wanted to be the primary, E.g., you could use
> the PCI BDF on PCI devices, primary_display=0000:0a:00.0 or some other
> device string for non-PCI devices.
Indeed, thanks for you give the right direction.
Bjorn and Willianson also told me something similar,
but I didn't realized until today.
historically, VGAARB is only for managing VGA compatible device.
For legacy BIOS environment, the routing related to this need to handled.
I didn't realized those technique point by the time this series is sending.
Beside this, It not uncommon that a specific machine ship multiple identical
GPUs models Or GPUs from the same company (likely a integrated one and a discrete one)
My naive approach in v4 cannot solve complex problems like this.
I will try to solve this problems and concerns at the next version.
Thanks.
> Alex
Powered by blists - more mailing lists