lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADnq5_OYPha5cGF+tSj4fvSmf-6tObzNSe2__nG-SbjX6v_2vw@mail.gmail.com>
Date:   Wed, 6 Sep 2023 12:00:53 -0400
From:   Alex Deucher <alexdeucher@...il.com>
To:     suijingfeng <suijingfeng@...ngson.cn>
Cc:     Christian König <christian.koenig@....com>,
        Sui Jingfeng <sui.jingfeng@...ux.dev>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        nouveau@...ts.freedesktop.org, intel-gfx@...ts.freedesktop.org,
        linux-kernel@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        amd-gfx@...ts.freedesktop.org, linux-pci@...r.kernel.org,
        Alex Deucher <alexander.deucher@....com>
Subject: Re: [RFC,drm-misc-next v4 3/9] drm/radeon: Implement .be_primary() callback

On Tue, Sep 5, 2023 at 1:25 PM suijingfeng <suijingfeng@...ngson.cn> wrote:
>
> Hi,
>
>
> On 2023/9/5 13:50, Christian König wrote:
> > Am 04.09.23 um 21:57 schrieb Sui Jingfeng:
> >> From: Sui Jingfeng <suijingfeng@...ngson.cn>
> >>
> >> On a machine with multiple GPUs, a Linux user has no control over
> >> which one
> >> is primary at boot time.
> >
> > Question is why is that useful? Should we give users the ability to
> > control that?
> >
> > I don't see an use case for this.
> >
>
> On a specific machine with multiple GPUs mounted, only the
> primary graphics get POST-ed (initialized) by the firmware.
> Therefore the DRM drivers for the rest video cards have to
> work without the prerequisite setups done by firmware, This
> is called as POST.

I think that should be regarded as a bug in the driver that should be
fixed and this would not help with that case.  If a driver can't
initialize a device without aid from the pre-OS environment, that
should be fixed in the driver.  This solution also doesn't fix which
device is selected as the primary by the pre-OS environment.  That can
only be fixed in the pre-OS environment code.

>
> One of the use cases is to test if a specific DRM driver
> would works properly, under the circumstance of not being
> POST-ed, The ast drm driver is the first one which refused
> to work if not being POST-ed by the firmware.
>
> Before apply this series, I was unable make drm/ast as the
> primary video card easily. The problem is that on a multiple
> video card configuration, the monitor connected with my
> AST2400 card not light up. While confusing, a naive programmer
> may suspect the PRIME is not working.
>
> After applied this series and passing ast.modeset=10 on the
> kernel cmd line, I found that the monitor connected with my
> ast2400 video card still black, It doesn't display and It
> doesn't show image to me.

The problem with adding modeset=10 is that it only helps when you have
one GPU driven by that driver in the system.  If you have multiple
GPUs driven by that driver, which one would that apply to?  E.g., what
if you have 2 AMD GPUs in the system.

>
> While in the process of study drm/ast, I know that drm/ast
> driver has the POST code shipped, See the ast_post_gpu() function.
> Then, I was wondering why this function doesn't works.
>
> After a short-time (hasty) debugging, I found that the ast_post_gpu()
> function didn't get run. Because it have something to do with the
> ast->config_mode. Without thinking too much, I hardcoded the
> ast->config_mode as ast_use_p2a, the key point is to force the
> ast_post_gpu() function to run.
>
>
> ```
>
> --- a/drivers/gpu/drm/ast/ast_main.c
> +++ b/drivers/gpu/drm/ast/ast_main.c
> @@ -132,6 +132,8 @@ static int ast_device_config_init(struct ast_device
> *ast)
>                  }
>          }
>
> +       ast->config_mode = ast_use_p2a;
> +
>          switch (ast->config_mode) {
>          case ast_use_defaults:
>                  drm_info(dev, "Using default configuration\n");
>
> ```
>
> Then, the monitor light up, it display the Ubuntu greeter to me. Therefore
> my patch is useful, at least for the Linux drm driver tester and developer.
> It allow programmers to test the specific part of a specific driver without
> changing a line of the source code and without the need of sudo authority.
>
> It improves the efficiency of the testing and patch verification. I know
> the PrimaryGPU option of Xorg conf, but this approach will remember the
> setup have been made, you need modify it with root authority each time
> you want to switch the primary. But on the process of rapid developing
> and/or testing for multiple video drivers, with only one computer hardware
> resource available. What we really want is a one-shot command, as provided
> by this series.  So, this is the first use case.
>
>
> The second use case is that sometime the firmware is not reliable.
> While there are thousands of ARM64, PowerPC and Mips servers machine,
> Most of them don't have a good UEFI firmware support. I haven't test the
> drm/amdgpu and drm/radeon at my ARM64 server yet. Because this ARM64
> server always use the platform(BMC) integrated display controller as primary.
> The UEFI firmware of it does not provide options menu to tune.
> So, for the first time, the discrete card because useless, despite more powerful.
> I will take time to carry on the testing, so I will be able to tell more
> in the future.
>
>
> Even on X86, when select the PEG as primary on the UEFI BIOS menu.
> There is no way to tell the bios which one of my three
> discrete video be the primary. Not to mention some old UEFI
> firmware, which doesn't provide a setting at all.
> While the benefit of my approach is the flexibility.
> Yes the i915, amdgpu and radeon are good quality,
> but there may have programmers want to try nouveau.
>
>
> The third use case is that VGAARB is also not reliable, It will
> select a wrong device as primary. Especially on Arm64, Loongarch
> and mips arch etc. And the X server will use this wrong device
> as primary and completely crash there. Either because of lacking
> a driver or the driver has a bug which can not bear the graphic
> environment up. VGAARB is firmware dependent.
> My patch provide a temporary method to rescue.
>

It sounds like we need a general purpose "primary" selector.  I think
it's sort of orthogonal to VGA.  VGAARB is just for managing VGA
routing on PCI.  It's not really directly related to which GPU you
want to be the primary when the OS loads.  Maybe some new kernel
parameter, E.g., primary_display=<string> where the string would be
the specific device you wanted to be the primary,  E.g., you could use
the PCI BDF on PCI devices, primary_display=0000:0a:00.0 or some other
device string for non-PCI devices.

Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ