lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f1b5499f-1c7c-4a40-b986-74a014897148@ursulin.net>
Date: Thu, 14 Aug 2025 13:10:02 +0100
From: Tvrtko Ursulin <tursulin@...ulin.net>
To: Liao Yuanhong <liaoyuanhong@...o.com>
Cc: Christian König <christian.koenig@....com>,
 Alex Deucher <alexander.deucher@....com>, David Airlie <airlied@...il.com>,
 Simona Vetter <simona@...ll.ch>,
 "open list:RADEON and AMDGPU DRM DRIVERS" <amd-gfx@...ts.freedesktop.org>,
 "open list:DRM DRIVERS" <dri-devel@...ts.freedesktop.org>,
 open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] drm/radeon: Move the memset() function call after the
 return statement



On 14/08/2025 12:56, Liao Yuanhong wrote:
> On 8/14/2025 7:07 PM, Tvrtko Ursulin wrote:
> 
>>
>> On 14/08/2025 11:11, Christian König wrote:
>>> On 14.08.25 11:39, Liao Yuanhong wrote:
>>>> Adjust the position of the memset() call to avoid unnecessary 
>>>> invocations.
>>>
>>> Hui? That doesn't seem to make much sense to me.
>>>
>>> We memset the local variable because we need to make sure that 
>>> everything (including padding bytes) is zeroed out.
>>>
>>> Even if that isn't sometimes necessary because of error handling we 
>>> clearly shouldn't try to optimize this.
>>
>> To interject with a curiousity, it is even worse to move the memset 
>> away from the declaration block when the often enabled 
>> CONFIG_INIT_STACK_ALL_ZERO is on. At least when they are close 
>> compiler can figure out it only needs to memset it once. Move them 
>> further apart and sometimes memset happens twice, yay.
>>
>> Main point when CONFIG_INIT_STACK_ALL_ZERO is on, and it often is, 
>> there is no way to avoid the sometimes pointless init. I have some 
>> local branches which try to address the worst offenders but it is so 
>> much in the noise that I don't think I ever sent them out.
>>
>> Regards,
>>
>> Tvrtko
>>
> If we converts memset() to = { },can it alleviate the problem?

Yes, for example I have this patch:

commit c780f63e20cb0fd66d31f3c1a37b37983a30f318 (240813-amd-csmemst)
Author: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
Date:   Wed Jul 17 08:28:56 2024 +0100

     drm/amdgpu: Remove hidden double memset from amdgpu_vm_pt_clear()

     When CONFIG_INIT_STACK_ALL_ZERO is set and so 
-ftrivial-auto-var-init=zero
     compiler option active, compiler fails to notice that later in
     amdgpu_vm_pt_clear() there  is a second memset to clear the same on 
stack
     struct amdgpu_vm_update_params.

     If we replace this memset with an explicit automatic variable 
initializer,
     compiler can then see it and avoid clearing this struct twice.

     Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
index e39d6e7643bf..ecdc8fffe941 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
@@ -361,7 +361,7 @@ int amdgpu_vm_pt_clear(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
  {
         unsigned int level = adev->vm_manager.root_level;
         struct ttm_operation_ctx ctx = { true, false };
-       struct amdgpu_vm_update_params params;
+       struct amdgpu_vm_update_params params = { };
         struct amdgpu_bo *ancestor = &vmbo->bo;
         unsigned int entries;
         struct amdgpu_bo *bo = &vmbo->bo;
@@ -398,7 +398,6 @@ int amdgpu_vm_pt_clear(struct amdgpu_device *adev, 
struct amdgpu_vm *vm,
         if (r)
                 goto exit;

-       memset(&params, 0, sizeof(params));
         params.adev = adev;
         params.vm = vm;
         params.immediate = immediate;

Or a bit uglier, different approach, but more on fast path:

commit 31c9f97d69dcc455fe45eb297bcb06c2b87d8204
Author: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>
Date:   Wed Jul 17 08:28:45 2024 +0100

     drm/amdgpu: Remove hidden double memset from amdgpu_cs_ioctl()

     When CONFIG_INIT_STACK_ALL_ZERO is set and so 
-ftrivial-auto-var-init=zero
     compiler option active, compiler fails to notice that inside
     amdgpu_cs_parser_init() there is a second memset to clear the same on
     stack struct amdgpu_cs_parser.

     If we pull this memset one level out, into the amdgpu_cs_ioctl(), 
compiler
     can then see it and avoid clearing this struct twice.

     Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@...lia.com>

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 9aa952f258cf..554289eb1913 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -51,7 +51,6 @@ static int amdgpu_cs_parser_init(struct 
amdgpu_cs_parser *p,
         if (cs->in.num_chunks == 0)
                 return -EINVAL;

-       memset(p, 0, sizeof(*p));
         p->adev = adev;
         p->filp = filp;

@@ -1411,6 +1410,7 @@ int amdgpu_cs_ioctl(struct drm_device *dev, void 
*data, struct drm_file *filp)
         if (!adev->accel_working)
                 return -EBUSY;

+       memset(&parser, 0, sizeof(parser));
         r = amdgpu_cs_parser_init(&parser, adev, filp, data);
         if (r) {
                 DRM_ERROR_RATELIMITED("Failed to initialize parser 
%d!\n", r);

Regards,

Tvrtko

> 
> 
> Thanks,
> 
> Liao
> 
>>>> Signed-off-by: Liao Yuanhong <liaoyuanhong@...o.com>
>>>> ---
>>>>   drivers/gpu/drm/radeon/atombios_crtc.c     |  8 ++++----
>>>>   drivers/gpu/drm/radeon/atombios_encoders.c | 20 ++++++++++----------
>>>>   2 files changed, 14 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/ 
>>>> drm/radeon/atombios_crtc.c
>>>> index 9b3a3a9d60e2..0aae84f5e27c 100644
>>>> --- a/drivers/gpu/drm/radeon/atombios_crtc.c
>>>> +++ b/drivers/gpu/drm/radeon/atombios_crtc.c
>>>> @@ -770,13 +770,13 @@ static void 
>>>> atombios_crtc_set_disp_eng_pll(struct radeon_device *rdev,
>>>>       int index;
>>>>       union set_pixel_clock args;
>>>>   -    memset(&args, 0, sizeof(args));
>>>> -
>>>>       index = GetIndexIntoMasterTable(COMMAND, SetPixelClock);
>>>>       if (!atom_parse_cmd_header(rdev->mode_info.atom_context, 
>>>> index, &frev,
>>>>                      &crev))
>>>>           return;
>>>>   +    memset(&args, 0, sizeof(args));
>>>> +
>>>>       switch (frev) {
>>>>       case 1:
>>>>           switch (crev) {
>>>> @@ -832,12 +832,12 @@ static void atombios_crtc_program_pll(struct 
>>>> drm_crtc *crtc,
>>>>       int index = GetIndexIntoMasterTable(COMMAND, SetPixelClock);
>>>>       union set_pixel_clock args;
>>>>   -    memset(&args, 0, sizeof(args));
>>>> -
>>>>       if (!atom_parse_cmd_header(rdev->mode_info.atom_context, 
>>>> index, &frev,
>>>>                      &crev))
>>>>           return;
>>>>   +    memset(&args, 0, sizeof(args));
>>>> +
>>>>       switch (frev) {
>>>>       case 1:
>>>>           switch (crev) {
>>>> diff --git a/drivers/gpu/drm/radeon/atombios_encoders.c b/drivers/ 
>>>> gpu/drm/radeon/atombios_encoders.c
>>>> index d1c5e471bdca..f706e21a3509 100644
>>>> --- a/drivers/gpu/drm/radeon/atombios_encoders.c
>>>> +++ b/drivers/gpu/drm/radeon/atombios_encoders.c
>>>> @@ -492,11 +492,11 @@ atombios_dvo_setup(struct drm_encoder 
>>>> *encoder, int action)
>>>>       int index = GetIndexIntoMasterTable(COMMAND, DVOEncoderControl);
>>>>       uint8_t frev, crev;
>>>>   -    memset(&args, 0, sizeof(args));
>>>> -
>>>>       if (!atom_parse_cmd_header(rdev->mode_info.atom_context, 
>>>> index, &frev, &crev))
>>>>           return;
>>>>   +    memset(&args, 0, sizeof(args));
>>>> +
>>>>       /* some R4xx chips have the wrong frev */
>>>>       if (rdev->family <= CHIP_RV410)
>>>>           frev = 1;
>>>> @@ -856,8 +856,6 @@ atombios_dig_encoder_setup2(struct drm_encoder 
>>>> *encoder, int action, int panel_m
>>>>       if (dig->dig_encoder == -1)
>>>>           return;
>>>>   -    memset(&args, 0, sizeof(args));
>>>> -
>>>>       if (ASIC_IS_DCE4(rdev))
>>>>           index = GetIndexIntoMasterTable(COMMAND, DIGxEncoderControl);
>>>>       else {
>>>> @@ -870,6 +868,8 @@ atombios_dig_encoder_setup2(struct drm_encoder 
>>>> *encoder, int action, int panel_m
>>>>       if (!atom_parse_cmd_header(rdev->mode_info.atom_context, 
>>>> index, &frev, &crev))
>>>>           return;
>>>>   +    memset(&args, 0, sizeof(args));
>>>> +
>>>>       switch (frev) {
>>>>       case 1:
>>>>           switch (crev) {
>>>> @@ -1453,11 +1453,11 @@ atombios_external_encoder_setup(struct 
>>>> drm_encoder *encoder,
>>>>               (radeon_connector->connector_object_id & 
>>>> OBJECT_ID_MASK) >> OBJECT_ID_SHIFT;
>>>>       }
>>>>   -    memset(&args, 0, sizeof(args));
>>>> -
>>>>       if (!atom_parse_cmd_header(rdev->mode_info.atom_context, 
>>>> index, &frev, &crev))
>>>>           return;
>>>>   +    memset(&args, 0, sizeof(args));
>>>> +
>>>>       switch (frev) {
>>>>       case 1:
>>>>           /* no params on frev 1 */
>>>> @@ -1853,11 +1853,11 @@ atombios_set_encoder_crtc_source(struct 
>>>> drm_encoder *encoder)
>>>>       uint8_t frev, crev;
>>>>       struct radeon_encoder_atom_dig *dig;
>>>>   -    memset(&args, 0, sizeof(args));
>>>> -
>>>>       if (!atom_parse_cmd_header(rdev->mode_info.atom_context, 
>>>> index, &frev, &crev))
>>>>           return;
>>>>   +    memset(&args, 0, sizeof(args));
>>>> +
>>>>       switch (frev) {
>>>>       case 1:
>>>>           switch (crev) {
>>>> @@ -2284,11 +2284,11 @@ atombios_dac_load_detect(struct drm_encoder 
>>>> *encoder, struct drm_connector *conn
>>>>           int index = GetIndexIntoMasterTable(COMMAND, 
>>>> DAC_LoadDetection);
>>>>           uint8_t frev, crev;
>>>>   -        memset(&args, 0, sizeof(args));
>>>> -
>>>>           if (!atom_parse_cmd_header(rdev->mode_info.atom_context, 
>>>> index, &frev, &crev))
>>>>               return false;
>>>>   +        memset(&args, 0, sizeof(args));
>>>> +
>>>>           args.sDacload.ucMisc = 0;
>>>>             if ((radeon_encoder->encoder_id == 
>>>> ENCODER_OBJECT_ID_INTERNAL_DAC1) ||
>>>
>>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ