lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 Feb 2009 17:32:09 +0100
From:	Borislav Petkov <petkovbb@...glemail.com>
To:	Sergei Shtylyov <sshtylyov@...mvista.com>
Cc:	Bartlomiej Zolnierkiewicz <bzolnier@...il.com>,
	linux-ide@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/6] ide: remove ide_execute_pkt_cmd()

[..]

>>>>
>>>> How about we finally add those check macros in block layer fashion like
>>>> blk_pc_request et al and do
>>>>
>>>> #define drv_can_drq_interrupt(drive)    ((drive)->atapi_flags &
>>>> IDE_AFLAG_DRQ_INTERRUPT)
>>>>
>>>>
>>>
>>>  I suppose it's for the devices that interrupt on packet DRQ? Then it's
>>> hardly a good name because it's not like this is some optional
>>> capability.
>>>
>>
>> No, I was alluding to the command packet DRQ type used by the device as it
>> is
>> put in SFF8020i, 7.1.7.1 General Configuration Word.
>>
>
>  I was talking about exactly the same feature. :-)

Ok, I agree, the names could be a bit more explanatory w.r.t of the DRQ type:

ide_drv_microprocessor_drq
ide_drv_interrupt_drq
ide_drv_accelerated_drq

although we use only one type currently.

>>>> or similar instead of wasting stack space?
>>>>
>>>
>>>  It doesn't necessarily waste stack space. Haven't you heard about
>>> compiler
>>> putting local vairables into registers?
>>>
>>
>> Yes, have you heard of unnecessary register spilling?
>>
>
>  No -- only about stack spilling on CPUs "caching" the top of stack in their
> register file (like SPARC).
>  Linux runs not only on x86 and many RISCs can store several local variables
> in the dedicated registers -- it's the part of say MIPS ABIs...

>>>> It'll also read better in the if() check:
>>>>
>>>> if (drv_can_irq_interrupt(drive)) { ...
>>>>
>>>>
>>>
>>>  It's faster to checj a local variable than to dereference drv several
>>> times
>>> -- unless gcc optimizes that away (by creating an implicit local variable
>>> :-).


Let's look at an example:

In ide-cd.c:cdrom_newpc_intr() you have the following code snippet:

<ide-cd.c>
 799         thislen = blk_fs_request(rq) ? len : rq->data_len;
 800         if (thislen > len)
 801                 thislen = len;
 802
 803         ide_debug_log(IDE_DBG_PC, "%s: DRQ: stat: 0x%x, thislen: %d\n",
 804                       __func__, stat, thislen);
 805
 806         /* If DRQ is clear, the command has completed. */
 807         if ((stat & ATA_DRQ) == 0) {
 808                 if (blk_fs_request(rq)) {
</ide-cd.c>

Now watch the blk_fs_request() thing.

Here's what my gcc¹ spits out:

<ide-cd.s>
.LVL174:
        .loc 1 799 0
        movl    76(%r12), %ecx  # <variable>.cmd_type, prephitmp.1128
        cmpl    $1, %ecx        #, prephitmp.1128
        movl    %ecx, %r8d      # prephitmp.1128, prephitmp.1047
        je      .L225   #,
.LVL175:
        .loc 1 800 0
        movzwl  -44(%rbp), %r15d        # len, thislen
.LVL176:
        .loc 1 799 0
        movl    280(%r12), %edx # <variable>.data_len, thislen.1129
.LVL177:
        .loc 1 800 0
        cmpl    %r15d, %edx     # thislen, thislen.1129
        movl    %r15d, %ebx     # thislen, thislen.1163
        jg      .L145   #,
.LVL178:
        movl    %edx, %r15d     # thislen.1129, thislen
.LVL179:
.L145:
        .loc 1 807 0
        testb   $8, -64(%rbp)   #, stat
        jne     .L147   #,
.LVL180:
        .loc 1 808 0
        cmpl    $1, %ecx        #, prephitmp.1128
        je      .L226   #,
        .loc 1 825 0
        cmpl    $2, %ecx        #, prephitmp.1128
        .p2align 4,,3
        .p2align 3
        je      .L152   #,
.LBB408:
</ide-cd.s>

and at label .LVL174 you see the blk_fs_request() check from line
799 above. Later, at label .LVL180 you see the next blk_fs_request() check from
line 808 and this is cached in %ecx so gcc is smart enough to do that. So,
actually you get the same thing/or even better with variables in registers
instead of on stack and the code is more readable. A win-win
situation, I'd say :).

>>
>> I hope gcc is smart enough to do that.
>>
>
>  Then where's the win?

Readability, of course. Also you have smaller, cleaner code. This is very
important, IMO.


---
¹gcc (Debian 4.3.3-3) 4.3.3
Copyright (C) 2008 Free Software Foundation, Inc.


-- 
Regards/Gruss,
Boris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ