linux-kernel - Re: [PATCH 3/5] mm/vmalloc.c: correct lazy_max

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a98b4ca9-11d1-6510-63c9-f63897129be3@zoho.com>
Date:   Fri, 23 Sep 2016 13:00:35 +0800
From:   zijun_hu <zijun_hu@...o.com>
To:     Nicholas Piggin <npiggin@...il.com>
Cc:     zijun_hu@....com, Michal Hocko <mhocko@...nel.org>,
        npiggin@...e.de, David Rientjes <rientjes@...gle.com>,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>, tj@...nel.org,
        mingo@...nel.org, iamjoonsoo.kim@....com,
        mgorman@...hsingularity.net
Subject: Re: [PATCH 3/5] mm/vmalloc.c: correct lazy_max_pages() return value

On 2016/9/23 11:30, Nicholas Piggin wrote:
> On Fri, 23 Sep 2016 00:30:20 +0800
> zijun_hu <zijun_hu@...o.com> wrote:
> 
>> On 2016/9/22 20:37, Michal Hocko wrote:
>>> On Thu 22-09-16 09:13:50, zijun_hu wrote:  
>>>> On 09/22/2016 08:35 AM, David Rientjes wrote:  
>>> [...]  
>>>>> The intent is as it is implemented; with your change, lazy_max_pages() is 
>>>>> potentially increased depending on the number of online cpus.  This is 
>>>>> only a heuristic, changing it would need justification on why the new
>>>>> value is better.  It is opposite to what the comment says: "to be 
>>>>> conservative and not introduce a big latency on huge systems, so go with
>>>>> a less aggressive log scale."  NACK to the patch.
>>>>>  
>>>> my change potentially make lazy_max_pages() decreased not increased, i seems
>>>> conform with the comment
>>>>
>>>> if the number of online CPUs is not power of 2, both have no any difference
>>>> otherwise, my change remain power of 2 value, and the original code rounds up
>>>> to next power of 2 value, for instance
>>>>
>>>> my change : (32, 64] -> 64
>>>> 	     32 -> 32, 64 -> 64
>>>> the original code: [32, 63) -> 64
>>>>                    32 -> 64, 64 -> 128  
>>>
>>> You still completely failed to explain _why_ this is an improvement/fix
>>> or why it matters. This all should be in the changelog.
>>>   
>>
>> Hi npiggin,
>> could you give some comments for this patch since lazy_max_pages() is introduced
>> by you
>>
>> my patch is based on the difference between fls() and get_count_order() mainly
>> the difference between fls() and get_count_order() will be shown below
>> more MM experts maybe help to decide which is more suitable
>>
>> if parameter > 1, both have different return value only when parameter is
>> power of two, for example
>>
>> fls(32) = 6 VS get_count_order(32) = 5
>> fls(33) = 6 VS get_count_order(33) = 6
>> fls(63) = 6 VS get_count_order(63) = 6
>> fls(64) = 7 VS get_count_order(64) = 6
>>
>> @@ -594,7 +594,9 @@ static unsigned long lazy_max_pages(void) 
>> { 
>>     unsigned int log; 
>>
>> -    log = fls(num_online_cpus()); 
>> +    log = num_online_cpus(); 
>> +    if (log > 1) 
>> +        log = (unsigned int)get_count_order(log); 
>>
>>     return log * (32UL * 1024 * 1024 / PAGE_SIZE); 
>> } 
>>
> 
> To be honest, I don't think I chose it with a lot of analysis.
> It will depend on the kernel usage patterns, the arch code,
> and the CPU microarchitecture, all of which would have changed
> significantly.
> 
> I wouldn't bother changing it unless you do some bench marking
> on different system sizes to see where the best performance is.
> (If performance is equal, fewer lazy pages would be better.)
> 
> Good to see you taking a look at this vmalloc stuff. Don't be
> discouraged if you run into some dead ends.
> 
> Thanks,
> Nick
> 
thanks for your reply
please don't pay attention to this patch any more since i don't have
condition to do many test and comparison

i just feel my change maybe be consistent with operation of rounding up
to power of 2