linux-kernel - Re: do_div() silently truncates "base" to 32bit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <52212128.3000100@infradead.org>
Date:	Fri, 30 Aug 2013 15:48:08 -0700
From:	Randy Dunlap <rdunlap@...radead.org>
To:	Anatol Pomozov <anatol.pomozov@...il.com>
CC:	LKML <linux-kernel@...r.kernel.org>, Tejun Heo <tj@...nel.org>,
	bernie@...eler.com
Subject: Re: do_div() silently truncates "base" to 32bit

On 08/30/13 15:14, Anatol Pomozov wrote:
> Hi
> 
> On Fri, Aug 30, 2013 at 2:23 PM, Randy Dunlap <rdunlap@...radead.org> wrote:
>> On 08/30/13 10:21, Anatol Pomozov wrote:
>>> Hi,
>>>
>>>
>>> I was debugging weird "zero divide" problem in CFQ code below
>>>
>>>
>>> static u64 cfqg_prfill_avg_queue_size(struct seq_file *sf,
>>>       struct blkg_policy_data *pd, int off)
>>> {
>>>     struct cfq_group *cfqg = pd_to_cfqg(pd);
>>>     u64 samples = blkg_stat_read(&cfqg->stats.avg_queue_size_samples);
>>>     u64 v = 0;
>>>
>>>     if (samples) {
>>>         v = blkg_stat_read(&cfqg->stats.avg_queue_size_sum);
>>>         do_div(v, samples);
>>>     }
>>>     __blkg_prfill_u64(sf, pd, v);
>>>     return 0;
>>> }
>>>
>>>
>>> do_div() crashes says "zero divide". It is weird because just a few
>>> lines above we check divider for zero.
>>>
>>>
>>> The problem comes from include/asm-generic/div64.h file that
>>> implements do_div() as macros:
>>>
>>> # define do_div(n,base) ({ \
>>>     uint32_t __base = (base); \
>>>     uint32_t __rem; \
>>>     __rem = ((uint64_t)(n)) % __base; \
>>>     (n) = ((uint64_t)(n)) / __base; \
>>>     __rem; \
>>>  })
>>>
>>>
>>> Do you see the problem?
>>>
>>> The problem here is that "base" argument is truncated to 32bit, but in
>>> the function above "sample" is 64bit variable. If sample's 32 low bits
>>> are zero - we have a crash. in fact we have incorrect behavior any
>>> time when high 32bits are non-zero.
>>>
>>>
>>> My question is why the base is 32bit? Why not to use 64bit arguments?
>>
>> Maybe performance related?
>>
>> If you want 64-bit values, don't use do_div() from asm-generic/div64.h.
>>
>> Instead look at linux/math64.h and use div_u64_rem() et al
>> or the recently posted div64_u64_rem().
>> [posted by Mike Snitzer on Aug. 21 2013]
>>
>> I.e., use exactly the function(s) that you need to use.
>>
>> Does that fix the problem?
> 
> It definitely fixes the crash. I've already sent a patch to CFQ
> maillist http://news.gmane.org/gmane.linux.kernel.cgroups
> 
> But another question still remains: why compiler does not warn that
> size truncation happens? How to prevent bugs like CFQ one in the
> future? Should we add a compile-time assert to do_div() to prevent
> passing 64 numbers in "base" macro parameter?


That sounds like a fine idea to me.

-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/