phc-discussions - Re: [PHC] Re: The best of the best, IMO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVQ9MinbayWvLrC4X47AhbhnUb5kXzxH4jxJxsDAKH-2g@mail.gmail.com>
Date: Wed, 16 Apr 2014 16:05:41 -0700
From: Andy Lutomirski <luto@...capital.net>
To: discussions <discussions@...sword-hashing.net>
Subject: Re: [PHC] Re: The best of the best, IMO

On Wed, Apr 16, 2014 at 3:59 PM, Bill Cox <waywardgeek@...il.com> wrote:
> On Wed, Apr 16, 2014 at 6:32 PM, Andy Lutomirski <luto@...capital.net>
> wrote:
>>
>> On Mon, Apr 14, 2014 at 6:06 AM, Bill Cox <waywardgeek@...il.com> wrote:
>> >
>> > I have a similar problem with my t_cost parameter in TwoCats.  I use it
>> > to
>> > balance external memory bandwidth and internal cache bandwidth, ideally
>> > maxing out both at the same time to provide two levels of defense.
>> > However,
>> > if used to run a long time hashing only a small amount of memory, it
>> > will
>> > hash the same two blocks together many times before writing the result
>> > block, lowering external memory bandwidth to nearly 0.  I could have
>> > made
>> > t_cost repeat the entire memory hash operation like several entries do,
>> > but
>> > then I could not use it to balance cache and external memory bandwidth
>> > at
>> > the same time.  Also, some users may prefer to have TwoCats avoid maxing
>> > out
>> > external memory bandwidth, and this gives them a knob to do that.
>> > Rather
>> > than confuse users with two separate time cost parameters, I chose to
>> > keep
>> > only the one I find of higher value.  As a work-around, a user could
>> > just
>> > call TwoCats repeatedly, providing his own outer loop.  The same can be
>> > done
>> > with Centrifuge, with t_cost set to 0, substantially increasing it's
>> > memory
>> > bandwidth.  However, in CFB mode, and writing only to on-chip cache,
>> > Centrifuge cannot approach maxing out cache memory bandwidth, regardless
>> > of
>> > settings.
>> >
>>
>> I understand that this is needed to fit within the PHC framework, but
>> this sounds like it'll cause screwups if you win and it stays like
>> this (or if anyone else with the same issue wins).  For example, what
>> if I want a really strong FDE hash, but I only have 4 GB of RAM?
>>
>> Would it make sense to adjust the real API (e.g.
>> TwoCats_HashPasswordFull) to accept parameters for the total time to
>> use (in arbitrary units), total memory to use (in bytes or some real
>> unit) and something to control the cache/RAM bandwidth ratio?  It
>> would be okay if not all combinations or arguments are valid (e.g. if
>> you ask for very large memory and very short time).
>>
>> This way I could say "I'm willing to hash for 2 seconds and I can use
>> 3GB of RAM, and tune for average systems in 2014" and get reasonable
>> behavior?
>>
>> --Andy
>
>
> Yes, that would be fine.  I wrote a simple wrapper to find these parameters
> automatically:
>
> // Find parameter settings on this machine for a given desired runtime and
> maximum memory
> // usage.  maxMem is in KiB.  Runtime with be typically +/- 50% and memory
> will be <= maxMem.
> void TwoCats_FindCostParameters(TwoCats_HashType hashType, uint32_t
> milliSeconds, uint32_t maxMem,
>     uint8_t *memCost, uint8_t *timeCost, uint8_t *multplies, uint8_t
> *lanes);
>
> This function first finds a memory cost setting to take up the allotted
> time, up to maxMem, and if successful, it increases multiplies and timeCost
> to the point that the start slowing  hashing down.  This gives close to the
> max memory achievable in the given time, while compute time hardening with
> multiplication chains, internal cache bandwidth, and external DRAM
> bandwidth.
>
> I've got a problem when maxMem is too low for a given runtime.  In that
> case, It's probably best to iterate the whole algorithm a number of times to
> fill up the desired runtime at maxMem.  I could add this as an external loop
> around the algorithm.
>

I would suggest doing that as part of the provided API, because people
will screw it up if you they're supposed to do it on their own.

--Andy