lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Nov 2019 10:32:31 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Ian Rogers <irogers@...gle.com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        Kees Cook <keescook@...omium.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Petr Mladek <pmladek@...e.com>,
        Mauro Carvalho Chehab <mchehab+samsung@...nel.org>,
        Qian Cai <cai@....pw>, Joe Lawrence <joe.lawrence@...hat.com>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Sri Krishna chowdary <schowdary@...dia.com>,
        "Uladzislau Rezki (Sony)" <urezki@...il.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Changbin Du <changbin.du@...el.com>,
        Ard Biesheuvel <ardb@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Kent Overstreet <kent.overstreet@...il.com>,
        Gary Hook <Gary.Hook@....com>, Arnd Bergmann <arnd@...db.de>,
        Kan Liang <kan.liang@...ux.intel.com>,
        linux-kernel@...r.kernel.org,
        Stephane Eranian <eranian@...gle.com>,
        Andi Kleen <ak@...ux.intel.com>
Subject: Re: [PATCH v3 02/10] lib: introduce generic min max heap

On Wed, Nov 13, 2019 at 04:30:34PM -0800, Ian Rogers wrote:
> Based-on-work-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> Signed-off-by: Ian Rogers <irogers@...gle.com>
> ---
>  include/linux/min_max_heap.h | 134 ++++++++++++++++++++++++
>  lib/Kconfig.debug            |  10 ++
>  lib/Makefile                 |   1 +
>  lib/test_min_max_heap.c      | 194 +++++++++++++++++++++++++++++++++++
>  4 files changed, 339 insertions(+)
>  create mode 100644 include/linux/min_max_heap.h
>  create mode 100644 lib/test_min_max_heap.c
> 
> diff --git a/include/linux/min_max_heap.h b/include/linux/min_max_heap.h
> new file mode 100644
> index 000000000000..ea7764a8252a
> --- /dev/null
> +++ b/include/linux/min_max_heap.h
> @@ -0,0 +1,134 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_MIN_MAX_HEAP_H
> +#define _LINUX_MIN_MAX_HEAP_H
> +
> +#include <linux/bug.h>
> +#include <linux/string.h>
> +

Make this a kerneldoc comment and loose the comments in the structure.

> +/*
> + * Data structure used to hold a min or max heap, the number of elements varies
> + * but the maximum size is fixed.
> + */
> +struct min_max_heap {
> +	/* Start of array holding the heap elements. */
> +	void *data;
> +	/* Number of elements currently in min-heap. */
> +	int size;
> +	/* Maximum number of elements that can be held in current storage. */
> +	int cap;

You've got the naming all wrong; size is the size of the allocation, num
is the number of elements in use.

> +};
> +

Maybe do a kerneldoc comment for this structure too, that keeps the
definition less cluttered.

/**
 * struct min_max_heap_callbacks - const data/functions to customise the minmax heap
 * @elem_size:		the size of each element in bytes
 * @cmp:		partial order function for this heap
 *			'less'/'<' for min-heap, 'greater'/'>' for max-heap
 * @swp:		swap function.
 */
> +struct min_max_heap_callbacks {
> +	/* Size of elements in the heap. */
> +	int elem_size;
> +	/*
> +	 * A function which returns *lhs < *rhs or *lhs > *rhs depending on
> +	 * whether this is a min or a max heap. Note, another compare function
> +	 * style in the kernel will return -ve, 0 and +ve and won't handle
> +	 * minimum integer correctly if implemented as a subtract.
> +	 */
> +	bool (*cmp)(const void *lhs, const void *rhs);
> +	/* Swap the element values at lhs with those at rhs. */
> +	void (*swp)(void *lhs, void *rhs);
> +};

Personally I'd just call the whole thing a minheap and call the compare
function less and leave it at that. Sure if you flip the order you'll
get a maxheap but that's fairly obvious.

> +
> +/* Sift the element at pos down the heap. */
> +static inline void heapify(struct min_max_heap *heap, int pos,

Given this lives in the global namespace (this is C), maybe pick a
slightly more specific name, like min_max_heapify().

> +			const struct min_max_heap_callbacks *func) {

This is against coding style. Functions get their opening brace on a new
line. The rest of your patch has this right, why not this one?

> +	void *left_child, *right_child, *parent, *large_or_smallest;

I'm not a fan of excessively long variable names, they make it so much
harder to read code.

	void *left, *right, *parent, *pivot;

> +	char *data = (char *)heap->data;

What's the deal with that char nonsense? GCC does void* arithmetic just
right, also C will silently cast void* to any other pointer type.

> +
> +	for (;;) {
> +		if (pos * 2 + 1 >= heap->size)
> +			break;
> +
> +		left_child = data + ((pos * 2 + 1) * func->elem_size);
> +		parent = data + (pos * func->elem_size);

You can reduce the number of multiplications here. You have 3, and IIRC
you only need 1.

Set parent before the loop, compute right as left + size, and hand
either down as parent for the next iteration.

> +		large_or_smallest = parent;
> +		if (func->cmp(left_child, large_or_smallest))
> +			large_or_smallest = left_child;
> +
> +		if (pos * 2 + 2 < heap->size) {
> +			right_child = data + ((pos * 2 + 2) * func->elem_size);
> +			if (func->cmp(right_child, large_or_smallest))
> +				large_or_smallest = right_child;
> +		}
> +		if (large_or_smallest == parent)
> +			break;
> +		func->swp(large_or_smallest, parent);
> +		if (large_or_smallest == left_child)
> +			pos = (pos * 2) + 1;
> +		else
> +			pos = (pos * 2) + 2;
> +	}

I'm a little confused, normally (2*pos) is left and (2*pos+1) is right,
you seem to have used (2*pos + 1) and (2*pos + 2).

Also, I'm thinking the above can be helped with a little helper:

static inline int min_max_child(int pos, bool right)
{
	return 2 * pos + 1 + right;
}

> +}
> +
> +/* Floyd's approach to heapification that is O(size). */
> +static inline void
> +heapify_all(struct min_max_heap *heap,

min_max_heapify_all()

> +	const struct min_max_heap_callbacks *func)
> +{
> +	int i;
> +
> +	for (i = heap->size / 2; i >= 0; i--)

Where does that >= come from?

> +		heapify(heap, i, func);
> +}
> +
> +/* Remove minimum element from the heap, O(log2(size)). */
> +static inline void
> +heap_pop(struct min_max_heap *heap, const struct min_max_heap_callbacks *func)
> +{
> +	char *data = (char *)heap->data;

more silly char stuff

> +
> +	if (WARN_ONCE(heap->size <= 0, "Popping an empty heap"))
> +		return;
> +
> +	/* Place last element at the root (position 0) and then sift down. */
> +	heap->size--;
> +	memcpy(data, data + (heap->size * func->elem_size), func->elem_size);
> +	heapify(heap, 0, func);
> +}
> +
> +/*
> + * Remove the minimum element and then push the given element. The
> + * implementation performs 1 sift (O(log2(size))) and is therefore more
> + * efficient than a pop followed by a push that does 2.
> + */
> +static void heap_pop_push(struct min_max_heap *heap,
> +			const void *element,
> +			const struct min_max_heap_callbacks *func)
> +{
> +	char *data = (char *)heap->data;

delete it already

> +
> +	memcpy(data, element, func->elem_size);
> +	heapify(heap, 0, func);
> +}
> +
> +/* Push an element on to the heap, O(log2(size)). */
> +static inline void
> +heap_push(struct min_max_heap *heap, const void *element,
> +	const struct min_max_heap_callbacks *func)
> +{
> +	void *child, *parent;
> +	int pos;
> +	char *data = (char *)heap->data;

there are no strings here...

> +
> +	if (WARN_ONCE(heap->size >= heap->cap, "Pushing on a full heap"))
> +		return;
> +
> +	/* Place at the end of data. */
> +	pos = heap->size;
> +	memcpy(data + (pos * func->elem_size), element, func->elem_size);
> +	heap->size++;
> +
> +	/* Sift up. */
> +	for (; pos > 0; pos = (pos - 1) / 2) {

And here you have '>' in direct conflict with heapify_all()

> +		child = data + (pos * func->elem_size);
> +		parent = data + ((pos - 1) / 2) * func->elem_size;
> +		if (func->cmp(parent, child))
> +			break;
> +		func->swp(parent, child);

		child = parent;

and loose one multiplcation.

> +	}
> +}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ