lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 17 Jul 2017 14:57:06 -0400
From:   Tejun Heo <tj@...nel.org>
To:     Dennis Zhou <dennisz@...com>
Cc:     Christoph Lameter <cl@...ux.com>, kernel-team@...com,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Dennis Zhou <dennisszhou@...il.com>
Subject: Re: [PATCH 06/10] percpu: modify base_addr to be region specific

Hello,

On Sat, Jul 15, 2017 at 10:23:11PM -0400, Dennis Zhou wrote:
> From: "Dennis Zhou (Facebook)" <dennisszhou@...il.com>
> 
> Originally, the first chunk is served by up to three chunks, each given
> a region they are responsible for. Despite this, the arithmetic was based
> off of the base_addr making it require offsets or be overly inclusive.
> This patch changes percpu checks for first chunk to consider the only
> the dynamic region and the reserved check to be only the reserved
> region. There is no impact here besides making these checks a little
> more accurate.
> 
> This patch also adds the ground work increasing the minimum allocation
> size to 4 bytes. The new field nr_pages in pcpu_chunk will be used to
> keep track of the number of pages the bitmap serves. The arithmetic for
> identifying first chunk and reserved chunk reflect this change.

However small the patch might end up being, I'd much prefer changing
the minimum alloc size to be a separate patch with rationale.

> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> index 98a371c..a5cedcd 100644
> --- a/include/linux/percpu.h
> +++ b/include/linux/percpu.h
> @@ -21,6 +21,10 @@
>  /* minimum unit size, also is the maximum supported allocation size */
>  #define PCPU_MIN_UNIT_SIZE		PFN_ALIGN(32 << 10)
>  
> +/* minimum allocation size and shift in bytes */
> +#define PCPU_MIN_ALLOC_SIZE		(1 << PCPU_MIN_ALLOC_SHIFT)
> +#define PCPU_MIN_ALLOC_SHIFT		2

nitpick: Put SHIFT def above SIZE def?

> +/*
> + * Static addresses should never be passed into the allocator.  They
> + * are accessed using the group_offsets and therefore do not rely on
> + * chunk->base_addr.
> + */
>  static bool pcpu_addr_in_first_chunk(void *addr)
>  {
>  	void *first_start = pcpu_first_chunk->base_addr;
>  
> -	return addr >= first_start && addr < first_start + pcpu_unit_size;
> +	return addr >= first_start &&
> +	       addr < first_start +
> +	       pcpu_first_chunk->nr_pages * PAGE_SIZE;

Does the above line need line break?  If so, it'd probably be easier
to read if the broken line is indented (preferably to align with the
start of the sub expression).  e.g.

	return addr < first_start +
		      pcpu_first_chunk->nr_pages * PAGE_SIZE;

>  static bool pcpu_addr_in_reserved_chunk(void *addr)
>  {
> -	void *first_start = pcpu_first_chunk->base_addr;
> +	void *first_start;
>  
> -	return addr >= first_start &&
> -		addr < first_start + pcpu_reserved_chunk_limit;
> +	if (!pcpu_reserved_chunk)
> +		return false;
> +
> +	first_start = pcpu_reserved_chunk->base_addr;
> +
> +	return addr >= first_start + pcpu_reserved_offset &&
> +	       addr < first_start +
> +	       pcpu_reserved_chunk->nr_pages * PAGE_SIZE;

Ditto on indentation.

> @@ -1366,10 +1388,17 @@ phys_addr_t per_cpu_ptr_to_phys(void *addr)
>  	 * The following test on unit_low/high isn't strictly
>  	 * necessary but will speed up lookups of addresses which
>  	 * aren't in the first chunk.
> +	 *
> +	 * The address check is of high granularity checking against full
> +	 * chunk sizes.  pcpu_base_addr points to the beginning of the first
> +	 * chunk including the static region.  This allows us to examine all
> +	 * regions of the first chunk. Assumes good intent as the first
> +	 * chunk may not be full (ie. < pcpu_unit_pages in size).
>  	 */
> -	first_low = pcpu_chunk_addr(pcpu_first_chunk, pcpu_low_unit_cpu, 0);
> -	first_high = pcpu_chunk_addr(pcpu_first_chunk, pcpu_high_unit_cpu,
> -				     pcpu_unit_pages);
> +	first_low = (unsigned long) pcpu_base_addr +
                                   ^
			 no space for type casts

> @@ -1575,6 +1604,8 @@ int __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai,
>  	unsigned int cpu;
>  	int *unit_map;
>  	int group, unit, i;
> +	unsigned long tmp_addr, aligned_addr;
> +	unsigned long map_size_bytes;

How about just map_size or map_bytes?

> @@ -1678,46 +1709,66 @@ int __init pcpu_setup_first_chunk(const struct pcpu_alloc_info *ai,
>  		INIT_LIST_HEAD(&pcpu_slot[i]);
>  
>  	/*
> -	 * Initialize static chunk.  If reserved_size is zero, the
> -	 * static chunk covers static area + dynamic allocation area
> -	 * in the first chunk.  If reserved_size is not zero, it
> -	 * covers static area + reserved area (mostly used for module
> -	 * static percpu allocation).
> +	 * Initialize static chunk.
> +	 * The static region is dropped as those addresses are already
> +	 * allocated and do not rely on chunk->base_addr.
> +	 * reserved_size == 0:
> +	 *      the static chunk covers the dynamic area
> +	 * reserved_size > 0:
> +	 *      the static chunk covers the reserved area
> +	 *
> +	 * If the static area is not page aligned, the region adjacent
> +	 * to the static area must have its base_addr be offset into
> +	 * the static area to have it be page aligned.  The overlap is
> +	 * then allocated preserving the alignment in the metadata for
> +	 * the actual region.

We can address this later but static chunk not covering static area is
kinda confusing.  The original complication came from trying to make
the static chunk service either reserved or first dynamic chunk.  We
don't need that anymore and might as well use separate rchunk and
dchunk.

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ