linux-kernel - Re: [PATCH regression] dma debug: account for cachelines and read-only mappings in overlap tracking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20140213140523.099fd03418d8fe1467db3bd9@linux-foundation.org>
Date:	Thu, 13 Feb 2014 14:05:23 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Dan Williams <dan.j.williams@...el.com>
Cc:	Wei Liu <wei.liu2@...rix.com>,
	Eric Dumazet <eric.dumazet@...il.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	Sander Eikelenboom <linux@...elenboom.it>,
	Francois Romieu <romieu@...zoreil.com>,
	Dave Jones <davej@...hat.com>
Subject: Re: [PATCH regression] dma debug: account for cachelines and
 read-only mappings in overlap tracking

On Thu, 13 Feb 2014 13:58:00 -0800 Dan Williams <dan.j.williams@...el.com> wrote:

> While debug_dma_assert_idle() checks if a given *page* is actively
> undergoing dma the valid granularity of a dma mapping is a *cacheline*.
> Sander's testing shows that the warning message "DMA-API: exceeded 7
> overlapping mappings of pfn..." is falsely triggering.  The test is
> simply mapping multiple cachelines in a given page.
> 
> Ultimately we want overlap tracking to be valid as it is a real api
> violation, so we need to track active mappings by cachelines.  Update
> the active dma tracking to use the page-frame-relative cacheline of the
> mapping as the key, and update debug_dma_assert_idle() to check for all
> possible mapped cachelines for a given page.
> 
> However, the need to track active mappings is only relevant when the
> dma-mapping is writable by the device.  In fact it is fairly standard
> for read-only mappings to have hundreds or thousands of overlapping
> mappings at once.  Limiting the overlap tracking to writable
> (!DMA_TO_DEVICE) eliminates this class of false-positive overlap
> reports.
> 
> Note, the radix gang lookup is sub-optimal.  It would be best if it
> stopped fetching entries once the search passed a page boundary.
> Nevertheless, this implementation does not perturb the original net_dma
> failing case.  That is to say the extra overhead does not show up in
> terms of making the failing case pass due to a timing change.
> 
> References:
> http://marc.info/?l=linux-netdev&m=139232263419315&w=2
> http://marc.info/?l=linux-netdev&m=139217088107122&w=2
> 
> ...
>
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -424,111 +424,132 @@ void debug_dma_dump_mappings(struct device *dev)
>  EXPORT_SYMBOL(debug_dma_dump_mappings);
>  
>  /*
> - * For each page mapped (initial page in the case of
> - * dma_alloc_coherent/dma_map_{single|page}, or each page in a
> - * scatterlist) insert into this tree using the pfn as the key. At
> + * For each mapping (initial cacheline in the case of
> + * dma_alloc_coherent/dma_map_page, initial cacheline in each page of a
> + * scatterlist, or the cacheline specified in dma_map_single) insert
> + * into this tree using the cacheline as the key. At
>   * dma_unmap_{single|sg|page} or dma_free_coherent delete the entry.  If
> - * the pfn already exists at insertion time add a tag as a reference
> + * the entry already exists at insertion time add a tag as a reference
>   * count for the overlapping mappings.  For now, the overlap tracking
> - * just ensures that 'unmaps' balance 'maps' before marking the pfn
> - * idle, but we should also be flagging overlaps as an API violation.
> + * just ensures that 'unmaps' balance 'maps' before marking the
> + * cacheline idle, but we should also be flagging overlaps as an API
> + * violation.
>   *
>   * Memory usage is mostly constrained by the maximum number of available
>   * dma-debug entries in that we need a free dma_debug_entry before
> - * inserting into the tree.  In the case of dma_map_{single|page} and
> - * dma_alloc_coherent there is only one dma_debug_entry and one pfn to
> - * track per event.  dma_map_sg(), on the other hand,
> - * consumes a single dma_debug_entry, but inserts 'nents' entries into
> - * the tree.
> + * inserting into the tree.  In the case of dma_map_page and
> + * dma_alloc_coherent there is only one dma_debug_entry and one
> + * dma_active_cacheline entry to track per event.  dma_map_sg(), on the
> + * other hand, consumes a single dma_debug_entry, but inserts 'nents'
> + * entries into the tree.
>   *
>   * At any time debug_dma_assert_idle() can be called to trigger a
> - * warning if the given page is in the active set.
> + * warning if any cachelines in the given page are in the active set.
>   */
> -static RADIX_TREE(dma_active_pfn, GFP_NOWAIT);
> +static RADIX_TREE(dma_active_cacheline, GFP_NOWAIT);
>  static DEFINE_SPINLOCK(radix_lock);
> -#define ACTIVE_PFN_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1)
> +#define ACTIVE_CLN_MAX_OVERLAP ((1 << RADIX_TREE_MAX_TAGS) - 1)
> +#define CACHELINE_PER_PAGE_SHIFT (PAGE_SHIFT - L1_CACHE_SHIFT)
> +#define CACHELINES_PER_PAGE (1 << CACHELINE_PER_PAGE_SHIFT)
>  
> -static int active_pfn_read_overlap(unsigned long pfn)
> +unsigned long to_cln(struct dma_debug_entry *entry)
> +{
> +	return (entry->pfn << CACHELINE_PER_PAGE_SHIFT) +
> +		(entry->offset >> L1_CACHE_SHIFT);
> +}

"cln" is ugly and isn't a well-known kernel abbreviation.  We typically
spell these things out, so "cacheline".  But I think you mean
"cacheline number", and that is too long to spell out.

So I guess "cln" just became a well-known kernel abbreviation.

> ....
>
>  void debug_dma_assert_idle(struct page *page)
>  {
> +	unsigned long cln = page_to_pfn(page) << CACHELINE_PER_PAGE_SHIFT;

This worries me.  Are you sure we cannot overflow the ulong here under
any circumstances?  32GB PAE with sparsemem or whatever?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/