lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081121174348.GB4336@elte.hu>
Date:	Fri, 21 Nov 2008 18:43:48 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Joerg Roedel <joerg.roedel@....com>
Cc:	Ingo Molnar <mingo@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org, netdev@...r.kernel.org,
	iommu@...ts.linux-foundation.org
Subject: Re: [PATCH 03/10] x86: add initialization code for DMA-API
	debugging


* Joerg Roedel <joerg.roedel@....com> wrote:

> +static struct list_head dma_entry_hash[HASH_SIZE];
> +
> +/* A slab cache to allocate dma_map_entries fast */
> +static struct kmem_cache *dma_entry_cache;
> +
> +/* lock to protect the data structures */
> +static DEFINE_SPINLOCK(dma_lock);

some more generic comments about the data structure: it's main purpose 
is to provide a mapping based on (dev,addr). There's little if any 
cross-entry interaction - same-address+same-dev DMA is checked.

1)

the hash:

+ 	return (entry->dev_addr >> HASH_FN_SHIFT) & HASH_FN_MASK;

should mix in entry->dev as well - that way we get not just per 
address but per device hash space separation as well.

2)

HASH_FN_SHIFT is 1MB chunks right now - that's probably fine in 
practice albeit perhaps a bit too small. There's seldom any coherency 
between the physical addresses of DMA - we rarely have any real 
(performance-relevant) physical co-location of DMA addresses beyond 4K 
granularity. So using 1MB chunking here will discard a good deal of 
random low bits we should be hashing on.

3)

And the most scalable locking would be per hash bucket locking - no 
global lock is needed. The bucket hash heads should probably be 
cacheline sized - so we'd get one lock per bucket.

This way if there's irq+DMA traffic on one CPU from one device into 
one range of memory, and irq+DMA traffic on another CPU to another 
device, they will map to two different hash buckets.

4)

Plus it might be an option to make hash lookup lockless as well: 
depending on the DMA flux we can get a lot of lookups, and taking the 
bucket lock can be avoided, if you use RCU-safe list ops and drive the 
refilling of the free entries pool from RCU.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ