[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20071213235543.568682000@intel.com>
Date: Thu, 13 Dec 2007 15:55:43 -0800
From: venkatesh.pallipadi@...el.com
To: ak@....de, ebiederm@...ssion.com, rdreier@...co.com,
torvalds@...ux-foundation.org, gregkh@...e.de, airlied@...net.ie,
davej@...hat.com, mingo@...e.hu, tglx@...utronix.de, hpa@...or.com,
akpm@...ux-foundation.org, arjan@...radead.org,
jesse.barnes@...el.com
Cc: linux-kernel@...r.kernel.org
Subject: [RFC PATCH 00/12] PAT 64b: PAT support for X86_64
Yes. It is that wonderful time of the year again.
No, no. We are not talking about holiday season or new year here.
We are talking about one another rehash of "why we do not support PAT in x86"
question and series of patches that implement some PAT support before going
into hibernation again. Only difference is that we hope to take this little
further this time and may be really get this support into
upstream kernel soon.
This series is heavily derived from the PAT patchset by Eric Biederman and
Andi Kleen.
http://www.firstfloor.org/pub/ak/x86_64/pat/
We have forwarded ported these patches to latest kernel, addressed some of the
race conditions, cut a lot more corners to get this into a working patchset
that can be seen as an RFC. Specifically, the chanegs we added include:
* Various bug fixes over original patchset above.
* Change x86_64 identity map to only map non-reserved memory. This helps
to handle UC/WC mapping of reserved region in a much simple manner
(we don't have to do cpa any more, as such not keep track of the actual
reference counts. We still track all the usages to keep the mappings
consistent. We just avoid the headache of splitting mattr regions for
managing ref counts for every individual usage of the reserved area).
* Modify reserve_mattr and free_mattr to handle various mapping of reserved
regions cleanly.
There are many rough edges in the patchset. TBD list below refers to
the open issues that we have thought through during this process.
TBD:
* Do we need to allow RAM pages to be mapped as WC? If not, then
we don't need to follow the TLB flush mechanism (make pte not present,
flush, and set pte with new mapping) mentioned in section 10.12.4 of SDM
Vol3a.
* If the above can be assumed, then for a complete solution, handle RAM
pages with UC and /dev/mem mapping conflicts.
Can we use the existing page struct to keep track of the /dev/mem
mappings (through the page ref count) and not allow
to free the page while the /dev/mem mappings are active. And
allow /dev/mem to map only those pages which are marked reserved (which
the driver does before doing iomap).
* For X and others, do we need the ioctl interface to sysfs or get the type
attribute through a different sysfs file.
* Clean up early table space allocation, avoiding overallocation there.
* Avoid mapping 0 - 1M physical addresses in kernel text mapping.
* Read reserved regions in /dev/mem read() as 0xffff or something, and continue
reading across holes, till we reach the high_memory (end of memory).
* For fork(), for every /dev/mem mapping, we have to keep track of the usage
by doing reserve_mattr().
There are also many edges completely missing. Lot of things we did not look
at all for this first cut. Specifically:
* Only supports x86_64 for now. i386 may not even compile with this patchset.
* We did not look at implication of PAT on Suspend-Resume.
* We did not look at implications of PAT on KEXEC.
* Coding style details.
We expect this can be done easily once we have discussed/resolved the
basic PAT problems with this RFC.
Fireaway all comments, complaints, concerns and things we may break while
we do this.
Tested with 2.6.24-rc4 and X86_64.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@...el.com>
---
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists