bugtraq - Re: PointGuard: It's not the Size of the Buffer, it's the Address

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 18 Aug 2003 22:19:11 -0700
From: Crispin Cowan <crispin@...unix.com>
To: pageexec@...email.hu
Cc: bugtraq@...urityfocus.com
Subject: Re: PointGuard: It's not the Size of the Buffer, it's the Address


Sorry for the length, but its a long post, and I feel the need to rebut 
most of this.
pageexec@...email.hu wrote:

>Here we go then (all quotes are from your paper).
>
>1. "This key is then never shared with any entity outside the process's
>    address space....
>   "Thus we cannot identify any feasible means by which the attacker can
>    obtain the PointGuard key."
>
>You are wrong (and even self-contradicting) here, in any case, so-called
>information leaking can happen without having to corrupt pointers ([1],
>[2]). Also, section 3.4.3 sublates the above.
>
It is true that PointGuard raises new issues with regard to information 
leakage: before PointGuard, there was not much significance to leaking 
pointer values from a running process, and so this now becomes a new 
threat that needs study. At the USENIX conference, it was pointed out 
that format string bugs can be used to obtain pointer values, which we 
had not thought of. However, composing PointGuard with FormatGuard 
somewhat mitigates this problem.

>However in the implementation part you talk about only those pointers
>that are visible at the C language level whereas we know all too well
>that there is more than that (ELF GOT/PLT, saved program counter and
>frame pointer, etc). Because of this omission it appears that PG does
>not protect these pointers at all even if they have been the primary
>targets of address space corruption bugs in the past. Is this really
>the case or is the paper missing something?
>
Yes, PointGuard only protects pointer values generated by code compiled 
with PointGuard.

>What really piqued my interest is that the PLT/GOT are not generated
>by the compiler hence the implementation you describe cannot possibly
>handle them without changes to the dynamic linker - something you do
>not mention at all.
>
We are modifying the dynamic linker for Immunix. But that kind of 
hacking isn't worthy of a paper, so we omitted it.

> It would also be interesting to know how you can
>handle the saved program counter and frame pointer just after the AST
>level where as far as i know these entities do not even exist (and
>hence cannot be manipulated/controlled there).
>
As the paper said, we are going to tag the AST expressions so that 
spills are PG-encrypted, but this is not yet implemented.

>3. In section 3.4.1 you say about statically initialized data that:
>
>   "[...] we modify the initialization code emitted by the compiler
>    (stuff that runs before main()) to re-initialize statically
>    initialized pointers with values encrypted with the current
>    process's key."
>
>Can you clarify what initialization code the compiler emits before
>main()? As far as i know, on entry only the dynamic linker, library
>initialization and some statically linked-in object code (various
>crt*.o stuff and what they call) gets to run before main() - none
>of this is emitted by the compiler, at least not for each executable
>as you made it sound to be.
>
Yes, the static data initialization is hacked into that code. I don't 
recall whether it is crt0.o or something else, but it doesn't really matter.

>4. As mentioned above, section 3.4.3 admits that there are still ways to
>   modify non-encrypted pointers in the current implementation (beyond
>   the information leaking attacks i mentioned). To me it also means that
>   not all pointer stores/loads are protected but only those visible at
>   the C language level (refer to the problematic pointers pointed out in
>   2). It also begs the question of what kind of performance impact PG
>   will have once all these omissions are rectified (more on your
>   performance evaluation below).
>
The only pointer load/stores that are not encrypted right now are 
register spills. That is a rare case, so it will not affect performance 
much.

>5. In section 3.4.4 you talk about mixed-mode code (PG vs. non-PG). You
>   seem to be focused on marking function parameters for use by PG or
>   non-PG code but you do not mention what happens with pointers stored
>   in data structures which are used by both kinds of code. Do/can you
>   mark such structure members with __std_ptr_mode_on__?
>
Yes.

> Also what happens
>   with functions that take format strings and hence accept arguments of
>   variable types (i.e., pointers and non-pointers), do you parse such
>   format strings and convert the pointer arguments accordingly or do
>   you turn off PG altogether for such code?
>
There is special case handling for varargs.

> What happens with system
>   calls that take pointers? You mention in the paper that you have not
>   created a PG version of glibc, so are all pointers passed to system
>   calls unprotected?
>
That is correct: unencrypted pointers are passed into the kernel. It has 
to be this way, because the alternatives would be to either have a 
system-wide single key value (which would persist far too long for such 
a small key, and be too easy to obtain) or to have the kernel know the 
key value of all processes and do the mapping for you (which is 
feasible, but more intrusive than just hacking glibc).

> What happens to system calls that do not go through
>   glibc (there are applications that do this)?
>
You would have to modify the source to mark those arguments as cleartext.

>   In the same section you all of a sudden introduce the notion of
>   'hashed pointers' without explaining what they are and how PG uses
>   them. Can you elaborate on this?
>
It's just a synonym for PG-encrypted pointers.

>   Finally i am wondering how you plan to implement pointer mode tracking
>   in the compiler, or more precisely, why you have to do it in the compiler
>   only and not at runtime (in the latter case you would have to extend the
>   pointer representation and open a whole can of worms).
>
I have no idea what you are talking about.

>6. In section 5 you admit that you do not indeed have a PG protected
>   glibc and hence heap pointers are not protected at all, this calls
>   into question the seriousness of your security and performance
>   testing (especially since you compare your results to mature
>   solutions which cannot be said of PG yet).
>
All of the code used in our performance testing was statically linked 
and compiled with PointGuard to work around the absence of a PG version 
of glibc, so the performance figures are valid.

>   "2. Usefully corrupting a pointer requires pointing it at a
>    specific location."
>
>This is false, the hijacked pointer may very well point to a set of
>specific values (e.g. any GOT entry that is used later, any member of
>a linked list, etc).
>
Bull: you just specified a specific location that happens to be a range. 
A very small range in the size of an address space. Unlike PaX/ASLR 
(which can only jiggle objects a little within a range) PointGuard has 
complete freedom to randomize all 32 bits of the pointer, so the fact 
that you can craft an exploit that can only approximately hit a target 
does not affect PointGuard.

Thanks for bringing up this point, as it highlights something important: 
PointGuard and address space randomization techniques (PaX/ASLR, the 
Sekar paper that immediately followed PointGuard at USENIX, and several 
other re-implementations of PaX/ASLR) are complementary.

    * ASLR techniques defend binaries (good for convenience) and as a
      consequence defend objects that are hard to protect with PointGuard.
    * PointGuard provides better randomization than ASLR, because the
      randomization ranges are much greater.
    * These two techniques compose: you are better off using both PG and
      ASLR. Similar to the way in which you are better off using both
      StackGuard and non-executable stacks (Immunix ships with both
      StackGuarded binaries and a non-executable stack kernel).


>   "3. Under PointGuard protection, a pointer cannot be corrupted
>    to point to a specific location without knowing the secret key."
>
>This is correct provided the implementation is bug-free - something
>that cannot be verified until you actually release PG.
>
I have no idea what you are talking about. If the pointer is hashed, you 
*cannot* usefully corrupt it without knowing the secret key. Speculating 
that any piece of software has bugs without foundation boarders on FUD, 
but in this case it isn't even possible: an encrypted pointer cannot be 
modified by a plaintext overflow. A bug that accidentally laid a 
plaintext pointer would result in a crash when the value is decrypted, 
and vice versa: the design specifically resists this problem.

>   "4. Learning the secret key requires either obtaining the secret
>    key directly, or cryptanalysis against a sample pointer value."
>
>These methods are called information leaking as discussed above. The
>term 'cryptanalysis' is a bogus term here, as it makes it sound like
>an expensive operation whereas all it takes is knowing the valid
>pointer value (something an attacker can observe on a test system)
>and xor'ing it against the leaked one.
>
It is none the less cryptanalysis. The paper itself points out that the 
crypto is weak: the security depends on not leaking ciphertext.

>   "6. Obtaining a sample of ciphertext (an encrypted pointer) would
>    require either corrupting a pointer precisely (which begs the
>    question) or a program that leaks pointer values (which is highly
>    unusual)."
>
>The latter claim ("highly unusual") is unsubstantiated, what is the
>basis for it? At least neither your paper nor anything you referenced
>present research data on this. Also there have been papers published
>recently on this very topic ([1] and [2]), so it seems we are just
>beginning to see the real nature of information leaking (this has
>also been pointed out in the PaX ASLR paper [3]).
>
As I said above, it's a new area, and needs study. Thanks for the citations.

>8. In section 6 you present performance evaluation data. The fundamental
>   problem with it is of course that PG has apparently not been finished
>   yet (something you do not make clear there), therefore any claims about
>   its impact are to be taken with a grain of salt.
>
PointGuard is at almost exactly the same stage of maturity as StackGuard 
was when the paper first appeared in January 1998. A whole bunch of 
system engineering has yet to be done. In the case of StackGuard, 
overall performance *improved* vs. the results claimed in the paper. 
Speculate away as to what PointGuard will do when we're done integrating 
it. On second thought, don't: you've done more than enough flaming 
speculation today :)

>   Third, there is related work ([4] and [5], all of which predates PG
>   by years and you failed to reference) that appears to show more real
>   performance impact of function pointer encryption (something PG does
>   not seem to do yet universally).
>
That work is in fact based directly on PointGuard, having resulted from 
this post http://lwn.net/1999/1111/a/stackguard.html

And you're on crack if you think their performance results are more 
realistic: the only "pointer" they encrypt is the activation return 
address. *None* of the hard work of weaving pointer encryption into the 
compiler's type system was done. They published first because we chose 
deliberately to not publish an empty idea with no implementation.

>9. In section 7.1 you say that:
>
>   "A developer can port an application to these safer dialects in a few
>    hours or days, where as PointGuard was designed to allow a developer
>    to compile & protect millions of lines of code in a few hours or day."
>
>whereas you admit before that PG requires programmer intervention (as it
>is not possible to have a pure PG system right now), i doubt a programmer
>can compile (port) millions of lines of code in a day.
>
You are entitled to your opinion on the numbers and magnitudes, but it 
is inescapable that "porting" to PointGuard is far less work than 
porting from C to Cyclone or CCured. So what's your point?

>10. In section 7.2 you claim that:
>
>   "The main limitation is that this defense can be bypassed, because
>    suitable attack payload code (effectively "exec(sh)")) is almost
>    always resident in victim program address spaces, and so pointer
>    corruption is all that is necessary for the determined attacker
>    to succeed."
>
>Where is this "exec(sh)" supposed to be 'almost always'? Can you substantiate
>this claim?
>
It is in glibc, and most programs link to glibc. This is very well 
known, and I didn't think it needed to be justified.

>Next you make certain claims about PaX [6] (please observe the proper
>capitalization) without providing any reference to our project - why?
>
I have been *trying* to properly cite PaX in various papers for at least 
a year, but you don't make it easy. A web URL is not normally considered 
a suitable citation. At least publish a Phrack article or something so I 
can actually cite you. FWIW, I have been repeatedly pointing out PaX to 
various people who are re-inventing ALSR in various forms, because the 
research community is unaware of PaX. I dare say that the PointGuard 
paper will do more to raise PaX visibility in the research community 
than anything before. That was deliberate, because IMHO PaX is 
under-exposed: it's good work, and few have heard of it.

>You also fail to substantiate your claims about the performance of PaX.
>My best guess is that you are probably referring to a very old and long
>outdated paper, not the current implementation. For your information,
>NOEXEC has no performance impact on alpha, i386 (when SEGMEXEC is used,
>which is the default, [12]), parisc, sparc and sparc64 and has a small
>impact on ppc. I am curious to learn why you cited this information
>when you have already been made aware of the current situation ([13]).
>
It was hearsay. Publish something, and I'll cite it. Please.

>11. In section 7.3 you claim that:
>
>   "PaX also incorporates ASLR (Address Space Layout Randomization) which
>    can be viewed as the dual of PointGuard: rather than randomizing
>    pointers, ASLR randomizes the location of key memory objects."
>
>This is a false claim, ASLR does the exact same thing to pointers as PG.
>Think about it, if you randomize all memory regions, then all pointers
>to these regions will necessarily be randomized as well.
>
Go look up the word "dual": it is a mathematical term. What you're 
saying is exactly the same as what I am saying.

>   "Sekar et al [3] have a new implementation of this concept that
>    randomizes more elements of the address spacelayout, which may
>    make it harder to bypass than PaX/ASLR."
>
>This is misleading because Address Obfuscation is vulnerable to the exact
>same information leaking problem as ASLR or PG, otherwise an attacker has
>to guess addresses (if he needs any, that is), there is no (determinisctic)
>way around that.
>
It is *your job*, not mine, to go write a paper explaining how PaX/ASLR 
is better than Sekar et al. Be sure to point out that PaX/ASLR came 
first, as that is a strong point in your favor. In the absence of such a 
paper, I'm having to guess at the differences, in a very small portion 
of my paper. I vigorously encourage you to go write a real paper and 
submit it to a strong refereed conference such as USENIX Security. 
Really, please, go write a real paper, I would love to read it, and 
would cite it as often as I could. Had you done this two years ago, you 
would not be having this silly flame war over W^X with Theo.

Crispin

-- 
Crispin Cowan, Ph.D.           http://immunix.com/~crispin/
Chief Scientist, Immunix       http://immunix.com
            http://www.immunix.com/shop/