NEW BOT Телеграм, страница

کشف بافر اورفلو در مهندسی معکوس با ابزارهای IDA و Ghidra و r2

توی این قسمت یاد میگیریم وقتی یک فایل باینری باز میکنید چطور بفهمید توش بافر اورفلو هست یا نه
یعنی بدون داشتن سورس و فقط با آنالیز تابع‌ ها ورودی‌ های خطرناک و مسیرهای حساس رو پیدا کنی فقط بررسی باینری انجام میدیم هیچ اکسپلویت واقعی نوشته نمیشه کاملا امنه

تشخیص توابع خطرناک در باینری

این یکی از سریع‌ترین روش‌ هاست

اگه جایی تابع‌ هایی مثل strcpy sprintf gets memcpy بدون سایز مشخص وجود داشته باشه احتمال بافر اورفلو زیاده

ما یک باینری ساده برای آموزش استفاده میکنیم که توش strcpy و gets استفاده شده تا فقط مفهوم کشف آسیب‌ پذیری رو تمرین کنیم

file.c

#include <stdio.h>
#include <string.h>

void read_name() {
    char name[32];
    gets(name);
    printf("hello %s\n", name);
}

void copy_data(char *s) {
    char buf[16];
    strcpy(buf, s);
    puts("done copy");
}

int main(int argc, char **argv) {
    if (argc > 1)
        copy_data(argv[1]);
    read_name();
    return 0;
}

باز کردن باینری در IDA یا Ghidra

وقتی فایل را داخل IDA باز میکنید دنبال اسم توابع خطرناک بگردید

مثال
اگه توی view functions ببینید gets یا strcpy هست همین خودش زنگ خطره
بعد برید داخل خود تابع و نگاه کند سایز بافر چقدره و ورودی از کجا میاد

وقتی دیدید strcpy(buf s) و buf اندازه ثابت داره ولی طول s از ورودی کاربر میاد خیلی احتمال بافر اورفلو هست
این الگو یکی از کلاسیک ترین نشونه هاشه

دیدن فریم تابع و محل بافر

توی disassembly دنبال دستوراتی مثل

sub rsp
alloca

اینا مکان ساختن فضای لوکال روی استک رو نشون میدن

مثال اسمبلی تابع copy_data وقتی disassemble کنیم تقریبا چیزی شبیه این میبینیم

push rbp
mov rbp, rsp
sub rsp, 0x20
mov rax, rdi
lea rdx, [rbp-0x10]
mov rsi, rax
call strcpy

توضیح کد
اینجا واضح میبینید که بافر 16 بایته چون از rbp تا rbp-0x10 فاصله داره و چون strcpy هیچ چک طولی نمیکنه اگه ورودی طولانی بیاد استک میتونه خراب بشه

چک کردن مسیر ورودی کاربر
این خیلی مهمه اگه ورودی مستقیم از argv یا fgets یا read یا gets گرفته بشه و همون مستقیم به strcpy بره آسیب‌پذیری تقریبا قطعی میشه

نمونه اجرا

./a.out $(python3 -c "print('A'*200)")

اگر برنامه کرش کرد یعنی تشخیص ما درست بوده

Part 14 Buffer Overflow

Detecting Buffer Overflow in Reverse Engineering with IDA, Ghidra and r2 tools

In this section, we will learn how to find out if there is a buffer overflow when you open a binary file

That is, without having the source and only by analyzing the functions, you can find dangerous inputs and sensitive paths. We only do binary inspection. No real exploits can be written. It is completely safe

Detecting dangerous functions in binary

This is one of the fastest methods

If there are functions like strcpy sprintf gets memcpy without a specified size, the probability of buffer overflow is high

We will use a simple binary for training in which strcpy and gets are used to practice the concept of vulnerability detection

file.c

#include <stdio.h>
#include <string.h>

void read_name() {
char name[32];
gets(name);
printf("hello %s\n", name);
}

void copy_data(char *s) {
char buf[16];
strcpy(buf, s);
puts("done copy");
}

int main(int argc, char **argv) {
if (argc > 1)
copy_data(argv[1]);
read_name();
return 0;
}

Opening a binary in IDA or Ghidra

When you open a file in IDA, look for the names of dangerous functions

For example, if you see gets or strcpy in the view functions, that's a red flag.

Then go inside the function itself and look at the buffer size and where the input comes from.

When you see strcpy(buf s) and buf has a fixed size, but the length s comes from the user input, it's very likely a buffer overflow.

This pattern is one of the most classic signs.

Seeing the function frame and buffer location.

In disassembly, look for commands like

sub rsp
alloca

These show where to create local space on the stack.

For example, the assembly of the copy_data function, when we disassemble it, we see something like this:

push rbp
mov rbp, rsp
sub rsp, 0x20
mov rax, rdi
lea rdx, [rbp-0x10]
mov rsi, rax
call strcpy

❤1

160 views12:25

ReverseEngineering

Code explanation
Here you can clearly see that the buffer is 16 bytes because it is from rbp to rbp-0x10 and because strcpy does not do any length check, if the input is long, the stack can be corrupted

Checking the user input path
This is very important if the input is taken directly from argv or fgets or read or gets and goes directly to strcpy, the vulnerability is almost certain

Example execution

./a.out $(python3 -c "print('A'*200)")

If the program crashes, it means our diagnosis was correct

@reverseengine

148 views12:25

ReverseEngineering

Stack Canary / NX / ASLR

Stack Canary حفاظت از استک

چی هست؟

یک مقدار تصادفی بین:

local variables
saved RBP
return address

قبل از ret چک میشه

اگر تغییر کرده باشه:

* stack smashing detected *

برنامه کرش میکند

چرا مهمه؟

دیگر نمیتونید مستقیم:

buf → RIP

رو overwrite کنید

راه‌های دور زدن:

Leak Canary

فرمت استرینگ

out-of-bounds read

Partial overwrite

بعضی وقت‌ها فقط LSB قابل کنترله

Logic bug

اصلا به RIP دست نمیزنید

NX / DEP Non-Executable Stack

یعنی چی؟

استک قابل اجرا نیست

شِل‌کد روی استک اجرا نمیشه

نتیجه مستقیم:

buffer → shellcode → RIP

ROP

ret2libc

ret2win

ASLR Address Space Layout Randomization

چی رندوم میشه؟

Stack

Heap

libc

mmap

چی ثابت میمونه؟

باینری اگر PIE نباشه

چرا مهمه؟

آدرس‌ ها هر بار فرق میکنن:

system = 0x7f....

(هر اجرا فرق داره)

راه دور زدن:

Leak یک آدرس

puts@got

printf

محاسبه offset:

libc_base = leaked_puts - puts_offset

پیدا کردن:

system
/bin/sh

معمولا اینها رو دارید:

Canary

NX

ASLR

ولی:

یک info leak هم هست

یا overflow کنترل‌ شده

Leak → ROP → ret2libc

Stack Canary / NX / ASLR

Stack Canary Stack Protection

What is it?

A random value between:

local variables

saved RBP

return address

Checked before ret

If changed:

* stack smashing detected *

Program crashes

Why is it important?

You can no longer directly overwrite:

buf → RIP

Workarounds:

Leak Canary

Format string

out-of-bounds read

Partial overwrite

Sometimes only LSB is controllable

Logic bug

You don't touch RIP at all

NX / DEP Non-Executable Stack

What does it mean?

Stack is not executable

Shellcode cannot be executed on the stack

Direct result:

buffer → shellcode → RIP

ROP

ret2libc

ret2win

ASLR Address Space Layout Randomization

What is being randomized?

Stack

Heap

libc

mmap

What remains constant?

Binary if not PIE

Why does it matter?

Addresses are different every time:

system = 0x7f....

(different for each implementation)

Workaround:

Leak an address

puts@got

printf

Calculate offset:

libc_base = leaked_puts - puts_offset

Find:

system
/bin/sh

Usually you have these:

Canary

NX

ASLR

But:

There is also an info leak

or controlled overflow

Leak → ROP → ret2libc

@reverseengine

160 views18:25

ReverseEngineering

بعد از کنترل RIP چی کار میکنیم؟

پرش به کد خودمون

تا اینجا:

برنامه کرش کرده

Offset دقیق داریم

کنترل RIP / EIP رو هم گرفتیم

سوال مهم الان اینه:

حالا RIP رو کنترل کردیم باید بذاریمش کجا؟

اینجا دو مسیر اصلی داریم

مسیر کلاسیک: اجرای کد خودمون Shellcode

ایده:

یه تیکه کد اسمبلی مینویسید

داخل ورودی برنامه تزریقش میکنید

RIP رو میپرونید روی همون کد

به این کد میگن Shellcode

مشکل کجاست؟

روی سیستم‌های قدیمی این روش خیلی راحت جواب میداد اما سیستم‌های جدید یه سد بزرگ دارن به اسم:

NX Non-Executable Memory

یعنی:

استک و هیپ فقط برای داده‌ ان

CPU اجازه اجرای کد از اونجا رو نمیده

پس حتی اگه RIP رو بفرستید روی استک برنامه کرش میکنه نه اینکه کدتون اجرا بشه

What do we do after controlling RIP?

Jump to our own code

So far:

The program has crashed

We have the exact offset

We also got the RIP / EIP control

The important question now is:

Now that we have controlled RIP, where should we put it?

Here we have two main paths

Classic path: Execute our own code Shellcode

The idea:

You write a piece of assembly code

You inject it into the program input

You run RIP on that code

This code is called Shellcode

Where is the problem?

On old systems, this method worked very easily, but new systems have a big barrier called:

NX Non-Executable Memory

That is:

The stack and heap are only for data

The CPU does not allow executing code from there

So even if you send RIP on the stack, the program will crash, not your code will be executed

@reverseengine

146 views18:33

ReverseEngineering

Stack Frames Advanced

ضروری‌ترین بخش برای Exploit + ROP

هدف:

کامپایلر دقیقا چه چیزهایی رو داخل استک ذخیره میکنه

Saved RBP

Return Address

Local Variables

Padding / Alignment

Call-preserved registers

متغیر های محلی چجوری روی استک قرار میگیرن؟

کد C:

int func(int x) {
int a = 5;
int b = x + 3;
return a + b;
}

کامپایل با -O0:

push rbp
mov rbp, rsp
sub rsp, 16 ; allocate space for a, b
mov DWORD PTR [rbp-4], 5
mov eax, DWORD PTR [rbp+16] ; x
add eax, 3
mov DWORD PTR [rbp-8], eax
mov eax, DWORD PTR [rbp-4]
add eax, DWORD PTR [rbp-8]
leave
ret

نکته مهم برای اکسپلویت:

لوکال‌ها همیشه از آدرس‌های:

rbp - 4

rbp - 8

rbp - 0x10

شروع میشن

این دقیقا جاییه که بافر اورفلو اتفاق میوفته

Stack Frames Advanced

The most essential part for Exploit + ROP

Objective:

What exactly does the compiler store on the stack

Saved RBP

Return Address

Local Variables

Padding / Alignment

Call-preserved registers

How are local variables placed on the stack?

C code:

int func(int x) {

int a = 5;

int b = x + 3;

return a + b;
}

Compile with -O0:

push rbp
mov rbp, rsp
sub rsp, 16 ; allocate space for a, b
mov DWORD PTR [rbp-4], 5
mov eax, DWORD PTR [rbp+16] ; x
add eax, 3
mov DWORD PTR [rbp-8], eax
mov eax, DWORD PTR [rbp-4]
add eax, DWORD PTR [rbp-8]
leave
ret

Important note for exploit:

Locals always start at: