BIOS execution in QEMU: software MMU address translation

In the previous article (BIOS execution in QEMU: where it all starts) we described how QEMU starts executing the BIOS image and what binary translation means. Even though we found the BIOS image instructions in the host memory and established a sort of correlation to emulated physical addresses executed by QEMU’s CPU, there was still a leap in-between. That is what we will explore now.

Modern CPUs view memory as a virtual and continuous range. A hardware component called MMU (Memory Management Unit) does a translation between virtual and physical addresses, that ultimately go into the system memory bus. To increase performance, MMUs implement a cache scheme called TLB (Translation Lookaside Buffer). These same concepts of ‘translating between two disjoint address spaces’ and ‘caching translations’ are implemented by QEMU’s software MMU (softmmu).

Continue reading “BIOS execution in QEMU: software MMU address translation”

BIOS execution in QEMU: where it all starts

Reading Linux Inside (by 0xAX) and the early steps of the kernel boot process sparked my curiosity about the BIOS code executed immediately after power-on. Somehow related to that, tinkering with QEMU has been on my backlog for quite some time. That made the recipe for a new -and quick- challenge: debug QEMU while executing the first instructions of the BIOS firmware.

The first step was to setup an environment to compile and debug QEMU. The compiling capability is actually optional. My RPM-based environment runs on top of Fedora 31 (x86_64). QEMU version is 4.1.1. Binary translation mode has been used for emulation -hardware acceleration may come in a follow up-.

With no other preamble, let’s go straight to the task.

Continue reading “BIOS execution in QEMU: where it all starts”

GCC Signal Exceptions – Part 1

Fig. 1: proof-of-concept for handling segmentation fault signals in C as programmable exceptions

Signal Exceptions is a proof-of-concept extension to the C language with the goal of handling POSIX signals as programmable exceptions. This work is based on the GCC compiler and libgcc runtime for a Linux x86-64 platform. Only segmentation fault (SIGSEGV) signals are currently under the scope, but support for other signals can be added in the future using the same foundations. Source code and more documentation are available at end of this article.

With that said, I’ll briefly summarize some of the challenges and decisions taken while developing this project.

Continue reading “GCC Signal Exceptions – Part 1”

How terminal emulators work on Linux?

A few days back, while debugging the delivery of a SIGINT signal from the kernel to a foreground process, I wondered how terminal emulators work. In this article I’ll be using gnome-terminalbash and the yes application as an example; but the same concept should apply to any combination of a terminal emulator, a shell and a console application.

Gnome-terminal runs under a graphic environment (such as Xorg), so any key pressed on the keyboard has to turn into a character obtainable from there. Bash runs as a child process of gnome-terminal, and yes as child process of bash. Console application developers are used to reading input from STDINT and writing to STDOUT or STDERR. But, how are characters delivered from one end to the other?

Continue reading “How terminal emulators work on Linux?”

Global Descriptor Table (GDT) in Linux x86-64

Even though x86 memory segmentation is largely unused in 64 bits mode, the Linux kernel still initializes the CPU’s Global Descriptor Table (GDT) and points the gdt register to it. I was curious about its content and how segment selectors can be used today. This article is a brief summary of my experiments and findings.

To begin with, we can read the gdt register calling native_store_gdt in kernel space (arch/x86/include/asm/desc.h):

We see there that the register contains a virtual address, so segmentation is previous to pagination in protected mode.

Continue reading “Global Descriptor Table (GDT) in Linux x86-64”

Hardware breakpoints in the Linux kernel through perf_events

Hardware breakpoints are quite useful while developing user or kernel space software. An interesting feature, which makes them be also known as memory breakpoints, is interrupting execution whenever the processor executes, reads or writes a specific virtual address. This feature is not available in software breakpoints.

Once a breakpoint is hit, there might be cases in which we want the execution to stop, so single-stepping or memory analysis is possible; and cases in which we want traces to be generated without a full-stop. I won’t make a distinction between these use cases, given that the underlying mechanism is the same.

Continue reading “Hardware breakpoints in the Linux kernel through perf_events”

cipherchat: Ekoparty’s 2019 CTF write-up

Last week I joined a group of colleagues from Core Security to participate in the Ekoparty’s 2019 official CTF, organized by null life. I picked up one of the crypto challenges: chiperchat. This write-up is to described the process followed to solve it.

The starting point was a PCAP capture and a hint stating that there was encrypted traffic inside. First off, I opened the capture in Wireshark and verified that there was a perfectly well formed TCP stream between 2 parties. The data carried out by these TCP packets contained no ASCII characters. I was not sure, by then, if every single byte was encrypted or if there was a mix of a binary protocol and encrypted bytes.

Continue reading “cipherchat: Ekoparty’s 2019 CTF write-up”

ASMifier: from Java bytecode to “programmable” bytecode

ASM framework for bytecode instrumentation is a powerful tool. There is an interesting feature named ASMifier that I’ll show in this article. Given a classfile (Java bytecode), ASMifier can generate a Java source code with all the invocations to its own API that are necessary to replicate it. In other words, if you compile and invoke the generated Java source code, what you get is exactly the same classfile you had. The advantage is that you can now play with the generated source and easily create variations of the classfile.

Let’s see an example.

Continue reading “ASMifier: from Java bytecode to “programmable” bytecode”

pthread_cancel and stack unwinding

Thread cancellation is a bit trickier than what one might expect. The reason is resources cleanup and maintaining a consistent state. Just as an example, a cancelled thread may need to unlock its mutex objects, free up memory, close file descriptors, delete named shared memory and the list can go on.

The idiom to make this work in pthread is cleanup handlers. These handlers work in a stack fashion, and can be pushed or popped at any time during run time. In case of thread cancellation by means of a call to pthread_cancel, handlers present in the stack will be automatically executed while the unwinding process occurs.

Here we have a piece of code showing how this API works:

Continue reading “pthread_cancel and stack unwinding”