Signal Exceptions is a proof-of-concept extension to the C language with the goal of handling POSIX signals as programmable exceptions. This work is based on the GCC compiler and libgcc runtime for a Linux x86-64 platform. Only segmentation fault (SIGSEGV) signals are currently under the scope, but support for other signals can be added in the future using the same foundations. Source code and more documentation are available at end of this article.
With that said, I’ll briefly summarize some of the challenges and decisions taken while developing this project.
Lexical analysis
Two new reserved words were added to the C language: __try and __catch. Existing RID_TRY and RID_CATCH C-family tokens were mapped to these lexemes respectively.
Syntactic analysis (parsing)
While parsing statements in the C front-end (c_parser_statement_after_labels), new RID_TRY tokens may come from the input and will be handled by c_parser_try_statement. Within this function, c_parser_compound_statement is invoked twice to parse try and catch sub-trees (list of statements). TRY_CATCH_EXPR and CATCH_EXPR GENERIC Abstract Syntax Tree (AST) nodes are built respectively. It is worth noting that GCC has support for exceptions in the GENERIC AST (used by front-ends such as C++), so there was no need to introduce new types of nodes.
Optimization
It’s essential for the compiler to determine if a statement may throw an exception or not. The reason is optimization: removing all possible dead code and exception-handling annotations saves space. For segmentation fault signals, memory access (read, write and execute) is the key element to make a decision. Whether or not a memory access may cause a trap is an open discussion, though; different criteria can be applied.
On one side of the arguments, it’s possible to assume that memory access to compiler-allocated segments cannot throw. In example: global variables declared const and placed in a read-only section won’t be written; the stack can be read or written at any time; jumps or calls to code will always have execute permissions; array access with an index known to be within boundaries is okay and so on. However, an application may execute the mprotect system call to remove permissions from any page; or write a read-only variable through indirection -so the compiler cannot catch the error statically-; or exhaust the space reserved for the stack; or perform any other unexpected action.
The trade-off is between taking a more conservative posture and paying the cost of blocking optimization, or making assumptions that look reasonable for most of the cases and allow savings. I’ve taken the latter approach and focused on indirect memory accesses, array access when the index is not guaranteed to be within boundaries and inline assembly. There are many improvement opportunities in this area which would be necessary for turning this proof-of-concept project into a real feature.
Final notes on compilation
The Signal Exceptions extension sits on top of GCC’s middle and back end exceptions. The generated ELF binary will have the executable code for catch handlers and the annotations that allow a runtime to unwind the stack and decide who handles the exception (if anyone).
Runtime initialization
Once a Signal-Exceptions aware binary is executed or its shared object equivalent is dynamically loaded into a process, a signals handler will be needed. Two questions immediately arise: 1) given an executable binary or a shared object, how do we know if it is Signal-Exceptions aware?; and 2) in case it is, who is responsible for registering a signals handler and when should it be done?
Both the dynamic loader and glibc execute code at initialization time, so one option would be to analyze the executable binary or shared object there. If it is Signal-Exceptions aware, proceed with the signals handler registration.
A GCC-generated ELF binary with exceptions support can be identified because the following sections are present: .gcc_except_table, .eh_frame and .eh_frame_hdr. Distinguishing between C++, C or other language exceptions gets more tricky but can probably be done through the personality function symbol. However, deciding whether a C binary has exceptions support because it is Signal-Exceptions aware or because it uses pthread cleanup handlers is even more involved.
Considering this complexity, I decided to take a different path: the Signal-Exceptions aware binary generated by GCC will contain an artificial constructor that calls the runtime for initialization. Note how the initiative comes from the executable binary instead of the runtime. The runtime will then register a signals handler if not previously done. Fig. 5 shows the C-language equivalent of this constructor. Its symbol is weak so once objects are linked there is only one function. The body contains a call to a runtime function located in libgcc (GCC builtins provide a convenient way to call and dynamically link external libraries).
The constructor is injected at the AST level, as shown in Fig. 6.
There is one significant drawback to this approach. The .init_array ELF section will contain one pointer to the constructor per Signal-Exceptions aware object. That means that the runtime may receive multiple -and unnecessary- calls per binary, affecting initialization performance. Only when Link Time Optimization (LTO) is enabled in GCC (-flto flag), the .init_array is flattened and only one entry is present.
Signals handling
The signals handler is called when the kernel handles a trap, and the context (register values) that triggered the trap is provided as an argument. This is the equivalent of a throw statement and a call to the runtime in languages such as C++. Before moving execution to the runtime unwinder, we need to return from the signals handler passing all the way back through the kernel. In an x86-64 architecture, the context RIP can be set to continue execution in a trampoline that finally calls _Unwind_ForcedUnwind.
In addition to continuing execution, the address of the instruction that triggered the trap has to be communicated to the runtime unwinder. In a C++ application, this address is obtained by the runtime subtracting 1 to the return address pushed to the stack (so it points to the last byte of the throw-call instruction). In the Signals-Exception case, it’s possible to modify the context values and emulate the same behavior. First of all, the stack has to be grown by the size of a pointer (that means decrementing the context’s stack pointer value 8 bytes). The address where the trap happened is the RIP context value, and could be written to the top of the context’s stack.
Stack unwinding
The libgcc runtime has a stack unwinder which is used for pthread cleanup handlers (see more information here). This project leverages on it. If the stop_function detects that the exception is not handled once the call stack is completely traversed, the default signals handler is re-installed and executed.
Other implementation notes
- The Signal Exceptions extension is under a -fsignal_exceptions GCC flag, not enabled by default.
- Most of the code is architecture independent, so portability to architectures other than x86-64 should be straight forward.
- This proof-of-concept has been tested on Fedora Linux only and the source code is based on GCC v.9.
Future work – Part 2
Here there are some ideas to continue this project:
- Handle other signals in addition to SIGSEGV (i.e.: SIGILL, SIGFPE, SIGBUS, etc.)
- Enable filters for catch exception handlers (i.e.: handle SIGSEGV only)
- Make signals information available to the catch exception handler as local variables (signal number, signal info structure and context).
Download source code (GPLv2)
Download PDF v1.0 EN
Download PDF v1.0 ES
Update 2020-04-26: Signal Exceptions talk at netlabs (Spanish )
I am trying to create recovery logic for a SIGSEGV signal.
I understand the arguments against do this however this for a limited scenario.
From the signal handler function I can successfully throw a C++ runtime_error object to the code which triggered the SIGSEGV which then performs recovery logic.
The only problem I am having is that after the first trigger of the SIGSEGV signal I can’t get the trigger to use my signal handler on a second trigger.
In my signal handler BackTrace_SignalHandler() it contains this code:
sigset_t signalMask;
sigemptyset (&signalMask);
sigaddset (&signalMask, SIGSEGV);
sigprocmask (SIG_UNBLOCK, &signalMask, nullptr);
struct sigaction backTraceAction;
backTraceAction.sa_handler = BackTrace_SignalHandler;
sigemptyset (&backTraceAction.sa_mask);
backTraceAction.sa_flags=0;
sigaction( SIGSEGV, &backTraceAction, NULL);
//signal(SIGSEGV, BackTrace_SignalHandler);
throw std::runtime_error(“segfault”);
I have been reading these source documentation.
https://ftp.gnu.org/old-gnu/Manuals/glibc-2.2.3/html_chapter/libc_24.html
https://www.gnu.org/software/libc/manual/html_node/Signal-Handling.html
Perhaps this signal triggering cannot be reenabled?
Do you have any suggestions?
I have code that dumps the back trace in the signal handler. That is what was causing the signal not to be triggered after the first time. I’ve added a counter to only dump the back trace when I want it to exit.