How terminal emulators work on Linux?

A few days back, while debugging the delivery of a SIGINT signal from the kernel to a foreground process, I wondered how terminal emulators work. In this article I’ll be using gnome-terminal, bash and the yes application as an example; but the same concept should apply to any combination of a terminal emulator, a shell and a console application.

Gnome-terminal runs under a graphic environment (such as Xorg), so any key pressed on the keyboard has to turn into a character obtainable from there. Bash runs as a child process of gnome-terminal, and yes as child process of bash. Console application developers are used to reading input from STDINT and writing to STDOUT or STDERR. But, how are characters delivered from one end to the other?

Let’s start with the simplest case first, where gnome-terminal is open and the bash prompt is there. We will see what happens between pressing the key “y” and observing the “y” drawing on the screen.

Gnome-terminal, when launching a new window or tab, makes a request to the TTY driver* in the kernel asking for a new pseudo-terminal (PTY). One of the user-space APIs available for this request is posix_openpt, but it’s also possible to use the /dev/ptmx device directly. In response to this request, the TTY driver creates two PTY devices: master and slave. These devices work like a pipe: what’s written on one side can be read on the other and vice-versa. The TTY driver is the one responsible for carrying information in both ways.

Gnome-terminal gets a file-descriptor to the previously created master device, named ptm<n> (where ‘n’ is a sequential number). Bash has a file-descriptor to the corresponding slave device (exposed under /dev/pts/pts<n>). For simplicity, let’s assume that the master device is ptm0 and the slave pts0. Both gnome-terminal and bash can now make sys_read and sys_write system calls on their file-descriptors to receive or transmit data between each other.

Gnome-terminal obtains the “y” character from Xorg and writes it down to ptm0. Bash, on the other side, reads the “y” from pts0. Then, bash writes the “y” to pts0 and gnome-terminal reads it from ptm0. Finally, gnome-terminal displays the “y” in the graphical window calling the corresponding Xorg API. There is a back and forth of the “y” character through the PTY pipe between gnome-terminal and bash.

For a quick demo, I wrote a couple of code fragments to be placed at the beginning of tty_write and at the end of tty_read (kernel/drivers/tty/tty_io.c). These code fragments, once filtering by ptm0 and pts0, send the following information to dmesg: operation name (read or write), device name (ptm0 or pts0), current process name and characters read or written.

Here it’s the output:

[  154.111191] tty_write on ptm0
[  154.111196] Current process: gnome-terminal-server
[  154.111198] Char: y
[  154.111249] tty_read on pts0
[  154.111251] Current process: bash
[  154.111253] Char: y
[  154.111343] tty_write on pts0
[  154.111345] Current process: bash
[  154.111346] Char: y
[  154.113133] tty_read on ptm0
[  154.113137] Current process: gnome-terminal-server
[  154.113139] Char: y

[ 154.111191] tty_write on ptm0

[ 154.111196] Current process: gnome-terminal-server

[ 154.111198] Char: y

[ 154.111249] tty_read on pts0

[ 154.111251] Current process: bash

[ 154.111253] Char: y

[ 154.111343] tty_write on pts0

[ 154.111345] Current process: bash

[ 154.111346] Char: y

[ 154.113133] tty_read on ptm0

[ 154.113137] Current process: gnome-terminal-server

[ 154.113139] Char: y

Note: Code fragments used to generate this output are trivial; I won’t publish them here because they are a bit ‘hacky’ and add no value.

Only one process opens the master device, but many can open its corresponding slave. In fact, bash may launch many processes -which inherit bash file-descriptors after forking- and send them to the background. All of them can then write to pts0 at the same time and we see a soup of characters in the gnome-terminal window! Only one of these processes can be on the foreground at a time and receive the input, though.

Now that we’ve seen the “y” case, let’s analyze what happens when pressing Ctrl + C while an application -such as yes– is running on the foreground.

Pressing Ctrl + C makes Xorg to send the non-printable ASCII character ETX (End-of-Text) to the window on focus. Gnome-terminal obtains this character (represented by the byte 0x03) and writes it down to the ptm0. The TTY driver, instead of passing the character through the PTY pipe to be read from pts0, recognizes it as special and applies a different treatment. In particular, it sends a SIGINT signal to the process that is in pts0’s foreground and writes the characters “^C” to pts0. As with any characters written on pts0, they get delivered to ptm0. Gnome-terminal reads from ptm0 and display them on the screen. Once the yes process is over -SIGINT is not handled-, bash is notified of its child’s death and writes a “\n” (newline) on pts0.

The TTY driver provides an ioctl to change the foreground process of a PTY slave device: tcsetpgrp. This is used by bash when doing fg and bg.

Finally, a couple of pointers if you want to have a deeper look into the TTY driver. The device file (either master or slave) use the struct file private_data member to hold a struct tty_struct pointer. Both the master and the slave tty_struct structures are linked-between through the link member. As a result, any sys_read or sys_write system call on a PTY device has the information required to transfer the data through the pipe.

—
(*) I’ve used the term TTY driver in a generic sense. PTY devices are managed more specifically by struct driver ptm and pts.

How terminal emulators work on Linux?

One Reply to “How terminal emulators work on Linux?”

Leave a Reply Cancel reply