from x
Part I – Foundations: What Is a Runtime?
Chapter 1 – Introduction to Runtime
In computer programming, a runtime system (or runtime environment) refers to the infrastructure that supports the execution of a program[1]. The term arises from the division of work between compile time (when code is translated) and runtime (when the program is actually running)[1]. In essence, the runtime is everything the program needs to run after it’s been compiled – this includes the memory management, type system, and interfaces to the operating system (OS) among other services. Almost every programming language, from low-level C to high-level Python, relies on some form of runtime system to handle tasks that the compiled program alone cannot. By one definition, any program behavior not explicitly written in the source code can be attributed to the runtime system.
A runtime environment acts like a miniature operating system for the program: it provides a layer between the application code and the underlying OS or hardware. This layer manages resources such as memory, variables, and I/O devices on behalf of the program. For example, when your code reads from a file or prints text to the screen, it typically calls a function in the runtime library, which in turn interacts with the OS to perform the actual operation. The runtime thus offers a consistent interface and set of services so that the program can run on different platforms without modification. A classic example is the Java Runtime Environment (JRE), which allows Java programs to run identically on Windows, Linux, or macOS by abstracting away platform differences.
Key responsibilities of runtimes. At runtime, several important things happen: memory is allocated for variables and data structures; function calls are managed (parameters are passed and return values handled); and interactions with the OS (like requesting files or network communication) are mediated by the runtime system. Many runtimes also provide dynamic features such as type checking, debugging hooks, and just-in-time (JIT) compilation for performance. In modern languages, the runtime often includes a garbage collector that automatically reclaims unused memory, or a thread scheduler that manages concurrent execution of code.
Importantly, even languages traditionally considered “compiled” (like C or C++) have runtime components. For instance, when you run a C program, some setup code (often called the C runtime startup routine) runs before your main() function – setting up the stack, initializing global variables, etc. This code, along with standard libraries (like printf for printing output), constitutes the C runtime environment. In fact, the C language’s runtime inserts instructions to manage the stack (for function calls and local variables) and other low-level details of execution. Thus, every program is running with some runtime support: if not a heavy virtual machine, then at least a lightweight set of routines and conventions that the compiler and program rely on.
Runtime vs. other terms. It’s useful to clarify related terminology. A runtime environment (RTE) usually refers to the overall platform in which programs execute (for example, Node.js or the .NET Framework). An engine is the component that actually runs the code (for example, the V8 JavaScript engine inside Node.js, or the Java Virtual Machine for Java bytecode). An interpreter is a type of engine that executes code line-by-line without ahead-of-time compilation. A virtual machine (VM) often refers to an engine that provides a low-level instruction set (bytecode) and manages its own memory and stack, as the JVM or .NET CLR does. Many modern runtime engines use just-in-time (JIT) compilation, meaning they translate code to machine instructions on the fly for faster execution. For instance, the V8 engine for JavaScript and the PyPy implementation of Python both employ JIT techniques to optimize performance at runtime. Despite these nuances, all these concepts fall under the broad umbrella of “runtime” – the support system that bridges the gap between static code and dynamic execution.
In summary, the runtime is the execution context that breathes life into program code. It handles the behind-the-scenes work such as managing memory and calling conventions, interfacing with the OS, and enforcing language rules at runtime so that the developer can focus on writing code logic. The rest of this textbook will delve deeper into how different kinds of runtime systems are structured, how they evolved, and how they implement advanced features like memory management, concurrency, security, and more.
Chapter 2 – The Anatomy of a Runtime
What does a runtime system actually consist of? In this chapter, we dissect the typical components and mechanisms that make up a runtime. Understanding this “anatomy” provides a foundation for later exploring specific language runtimes and advanced features.
At a high level, a runtime system handles memory management, execution flow (control stack), and integration with system resources. Let’s break these down:
- Memory Layout and Management: Every running program has a well-defined memory layout divided into regions such as the code segment (for executable instructions), the data segment (for global/static variables), the heap (for dynamically allocated memory), and the stack (for function call frames and local variables). When a program starts, the runtime (often with help from the OS loader) sets up these segments.
Figure: Memory layout of a C program’s address space, with separate segments for text (code), data (initialized and uninitialized globals), heap, and stack. Each segment serves a distinct purpose[2]. The text segment contains the program’s compiled machine code and is usually marked read-only. The data segment holds global and static variables – it is often split into an initialized portion for variables with initial values and an uninitialized portion (BSS) for those set to zero by default. The heap is an area from which memory is dynamically allocated (e.g. via malloc in C) and typically grows upward (to higher addresses) as needed. The stack is used for function calls: whenever a function is invoked, a stack frame is pushed onto this region containing the function’s local variables, parameters, and return address[3]. The stack grows downward (toward lower addresses) and is managed in a Last-In-First-Out fashion as functions call and return.
A core job of the runtime is to manage these areas. For example, in a low-level language like C, the runtime provides a heap allocator (such as the malloc and free functions) to request and release heap memory. It also sets up the stack for the program’s entry point. In languages with automatic memory management, the runtime includes a garbage collector to reclaim heap space used by objects no longer in use (garbage collection will be discussed in depth in Chapter 7). Managing memory involves not just allocation but also enforcing memory safety rules of the language (like array bounds checking in Java, or preventing use-after-free errors in managed languages). We will see different approaches to memory management across runtimes.
- The Call Stack and Execution Model: The runtime system implements the language’s execution model, which includes how function calls are made, how control flows, and how the program’s state changes over time. A critical part of this is the call stack. At runtime, each active function has a stack frame with its local state. The runtime defines the calling convention – e.g., how parameters are passed (via registers or pushed on the stack), who is responsible for cleaning up the stack, etc. For example, in the C runtime, instructions are inserted to create space on the stack for locals and to copy function call parameters into the new stack frame. This behavior is invariant for all function calls in the language, and thus is handled by runtime code rather than the specific user program logic. The separation is such that the compiler generates code assuming a certain runtime stack discipline; when a function call happens, the runtime’s conventions ensure that the callee can find its arguments at a known place (say, on the stack or in specific registers) and that returning control will clean up properly.
The runtime may also handle advanced control flow features. For instance, implementing exceptions (in languages that have them) requires unwinding the call stack when an error occurs – a process managed by the runtime support code because it must walk through stack frames, run cleanup code (finally blocks), etc. Similarly, if a language supports coroutines or generators, the runtime needs to manage multiple stack contexts or an equivalent mechanism to suspend and resume execution. All these behaviors – function call setup/teardown, exception handling, coroutine scheduling – are implemented as part of the runtime’s execution model rather than in user-written code.
- Runtime Libraries and System Interface: Most runtimes include a library of common functionalities that abstract the operating system interface. This is sometimes called the runtime library or standard library of the language. It provides services like I/O (reading from files, printing to console), networking, threading primitives, and so on. For example, a call to
printf("Hello")in C goes into the C runtime library, which internally will invoke an OS system call (likewriteon a POSIX system) to output text to the console. By routing through the runtime library, the program can remain portable and higher-level – the runtime takes on the job of handling OS idiosyncrasies. In higher-level languages, the runtime library can be quite rich (for instance, Python’s runtime provides everything from file objects to regular expression engines, implemented in C under the hood).
It’s useful to note that when we compile a program, the output (object code) typically doesn’t contain everything needed to run – external references to runtime library functions are left to be resolved at link time or load time. During the linking phase, the object code is combined with the runtime library code to produce a complete executable that contains both the user-defined functions and the runtime support routines. The resulting executable thus includes additional code (beyond the programmer’s source) that implements the runtime environment’s features. For instance, an object file might not include code to set up a stack frame for main(), but the final linked binary will include an _start routine (from the C runtime) that prepares the environment and then calls main. This highlights how the runtime’s anatomy is partly in libraries and partly in conventions/protocols followed by the compiler and the generated code.
In summary, the anatomy of a runtime system includes the structured memory layout (stack, heap, etc.), the mechanisms to manage function calls and control flow (the stack, calling convention, possibly a bytecode interpreter or JIT compiler in some runtimes), and the libraries or system interface that allow the program to perform I/O and other interactions. Some of these components are explicit code (like library functions), while others are abstract rules that the generated machine code or bytecode follows (like “push arguments on stack before calling”).
By understanding this anatomy, we can appreciate that when a program runs, it operates in a managed environment: memory is organized in certain ways, operations happen according to language-specific rules, and behind every high-level instruction there might be multiple layers of runtime activity. In the next chapters, we will see how different languages implement these runtime components – from the minimal C runtime to the extensive virtual machines of Java and .NET.
Lab Exercises:
- Exercise 2.1: Write a simple C program (just a
mainthat prints a message) and use a tool like objdump or ndisasm to disassemble the compiled binary. Identify the additional instructions at the start of the program (before yourmaincode executes). You should find the runtime startup code (crt0) that sets up the stack and eventually jumps tomain. This will give you a concrete view of runtime initialization. - Exercise 2.2: In a high-level language of your choice (Python, Java, etc.), attempt to draw the memory layout when a program runs. Write a function that allocates some variables (maybe an array or object) and prints addresses or identifiers for stack vs. heap allocations. For example, in C, you can print the address of a local variable (stack) and a
malloc’d variable (heap) to see their relative locations. This helps illustrate the segmented memory model shown in the figure above. - Exercise 2.3: Investigate the calling convention on your platform. Write two or three simple functions in C (e.g., a function that adds two numbers) and compile with debugging symbols. Use a debugger (like gdb) to step into a function call and inspect register/stack usage when the call happens. Observe which registers or stack positions hold the function arguments. This manual exploration reinforces how the runtime enforces a calling convention at the assembly level.
Chapter 3 – Operating System vs Runtime
Runtime systems and operating systems are closely related – both provide services to programs – but they operate at different layers and scopes. Understanding their relationship clarifies what responsibilities belong to the language runtime versus the underlying OS.
An operating system (OS) is the low-level software (like Windows, Linux, macOS) that manages hardware resources and provides common services for all applications on a computer. The OS controls things like CPU scheduling (deciding which process runs when), memory allocation (ensuring processes don’t trample on each other’s memory), device I/O (through drivers), and security enforcement. In essence, the OS is the universal runtime for machine code: it abstracts hardware into a set of generalized resources (processes, memory addresses, files, sockets) that programs can use.
A runtime system (as we’ve been discussing) typically sits above the OS. It often runs within the context of a single process and tailors the general OS services to the needs of a particular programming language or execution model. For example, an OS provides system calls for reading and writing files; a runtime library might provide a function like File.read() or printf() that internally invokes those system calls, possibly doing extra buffering or error handling.
One way to look at it: the OS is general-purpose – it must manage many programs and users – whereas a runtime is language-specific or application-specific – it focuses on the needs of code written in a certain language or environment. The runtime cooperates with the OS to get work done. In many cases, what we consider a runtime feature is implemented by translating runtime requests into OS requests. For instance, when a Java program opens a socket, the JVM runtime calls the underlying OS’s networking API. The runtime acts as an abstraction layer, hiding complexity or differences in OS services. This layering is what allows a Java or Python program to run on multiple OSes without change – the runtime adapts calls to whatever the host OS provides.
Despite this layered relationship, there is sometimes overlap. Some functionalities can be handled either by the OS or by the runtime. A classic example is thread scheduling for concurrency (discussed more in Chapter 8). Many language runtimes (like Java’s JVM) rely on the OS to schedule threads (each Java thread maps to an OS thread by default). But some runtimes implement their own scheduling (green threads) on top of a single OS thread. In that case, the runtime is doing a job the OS could do, but possibly with language-specific optimizations.
It’s also insightful to consider extremes: an OS itself can be viewed as a runtime system for programs written in machine code. In fact, the interface of system calls that an OS provides (like the Linux kernel’s calls for open(), read(), write(), etc.) can be thought of as the “API” through which user programs interact with the OS runtime. On the flip side, some applications bundle an entire OS or kernel with them as a specialized runtime. Historical examples include certain IBM PC applications in the 1980s that were distributed on a bootable disk with a minimal OS (like CP/M) included – when you powered on the machine with that disk, it booted directly into the app’s runtime, effectively dedicating the whole machine to that application. This blurs the line between OS and runtime, showing that in principle an application’s runtime could be the OS if it takes over the machine.
To clarify the typical division of labor:
- The Operating System manages resources globally. It handles isolation between processes, enforces security at the system level, and provides hardware abstraction. It schedules CPU time for processes and threads, allocates physical memory pages, communicates with disks and networks, etc. The OS operates with privileged access to hardware.
- The Runtime System manages resources within a program. It allocates and frees memory within the process (using OS-provided memory pages under the hood), schedules logical threads or tasks (which might map to OS threads or not), performs garbage collection within the process’s heap, and so on. The runtime operates as an unprivileged library or virtual machine on top of the OS.
A concrete illustration: Suppose you write a line print("Hello") in Python. At runtime, the Python interpreter (runtime) receives this instruction, converts it to a C library call (like fwrite to standard output) – that’s the runtime working. That C library call goes to the OS kernel via a system call (e.g., write on a file descriptor for stdout) – that’s the OS working. The OS then uses the hardware (through drivers) to actually output characters to your terminal. So the chain is Python code → Python runtime (C code) → OS system call → Hardware. The runtime and OS each handle their segment of the responsibility.
Interplay & Abstractions: Often, a runtime’s job is to provide uniformity across different OS environments. For example, environment variables are a feature provided by the OS, but a program accesses them through the runtime (like getenv() in C or System.getenv() in Java). Similarly, the runtime might offer high-level constructs (threads, file streams, graphical widgets) which underneath use OS primitives (processes, file descriptors, windowing system calls). The runtime thus hides complexity and bridges differences between OSes. This means a good runtime will translate its internal operations to OS operations in a way that the programmer doesn’t have to worry about whether they’re on Linux or Windows – e.g., Java’s file API works on any OS that the JVM runs on.
Interestingly, it also implies that an OS kernel can be seen as the lowest-level runtime: one for which the “language” is machine code and the “program” is any user process. The CPU hardware itself executes machine instructions and could be considered a runtime for assembly language. In that sense, runtimes exist in layers: hardware is a runtime for machine code, an OS is a runtime for processes (offering them system services), and a language runtime is a more specialized environment for programs in a particular language. Each layer abstracts the one below it.
Security considerations: The OS typically provides isolation between programs (one program’s memory cannot be accessed by another, etc.), but the runtime often provides isolation within a program. For example, in Java, the runtime’s SecurityManager and class loader can prevent untrusted code (like an applet) from performing certain actions, even though it’s running inside your process. This “sandbox” model (discussed in Chapter 9) relies on cooperation between the runtime and OS: the OS may prevent certain system calls, and the runtime further restricts what code can request in the first place. In general, the OS sets broad security and resource limits (like process cannot use more than X memory, or cannot open files it’s not permitted to), while the runtime can enforce finer-grained policies (like a Java applet cannot call System.exit() or cannot access the filesystem at all, even though the OS might allow the process to if it tried).
To conclude, the OS and runtime form a hierarchy of execution environments. The OS is the universal runtime environment for all processes, and the runtime is a domain-specific environment within that process. The OS focuses on inter-process concerns and hardware abstraction; the runtime focuses on intra-process concerns and language abstraction. Both are essential: a program runs on an OS, but if it’s written in, say, Python, it also needs the Python runtime (interpreter) to execute. The efficiency and capabilities of a program, especially in advanced features like thread concurrency or memory management, depend on how these two layers work together.
Lab Exercises:
- Exercise 3.1: Using system monitoring tools, observe the interaction between a program’s runtime and the OS. For example, run a simple program (like printing a file’s contents) and use strace (on Linux) or equivalent to trace system calls. You’ll see a series of OS calls (open, read, write, etc.) that are invoked by the runtime on behalf of your program. Try this with different languages (e.g., a Python script vs a C program) to see how the pattern of OS usage differs.
- Exercise 3.2: Explore OS-level sandboxing tools. For instance, on Linux, try using seccomp to restrict system calls in a simple program, or use containerization (Docker) to run a program in an isolated environment. Then consider what additional restrictions a language runtime (like the Java sandbox) might impose on top of OS isolation.
- Exercise 3.3: Research a historical example where the line between OS and runtime blurred (such as the UCSD Pascal p-System in the late 1970s, which was a portable OS and Pascal runtime in one, or early game consoles where games run with minimal OS). Present how that system managed resources and provided services to programs, comparing it to a contemporary OS + runtime separation.
Part II – Language Runtimes
Chapter 4 – C and Native Runtimes
Not all runtime systems are heavy virtual machines – some are lightweight and closely tied to the operating system. The C language and similar “native” languages (like C++ and Pascal) illustrate a minimalist runtime approach. In these languages, most work is done at compile time to produce machine code, and the runtime support is relatively small. Still, even C has a runtime layer, often referred to as the C runtime library (CRT) or simply the C standard library.
The C Runtime Startup: When you run a compiled C program, it doesn’t jump straight into your main function. First, a special routine (often named _start or similar) executes. This routine is part of the CRT and is linked into your program. Its job is to perform low-level initialization: for example, on many systems, _start will set up the process’s stack frame, initialize static and global variables (copying initial values into the data segment, zeroing out the BSS segment for uninitialized globals), and handle any required setup for dynamic linking if needed. Only after these tasks does it call main(). When main returns, the runtime startup code will retrieve the return value (if any) and then call the OS to terminate the process (often by invoking the exit system call). In summary, the C runtime startup ties your high-level main to the world of system calls and machine execution.
Memory and the Stack in C: C relies on the OS for memory management—there’s no garbage collector. Memory allocation is manual: using malloc (from the C runtime library) requests memory from the heap, and free returns it. Under the hood, malloc might use an OS system call like brk or mmap to get more heap pages from the OS when needed. It also manages a free-list or other data structure to reuse freed blocks within the process. For stack memory, each function call in C automatically uses a new section of the stack (growing the stack “downward” into free space). Setting up the stack frame – adjusting the stack pointer, saving registers, etc. – is handled by compiler-generated code following the OS ABI (Application Binary Interface) conventions. These conventions (e.g., how to align the stack, which registers must be preserved) are part of the platform’s runtime environment that the compiler targets, and they remain invisible to the C programmer but are crucial for correct execution.
Runtime Library Functions: The C standard library provides a wide range of functionality: I/O (e.g., printf, scanf, file streams), string manipulation (strcpy, strlen), memory functions (malloc, memcpy), math routines, and more. These are part of the C runtime in the sense that they execute while the program runs and are often needed for even basic tasks. For example, printf("Hello, %s!", name) will internally format the string and then eventually call a low-level write routine, which makes an OS call to output to stdout. The exact path might involve buffering: the C library often buffers file I/O for efficiency, meaning data is accumulated in memory and written out in larger chunks. This is all handled by the runtime library code. Another example is malloc as mentioned: it keeps track of allocated blocks and free blocks within the process’s heap, fulfilling allocation requests from the program and only occasionally calling the OS when it needs more memory. Thus, while C gives the programmer a lot of control, it still delegates these housekeeping tasks to its runtime routines.
One special part of the C runtime is the crt0object (the “C runtime zero” or startup code). Historically, crt0.o was the object file that contained the _start routine and related initialization code, which gets linked into every C program. This is what bridges the gap between the OS loader (which loads your program into memory and sets the instruction pointer to _start) and your main function.
Minimalism and Performance: The design of C’s runtime is minimal by intention – C was designed to be close to the hardware, with the runtime primarily serving to make OS calls available and implement a few higher-level utilities. There is no virtual machine interpreting C code at runtime; rather, C code is translated directly to machine code ahead-of-time (with functions like printf being compiled into calls into the C library). This yields high performance and predictability since there’s little overhead beyond what you explicitly code. However, it also means the responsibility for correctness is largely on the programmer: buffer overflows, invalid frees, and other errors are not caught by the runtime (there’s no managed safety net as in Java or C#). The C runtime will happily let you do unsafe operations, and if things go wrong (e.g., writing past the end of an array), the OS might step in only if you violate memory protections (segmentation fault). So the C runtime trusts the programmer (and the compiler) more than managed runtimes do.
Comparison with C++ and others: Languages like C++ build on the C runtime. They add features such as global constructors and destructors (which the runtime must call before main begins and after main returns or throws an exception). For example, if you have a global std::string in C++, the runtime startup will call its constructor to initialize it prior to entering main. Similarly, C++ exception handling relies on runtime support (unwinding the stack, calling destructors of objects in scope). These are implemented partly by generated code (from the compiler) and partly by library support (like the libc++abi for the C++ runtime). But fundamentally, C and C++ share the model of compiling to native code and using a relatively thin runtime layer on top of the OS.
Code Example: To illustrate the C runtime, consider the following extremely simplified pseudocode in assembly for a program’s start-up and termination:
; Pseudocode/assembly for runtime start (highly simplified)
_start:
call __libc_init_array ; call global constructors (C++)
call main ; call the user's main function
mov rdi, rax ; move return value of main into rdi (exit code)
call exit ; call C library exit, which invokes system exit
This pseudocode shows that _start might handle global initialization (__libc_init_array is often used to run C++ global constructors), then it calls main. After main finishes, the return value (in register rax on x86_64) is passed to the C library exit function, which will flush I/O buffers, call any atexit handlers, and finally invoke the OS system call to terminate the process. In the actual implementation, exit is a C library function that does cleanup then calls the kernel (e.g., via the exit_group system call on Linux).
From the perspective of a C programmer, none of this is visible – you write int main() { return 0; } and these runtime steps happen automatically. But it’s instructive to know they exist. The C runtime, while simple, is crucial for coordinating with the OS.
Lab Exercises:
- Exercise 4.1: Create a small C program and produce an assembly listing. (For instance, compile with
gcc -S program.cto get a.sfile, or use an online compiler explorer.) Examine the assembly around themainfunction. You might not directly see_startin a.sproduced from a single C file, but you can compile and link and then disassemble the binary as in Chapter 2’s exercise to find_start. Identify calls tomainandexit. This will concretely show the runtime’s role in starting and ending the program. - Exercise 4.2: Write a C program that uses a static/global variable with a non-trivial initializer (e.g., an array of ints with some values). Run the program and use a debugger to break at
main. Check the values of that global to confirm they were set before enteringmain. Try to find where in memory the initializer is stored (it will be in the data segment of the binary). This exercise demonstrates how the runtime (with OS help) sets up static data. - Exercise 4.3: Explore what happens if you forget to call
freefor memory allocated withmalloc. Write a loop that allocates memory repeatedly without freeing, and observe memory usage via OS tools. Then modify to free properly. While the C runtime won’t free it for you (no GC), the OS should reclaim all process memory when the program exits. This experiment highlights the difference between what the C runtime does (or doesn’t do) and what the OS does at process teardown.
Chapter 5 – Managed Runtimes: JVM and CLR
In the mid-1990s and early 2000s, a new paradigm for runtimes took center stage – managed runtimes. These are runtime environments that manage many aspects of execution automatically: memory allocation and garbage collection, safety checks, and often JIT compilation for performance. The two flagship examples are Java’s Java Virtual Machine (JVM) and Microsoft’s Common Language Runtime (CLR) for .NET. Both introduced the idea of “write once, run anywhere” through virtual machines, and both handle memory management and type safety in a way that native runtimes like C’s do not.
Java and the JVM
Java, released by Sun Microsystems in 1995, was designed around the principle of WORA: Write Once, Run Anywhere. To achieve this, Java programs are not compiled to native machine code directly; instead, they are compiled to an intermediate bytecode format (the .class files containing Java bytecode instructions). These bytecodes are executed by the Java Virtual Machine (JVM), which is a runtime that can run on any platform that has a JVM implementation. In other words, the JVM acts as a uniform processor architecture for Java programs, abstracting away the differences between x86, ARM, Windows, Linux, etc.. The Java Runtime Environment (JRE) includes the JVM plus standard libraries, forming the complete runtime platform for Java.
JVM Architecture: The JVM is an example of a stack-based virtual machine. It has several key components:
- A Class Loader Subsystem that loads
.classfiles (bytecode) into the runtime. When a Java program starts (or when it dynamically loads a class), the class loader finds the bytecode (from the filesystem or network), verifies it for correctness (ensuring it’s not malformed or violating Java’s safety rules), and then makes it available for execution in the JVM. Verification is crucial – it ensures that the bytecode adheres to Java’s type safety (no illegal typecasts, stack underflows, etc.), which is a cornerstone of Java security. - The Runtime Data Areas which include the heap (for objects), the method area (for class definitions and static fields), and stacks for each thread. When you create a new object in Java, memory is allocated on the JVM’s heap; the JVM’s garbage collector later reclaims objects that are no longer referenced. Each Java thread has its own stack for method calls, much like a native stack, but the JVM manages these stacks in accordance with Java’s semantics (for instance, if a StackOverflowError occurs, it’s the JVM detecting that the stack grew beyond a certain limit).
- The Execution Engine, which is the component that actually executes the bytecode instructions. In early JVMs, this was an interpreter that read each bytecode and executed the corresponding operation. Modern JVMs use Just-In-Time (JIT) compilers to translate hot portions of bytecode into native machine code on the fly. This means if a Java method is called frequently, the JVM will compile it to native code for speed. The execution engine also includes the garbage collector (GC) which runs periodically to free memory. The JVM typically has multiple GC algorithms (e.g., generational collectors, mark-and-sweep, etc.) and can choose or tune them based on the application’s needs.
- Native Interface: The JVM provides a way to call native code (JNI – Java Native Interface) when needed, and it interacts with the OS through a thin layer when it must (for threads or I/O). But from the Java programmer’s perspective, those details are hidden; you work with Java objects and methods, and the JVM runtime does the rest.
Managed Execution Benefits: The JVM enforces memory safety and type safety. An object in Java can’t be treated as, say, an integer arbitrarily – the JVM would throw an error if bytecode tried to do that. Array bounds are checked at runtime (throwing an ArrayIndexOutOfBoundsException if violated), preventing buffer overflows. These checks, plus the garbage collector preventing use-after-free errors, mean Java programs avoid many common bugs by design. The downside is a performance cost for such checks and GC pauses, but the JIT compilation and other optimizations often mitigate that.
Another aspect is security: the original Java sandbox model for applets relied on the JVM’s class loader and SecurityManager to restrict what code could do. For example, an untrusted applet loaded from the internet could be prevented from accessing the local file system or making arbitrary network connections. The class loader would place it in a separate namespace and the SecurityManager would vet sensitive operations (like opening a file) before allowing them, based on policy. This was all implemented within the runtime.
Example – Bytecode Execution: Consider a simple Java snippet:
int a = 5;
int b = 3;
int c = a + b;
When compiled to bytecode (via javac), it might produce instructions like: load constant 5 into a local variable slot, load 3, then an iadd instruction to add them, storing the result. The JVM’s interpreter or JIT would execute that iadd by actually adding the two integers. Unlike C where the addition is a single machine instruction in the compiled binary, in Java it’s a bytecode that the runtime must execute. If this operation happens millions of times in a loop, the JVM’s JIT might compile the whole loop to native code for efficiency.
JVM in Action (code example): A classic demonstration is to compile a Java class and inspect the bytecode using the javap tool. For instance:
public class Example {
public static void main(String[] args) {
int x = 2;
int y = 3;
int z = x * y;
System.out.println(z);
}
}
After compiling, running javap -c Example might show something like (simplified):
0: iconst_2 // push int 2
1: istore_1 // store into local var 1 (x)
2: iconst_3 // push int 3
3: istore_2 // store into local var 2 (y)
4: iload_1 // load x
5: iload_2 // load y
6: imul // multiply
7: istore_3 // store into local var 3 (z)
8: getstatic ... // get System.out
11: iload_3 // load z
12: invokestatic ... println(I)V // call println
15: return
This sequence is what the JVM executes. The JVM ensures, for example, that iload_1 actually refers to an int (since it knows x is an int from the class metadata), and if we tried to use an object where an int is expected, the verifier or runtime would throw an error.
Garbage Collection in JVM: The Java runtime performs garbage collection on the heap. This means that Java programmers do not explicitly free objects; the runtime detects when objects are no longer reachable (e.g., all references to them have out of scope) and then reclaims that memory[4]. Chapter 7 covers GC in detail, but in brief, the JVM typically uses a generational GC: new objects are allocated in a young generation space which is collected frequently, and long-lived objects are moved to an old generation space collected less frequently. The GC runs on one or more background threads (depending on the algorithm) and pauses the application threads briefly (in stop-the-world pauses) to do its job. This is a prime example of a runtime-managed service that greatly simplifies programming (no manual free calls) at the cost of adding complexity inside the runtime system.
The CLR (Common Language Runtime) and .NET
Microsoft’s Common Language Runtime, introduced around 2002 with the .NET Framework, was heavily inspired by the JVM design but with some key differences aimed at multi-language support. The CLR is the engine that runs programs written in C#, VB.NET, F#, and dozens of other languages that target .NET. Like the JVM, the CLR uses a bytecode (called CIL – Common Intermediate Language, or MSIL) which all .NET languages compile down to. This bytecode is CPU-agnostic and is JIT-compiled by the CLR into native code when the program runs.
Key features of CLR:
- Multi-language support: The CLR was designed so that many languages can interoperate. It defines a Common Type System (CTS) which ensures that, say, an
intin C# is the same 32-bit value as anIntegerin VB.NET, and that objects from different languages can interact. The CLR’s Common Language Specification (CLS) sets rules for language interoperability. This means the runtime had to be language-neutral, imposing certain rules (for example, all exceptions are objects that inherit from a baseExceptionclass, all classes ultimately derive fromSystem.Object, etc.). - Managed execution and memory: The CLR, like JVM, manages memory with a garbage collector (in fact, early on the .NET GC was generational and quite advanced). It also enforces type safety and security restrictions. For instance, before executing, the CLR can perform a verification of the CIL code to ensure it doesn’t do anything unsafe (there is a concept of verifiable IL). The CLR’s security model (in older .NET called Code Access Security) could restrict what permissions code has, somewhat analogous to Java’s sandbox (though in practice, .NET applications often run fully trusted on desktop).
- JIT and performance: The CLR uses JIT compilation – when a method is first called, the CLR’s JIT compiler translates the CIL into machine code for the current architecture. The CLR can also do profile-guided optimizations, and newer versions include tiered compilation (first a quick, less optimized JIT, then a more optimized one for frequently used code). There are also Ahead-of-Time (AOT) compile options (e.g., the .NET Native or ReadyToRun images), but traditionally JIT is the norm. The CLR JIT, for example, can optimize knowing the exact machine it runs on (using available CPU instructions).
- Services provided: The CLR provides many runtime services: a just-in-time compiler, garbage collection (automatic memory management), exception handling across languages (you can throw an exception in VB.NET and catch in C#, etc.), and reflection (ability to inspect metadata and types at runtime). It also handles things like PInvoke (Platform Invocation) which allows managed code to call unmanaged libraries, marshaling data back and forth.
One can think of the CLR as very similar to the JVM in architecture: bytecode in, JIT to native, managed memory, type safety. One difference is that .NET historically was more tied to the Windows OS (e.g., using the OS threading model, integrating with COM, etc.), whereas Java was built to be cross-platform from the start. But with .NET Core and .NET 5/6+, the CLR is fully cross-platform too.
Example: If you compile a simple C# program:
int x = 2, y = 3;
Console.WriteLine(x * y);
It compiles to CIL that, conceptually, looks like: load 2, load 3, multiply, call Console.WriteLine with the result. The CLR upon execution will JIT compile the multiplication and the call to the actual console output function (which likely ends up calling into a native method to perform the write). The runtime ensures that the Console class is properly initialized, and that the call is safe (if, for instance, you tried to call a method on a null object, the CLR would throw a NullReferenceException; the generated code includes that check).
Memory Management in CLR: The garbage collection in .NET’s CLR is similar in spirit to Java’s. Objects are allocated on a managed heap; the collector runs on its own thread(s) and can compact the heap, move objects, etc., updating references. Programmers can get some control (like forcing a GC collect, or using weak references, or the IDisposable pattern to deterministically clean up non-memory resources), but they do not free memory manually. The runtime’s GC uses generational approach: Generation 0 for new objects, Gen 1 for surviving short-term, Gen 2 for long-term survivors. .NET also has features like Large Object Heap for big objects, and various GC modes (workstation vs. server GC for different trade-offs).
Type Safety and Verification: The CLR enforces that all code obeys the .NET type system. For example, you cannot corrupt memory from within C# code – there are no pointer arithmetic (unless you go into an unsafe context explicitly, which the CLR will then restrict unless you have full trust). The CLR’s verification process ensures that CIL code doesn’t do things like branch into the middle of an object or violate stack discipline. If you were to emit invalid CIL, the CLR would refuse to run it (or throw a VerificationException unless it’s running in an unsafe mode).
Interoperability: One of the powerful features of the CLR is that you can mix multiple languages in one program. The runtime unifies them. For instance, you could write a class in VB.NET and subclass it in C#, and it works seamlessly. This is possible because at runtime everything is just CLR types and the method calls are resolved by the CLR’s JIT and runtime type information. The JVM also has multiple languages (Kotlin, Scala, Groovy all run on the JVM), but the CLR from the start was designed for this multi-language support with a unified type system (CTS).
Managed vs Native Performance: Initially, managed runtimes like JVM and CLR were slower for certain tasks than optimized C/C++ because of the overhead of interpretation/JIT and GC. Over time, however, they narrowed the gap significantly. The JIT can make use of runtime information to optimize (for example, inlining methods across classes once it sees there’s no override at runtime, or optimizing based on the actual CPU model). Managed runtimes also benefit from eliminating certain categories of bugs, improving reliability and development speed at the slight cost of raw performance. In scenarios where performance is critical, both Java and .NET allow calling out to native code (JNI in Java, PInvoke or COM interop in .NET), but each such call transitions out of the managed environment and back, which is something the runtime handles (marshaling data, etc.).
Lab Exercises:
- Exercise 5.1: Write a simple Java program and use the
javaptool to inspect its bytecode (as shown above). Try to correlate the bytecode with your source code. Then run the program with theverbose:gcflag (which prints garbage collection events) or use a profiler to see when GC happens. This gives insight into how the Java runtime is working behind the scenes. - Exercise 5.2: Write equivalent simple programs in Java and C (for example, sum of first N integers) and compare their performance. Then experiment with Java Virtual Machine flags, such as enabling the server JIT (
server) or printing JIT compilation info (XX:+PrintCompilation), to observe how the JVM optimizes the code. You might notice the Java program speeds up after a few iterations once the JIT kicks in. - Exercise 5.3: For .NET, write a simple C# program and use the tool ILDasm (Intermediate Language Disassembler) to examine the generated CIL. You can also use dotnet –info and dotnet –list-runtimes to see what runtime version you’re using. Run the program with environment variable
COMPlus_ReadyToRun=0to force it to JIT everything (if using .NET Core), and use a tool like PerfView or dotnet-trace to observe JIT events. This helps understand the CLR’s behavior. - Exercise 5.4: Explore cross-language interoperability on a managed runtime. For example, create a library in C# with a class and method, then reference it in a F# or VB.NET program and call it. Everything should work seamlessly. Do a similar experiment on the JVM: write a simple class in Scala or Kotlin, and use it from a Java program. This demonstrates how the runtime (JVM or CLR) enables multiple languages to co-exist by adhering to the runtime’s common format and rules.
Chapter 6 – Scripting Runtimes (Python, Ruby, JavaScript)
Scripting languages like Python, Ruby, and JavaScript present yet another style of runtime system. These languages are typically interpreted or bytecode-compiled and executed by a runtime engine written in a lower-level language (often C). They emphasize flexibility and dynamic behavior, which means their runtimes must handle things like dynamic typing, late binding of variables, and often provide rich introspective features.
Python (CPython)
Python’s primary runtime is the CPython interpreter, implemented in C. When you run a Python program (with the python command), you are invoking this runtime. CPython compiles Python source code into bytecode (.pyc files), which is a lower-level, platform-independent representation of your code. This bytecode is then executed by the Python virtual machine, which is essentially a big loop in C that reads each bytecode instruction and performs the corresponding operation in terms of C code and Python C API calls.
Python’s Execution Model: Python is dynamically typed, meaning you don’t declare variable types; the runtime figures out what operations are valid at runtime and types can change (a variable can refer to an int, then later to a string). The CPython interpreter uses a stack-based VM similar in concept to Java’s or .NET’s, but at a higher level. It has a main loop often called the “bytecode evaluation loop” which repeatedly fetches the next instruction and executes it. For example, if the bytecode says BINARY_ADD, the interpreter will pop the top two Python objects off its internal evaluation stack, add them (by calling the appropriate addition logic depending on types), and push the result back. This loop is implemented in C and heavily optimized. In fact, CPython uses some advanced techniques like threaded code to jump directly to instruction handlers to speed up the dispatch.
A simplified view of the CPython main loop (in pseudo-code) is:
loop:
instruction = *next_bytecode++;
switch(instruction):
case LOAD_CONST: push(constant) on stack; continue;
case LOAD_NAME: push(value of variable from environment); continue;
case BINARY_ADD:
right = pop(); left = pop();
result = PyNumber_Add(left, right); // use Python C API to add
push(result);
continue;
case PRINT_ITEM: ... etc.
case RETURN_VALUE:
return pop(); // end of function
This design means Python executes line by line, resolving names and types as it goes. It makes Python extremely flexible (you can modify things at runtime, functions are first-class objects, etc.), but it has overhead for the dynamic checks each step of the way.
Global Interpreter Lock (GIL): One notable aspect of CPython’s runtime is the GIL – Global Interpreter Lock. This is a mutex that ensures only one thread executes Python bytecode at a time in a single process. The GIL simplifies memory management (since the Python memory allocator isn’t thread-safe, and many Python objects aren’t either, the GIL prevents race conditions by essentially serializing execution at the bytecode level). The downside is that multi-threaded Python programs do not get parallel execution on multiple CPU cores for CPU-bound tasks. The GIL does release for I/O operations and allows C extensions to run in parallel if they manage it, but for pure Python code, only one thread runs at a time. This is a major difference from runtimes like JVM or CLR where each thread can execute on separate cores concurrently. Python’s runtime thus chooses simplicity and ease of integration with C libraries over raw parallel performance. (There are implementations like Jython or IronPython without a GIL, and the new experimental no-GIL CPython branch, but standard CPython as of 2023 has the GIL.)
Memory Management in Python: Python uses a form of garbage collection, primarily reference counting. Every Python object keeps a count of references; when the count drops to zero, the object’s memory is immediately reclaimed. This is handled by the runtime automatically. For example, if you do x = [] (create a list) then x = None, the list’s reference count goes to zero, so CPython will free it right away. However, reference counting alone can’t handle cyclic references (where two objects reference each other). So CPython also has a cycle detector that occasionally looks for groups of objects that reference each other but are not referenced from anywhere else, and cleans those up[5]. This GC runs periodically (and can be controlled via gc module). It’s not as heavy-duty as JVM’s generational GC, but it works for typical use cases. Memory not freed by reference counting (due to cycles) gets handled eventually by this cycle GC.
Dynamic Features: Python’s runtime allows very dynamic behaviors: you can add attributes to objects at runtime, execute code from strings (eval or exec), modify classes dynamically, etc. The runtime maintains structures like dictionaries for each object’s attributes, and performs lookups on demand. For example, when you access obj.field, the Python runtime internally looks up field in obj’s dictionary (or its class’s dictionary). This is why attribute access in Python is slower than in Java/C#: it happens via hash table lookups at runtime instead of fixed offsets computed at compile time. Some Python VMs (like PyPy or other JITs) optimize this with techniques like inline caches (caching the attribute offset once found), similar to how JavaScript engines optimize property access.
Alternate Python Runtimes: Besides CPython, there are others: PyPy is a JIT-compiled Python runtime (written in a subset of Python) which can significantly speed up execution by compiling hot code to machine code. There’s also Jython (which runs on the JVM) and IronPython (on CLR), which leverage those managed runtimes. These show how a high-level language can be hosted on different runtime infrastructures. But CPython remains the reference implementation.
Ruby (MRI and others)
Ruby is another dynamic, interpreted language. The main Ruby interpreter is often called MRI (Matz’s Ruby Interpreter) or CRuby, also written in C. MRI works similarly to CPython: it compiles Ruby code to an internal bytecode and interprets it. Ruby is also dynamically typed and has a runtime model that is very flexible (methods can be added or removed at runtime, etc.).
Ruby 1.8 and earlier actually didn’t have bytecode – it directly interpreted the AST (Abstract Syntax Tree). Ruby 1.9 introduced YARV (Yet Another Ruby VM), which uses bytecode. YARV is stack-based and operates in a similar fashion to CPython’s VM. Ruby’s method calls are resolved at runtime (it has dynamic dispatch, meaning it looks up the method in the object’s class at call time). Ruby also has garbage collection, originally a simple mark-and-sweep collector, now improved with generations.
One interesting aspect of Ruby’s runtime is how it implements features like blocks/procs (closures) and metaprogramming (methods like method_missing that catch undefined method calls). The runtime must be ready to route method lookups dynamically. If you call obj.foo, the runtime checks obj’s class for a method foo. If not found, it calls method_missing on the object, which by default raises an error, but can be overridden. This is all done at runtime, meaning the Ruby VM spends a lot of time doing dynamic dispatch.
Ruby also has a Global VM Lock (GVL) similar in spirit to Python’s GIL – in MRI, only one thread executes at a time (with some exceptions for I/O). This makes threading simpler but limits parallelism. There are other Ruby runtimes: JRuby runs on the JVM (so it can use real parallel threads via JVM threads, and uses the JVM GC, etc.), and IronRuby on CLR (not as maintained). There’s also TruffleRuby (which uses the Graal VM for JIT), etc. These alternate runtimes often significantly improve performance or remove the global lock at the cost of some compatibility nuances.
JavaScript (Browser JS Engines and Node.js)
JavaScript has a unique place – originally it was solely in browsers, but with Node.js (and others) it’s now used on servers and even in desktop apps (Electron). The reference JavaScript runtime historically was a simple interpreter (like in early Netscape and IE). However, modern JavaScript engines (such as Google’s V8, Mozilla’s SpiderMonkey, Apple’s JavaScriptCore) are highly optimized JIT compilers. The need for speed (to run complex web applications) drove JavaScript runtimes to be some of the most advanced in terms of optimization techniques.
JavaScript Engine (V8 example): V8, which powers Chrome and Node.js, does multiple-tier JIT compilation. It might start by interpreting or doing a baseline compile of code, gathering type feedback (since JS is dynamic, variables can hold values of any type). Then it uses that feedback to do an optimized compile (via a optimizing compiler formerly called Crankshaft and now TurboFan) that generates fast machine code assuming certain types (e.g., that a particular variable is likely always an integer). If those assumptions are violated (say the variable becomes a string later), the runtime can deoptimize and fall back to a generic (slower) routine. V8 also uses techniques like hidden classes (to give JS objects a shape behind the scenes so that property accesses can be optimized similarly to a fixed offset as in C++ objects) and inline caching (remembering the outcome of property lookups so next time it’s faster). These techniques allow JavaScript to run orders of magnitude faster than naive interpretation.
JavaScript’s runtime environment (in a browser) also includes a sandbox model (like Java applets did) – JS running on a webpage cannot access the user’s filesystem arbitrarily or make arbitrary network calls except through the browser’s APIs, providing security. In Node.js, the runtime includes the Node APIs for file system, networking, etc., which are built on top of the V8 engine but not sandboxed (since on a server or local script you are allowed to access files).
Event Loop: One defining characteristic of JavaScript runtimes (especially in the browser and Node) is the event loop concurrency model. Instead of threads, JavaScript uses an event loop with asynchronous I/O. The runtime will execute one piece of JavaScript at a time (per JS engine instance) and rely on callbacks and events to handle concurrency. For example, in Node.js, when you perform a non-blocking I/O operation, the Node runtime (written in C++ on libuv) will initiate it, and then schedule a callback to be executed later on the main JS thread when the I/O completes. The Node.js runtime thus coordinates between V8 (for executing JS code) and libuv (for the event loop and system calls). This means the JavaScript runtime can handle many concurrent connections with a single thread by not blocking on I/O – a design well-suited for I/O-heavy workloads, though less so for CPU-heavy tasks because it can’t parallelize them without worker threads (a newer addition) or separate processes. The event loop concept will be further discussed in concurrency (Chapter 8).
Memory in JS: JavaScript is garbage-collected, and its semantics eliminate manual memory management (no pointers or malloc in JS). The GC is typically a generational mark-and-sweep or mark-and-compact collector within the engine. V8, for instance, has both a young generation (scavenging collector) and old generation (mark-compact) and runs either concurrently or stop-the-world depending on configuration.
Dynamic Nature: Like Python and Ruby, JavaScript is extremely dynamic. You can add properties to objects on the fly, even modify object prototypes (which act as classes). The runtime must handle this. Chrome’s V8 introduced hidden classes behind the scenes to optimize what was traditionally slow (property access) by creating an internal class-like structure when it observes consistent property usage. This is invisible to the developer but greatly improves runtime efficiency. This is a great example of a runtime adapting to dynamic language behavior with clever techniques.
Scripting vs. System Runtimes: The “scripting” runtimes generally prioritize flexibility and quick development turnaround (no compile step in the traditional sense – though there might be bytecode, it’s generated at runtime). They often come with REPLs (read-eval-print loops) that allow interactive execution (where the runtime reads your input, executes it, and prints result). This requires the runtime to be able to compile and execute code on the fly (e.g., the Python eval() function or the JavaScript eval() both take a string of code and execute it in the current runtime context).
Cross-language Comparisons
It’s interesting to compare these scripting runtimes to managed runtimes like JVM/CLR and to native ones:
- Performance: Out of the box, something like CPython or Ruby MRI is much slower than Java or C#. This is due to the overhead of interpretation and dynamic type handling. However, projects like PyPy (JIT for Python) or V8 (for JS) have narrowed the gap significantly by employing techniques similar to the JVM’s JIT. Still, Java/C# have static type info and ahead-of-time optimizations that give them an edge in many cases. It is often said that dynamic languages trade some raw speed for programmer productivity and ease of use.
- Memory management: All these runtimes free the programmer from manual memory management. They use GCs, though the details differ (refcount vs tracing, generational vs not, etc.). JavaScript and Java’s GCs are pretty advanced (with decades of tuning). Python’s refcounting means that objects are usually freed immediately when out of scope, which sometimes is more predictable (no long pauses) but in multi-threaded scenarios, the GIL is needed partly because refcount updates on objects are not atomic without a global lock.
- Concurrency model: As we’ll explore in Chapter 8, Python and Ruby use OS threads but lock the interpreter (so concurrency is limited, but you can do I/O in parallel or use multiple processes for parallel CPU work). Java and C# use OS threads freely for true parallelism (and the runtime deals with concurrent GC etc.). Node.js (JS) uses an event loop (single-threaded) with an option to offload work to an internal thread pool for certain tasks. This means each runtime has its own approach: multi-threaded with locks (Java, C#), single-threaded event loop (JS in Node, and conceptually in browsers), and effectively single-threaded due to GIL (CPython, MRI Ruby).
- Interactivity and Introspection: Scripting runtimes usually offer more introspection. For example, Python’s runtime lets you inspect the call stack, modify functions at runtime, etc. Java’s runtime has reflection but with more restrictions (you can’t easily add a method to an existing class at runtime, whereas in Python you can monkey-patch a method onto a class anytime). This dynamism is why frameworks in Python/Ruby can do metaprogramming magic (like ORMs auto-generating methods) so easily. The runtime must support it by allowing modifications of class dictionaries, method lookups deferring to runtime, etc. In Java, you could achieve some of that with bytecode generation or proxies, but it’s more static by design.
In conclusion, scripting language runtimes prioritize flexibility and ease of use, which puts more burden on the runtime engine to make things work efficiently. The evolution of these runtimes (e.g., the adoption of JITs in JS engines and alternative Python implementations) shows a continual effort to improve performance without sacrificing the dynamic features that make the languages attractive.
Lab Exercises:
- Exercise 6.1: Use Python’s built-in disassembler module
disto inspect Python bytecode. For example: import dis def sample(a, b): return a + b * 2 print(dis.dis(sample))This will show the Python bytecode instructions for the function (likeLOAD_FAST,BINARY_MULTIPLY, etc.). Try modifying the function (adding a conditional or loop) and see how the bytecode changes. This helps you understand what the Python runtime is executing.- Exercise 6.2: Write a simple program in Ruby and use the
RubyVM::InstructionSequence(for Ruby 2.5+ YARV) to disassemble it. For example: code = RubyVM::InstructionSequence.compile("x=2; y=3; z = x*y; p z") puts code.disasmCompare it to Python’s disassembly for a similar operation. You’ll see analogous operations. This gives insight into YARV’s bytecode.- Exercise 6.3: For JavaScript, open the developer console in a Chrome browser and enable verbose logging for V8 (there are flags like
-trace-opt --trace-deoptwhen running Node or Chrome with custom flags). Alternatively, use Node.js and run it with flags: node --allow-natives-syntax --trace-opt --trace-deopt script.jsWrite a script that has a function doing some operation in a loop. The trace output will show when V8 optimizes and deoptimizes that function. You can even use V8 internal calls (with-allow-natives-syntax) like%OptimizeFunctionOnNextCall(function)to force optimization. This is advanced, but it lets you witness the JIT in action in the JS runtime.- Exercise 6.4: Try the interactive mode or REPL of these languages (Python’s REPL,
irbfor Ruby, and Node’s REPL or browser console). Define functions or classes on the fly, redefine them, and observe that the runtime immediately adapts. For example, in Python REPL: class A: def f(self): return "original" a = A(); print(a.f()) def new_f(self): return "new" A.f = new_f print(a.f())You’ll see that you changed the method of classAat runtime and the instanceanow uses the new method. Consider how the runtime must handle such a change (it essentially looks upfeach time inA.__dict__, and now findsnew_f). This exercise underlines the dynamic nature of scripting runtimes.
Part III – Advanced Runtime Mechanisms
Chapter 7 – Memory Management & Garbage Collection
Managing memory is one of the most crucial roles of a runtime system. Different languages and runtimes take different approaches, from manual management (as in C/C++) to fully automated garbage collection (as in Java, C#, Python, etc.). In this chapter, we explore how runtime systems handle memory allocation and reclamation, and the algorithms that have evolved to make garbage collection efficient.
Manual vs. Automatic Memory Management
In lower-level languages like C and C++, the runtime provides simple routines to allocate and free memory (e.g., malloc/free or new/delete). The responsibility is on the programmer to call free for every malloc and ensure no use-after-free or double-free errors occur. This manual memory management is error-prone and can lead to serious issues (memory leaks if you forget to free, and crashes or security vulnerabilities if you free incorrectly). There is essentially no automatic help from the runtime beyond perhaps some debug modes or smart pointers libraries built on top.
By contrast, many modern runtimes implement automatic memory management, colloquially known as garbage collection (GC). A garbage-collected runtime periodically identifies objects that are no longer needed by the program and frees them on behalf of the programmer. This relieves the programmer from tracking memory usage explicitly and eliminates certain classes of errors (dangling pointers, double frees, etc.)[4]. The trade-off is that the runtime must do extra work (which can pause the program or use CPU time) to manage memory, and it generally uses more memory to allow for efficient GC.
Definition: Garbage collection is a form of automatic memory management where the runtime attempts to reclaim memory occupied by objects that are no longer in use by the program. Such memory is called “garbage” because it can’t be reached by any active part of the program. The concept was introduced by John McCarthy around 1959 in the context of Lisp, making it a very old idea that has been refined for decades.
How Garbage Collection Works
At a high level, a garbage collector needs to answer the question: “which objects are still in use?” Most GC algorithms do this by starting from a set of “root” references (things like local variables on the stack, global/static variables, CPU registers, etc.) and seeing what objects they reference, then recursively what those objects reference, and so on. Any object not reached by this graph traversal is considered unused and can be freed.
There are several classic garbage collection algorithms:
- Reference Counting: Each object keeps a count of how many references point to it. When references are destroyed or re-assigned, the count is decremented; if it drops to zero, the object can be immediately freed. Python’s CPython primarily uses reference counting. The advantage is simplicity and immediacy (objects are reclaimed promptly when unreferenced). The big disadvantage is it cannot handle cyclic references (A -> B -> A type cycles will never drop to zero)[5]. Also, reference updates have a performance cost (incrementing/decrementing counters, which typically requires thread safety measures like the GIL). Reference counting collectors usually need a backup cycle detector (as Python has) to collect cycles occasionally.
- Tracing Garbage Collection: Instead of counting references, the runtime traces the object graph. There are various strategies:
- Mark-and-Sweep: The collector stops the program (or runs concurrently in some implementations), then marks all objects that are reachable (by starting at roots and recursively visiting references). Then it sweeps through memory (or through some object list) and frees all objects that were not marked. This was one of the first GC algorithms (McCarthy’s 1959 Lisp collector was essentially mark-and-sweep). It ensures all garbage is collected, including cycles. However, naive mark-and-sweep doesn’t compact memory, so you can get fragmentation (holes in the heap). Also, if done stop-the-world, it can introduce pauses proportional to the number of objects.
- Mark-and-Compact: This is an improvement where after marking live objects, the collector also compacts them by moving them to eliminate gaps. This requires updating all references to the moved objects. The benefit is it defragments memory, making allocation simpler and keeping caches happier. The downside is the cost of copying and updating references, and it typically needs the program to be paused during compaction (or very fancy techniques to do it concurrently).
- Copying Collection: A different approach is to divide the heap into two halves (semispaces). At any time, one half is active for allocations. When GC happens, you copy all live objects from the active half to the other half (compacting them in the process), and then you can reclaim the entire old half in one swoop. This is simple and does compaction and allocation becomes just a pointer bump (very fast). But it requires enough space (typically you need the heap size to be at most half full of live data to copy them to the other half). If most objects die young, copying collection works well. This is often used for young generation GCs.
- Generational Garbage Collection: Empirical observation (called the generational hypothesis) is that most objects die young (e.g., temporaries, short-lived data structures). Meanwhile, objects that survive a GC cycle tend to live much longer. To optimize, generational collectors have multiple regions (generations). Newly allocated objects start in the young generation, which is collected very frequently (because we expect many will die, and collecting them quickly reclaims memory fast). Those that survive a few cycles get promoted to the old generation, which is collected less frequently. This concentrates GC effort where it’s most needed (the young gen, which is typically smaller and has a high garbage turnover). Old generation collections are more expensive but happen less often. Both the JVM and CLR use generational GCs by default, typically with 2 or 3 generations. Python’s cyclic GC also treats different “generations” of objects depending on how many collections they’ve survived.
- Concurrent and Parallel GC: Early GCs would stop the program (stop-the-world) completely during collection. Modern demands for responsiveness led to concurrent GCs that run alongside the program (mutator), and parallel GCs that use multiple CPU cores to speed up GC. For example, Java HotSpot offers many collectors:
- Parallel Stop-the-world GC (all application threads pause, but multiple GC threads work in parallel – good for throughput on servers with many cores but can have noticeable pauses).
- Concurrent Mark-Sweep (CMS) where marking is done concurrently with the program, then a brief pause to sweep. CMS reduces pause times at the cost of using CPU concurrently.
- G1 (Garbage First) and newer Shenandoah and ZGC, which break the heap into regions and attempt to collect incrementally and concurrently such that pause times are small (Shenandoah and ZGC aim for pauses in the low milliseconds regardless of heap size, using techniques like read/write barriers and thread-local allocation buffers). These are advanced designs but they all revolve around the fundamental mark-and-(maybe)-compact algorithms, with many engineering tricks.
- Memory Safety and GC: One of the benefits of GC is it eliminates certain memory errors. With GC, you cannot have a dangling pointer because an object is only freed when nothing is referencing it. In C, you can free an object while something still points to it, causing a dangling pointer bug. GC prevents that by definition – if something is still pointing to an object, it’s considered live. GC also prevents double free errors because the runtime is in charge of freeing (the program doesn’t explicitly free, so it can’t free twice). However, GC doesn’t directly prevent memory leaks – if your program keeps references to objects it doesn’t actually need (like storing them in a global list by mistake), the GC thinks they’re still live and won’t free them. So logical memory leaks are still possible, but many accidental leaks (due to forgetting to free) are resolved.
- Cost of GC: Garbage collection does introduce overhead. It can cause program pauses (which can be problematic in real-time systems or low-latency applications), and it uses CPU time for the marking and sweeping. Also, because objects may move (in copying/compacting collectors), it adds some complexity (like updating all pointers, which is typically done via some form of write barrier that tracks pointers between generations, etc.). Memory overhead is also a factor – generational collectors often reserve extra space to make copying efficient (like the semi-space approach doubling memory, though in practice generational GCs tune the ratio of Eden space to survivor spaces, etc.). Nonetheless, modern GC is highly optimized and for many applications the trade-off of a bit of CPU time for dramatically safer memory handling is worth it.
Examples in Practice:
- Java’s GC: In a simple Java program, you create objects with
newand never explicitly delete them. If you set references to null or they go out of scope, eventually the GC will reclaim them. Tuning the GC (with command-line flags) is a big part of Java performance engineering in large applications; you can choose collectors that prioritize either throughput or low latency. The Java Memory Model splits heap into young and old (and sometimes perm/metaspace for class metadata). Most objects die in young gen and are collected by a fast copying collector (like a scavenger). If an object survives a few rounds or the young gen is full, it gets tenured to old gen where a slower mark-sweep(-compact) happens less often. Java’s newer GCs like G1 break memory into regions and collect some regions (those with lots of garbage) first – hence “Garbage First”. This aims to avoid full-heap collections by doing many small partial GCs. - .NET’s GC: Very similar generational approach (generations 0, 1, 2). Generation 0 is collected frequently (often with a copying collector), Gen1 maybe less often, and Gen2 (large objects and long-lived ones) the least often, often with a compacting mark-sweep. .NET’s GC can run in server mode (which is parallel but stop-the-world) or workstation mode (which can be concurrent for Gen2 to reduce pause times).
- Python’s GC: As mentioned, CPython does immediate reclamation via refcounts. For instance:
a = [1,2,3] b = a # now 'a' and 'b' reference the list, refcount 2 del a # drop one reference, refcount becomes 1 del b # drop last reference, refcount becomes 0
At that moment, CPython will reclaim the list’s memory immediately (and recursively any referents, though in this case integers are small objects that might be cached). You typically don’t notice GC in Python unless dealing with big cyclic structures or lots of objects where the cyclic GC kicks in periodically. The gc module in Python allows you to inspect and tweak the thresholds for that. Python’s memory allocator also has its own layers (it has a private heap from the OS, and uses arenas, etc.). Because of the GIL, Python’s GC doesn’t need to worry about simultaneous mutations of object graphs by multiple threads (only one thread runs Python bytecode at a time), which simplifies implementation.
- Ruby’s GC: Ruby (since 2.1) has a generational GC with 3 generations (for young, old, etc.) and uses mark-and-sweep with compaction optional in latest versions. Historically Ruby had only mark-and-sweep. Ruby objects are all allocated from the C heap via its own allocator, and Ruby’s GC in MRI is stop-the-world (it stops all Ruby threads during a GC). On the other hand, JRuby delegates to the JVM’s GC, which can be more advanced.
Memory Pools and Allocation: Runtimes often implement custom allocators on top of OS malloc to make object allocation faster. For example, many GCs will linearly allocate in a contiguous block (bump pointer allocation) until GC happens. This is much faster than general-purpose malloc which has to find a fit for the requested size. The trade-off is they need to occasionally garbage collect to free the space. But usually, allocation is so frequent that a very fast allocation path is critical. That’s why copying/compacting GCs are loved – allocation is just incrementing a pointer (assuming you have enough contiguous space). In contrast, manual malloc might search a free list which is slower.
Finalization and Memory: Some languages allow objects to have finalizers or destructors (Java has finalize() (deprecated now in favor of Cleaner or try-with-resources), C# has Finalize (and a Dispose pattern for deterministic cleanup)). The runtime must handle these carefully – e.g., an object with a finalizer might not be immediately collected even when unreachable, because the runtime will schedule the finalizer to run first, and only actually free on a later cycle (to give the finalizer a chance to do something). This can complicate GC and is one reason finalizers are discouraged for most uses (they also add unpredictability – you don’t know when exactly they run, just that it’s after GC finds the object unreachable, which could be much later or maybe never if program exits before finalizer thread runs).
Memory Management Evolution Timeline (tidbits):
- 1950s-60s: Manual memory in early languages (assembly, Fortran had no dynamic allocation originally). Lisp (1959/1960) introduces GC.
- 1970s: GC appears in more managed languages (Simula, Smalltalk). Dijkstra et al. develop parallel GC techniques (tri-color marking) in late 1970s.
- 1980s: Reference counting in some environments (CP/M’s BASIC?), mark-and-sweep widely used. C++ introduces RAII (destructors that free memory) as an approach to safer manual management.
- 1990s: Java (1995) mainstreams GC in systems programming, bringing generational GC to the masses. .NET (2002) follows with similar approach.
- 2000s: Real-time and low-pause GCs (Boehm GC for C/C++ was around since 1980s but improved). High-performance JS engines needed good GC for web apps. Incremental, concurrent GCs developed (e.g., Baker’s incremental copying GC concept, etc.).
- 2010s: Regional and low-pause collectors (G1, Shenandoah, ZGC in Java), Go language appears (with fully concurrent GC by Go 1.5+). Rust takes a different path: memory management via compile-time ownership (no runtime GC, but still memory safe – a different approach entirely).
- Present: GC is a well-understood field, and new languages either adopt GC (like Swift uses ARC, a form of automated refcounting, and a cycle detector for closures) or innovative manual/automatic hybrids (Rust’s ownership model as mentioned).
In summary, memory management in runtimes spans a spectrum: from explicit manual free (with minimal runtime intervention) to sophisticated automated garbage collection. Modern high-level language runtimes almost all use some form of GC because it greatly reduces programmer burden and improves safety. The runtime’s GC algorithms have grown highly advanced to minimize the performance costs and pause times, making GC viable even in scenarios that used to avoid it. Understanding these mechanisms helps developers reason about how their program uses memory and why, for instance, creating tons of short-lived objects might be okay (they’ll be garbage-collected quickly), but keeping references around inadvertently can cause memory bloat.
Lab Exercises:
- Exercise 7.1: Simulate a simple mark-and-sweep GC in a high-level language. For example, in Python, represent objects as nodes in a graph (with references as edges). Designate some roots, mark reachable nodes, and report which ones would be collected. This conceptual exercise helps you internalize the mark-and-sweep process. You could literally use a dictionary of object->list of references and write a mark function.
- Exercise 7.2: In a language like Java or C#, induce a garbage collection and observe it. For instance, fill a large array with millions of small objects, then drop the references (set array to null) and call
System.gc(). Use logging or profiling to see if a GC occurred (in Java,verbose:gcflag prints GC events). Note how the memory usage goes down after GC. This demonstrates the runtime reclaiming memory. - Exercise 7.3: Explore memory leaks in a GC language. Write a program (in Python, Java, or C#) that unintentionally retains objects. For example, keep adding objects to a global list or static collection and never remove them. Watch memory usage grow (you can often see this in task manager or with a memory profiler). Then fix the leak by removing references or not storing them, and see memory usage stay stable. This reinforces that GC frees only what the program truly doesn’t need – it’s not a cure for logical leaks.
- Exercise 7.4: If using Python, play with the
gcmodule: create some cyclic references (e.g., two objects referencing each other) and delete the external references to them. Checkgc.get_count()andgc.collect(). For example: import gc class Node: def __init__(self): self.ref = None a = Node(); b = Node() a.ref = b; b.ref = a # cycle between a and b del a; del b print("Garbage count before manual collect:", gc.garbage) gc.collect() print("Garbage count after collect:", gc.garbage)You should see that beforegc.collect(), the cyclic objects aren’t freed (since refcount not zero), but after an explicit collect, they are detected and reclaimed (they might end up ingc.garbagelist if they have__del__that prevented collection). This demonstrates Python’s cyclic GC working beyond refcounts.- Exercise 7.5: In Java or C#, examine the effect of finalizers. Write a class with a
finalize()or a C# destructor and have it print something. Create some instances and drop references. Notice that the finalizer might run later, or maybe not at all if program ends quickly. This can show how reliance on finalizers is tricky – they run on GC’s schedule, not predictably. (Remember to enable finalizer printing, e.g., in Java, you might need to explicitly callSystem.runFinalization()or wait for GC to happen.)
Chapter 8 – Concurrency in Runtimes
Modern software often needs to do many things seemingly at once – handling multiple users, doing background tasks, exploiting multi-core processors for speed, etc. Concurrency refers to the ability to execute multiple sequences of operations in overlapping time periods. Runtimes implement concurrency through various abstractions like threads, tasks, coroutines, and event loops. In this chapter, we discuss how different runtime systems handle concurrency, including scheduling, context switching, and synchronization.
Threads and OS Integration
The most straightforward concurrency model provided by many runtimes is the thread model, typically using native threads from the operating system (OS). An OS thread (often called a kernel thread) is managed by the OS scheduler, which can run it on any available CPU core and can preempt it at any time to switch to another thread (preemptive multitasking). High-level languages often expose threads in their runtime:
- Java and CLR (.NET): Both have a concept of threads (e.g.,
java.lang.Threadin Java,System.Threading.Threadin .NET). These correspond to real OS threads (e.g., POSIX threads on Linux, Windows threads on Windows). The runtime creates and manages them but scheduling is largely handled by the OS kernel. This means if you create 10 threads and you have a 4-core machine, the OS might run 4 threads at once (one per core) and time-slice the rest.
Using OS threads has the advantage of true parallelism on multi-core processors (different threads can run simultaneously on different cores) and leveraging OS maturity in scheduling (the OS tries to allocate CPU fairly or according to priorities). However, thread-based concurrency has challenges: – Overhead: Each thread has its own stack (which can be large) and creating threads or switching between them involves system calls and context switches that are relatively costly. – Synchronization complexity: Because threads share memory by default (all threads in a process share the same heap), you need locks or other synchronization mechanisms to avoid race conditions when they access shared data. The runtime often provides primitives like mutexes, monitors (in Java, synchronized blocks), and higher-level locks or atomic operations. If misused, these can lead to deadlocks, contention, etc. – Scalability: Creating thousands of OS threads can overload the scheduler or exhaust memory (each thread’s stack might be, say, 1MB by default in some systems, which becomes huge if you have thousands of threads). So heavy multi-threading can hit limits.
Both the JVM and CLR employ thread pooling patterns: for example, .NET’s ThreadPool or Task Parallel Library will reuse a fixed pool of OS threads to run many short-lived tasks, to avoid constantly creating/destroying threads. Java’s ExecutorService frameworks do similarly. This is an application-level solution but integrated with runtime by providing concurrency utilities.
- C/C++ (native code): Runtimes here (like the C runtime library) typically just wrap OS threads (pthreads in Unix, CreateThread on Windows, etc.). So concurrency is available but not managed by a language runtime per se – it’s directly the OS’s concern. However, higher-level frameworks (OpenMP, std::thread in C++11, etc.) provide nicer abstractions but still thin wrappers around OS capabilities.
Green Threads (User-Level Threads)
Some runtimes choose to implement user-level threading, often called green threads. Green threads are scheduled by the runtime (or a VM) instead of the OS. Historically, for example: – Early Java (JDK 1.1 on Solaris) had green threads (all Java threads ran within one OS thread, the JVM did its own scheduling). They moved to native threads by JDK 1.2 because native threads benefited from multi-core and OS scheduling improvements. – Many modern languages revisit user-level threads under new names: goroutines in Go, fibers in Ruby (when using certain configurations), Erlang processes (Erlang VM has lightweight processes managed by VM), Rust has some libraries for user-space tasks, and experimental virtual threads in Java (Project Loom) which aim to create millions of lightweight threads scheduled by the JVM on a smaller pool of OS threads.
Pros of green threads: – They can be extremely lightweight (each green thread might just need a few KB of stack and metadata, versus OS threads needing megabyte-scale stacks). – The runtime can schedule them with very low overhead (no user/kernel mode switch is needed to context switch between green threads, it’s just a function call in the VM). This can mean millions of context switches per second, far more than OS threads. – Because the runtime “knows” the intentions of the threads (like if a thread is waiting on a specific I/O or event), it can sometimes schedule more intelligently or avoid needless switching.
Cons: – If the runtime schedules them on one OS thread (like old Java did), then you don’t get multi-core parallelism unless the runtime uses multiple OS threads itself as workers for the green threads (which some do, e.g., Go’s runtime uses an M:N model: M green threads multiplexed over N OS threads). – The runtime must handle blocking syscalls carefully. If a green thread performs a blocking operation (like a file read), it could stall the entire OS thread and thus all other green threads on it. Green thread runtimes usually avoid that by using non-blocking I/O operations or by handing off blocking operations to a thread pool. – Complexity: writing a green thread scheduler inside the runtime is extra complexity and might need tuning to avoid pathological scheduling issues.
Examples: – Go: The Go language is famous for goroutines. You can spawn tens of thousands of goroutines (via the go keyword) and the Go runtime will multiplex them over a smaller number of OS threads (it uses a work-stealing scheduler on a thread pool roughly equal to number of CPU cores by default). The runtime also includes a scheduler that automatically moves goroutines between threads to keep cores busy and uses asynchronous I/O under the hood so that a goroutine doing I/O doesn’t block an OS thread – the thread will go do something else while that goroutine is waiting. – Erlang: The BEAM VM for Erlang can spawn millions of “processes” (lightweight threads) which are scheduled by BEAM. Erlang processes have their own small heaps and communicate via message passing (no shared memory by design, which greatly simplifies concurrency reasoning and GC). The scheduler in BEAM runs a certain number of reductions (function calls) in one process, then switches to another (cooperative/preemptive hybrid scheduling based on instruction counts). – Java’s Project Loom (expected in Java 19/20 timeframe): Introduces virtual threads which are essentially green threads scheduled by the JVM on the pool of OS threads (essentially an M:N scheduler in the JVM). This allows code written as if using normal threads (with blocking I/O, etc.) to transparently benefit from lightweight scheduling. Internally, a blocking call in a virtual thread (like a socket read) can park the virtual thread and let the OS thread do other work (like serving another virtual thread). This attempts to combine ease of synchronous programming with scalability near that of async or event-loop models. – Python: Standard CPython doesn’t have green threads (and the GIL means even OS threads can’t run Python code in parallel on multiple cores; they run one at a time). However, Python has coroutines and async via asyncio, which is an event-loop model (discussed later) and libraries like greenlet/gevent that implement green threads by switching the C stack manually (which is quite tricky but works for I/O-bound tasks).
Coroutines and Cooperative Concurrency
Another concurrency approach is cooperative multitasking, often via coroutines. In this model, instead of preemptive switching by a scheduler, each task explicitly yields control periodically or when idle. Many modern languages have async/await which is essentially syntactic support for coroutine style.
- Coroutines (as in Python’s
async deffunctions, JavaScriptasync function, C# async/await, or older generator-based coroutines) allow writing code that looks sequential but is actually scheduling points at eachawaitoryield. The runtime’s event loop (or scheduler) resumes coroutines when their awaited operations complete. - These are often tied to event loop concurrency (single-threaded, as in JavaScript or Python’s asyncio). For example, in Node.js or asyncio, there is one OS thread running an event loop that manages many tasks. When a task awaits I/O, the runtime registers the I/O and moves on to other tasks, coming back when the I/O event fires. This is cooperative because tasks run until they explicitly yield (via an await that doesn’t have result ready).
- Cooperative scheduling avoids the overhead of thread context switches and locking (since usually only one task runs at a time, no simultaneous memory access from multiple tasks, thus thread safety concerns are minimized). But it requires that each task be well-behaved and yield periodically. A task that runs a CPU-bound loop without awaits can starve the whole system (in Node, a busy loop will block the entire server).
- The async/await pattern in many runtimes is essentially building an abstraction of concurrency on top of either an event loop or thread pool. For example, C# async/await typically uses a thread pool and futures (if one async call awaits an I/O, the thread can pick up other work; when I/O finishes, a thread resumes the continuation). In Python’s asyncio, async/await is layered on an event loop in a single thread, similar in spirit to Node.
Concurrency Hazards and Runtime Support
Race conditions and Synchronization: When threads (or tasks) run truly in parallel (or conceptually parallel), two operations on shared data can interleave unpredictably causing errors. Runtimes offer synchronization primitives: – Locks/Mutexes: e.g., synchronized in Java uses an intrinsic monitor per object. In C#, lock(obj) { ... } does similarly. These ensure only one thread enters critical section at a time. The runtime (or OS) implements these with atomic operations and OS wait queues. – Semaphores, Condition Variables: for more complex coordination (e.g., waiting for conditions, signaling between threads). – Atomics and Lock-free: Many runtimes provide atomic operations (Java’s AtomicInteger, C++ <atomic>, .NET Interlocked class) that allow lock-free concurrency for simple cases (increment counters, etc.) using CPU instructions. – Memory model: Runtimes define how memory operations in different threads relate. For example, the Java Memory Model specifies happens-before relationships and guarantees when one thread’s writes become visible to another. The runtime must insert memory barriers or use CPU instructions appropriately to satisfy the memory model (especially on weakly ordered architectures). High-level languages abstract these details, but under the hood, the runtime (or compiled code) has to ensure things like volatile variables or the end of a synchronized block issues the necessary fencing.
Thread-Local Storage (TLS): Runtimes often provide thread-local variables (e.g., ThreadLocal<T> in Java). The runtime manages separate copies per thread, which is useful for things like pseudo-random number generators or transaction contexts. Implementation is usually a hash map keyed by thread or an array index given to each thread.
Fibers/Continuations: Some languages expose even more control. For instance, Ruby has a Fiber class that’s like a coroutine that can yield to its caller. Python’s generators serve as a form of coroutine. These are typically cooperatively scheduled: the programmer explicitly yields out. Some advanced runtimes have continuations (save the entire state of a running function to be resumed later). This can be used to implement coroutines, generators, or even backtracking. Continuations are not common in mainstream languages (besides maybe Scala which has delimited continuations in a library, or older languages like Scheme with call/cc).
Case Studies:
- Java and C# (OS-thread-based): Generally favor using multiple OS threads for parallel tasks. Synchronization heavily relies on OS primitives (mutexes, etc.), though optimized in runtime (Java uses thin vs fat locks, biasing locks to threads to avoid atomic operations until contention arises).
- Python (GIL and asyncio): Because of GIL, multi-threading in CPython doesn’t give parallel CPU usage. So for concurrency, Python often uses multiprocess (spawning processes) or asyncio for concurrent I/O without threads. The runtime’s GIL ensures single-thread access to Python objects (so typical Python code doesn’t need locks for its objects, but extension modules releasing the GIL need to ensure thread safety at lower level). Asyncio in Python is built into the runtime library as an event loop that uses
select/epollfor I/O and schedulesTaskobjects (futures) cooperatively by driving them forward on events. It requires that I/O be done via the provided async libraries to not block. - Node.js (event loop): Node’s runtime is essentially single-threaded for JavaScript execution. It uses libuv (a C library) to handle async I/O and a threadpool for certain things (like filesystem ops) behind the scenes, but JavaScript code sees a single thread and uses callback/async patterns. This eliminates needing locks in user JS code at all – but the downside is one slow JS function can freeze the event loop.
- Go (M:N scheduling): The Go runtime is a good example of an M:N green threading model, where it multiplexes M goroutines on N OS threads. If a goroutine blocks on I/O, the runtime (with help from the Go scheduler and runtime-integrated network poller) will park that goroutine and schedule another onto that OS thread, or even create a new OS thread if all are blocked. This way, a blocking I/O operation in code doesn’t block an OS thread. The runtime uses a technique called
goroutine parking– basically when a goroutine does something that might block, the Go runtime switches it out similarly to a context switch but in user space. It uses something akin to segmented stacks (goroutine stacks start small and can grow) to make goroutines memory-efficient. The scheduler is a complex part of Go’s runtime with tuning constants for how long to run a goroutine before switching (time-slicing in a sense, but cooperative for I/O). The result is writing concurrent code in Go is as straightforward as writing sequential code with go statements, but under the hood, it’s event-driven to avoid OS thread overhead.
New Directions: – Rust: Although Rust doesn’t have a runtime-managed GC, it has a concurrency model based on ownership which prevents data races at compile time. It lets you use OS threads (Rust’s standard library provides threads basically mapped to OS threads) and message passing without needing locks if ownership is transferred. Rust also has async/await with an ecosystem of executors – interestingly, Rust’s async is zero-cost in that the compiler transforms async functions into state machines (like many languages do), and you plug that into an executor (e.g., tokio or async-std crates). These executors act like an event loop/ threadpool. The runtime part is offloaded to libraries, not built into the language. – Multi-core and multi-process: Some runtimes sidestep threading complexity by using processes. For example, Erlang processes are within BEAM, but in many deployments, you run one BEAM instance per core and connect them; Python’s multiprocessing spawns processes that communicate via pipes. This leverages OS process isolation to avoid locks but at cost of more overhead and need to serialize data between processes.
Parallel computing (like scientific computing often uses multi-threading or GPU offloading outside of typical runtime concurrency; those might involve runtime support for thread pools or OpenMP, which a runtime might either provide or rely on libraries).
Thread vs Event vs Hybrid: – A general rule: If tasks are mostly I/O-bound (waiting for external events like network or disk), an event-loop or async model can be more efficient (because you don’t need one thread per connection, you can handle thousands of connections in one thread by non-blocking I/O and callbacks). Node.js and Python’s asyncio shine here. – If tasks are CPU-bound, then you want to use multiple cores – that necessitates either OS threads or processes (since one core can do one thing at a time). So Java, C#, Go etc. will run things on multiple cores automatically with threads or thread pools. Node can’t utilize multiple cores from one process (except via cluster module to fork multiple processes). – Many systems combine approaches: e.g., a web server might have a pool of threads, each thread uses async I/O to manage thousands of sockets (so basically M = number of threads, N = number of connections tasks per thread). This is like hybrid thread + event model. Some languages (like Scala with Akka) encourage using actor model (like Erlang processes) on top of thread pools.
Lab Exercises:
- Exercise 8.1: Write a simple multi-threaded program in a language of your choice (Java, C#, or C++). For example, increment a shared counter from multiple threads many times without synchronization and observe that the final result is often wrong (due to race conditions). Then add proper locking around the increment and observe the correct result. This demonstrates the need for synchronization in threaded programs.
- Exercise 8.2: In Python, experiment with threading vs asyncio:
- Create a thread that runs a loop with some prints and delays, and another thread doing something else. Notice that because of the GIL, if one thread is purely computing (no I/O), it might block the other (they will still context switch on GIL release, but Python switches by bytecode count or I/O).
- Then do a similar task with
asyncio(useasyncio.sleepto simulate waiting) and run multiple coroutines. See how they interleave nicely on a single thread. This shows the cooperative nature. - Exercise 8.3: If you have Go installed, write a small program launching 100k goroutines, each sleeping for a second (use
time.Sleep) then printing a dot. Observe that you can create far more goroutines than OS threads (try same with OS threads in C or C++ and likely the OS will struggle or run out of resources). This highlights how lightweight goroutines are. - Exercise 8.4: In Node.js, write a server that intentionally blocks (like computing an enormous Fibonacci number synchronously when a certain URL is hit). Start it and make concurrent requests (some that hit the heavy computation, some trivial). Observe how one heavy request delays others because of the single-threaded event loop. In contrast, do a similar thing in a multi-threaded server (like Java or C#) and see that other requests still get served in parallel. This underscores event loop’s weakness for CPU-bound tasks.
- Exercise 8.5: Use a debugging or profiling tool to visualize concurrency:
- In Java, you could use VisualVM or a thread dump to see multiple threads and their states (runnable, waiting, etc.). For a simple program with a few threads sleeping or waiting on locks, take a thread dump (
jstack) to see the lock states and thread call stacks. - In an async environment like JavaScript, use Chrome DevTools Performance tab to record when tasks run and how asynchronous events are handled in the call stack.
By exploring these, you get a feel of how the runtime orchestrates multiple tasks, either via OS or internally, and the importance of writing code that cooperates with the chosen concurrency model.
Chapter 9 – Security and Sandboxing
Runtime systems also play a critical role in enforcing security constraints and isolating code execution. The term sandboxing refers to running code in a controlled environment where its access to resources (like file system, network, memory) is restricted, preventing potential harm or unauthorized operations. Many runtimes implement sandboxing especially for executing untrusted code (think Java applets, JavaScript in the browser, or smart contracts on blockchain VMs). This chapter looks at how runtimes provide security features like sandboxing, type safety enforcement, and other protection mechanisms.
Memory Safety and Type Safety
One foundational security aspect provided by runtimes is memory safety. Languages with managed runtimes (Java, C#, Python, etc.) don’t allow arbitrary pointer arithmetic or buffer overflows – the runtime (or compiled code with runtime checks) will prevent reading/writing outside the bounds of an array, or following a null reference, etc. This not only prevents crashes but also a class of security vulnerabilities (buffer overflow exploits, for instance, which are common in C/C++ programs, are virtually impossible in Java/C# code because the runtime would throw an IndexOutOfBoundsException rather than let you overwrite memory). Memory safety significantly improves security as attackers cannot as easily inject code or manipulate the process memory via out-of-bounds writes.
Type safety is similar: a runtime ensures an object is only used according to its declared type. For example, you can’t treat a string as if it were an integer and jump to its bytes as executable code. Type safety and safe casting rules (with runtime type checks for downcasts) ensure that you don’t interpret memory in an invalid way.
These features are baked into the language’s semantics and enforced by the runtime through checks and by the design of the execution model (e.g., no pointer arithmetic in Java, and the bytecode verifier ensures you don’t forge pointers or call methods with wrong types).
Sandboxing Untrusted Code
When you run code from an untrusted source (like an applet in the 90s, or JavaScript from a random website, or WebAssembly from a third-party, or user-submitted plugins in an application), you want to confine what that code can do. The runtime is in a unique position to mediate all operations because code typically can only do what the runtime’s API allows.
Java’s Sandbox Model: Early Java’s security architecture was built around a sandbox for applets. The idea: – Local code (fully trusted, e.g., running from your disk with permissions) could do anything Java allows (file I/O, network, etc.). – Remote code (applets downloaded from the internet) ran in a sandbox with severe restrictions: – It couldn’t access the local file system (except maybe a very limited, safe portion). – It couldn’t open network connections to any host except the one it originated from. – It couldn’t spawn local processes or load native libraries. – It couldn’t read system properties or do reflection to break out of the sandbox (in theory).
The enforcement of these rules was done by the SecurityManager and the ClassLoader in the JVM: – The ClassLoader that loaded the applet classes would mark them as coming from an untrusted source (and isolate them in a separate namespace). – The SecurityManager had hooks in potentially dangerous API calls. For example, any call like FileReader("C:\\\\secret.txt") inside an applet would trigger the SecurityManager’s checkRead method, which by default (for an applet) would throw a SecurityException. The JDK’s library methods are written to consult SecurityManager for various sensitive operations (file access, opening a socket, creating a ClassLoader, etc.). – There was also a bytecode verifier (ensuring that the applet’s compiled bytecode doesn’t do low-level illegal stuff) and a concept of privileged code vs. unprivileged code (the core system classes could do privileged actions, and applet code could not unless given permission).
Over time, this sandbox was expanded into the Java Permission model where code could be signed and granted specific permissions. For instance, a signed applet from a trusted vendor might be allowed to write files if the user approved. The SecurityManager would then be configured with a policy granting that signed code certain privileges.
The sandbox model successfully prevented Java applets from, say, deleting your hard drive or snooping your local files (unless there was a SecurityManager bug, which occasionally happened in early days). It basically ensured an applet could only play in its sandbox – much like children in a sandbox can play freely inside it but cannot affect the outside environment.
JavaScript Sandbox in Browsers: The browser is essentially a runtime for JavaScript that sandboxes it: – JS from a webpage cannot access your file system arbitrarily (there’s no direct API to do so in standard JS). – It cannot open network connections to hosts other than where it came from, due to the Same Origin Policy enforced by browsers (this is not in the JS runtime per se, but the environment: browsers restrict that script’s network requests – e.g., AJAX calls – to the origin domain unless CORS is allowed). – It cannot spawn OS processes or read arbitrary system info (it can only use what the Web APIs provide, which are designed not to expose sensitive stuff). – The JS engine and browser enforce that each web page runs in a sandbox relative to other pages. One page’s script can’t access another page’s data in memory directly (unless allowed via safe APIs like postMessage). – The browser’s internal sandboxing (like Chrome uses OS-level sandbox processes for each site) complements the JS runtime restrictions, but focusing on the runtime: the language itself lacks dangerous primitives like pointers or syscalls, and the global objects provided exclude filesystem or raw socket access (except WebSocket, which is restricted to web protocols).
Additionally, the browser’s runtime has mitigations like script injection protections, and the environment is designed such that if you try to do something like window.open('file://C:/etc/passwd') it’s either blocked or limited by user prompts.
Managed Code Access Security in .NET: .NET had a similar concept called Code Access Security (CAS) where assemblies could be loaded in a sandbox with certain permissions (e.g., no file IO, no UI access, etc.). This was used in scenarios like running add-ins with limited trust. It was quite complex and largely deprecated in favor of simpler sandboxing (like running in an OS sandbox or using .NET Core which runs code with full trust but within OS-level boundaries like containers or such).
Mobile App Sandboxing: On platforms like Android and iOS, each app is typically running with OS-level sandbox (separate UID, limited access to files of other apps). At the runtime level, Android’s Java VM doesn’t have a special SecurityManager for apps – instead, permissions (like accessing camera or contacts) are enforced partly at OS level and partly by gating APIs (Android APIs check whether the app has been granted a permission and throw otherwise). The runtime ensures type safety and memory safety, while the OS ensures process isolation and file sandbox.
Operating System and Runtime interplay: A lot of security is layered: – The runtime prevents code from doing unintended things within the process (no wild memory writes, no calling native code except through controlled interfaces). – The OS can isolate the process itself from doing restricted actions (like a process might run under a restricted user account or inside a container with no access to certain devices). – Combining them provides defense in depth.
Native Code and Sandboxing in runtimes: A challenge arises when untrusted code can somehow run native instructions (like through a JIT or native extensions). For instance: – Java’s sandbox forbids use of System.loadLibrary by untrusted code (because loading a native library would let that library do anything with native code outside the control of the JVM). This is checked by SecurityManager. – Similarly, in .NET, untrusted code would not be allowed to call unmanaged code (P/Invoke) without permission. – JavaScript in browsers doesn’t have any mechanism to run arbitrary native code (except if there’s a vulnerability). WebAssembly changes this a bit because WebAssembly is low-level, but browsers ensure that WebAssembly is confined by the same rules as JS (it can’t access outside memory except what’s given, can’t call OS functions directly – it can only call JS imports which are subject to same origin and all). – WebAssembly is actually an example of a runtime within a runtime: it’s a deterministic, memory-safe sandbox – the browser verifies the Wasm binary for memory safety (no out-of-bounds memory access due to the Wasm semantics: it checks every access is within bounds of linear memory). So Wasm code can’t break out of its linear memory or corrupt the engine.
Security Polices and Verification: – Bytecode Verification: as mentioned, VMs like JVM and .NET verify bytecode/MSIL before executing to ensure it doesn’t do things like overflow the operand stack, call methods with wrong types, forge object references, etc. This prevents malicious or corrupted bytecode from doing something outside of language rules. – Stack Inspection: In Java’s original security, when code called a sensitive method, the runtime did a stack inspection to see who in the call stack had permission. For example, if a trusted library method tries to open a file on behalf of applet code, the SecurityManager would see that untrusted code was in the call stack and deny it unless the trusted code explicitly used AccessController.doPrivileged to assert that it takes responsibility (thus truncating the stack inspection). This was a sophisticated mechanism to prevent confused deputy problems (where trusted code might inadvertently perform a privileged action due to being tricked by untrusted code). .NET CAS had similar concept of stack walk for demands and asserts.
Sandboxing for Social/Web contexts: – Web frameworks and managed code: Some applications allow users to submit code or queries (like SQL, or plugins in a sandbox). Runtimes often provide or require sandboxing these user scripts. E.g., a game might embed Lua or Python and run mods from users in a restricted environment (not giving those scripts dangerous APIs). – Social media and “social runtimes”: If we interpret “social runtimes” as multi-user collaborative code execution, the runtime might need to enforce permissions per user context. (This term is not standard; possibly the prompt hints at things like plugin ecosystems in social platforms or collaborative scripting? In any case, a runtime would isolate one user’s code from another’s.)
Operating System Sandboxing: Not to be overlooked, many language runtimes now run on OSs with sandboxing features (like iOS’s app sandbox, Windows Sandbox, Linux seccomp, etc.). For instance: – Chrome’s JS engine runs each site’s code in a separate OS process that is sandboxed (limited syscalls via seccomp, limited filesystem access, etc.). Even if the JS engine had a bug, the OS sandbox would limit damage. – Node.js doesn’t sandbox by default (since it’s meant for trusted server scripts), but you can run Node in a container or use a third-party sandbox module (there were modules to run untrusted JS in a separate V8 isolate with restricted builtins).
Security boundaries and performance: – Sandboxing often comes at a performance cost. Additional checks (like array bounds, SecurityManager calls on each file access) slow things down. Over time, some of these got optimized (e.g., JIT can remove redundant bounds checks in loops). The trend also has been to move security to coarser layers (like OS container boundaries) rather than fine-grained checks in every library call, because the latter can be burdensome and sometimes bypassable if not careful. – For example, Java’s SecurityManager is optional and often disabled in server environments for performance reasons (because doing those checks on each file open etc. adds overhead). Instead, people sandbox at container/OS level if needed.
Memory sandboxing for native code: There are also projects to sandbox unsafe code in safe runtimes. For example, Google’s Native Client (NaCl) was an effort to sandbox native x86 code in the browser by validating the binary for certain safety patterns. That is more of a CPU-level sandbox, and WebAssembly sort of replaced that approach by providing a structured sandbox format.
In summary, runtime security is about limiting what running code can do and ensuring code cannot break the runtime’s own safety guarantees. Through a combination of language design (no dangerous primitives), runtime verification and checks, and controlled library APIs, runtimes can create a sandbox that allows code to execute useful tasks without compromising the host system or other programs’ data.
Lab Exercises:
- Exercise 9.1: Try out Java’s SecurityManager (which requires a security policy file). Write a small Java program that attempts a sensitive operation (like reading a file or opening a socket). Then enable a SecurityManager with a restrictive policy (e.g., no file access) and see it throw a SecurityException. For instance:
public class SandboxTest { public static void main(String[] args) { System.setSecurityManager(new SecurityManager()); try { File f = new File("test.txt"); FileInputStream fis = new FileInputStream(f); } catch(Exception e) { e.printStackTrace(); } } }Run with a policy file that denies all, and watch the exception. This shows how the runtime intercepts the action.- Exercise 9.2: In a browser environment, open developer console and try to do something like:
fetch('file:///etc/passwd') .then(response => console.log(response)) .catch(error => console.error('Error:', error));Most likely, the browser will not allow this due to CORS or file URI restrictions (especially if not running from a local context). This demonstrates the web sandbox disallowing file access. Also try opening a socket directly (not via WebSocket API) – you can’t, because there’s no API. If you try some hidden trick like using Flash or ActiveX (older tech), modern browsers block those by default now.- Exercise 9.3: Explore a WebAssembly sandbox. Use an online WASM playground or write a simple C program, compile to WebAssembly, and run it in a browser via an HTML page. For example, a C program that tries to access memory out of bounds:
#include <stdio.h> int main() { int arr[10]; arr[20] = 42; printf("Hello\\n"); return 0; }In a normal C environment, this is a memory error that could crash or corrupt memory. In WebAssembly (with appropriate compile flags to trap on bounds), this out-of-bounds will be caught and typically results in a runtime trap rather than memory corruption. Try it (ensuring the WASM is compiled with bounds checking, which it is by design). The runtime should abort execution safely on the out-of-bounds.- Exercise 9.4: If possible, run a piece of untrusted Python code using
execin a restricted environment. Python doesn’t have a built-in sandbox (older versions had a “restricted mode” that was removed because it was not secure). But you can simulate by e.g., removing dangerous builtins and limiting the globals. For example: safe_globals = {"__builtins__": {"print": print}} untrusted_code = "print('Hi'); import os; os.remove('foo.txt')" try: exec(untrusted_code, safe_globals) except Exception as e: print("Caught exception:", e)This will raise an exception when the code triesimport osbecause in our provided builtins we only allowedprint. This is a naive sandbox (and can be circumvented in Python by clever means), but it shows concept of restricting what builtins are available.- Exercise 9.5: Check out OS-level sandboxing quickly:
- On Linux, try running a program under
firejailor a simple Docker container and see that it cannot see files outside its environment. - On Windows, one can create a Low-Integrity process (harder manually, but tools exist). This is more about OS, but it reinforces how multiple layers enforce security.
These exercises demonstrate how runtimes and systems work together to confine code execution and prevent malicious or buggy code from causing harm outside designated boundaries.
[1] Runtime system – Wikipedia
https://en.wikipedia.org/wiki/Runtime_system
[2][3] Memory Layout of C Programs – GeeksforGeeks
https://www.geeksforgeeks.org/c/memory-layout-of-c-program
[4][5] Garbage collection (computer science) – Wikipedia
https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)
发表回复