Efficient Software-Based Fault Isolation
Myoungsoo Jung
Problems, Solutions and Summaries:
The key idea to isolate fault by software is very simple, and allows us to archive the efficient way to make fault isolation cheap enough. Begin with this paper, Robert Wahbe at al. figure out the problem of the traditional fault isolation method. When we use such scheme (i.e., hardware-based fault isolation), high performance cost is necessary because preventing the code in one address space from corrupting the contents of another induces prohibitive context switch overhead and needs some additional operation such as trapping, copying arguments, saving and loading relative register, and flush look-aside buffer. To overcome this challenge, the authors provide a software approach, which implemented within a single address space. It grants a separated fault domain to load code and data for a distrusted module and modify object code to prevent faults from writing and jumping to address outside such fault domain.
To archive mentioned goal of this paper, the authors propose the software encapsulation transforming distrusted module not to escape its fault domain – fault domain consists of two segments, one for distrusted module's code, and another for its static data, heap and stack. The software encapsulation contains of two kinds of key mechanism to pinpoint the actual location of fault within a module and isolate distrust module. One is called, segment matching preventing the use of illegal addresses. In segment matching, insert checking code before every unsafe instruction that jumps to or store to statically unverified address within the correct segment. If the check fails, such code traps to a system error routine outside the distributed modules' fault domain.
Another key mechanism of software encapsulation is address sandboxing. Sanding boxing indicate inserted code that sets the upper bits of the target address to the correct segment identifier before distrusted instructions. As with segment matching, earlier mentioned, unsafe store or jump instruction can be modified to use dedicate register, and it guarantee that distrusted module code cannot produce an illegal address. There are two instruction for providing sandboxing, one to clear segment-id and store the result in a dedicate register, the other to set segment id for the correct value.
Remaining issues for supporting efficient software encapsulation are optimization, how to prevent corrupting process resources and to access among domains when they need data sharing. In the view of optimization, the authors provide the way to reduce overhead, which is induced from computing target address. Basically, instruction of RISC has register address and offset. However, sandboxing mechanism just use only register addresses or numbers and handles offset by creating guard zone, which indicate unmapped area. In process resources problem, the authors require distrusted modules for accessing resource through cross-fault-domain RPC. For instance, if a distrusted module's object code performs a direct system call, the authors transform this call into the appropriate RPC call. In last, I talk about data sharing. Because segment encapsulation doesn't alter load instruction, fault domains can read any memory mapped in the application's address space. However, each domain cannot share data among them. So the authors provide lazy pointer swizzling that alias the shared regions into multiple locations in the virtual address space by modifying the hardware page table.
Critiques:
Although this paper strives for reducing the cost of fault isolation by using not hardware but software, I afraid some kinds of possibility that its mechanism can break pipeline. Traditionally, many scientists who involved in computer architecture realm take effort to reduce the number of broken pipeline because it induces the terrible performance degradation. As software encapsulation, we should insert code for support segment matching and sandboxing, which incur to break pipeline. So the authors should provide clearer evidence that proposed approach cannot hurt breaking pipeline. Secondly, I don't find realistic solution for adding "correct" value which used for implementing segment matching. If the actual location where occupied with loadable distrusted modules is fixed then we can easily find segment-id during a compile time, but if not, segment-id is unable to be recognized when compile time. For this reason, the authors also should refine the correct value and provide the way to find segment-id without any operating system's load modules mechanism during a compile time.