Publications

2018

Stopping Memory Disclosures via Diversification and Replicated Execution Kangjie Lu, Meng Xu, Chengyu Song, Taesoo Kim, and Wenke Lee. IEEE Transactions on Dependable and Secure Computing (TDSC) preprint, October 2018.

With the wide deployment of security mechanisms such as Address Space Layout Randomization (ASLR), memory disclosures have become a prerequisite for critical memory-corruption attacks (e.g., code-reuse attack) — adversaries are forced to exploit memory disclosures to circumvent ASLR as the first step. As a result, the security threats of memory disclosures are now significantly aggravated — they break not only data confidentiality but also the effectiveness of security mechanisms.

In this paper, we propose a general detection methodology and develop a system to stop memory disclosures. We observe that memory disclosures are not root causes but rather consequences of a variety of hard-to-detect program errors such as memory corruption and uninitialized read. We thus propose a replicated execution–based methodology to generally detect memory disclosures, regardless of their causes. We realize this methodology with BUDDY: By seamlessly maintaining two identical running instances of a target program and diversifying only its target data, BUDDY can accurately detects memory disclosures of the data, as doing so will result in the two instances outputting different values. Extensive evaluation results show that BUDDY is reliable and efficient while stopping real memory disclosures such as the Heartbleed leak.

@article{lu:buddy,
  title        = {{Stopping Memory Disclosures via Diversification and Replicated Execution}},
  author       = {Kangjie Lu and Meng Xu and Chengyu Song and Taesoo Kim and Wenke Lee},
  journal      = {IEEE Transactions on Dependable and Secure Computing (TDSC)},
  volume       = {preprint},
  month        = oct,
  year         = 2018,
}
SGX-Tor: A Secure and Practical Tor Anonymity Network With SGX Enclaves Seongmin Kim, Juhyeng Han, Jaehyeong Ha, Taesoo Kim, and Dongsu Han. IEEE/ACM Transactions on Networking (ToN) 26(5), October 2018.

With Tor being a popular anonymity network, many attacks have been proposed to break its anonymity or leak information of a private communication on Tor. However, guaranteeing complete privacy in the face of an adversary on Tor is especially difficult, because Tor relays are under complete control of world-wide volunteers. Currently, one can gain private information, such as circuit identifiers and hidden service identifiers, by running Tor relays and can even modify their behaviors with malicious intent. This paper presents a practical approach to effectively enhancing the security and privacy of Tor by utilizing Intel SGX, a commodity trusted execution environment. We present a design and implementation of Tor, called SGX-Tor, that prevents code modification and limits the information exposed to untrusted parties. We demonstrate that our approach is practical and effectively reduces the power of an adversary to a traditional network-level adversary. Finally, SGX-Tor incurs moderate performance overhead; the end-to-end latency and throughput overheads for HTTP connections are 3.9% and 11.9%, respectively.

@article{kim:sgx-tor2,
  title        = {{SGX-Tor: A Secure and Practical Tor Anonymity Network With SGX Enclaves}},
  author       = {Seongmin Kim and Juhyeng Han and Jaehyeong Ha and Taesoo Kim and Dongsu Han},
  journal      = {IEEE/ACM Transactions on Networking (ToN)},
  volume       = 26,
  number       = 5,
  pages        = {2174--2187},
  month        = oct,
  year         = 2018,
}
Enforcing Unique Code Target Property for Control-Flow Integrity Hong Hu, Chenxiong Qian, Carter Yagemann, Simon Pak Ho Chung, William R. Harris, Taesoo Kim, and Wenke Lee. CCS 2018.

The goal of control-flow integrity (CFI) is to stop control-hijacking attacks by ensuring that each indirect control-flow transfer (ICT) jumps to its legitimate target. However, existing implementations of CFI have fallen short of this goal because their approaches are inaccurate and as a result, the set of allowable targets for an ICT instruction is too large, making illegal jumps possible.

In this paper, we propose the Unique Code Target (UCT) property for CFI. Namely, for each invocation of an ICT instruction, there should be one and only one valid target. We develop a prototype called uCFI to enforce this new property. During compilation, uCFI identifies the sensitive instructions that influence ICT and instruments the program to record necessary execution context. At runtime, uCFI monitors the program execution in a different process, and performs points-to analysis by interpreting sensitive instructions using the recorded execution context in a memory safe manner. It checks runtime ICT targets against the analysis results to detect CFI violations. We apply uCFI to SPEC benchmarks and 2 servers (nginx and vsftpd) to evaluate its efficacy of enforcing UCT and its overhead. We also test uCFI against control-hijacking attacks, including 5 real-world exploits, 1 proof of concept COOP attack, and 2 synthesized attacks that bypass existing defenses. The results show that uCFI strictly enforces the UCT property for protected programs, successfully detects all attacks, and introduces less than 10% performance overhead.

@inproceedings{hu:ucfi,
  title        = {{Enforcing Unique Code Target Property for Control-Flow Integrity}},
  author       = {Hong Hu and Chenxiong Qian and Carter Yagemann and Simon Pak Ho Chung and William R. Harris and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 25th ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2018,
  address      = {Toronto, ON, Canada},
}
REPT: Reverse Debugging of Failures in Deployed Software Weidong Cui, Xinyang Ge, Baris Kasikci, Ben Niu, Upamanyu Sharma, Ruoyu Wang, and Insu Yun. OSDI 2018.

Debugging software failures in deployed systems is important because they impact real users and customers. However, debugging such failures is notoriously hard in practice because developers have to rely on limited information such as memory dumps. The execution history is usually unavailable because high-fidelity program tracing is not affordable in deployed systems.

In this paper, we present REPT, a practical system that enables reverse debugging of software failures in deployed systems. REPT reconstructs the execution history with high fidelity by combining online lightweight hardware tracing of a program's control flow with offline binary analysis that recovers its data flow. It is seemingly impossible to recover data values thousands of instructions before the failure due to information loss and concurrent execution. REPT tackles these challenges by constructing a partial execution order based on timestamps logged by hardware and iteratively performing forward and backward execution with error correction.

We design and implement REPT, deploy it on Microsoft Windows, and integrate it into Windows Debugger. We evaluate REPT on 16 real-world bugs and show that it can recover data values accurately (92% on average) and efficiently (less than 20 seconds) for these bugs. We also show that it enables effective reverse debugging for 14 bugs.

@inproceedings{cui:rept,
  title        = {{REPT: Reverse Debugging of Failures in Deployed Software}},
  author       = {Weidong Cui and Xinyang Ge and Baris Kasikci and Ben Niu and Upamanyu Sharma and Ruoyu Wang and Insu Yun},
  booktitle    = {Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI)},
  month        = oct,
  year         = 2018,
  address      = {Carlsbad, CA},
}
QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing Insu Yun, Sangho Lee, Meng Xu, Yeongjin Jang, and Taesoo Kim. USENIX Security 2018.

Recently, hybrid fuzzing has been proposed to address the limitations of fuzzing and concolic execution by combining both approaches. The hybrid approach has shown its effectiveness in various synthetic benchmarks such as DARPA Cyber Grand Challenge (CGC) binaries, but it still suffers from scaling to find bugs in complex, real-world software. We observed that the performance bottleneck of the existing concolic executor is the main limiting factor for its adoption beyond a small-scale study.

To overcome this problem, we design a fast concolic execution engine, called QSYM, to support hybrid fuzzing. The key idea is to tightly integrate the symbolic emulation with the native execution using dynamic binary translation, making it possible to implement more fine-grained, so faster, instruction-level symbolic emulation. Additionally, QSYM loosens the strict soundness requirements of conventional concolic executors for better performance, yet takes advantage of a faster fuzzer for validation, providing unprecedented opportunities for performance optimizations, e.g., optimistically solving constraints and pruning uninteresting basic blocks.

Our evaluation shows that QSYM does not just outperform state-of-the-art fuzzers (i.e., found 14× more bugs than VUzzer in the LAVA-M dataset, and outperformed Driller in 104 binaries out of 126), but also found 13 previously unknown security bugs in eight real-world programs like Dropbox Lepton, ffmpeg, and OpenJPEG, which have already been intensively tested by the state-of-the-art fuzzers, AFL and OSS-Fuzz.

@inproceedings{yun:qsym,
  title        = {{QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing}},
  author       = {Insu Yun and Sangho Lee and Meng Xu and Yeongjin Jang and Taesoo Kim},
  booktitle    = {Proceedings of the 27th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2018,
  address      = {Baltimore, MD},
}
Enabling Refinable Cross-host Attack Investigation with Efficient Data Flow Tagging and Tracking Yang Ji, Sangho Lee, Mattia Fazzini, Joey Allen, Evan Downing, Taesoo Kim, Alessandro Orso, and Wenke Lee. USENIX Security 2018.

Investigating attacks across multiple hosts is challenging. The true dependencies between security-sensitive files, network endpoints, or memory objects from different hosts can be easily concealed by dependency explosion or undefined program behavior (e.g., memory corruption). Dynamic information flow tracking (DIFT) is a potential solution to this problem, but, existing DIFT techniques only track information flow within a single host and lack an efficient mechanism to maintain and synchronize the data flow tags globally across multiple hosts.

In this paper, we propose RTAG, an efficient data flow tagging and tracking mechanism that enables practical cross-host attack investigations. RTAG is based on three novel techniques. First, by using a record-and-replay technique, it decouples the dependencies between different data flow tags from the analysis, enabling lazy synchronization between independent and parallel DIFT instances of different hosts. Second, it takes advantage of system-call-level provenance information to calculate and allocate the optimal tag map in terms of memory consumption. Third, it embeds tag information into network packets to track cross-host data flows with less than 0.05% network bandwidth overhead. Evaluation results show that RTAG is able to recover the true data flows of realistic cross-host attack scenarios. Performance wise, RTAG reduces the memory consumption of DIFT-based analysis by up to 90% and decreases the overall analysis time by 60%-90% compared with previous investigation systems.

@inproceedings{ji:rtag,
  title        = {{Enabling Refinable Cross-host Attack Investigation with Efficient Data Flow Tagging and Tracking}},
  author       = {Yang Ji and Sangho Lee and Mattia Fazzini and Joey Allen and Evan Downing and Taesoo Kim and Alessandro Orso and Wenke Lee},
  booktitle    = {Proceedings of the 27th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2018,
  address      = {Baltimore, MD},
}
On the Effectiveness of Kernel Debloating via Compile-time Configuration Mansour Alharthi, Hong Hu, Hyungon Moon, and Taesoo Kim. SALAD 2018.

Linux kernel contains a large number of features that not all systems need, while Linux distributors enable as many features as possible to make their distributions generic, leading to severe bloating problem. Intuitively, we can use the existing configuration system to remove unnecessary features. However, it is unclear whether this system is adequate for kernel debloating. In this study, we perform analysis to understand how much security benefit a user can obtain if she performs the kernel debloating through the compile-time configuration. Our study shows that existing configuration system provides a convenient and effective vector to reduce the attack surface while producing a functional kernel. With such result, we plan to spend more effort to build a secure kernel through the compile-time debloating.

@inproceedings{alharthi:debloat-study,
  title        = {{On the Effectiveness of Kernel Debloating via Compile-time Configuration}},
  author       = {Mansour Alharthi and Hong Hu and Hyungon Moon and Taesoo Kim},
  booktitle    = {Proceedings of the 1st Workshop on SoftwAre debLoating And Delayering},
  month        = jul,
  year         = 2018,
  address      = {Amsterdam, Netherlands},
}
Scaling Guest OS Critical Sections with eCS Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. ATC 2018.

Multi-core virtual machines (VMs) are now a norm in data center environments. However, one of the well-known problems that VMs suffer from is the vCPU scheduling problem that causes poor scalability behaviors. More specifically, the symptoms of this problem appear as preemption problems in both under- and over-committed scenarios. Although prior research efforts attempted to alleviate these symptoms separately, they fail to address the common root cause of these problems: the missing semantic gap that occurs when a guest OS is preempted while executing its own critical section, thereby leading to degradation of application scalability.

In this work, we strive to address all preemption problems together by bridging the semantic gap between guest OSes and the hypervisor: the hypervisor now knows whether guest OSes are running in critical sections and a guest OS has hypervisor's scheduling context. We annotate all critical sections by using the lightweight para-virtualized APIs, so we called enlightened critical sections ($e$CS), that provide scheduling hints to both the hypervisor and VMs. The hypervisor uses the hint to reschedule a vCPU to fundamentally overcome the double scheduling problem for these annotated critical sections and VMs use the hypervisor provided hints to further mitigate the blocked-waiter wake-up problem. Our evaluation results show that $e$CS guarantees the forward progress of a guest OS by 1) decreasing preemption counts by 85--100% while 2) improving the throughput of applications up to 2.5X in an over-committed scenario and 1.6X in an under-committed scenario for various real-world workloads on an 80-core machine.

@inproceedings{kashyap:ecs,
  title        = {{Scaling Guest OS Critical Sections with eCS}},
  author       = {Sanidhya Kashyap and Changwoo Min and Taesoo Kim},
  booktitle    = {Proceedings of the 2018 USENIX Annual Technical Conference (ATC)},
  month        = jul,
  year         = 2018,
  address      = {Boston, MA},
}
Precise and Scalable Detection of Double-Fetch Bugs in OS Kernels Meng Xu, Chenxiong Qian, Kangjie Lu, Michael Backes, and Taesoo Kim. IEEE SP 2018.

During system call execution, it is common for operating system kernels to read userspace memory multiple times (multi-reads). A critical bug may exist if the fetched userspace memory is subject to change across these reads, i.e., a race condition, which is known as a double-fetch bug. Prior works have attempted to detect these bugs both statically and dynamically. However, due to their improper assumptions and imprecise definitions regarding double-fetch bugs, their multi-read detection is inherently limited and suffers from significant false positives and false negatives. For example, their approach is unable to support device emulation, inter-procedural analysis, loop handling, etc. More importantly, they completely leave the task of finding real double-fetch bugs from the haystack of multi-reads to manual verification, which is expensive if possible at all.

In this paper, we first present a formal and precise definition of double-fetch bugs and then implement a static analysis system - Deadline - to automatically detect double-fetch bugs in OS kernels. Deadline uses static program analysis techniques to systematically find multi-reads throughout the kernel and employs specialized symbolic checking to vet each multi-read for double-fetch bugs. We apply Deadline to Linux and FreeBSD kernels and find 23 new bugs in Linux and one new bug in FreeBSD. We further propose four generic strategies to patch and prevent double-fetch bugs based on our study and the discussion with kernel maintainers.

@inproceedings{xu:deadline,
  title        = {{Precise and Scalable Detection of Double-Fetch Bugs in OS Kernels}},
  author       = {Meng Xu and Chenxiong Qian and Kangjie Lu and Michael Backes and Taesoo Kim},
  booktitle    = {Proceedings of the 39th IEEE Symposium on Security and Privacy (Oakland)},
  month        = may,
  year         = 2018,
  address      = {San Francisco, CA},
}
SOLROS: A Data-Centric Operating System Architecture for Heterogeneous Computing Changwoo Min, Woon-Hak Kang, Mohan Kumar, Sanidhya Kashyap, Steffen Maass, Heeseung Jo, and Taesoo Kim. EuroSys 2018.

We propose Solros - a new operating system architecture for heterogeneous systems that comprises fast host processors, slow but massively parallel co-processors, and fast I/O devices. A general consensus to fully drive such a hardware system is to have a tight integration among processors and I/O devices. Thus, in the Solros architecture, a co-processor OS (data-plane OS) delegates its services, specifically I/O stacks, to the host OS (control-plane OS). Our observation for such a design is that global coordination with system-wide knowledge (e.g., PCIe topology, a load of each co-processor) and the best use of heterogeneous processors is critical to achieving high performance. Hence, we fully harness these specialized processors by delegating complex I/O stacks on fast host processors, which leads to an efficient global coordination at the level of the control-plane OS.

We developed Solros with Xeon Phi co-processors and implemented three core OS services: transport, file system, and network services. Our experimental results show significant performance improvement compared with the stock Xeon Phi running the Linux kernel. For example, Solros improves the throughput of file system and network operations by 19x and 7x, respectively. Moreover, it improves the performance of two realistic applications: 19x for text indexing and 2x for image search.

@inproceedings{min:solros,
  title        = {{SOLROS: A Data-Centric Operating System Architecture for Heterogeneous Computing}},
  author       = {Changwoo Min and Woon-Hak Kang and Mohan Kumar and Sanidhya Kashyap and Steffen Maass and Heeseung Jo and Taesoo Kim},
  booktitle    = {Proceedings of the 13th European Conference on Computer Systems (EuroSys)},
  month        = apr,
  year         = 2018,
  address      = {Porto, Portugal},
}
A Scalable Ordering Primitive For Multicore Machines Sanidhya Kashyap, Changwoo Min, Kangnyeon Kim, and Taesoo Kim. EuroSys 2018.

Timestamping is an essential building block for designing concurrency control mechanisms and concurrent data structures. Various algorithms either employ physical timestamping, assuming that they have access to synchronized clocks, or maintain a logical clock with the help of atomic instructions. Unfortunately, these approaches have two problems. First, hardware developers do not guarantee that the available hardware clocks are exactly synchronized, which they find difficult to achieve in practice. Second, the atomic instructions are a deterrent to scalability resulting from cache-line contention. This paper addresses these problems by proposing and designing a scalable ordering primitive, called Ordo, that relies on invariant hardware clocks. Ordo not only enables the correct use of these clocks, by providing a notion of a global hardware clock, but also frees various logical timestamp-based algorithms from the burden of the software logical clock, while trying to simplify their design. We use the Ordo primitive to redesign 1) a concurrent data structure library that we apply on the Linux kernel; 2) a synchronization mechanism for concurrent programming; 3) two database concurrency control mechanisms; and 4) a clock-based software transactional memory algorithm. Our evaluation shows that there is a possibility that the clocks are not synchronized on two architectures ( Intel and ARM) and that Ordo generally improves the efficiency of several algorithms by 1.2-39.7x on various architectures.

@inproceedings{kashyap:ordo,
  title        = {{A Scalable Ordering Primitive For Multicore Machines}},
  author       = {Sanidhya Kashyap and Changwoo Min and Kangnyeon Kim and Taesoo Kim},
  booktitle    = {Proceedings of the 13th European Conference on Computer Systems (EuroSys)},
  month        = apr,
  year         = 2018,
  address      = {Porto, Portugal},
}
Kaleidoscope: Graph Analytics on Evolving Graphs Steffen Maass, and Taesoo Kim. EuroDW 2018.

Large-scale graphs and their analytics are common in many applications such as social networks, data mining, and machine learning. Many approaches to operate on static graphs of sizes in the trillions of edges have been proposed and adopted. However, real-world graphs are evolving in nature and are constantly changing in many application domains. We observe that current systems to process evolving graphs show three problems: they impose a storage overhead when adding new edges and vertices, they involve a synchronous compaction step to keep their internal data structures optimized and they exhibit poor cache locality.

We propose Kaleidoscope - a graph processing engine for evolving graphs - that aims at tackling these three problems by the use of a localized graph data structure, that is helpful in reducing both the storage overhead as well as the overhead of the synchronous compaction step. Kaleidoscope also enables the use of a locality-optimizing space-filling curve, the Hilbert order, when traversing the subgraphs built by Kaleidoscope, to allow for better cache locality.

@inproceedings{maass:kaleidoscope,
  title        = {{Kaleidoscope: Graph Analytics on Evolving Graphs}},
  author       = {Steffen Maass and Taesoo Kim},
  booktitle    = {12th EuroSys Doctoral Workshop (EuroDW)},
  month        = apr,
  year         = 2018,
  address      = {Porto, Portugal},
}
LATR: Lazy Translation Coherence Mohan Kumar, Steffen Maass, Sanidhya Kashyap, Ján Veselý, Zi Yan, Taesoo Kim, Abhishek Bhattacharjee, and Tushar Krishna. ASPLOS 2018.

We propose LATR-lazy TLB coherence-a software-based TLB shootdown mechanism that can alleviate the overhead of the synchronous TLB shootdown mechanism in existing operating systems. By handling the TLB coherence in a lazy fashion, LATR can avoid expensive IPIs which are required for delivering a shootdown signal to remote cores, and the performance overhead of associated interrupt handlers. Therefore, virtual memory operations, such as free and page migration operations, can benefit significantly from LATR's mechanism. For example, LATR improves the latency of munmap by 70.8% on a 2-socket machine, a widely used configuration in modern data centers. Real-world, performance-critical applications such as web servers can also benefit from LATR: without any application-level changes, LATR improves Apache by 59.9% compared to Linux, and by 37.9% compared to ABIS, a highly optimized, state-of-the-art TLB coherence technique.

@inproceedings{kumar:latr,
  title        = {{LATR: Lazy Translation Coherence}},
  author       = {Mohan Kumar and Steffen Maass and Sanidhya Kashyap and J\'{a}n Vesel\'{y} and Zi Yan and Taesoo Kim and Abhishek Bhattacharjee and
    Tushar Krishna},
  booktitle    = {Proceedings of the 23rd ACM International Conference on Architectural Support for Programming Languages and Operating Systems
    (ASPLOS)},
  month        = mar,
  year         = 2018,
  address      = {Williamsburg, VA},
}
Prevention of Cross-update Privacy Leaks on Android Beumjin Cho, Sangho Lee, Meng Xu, Sangwoo Ji, Taesoo Kim, and Jong Kim. Computer Science and Information Systems 15(1), January 2018.

Updating applications is an important mechanism to enhance their availability, functionality, and security. However, without careful considerations, application updates can bring other security problems. In this paper, we consider a novel attack that exploits application updates on Android: a cross-update privacy-leak attack called COUPLE. The COUPLE attack allows an application to secretly leak sensitive data through the cross-update interaction between its old and new versions; each version only has permissions and logic for either data collection or transmission to evade detection. We implement a runtime security system, BREAKUP, that prevents cross-update sensitive data transactions by tracking permission-use histories of individual applications. Evaluation results show that BREAKUP’s time overhead is below 5%. We further show the feasibility of the COUPLE attack by analyzing the versions of 2,009 applications (28,682 APKs).

@article{cho:breakup,
  title        = {{Prevention of Cross-update Privacy Leaks on Android}},
  author       = {Beumjin Cho and Sangho Lee and Meng Xu and Sangwoo Ji and Taesoo Kim and Jong Kim},
  journal      = {Computer Science and Information Systems},
  volume       = 15,
  number       = 1,
  pages        = {111--137},
  month        = jan,
  year         = 2018,
}

2017

SGX-Bomb: Locking Down the Processor via Rowhammer Attack Yeongjin Jang, Jaehyuk Lee, Sangho Lee, and Taesoo Kim. SysTEX 2017.

Intel Software Guard Extensions (SGX) provides a strongly isolated memory space, known as an enclave, for a user process, ensuring confidentiality and integrity against software and hardware attacks. Even the operating system and hypervisor cannot access the enclave because of the hardware-level isolation. Further, hardware attacks are neither able to disclose plaintext data from the enclave because its memory is always encrypted nor modify it because its integrity is always verified using an integrity tree. When the processor detects any integrity violation, it locks itself to prevent further damages; that is, a system reboot is necessary. The processor lock seems a reasonable solution against such a powerful hardware attacker; however, if a software attacker has a way to trigger integrity violation, the lock could result in a severe denial-of-service (DoS) attack.

In this paper, we introduce the SGX-Bomb attack that launches the Rowhammer attack against enclave memory to trigger the processor lockdown. The SGX-Bomb attack is simple yet alarming. Inside an enclave, this attack first finds conflicting row addresses at the same DRAM bank, and then repeatedly accesses them while bypassing the cache. If arbitrary bit flips have occurred inside the enclave because of the Rowhammer attack, any read attempts to the enclave memory results in a failure of integrity check so that the processor will be locked, and the system should be rebooted. The SGX-Bomb attack is a serious threat especially to the public cloud providers who are supposed to run unknown enclave programs received from their clients, which might shut down their servers shared with other clients. We evaluate the effectiveness of the SGX-Bomb attack in a real environment with DDR4 DRAM; it takes 283 s to hang the entire system with the default DRAM refresh rate, 64 ms.

@inproceedings{jang:sgx-bomb,
  title        = {{SGX-Bomb: Locking Down the Processor via Rowhammer Attack}},
  author       = {Yeongjin Jang and Jaehyuk Lee and Sangho Lee and Taesoo Kim},
  booktitle    = {Proceedings of the 2nd Workshop on System Software for Trusted Execution (SysTEX)},
  month        = oct,
  year         = 2017,
  address      = {Shanghai, China},
}
Designing New Operating Primitives to Improve Fuzzing Performance Wen Xu, Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. CCS 2017.

In recent years, various organizations and communities have been putting numerous computing resources on automated fuzzing, which has been proved as a highly efficient approach to find security bugs in complicated software and OS kernels. Thus the performance of a fuzzer becomes crucial, and the fuzzer that can use less running time to hit more security issues saves significant cost. Existing research focuses on producing input data that is likely to explore more states of a targeted application while ignoring the performance overhead that originates from the operating system side. In fact, the system components that generic fuzzers rely on cause serious performance bottlenecks. Especially when fuzzing on multiple cores, the scalability of the state-of-the-art fuzzer (AFL) slows down by 24x because of its over exploiting the file system as a communication channel, intensively invoking the fork() system call, and heavily interacting with the file system.

In this paper, we design and implement three new operating primitives specialized for fuzzing that solve these performance bottlenecks and achieve scalable performance on multi-core machines. Our experiment shows that the proposed primitives speed up AFL and LibFuzzer by 6.1 to 28.9x and 1.1 to 735.7x, respectively, on the overall number of executions per second when targeting Google's fuzzer test suite with 120 cores. In addition, the primitives improve AFL's throughput up to 7.7x with 30 cores, which is a more common setting in data centers. Our fuzzer-agnostic primitives can be easily applied to any fuzzer with fundamental performance improvement and directly benefit large-scale fuzzing and cloud-based fuzzing services.

@inproceedings{xu:os-fuzz,
  title        = {{Designing New Operating Primitives to Improve Fuzzing Performance}},
  author       = {Wen Xu and Sanidhya Kashyap and Changwoo Min and Taesoo Kim},
  booktitle    = {Proceedings of the 24th ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2017,
  address      = {Dallas, TX},
}
RAIN: Refinable Attack Investigation with On-demand Inter-Process Information Flow Tracking Yang Ji, Sangho Lee, Evan Downing, Weiren Wang, Mattia Fazzini, Taesoo Kim, Alessandro Orso, and Wenke Lee. CCS 2017.

As modern attacks become more stealthy and persistent, detecting or preventing them at their early stages becomes virtually impossible. Instead, an attack investigation or provenance system aims to continuously monitor and log interesting system events with minimal overhead. Later, if the system observes any anomalous behavior, it analyzes the log to identify who initiated the attack and which resources were affected by the attack and then assess and recover from any damage incurred. However, because of a fundamental tradeoff between log granularity and system performance, existing systems typically record system-call events without detailed program-level activities (e.g., memory operation) required for accurately reconstructing attack causality or demand that every monitored program be instrumented to provide program-level information.

To address this issue, we propose RAIN, a Refinable Attack INvestigation system based on a record-replay technology that records system-call events during runtime and performs instruction-level dynamic information flow tracking (DIFT) during on-demand process replay. Instead of replaying every process with DIFT, RAIN conducts system-call-level reachability analysis to filter out unrelated processes and to minimize the number of processes to be replayed, making inter-process DIFT feasible. Evaluation results show that RAIN effectively prunes out unrelated processes and determines attack causality with negligible false positive rates. In addition, the runtime overhead of RAIN is similar to existing system-call level provenance systems and its analysis overhead is much smaller than full-system DIFT.

@inproceedings{ji:rain,
  title        = {{RAIN: Refinable Attack Investigation with On-demand Inter-Process Information Flow Tracking}},
  author       = {Yang Ji and Sangho Lee and Evan Downing and Weiren Wang and Mattia Fazzini and Taesoo Kim and Alessandro Orso and Wenke Lee},
  booktitle    = {Proceedings of the 24th ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2017,
  address      = {Dallas, TX},
}
Checking Open-Source License Violation and 1-day Security Risk at Large Scale Ruian Duan, Ashish Bijlani, Meng Xu, Taesoo Kim, and Wenke Lee. CCS 2017.

With millions of apps available to users, the mobile app market is rapidly becoming very crowded. Given the intense competition, the time to market is a critical factor for the success and profitability of an app. In order to shorten the development cycle, developers often focus their efforts on the unique features and workflows of their apps and rely on third-party Open Source Software (OSS) for the common features. Unfortunately, despite their benefits, careless use of OSS can introduce significant legal and security risks, which if ignored can not only jeopardize security and privacy of end users, but can also cause app developers high financial loss. However, tracking OSS components, their versions, and interdependencies can be very tedious and error-prone, particularly if an OSS is imported with little to no knowledge of its provenance.

We therefore propose OSSPolice, a scalable and fully-automated tool for mobile app developers to quickly analyze their apps and identify free software license violations as well as usage of known vulnerable versions of OSS. OSSPolice introduces a novel hierarchical indexing scheme to achieve both high scalability and accuracy, and is capable of efficiently comparing similarities of app binaries against a database of hundreds of thousands of OSS sources (billions of lines of code). We populated OSSPolice with 60K C/C++ and 77K Java OSS sources and analyzed 1.6M free Google Play Store apps. Our results show that 1) over 40K apps potentially violate GPL/AGPL licensing terms, and 2) over 100K of apps use known vulnerable versions of OSS. Further analysis shows that developers violate GPL/AGPL licensing terms due to lack of alternatives, and use vulnerable versions of OSS despite efforts from companies like Google to improve app security. OSSPolice is available on GitHub.

@inproceedings{duan:osspolice,
  title        = {{Checking Open-Source License Violation and 1-day Security Risk at Large Scale}},
  author       = {Ruian Duan and Ashish Bijlani and Meng Xu and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 24th ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2017,
  address      = {Dallas, TX},
}
FLSCHED: A Lockless and Lightweight Approach to OS Scheduler for Xeon Phi Heeseung Jo, Woonhak Kang, Changwoo Min, and Taesoo Kim. APSys 2017.

The number of cores in a system is crucial for maximizing the parallelism of a program. Processor manufacturers have increased the number of cores in a chip so that the latest manycore processor has up to 76 physical cores and 304 hardware threads. On the other hand, the revolution of OS schedulers to manage processes in systems are slow to follow up emerging manycore processors.

In this paper, we show how much CFS, the default Linux scheduler, can break the performance of parallel applications on manycore processors (e.g., Intel Xeon Phi). Then, we propose a novel scheduler named FLsched, which is designed for lockless implementation with less context switches and more efficient scheduling decision. In our evaluations on Xeon Phi, FLsched shows better performance than CFS up to 1.73x for HPC applications and 3.12x for micro-benchmarks.

@inproceedings{jo:flsched,
  title        = {{FLSCHED: A Lockless and Lightweight Approach to OS Scheduler for Xeon Phi}},
  author       = {Heeseung Jo and Woonhak Kang and Changwoo Min and Taesoo Kim},
  booktitle    = {Proceedings of the 8th Asia-Pacific Workshop on Systems (APSys)},
  month        = sep,
  year         = 2017,
  address      = {Mumbai, India},
}
Building Trust in the User I/O in Computer Systems Yeongjin Jang. Ph.D. thesis, Georgia Institute of Technology, August 2017.

User input plays an essential role in computer security because it can control system behavior and make security decisions in the system. System output to users, or user output, is also important because it often contains security-critical information that must be protected regarding its integrity and confidentiality, such as passwords and user’s private data. Despite the importance of user input and output (I/O), modern computer systems often fail to provide necessary security guarantees on them, which could result in serious security breaches.

This dissertation aims to build trust in the user I/O in computer systems to keep the systems secure from attacks on the user I/O. To this end, we analyze the user I/O paths on popular platforms including desktop operating systems, mobile operating systems, and trusted execution environments such as Intel SGX, and identified that threats and attacks on the user I/O can be blocked by guaranteeing three key security properties of user I/O: integrity, confidentiality, and authenticity.

First, GYRUS addresses the integrity of user input by matching the user’s original input with the content of outgoing network traffic to authorize user-intended network transactions. Second, M-AEGIS addresses the confidentiality of user I/O by implementing an encryption layer on top of user interface layer that provides user-to-user encryption. Third, the A11Y ATTACK addresses the importance of verifying user I/O authenticity by demonstrating twelve new attacks, all of which stem from missing proper security checks that verify input sources and output destinations on alternative user I/O paths in operating systems. Finally, to establish trust in the user I/O in a commodity computer system, I built a system called SGX-USB, which combines all three security properties to ensure the assurance of user I/O. SGX-USB establishes a trusted communication channel between the USB controller and an enclave instance of Intel SGX. The implemented system supports common user input devices such as a keyboard and a mouse over the trusted channel, which guarantees the assurance of user input.

Having assurance in user I/O allows the computer system to securely handle commands and data from the user by eliminating attack pathways to a system’s I/O paths.

@phdthesis{jang:thesis,
  title        = {{Building Trust in the User I/O in Computer Systems}},
  author       = {Yeongjin Jang},
  school       = {{Georgia Institute of Technology}},
  year         = 2017,
  month        = aug,
}
Securing Software Systems by Preventing Information Leaks Kangjie Lu. Ph.D. thesis, Georgia Institute of Technology, August 2017.

Foundational software systems such as operating systems and web servers are implemented in unsafe programming languages for efficiency, and system designers often prioritize performance over security. Hence, these systems inherently suffer from a variety of vulnerabilities and insecure designs that have been exploited by adversaries to launch critical system attacks. Two typical goals of these attacks are to leak sensitive data and to control victim systems.

This thesis aims to defeat both data leaks and control attacks. We first identify that, in modern systems, preventing information leaks can be a general defense that not only stops data leaks but also defeats control attacks. We then investigate three ways to prevent information leaks: eliminating information-leak vulnerabilities, re-designing system mechanisms against information leaks, and protecting certain sensitive data from information leaks. We have developed multiple tools for each way. While automatically and reliably securing complex systems, all these tools impose negligible performance overhead. Our extensive evaluation results show that preventing information leaks can be a general and practical approach to securing complex software systems.

@phdthesis{lu:thesis,
  title        = {{Securing Software Systems by Preventing Information Leaks}},
  author       = {Kangjie Lu},
  school       = {{Georgia Institute of Technology}},
  year         = 2017,
  month        = aug,
}
Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch Shadowing Sangho Lee, Ming-Wei Shih, Prasun Gera, Taesoo Kim, Hyesoon Kim, and Marcus Peinado. USENIX Security 2017.

Intel has introduced a hardware-based trusted execution environment, Intel Software Guard Extensions (SGX), that provides a secure, isolated execution environment, or enclave, for a user program without trusting any underlying software (e.g., an operating system) or firmware. Researchers have demonstrated that SGX is vulnerable to a page-fault-based attack. However, the attack only reveals page-level memory accesses within an enclave.

In this paper, we explore a new, yet critical, side-channel attack, branch shadowing, that reveals fine-grained control flows (branch granularity) in an enclave. The root cause of this attack is that SGX does not clear branch history when switching from enclave to non-enclave mode, leaving fine-grained traces for the outside world to observe, which gives rise to a branch-prediction side channel. However, exploiting this channel in practice is challenging because 1) measuring branch execution time is too noisy for distinguishing fine-grained control-flow changes and 2) pausing an enclave right after it has executed the code block we target requires sophisticated control. To overcome these challenges, we develop two novel exploitation techniques: 1) a last branch record (LBR)-based history-inferring technique and 2) an advanced programmable interrupt controller (APIC)-based technique to control the execution of an enclave in a fine-grained manner. An evaluation against RSA shows that our attack infers each private key bit with 99.8% accuracy. Finally, we thoroughly study the feasibility of hardware-based solutions (i.e., branch history flushing) and propose a software-based approach that mitigates the attack.

@inproceedings{lee:sgx-branch-shadow,
  title        = {{Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch Shadowing}},
  author       = {Sangho Lee and Ming-Wei Shih and Prasun Gera and Taesoo Kim and Hyesoon Kim and Marcus Peinado},
  booktitle    = {Proceedings of the 26th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2017,
  address      = {Vancouver, Canada},
}
Hacking in Darkness: Return-oriented Programming against Secure Enclaves Jaehyuk Lee, Jinsoo Jang, Yeongjin Jang, Nohyun Kwak, Yeseul Choi, Changho Choi, Taesoo Kim, Marcus Peinado, and Brent B. Kang. USENIX Security 2017.

Intel Software Guard Extensions (SGX) is a hardware-based Trusted Execution Environment (TEE) that is widely seen as a promising solution to traditional security threats. While SGX promises strong protection to bug-free software, decades of experience show that we have to expect vulnerabilities in any non-trivial application. In a traditional environment, such vulnerabilities often allow attackers to take complete control of vulnerable systems. Efforts to evaluate the security of SGX have focused on side-channels. So far, neither a practical attack against a vulnerability in enclave code nor a proof-of-concept attack scenario has been demonstrated. Thus, a fundamental question remains: What are the consequences and dangers of having a memory corruption vulnerability in enclave code?

To answer this question, we comprehensively analyze exploitation techniques against vulnerabilities inside enclaves. We demonstrate a practical exploitation technique, called Dark-ROP, which can completely disarm the security guarantees of SGX. Dark-ROP exploits a memory corruption vulnerability in the enclave software through return-oriented programming (ROP). However Dark-ROP differs significantly from traditional ROP attacks because the target code runs under solid hardware protection. We overcome the problem of exploiting SGX-specific properties and obstacles by formulating a novel ROP attack scheme against SGX under practical assumptions. Specifically, we build several oracles that inform the attacker about the status of enclave execution. This enables him to launch the ROP attack while both code and data are hidden. In addition, we exfiltrate the enclave’s code and data into a shadow application to fully control the execution environment. This shadow application emulates the enclave under the complete control of the attacker, using the enclave (through ROP calls) only to perform SGX operations such as reading the enclave’s SGX crypto keys.

The consequences of Dark-ROP are alarming; the attacker can completely breach the enclave’s memory protections and trick the SGX hardware into disclosing the enclave’s encryption keys and producing measurement reports that defeat remote attestation. This result strongly suggests that SGX research should focus more on traditional security mitigations rather than on making enclave development more convenient by expanding the trusted computing base and the attack surface (e.g., Graphene, Haven).

@inproceedings{lee:darkrop,
  title        = {{Hacking in Darkness: Return-oriented Programming against Secure Enclaves}},
  author       = {Jaehyuk Lee and Jinsoo Jang and Yeongjin Jang and Nohyun Kwak and Yeseul Choi and Changho Choi and Taesoo Kim and Marcus Peinado and
    Brent B. Kang},
  booktitle    = {Proceedings of the 26th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2017,
  address      = {Vancouver, Canada},
}
Efficient Protection of Path-Sensitive Control Security Ren Ding, Chenxiong Qian, Chengyu Song, Bill Harris, Taesoo Kim, and Wenke Lee. USENIX Security 2017.

Control-Flow Integrity (CFI), as a means to prevent control-flow hijacking attacks, enforces that each instruction transfers control to an address in a set of valid targets. The security guarantee of CFI thus depends on the definition of valid targets, which conventionally are defined as the result of a static analysis. Unfortunately, previous research has demonstrated that such a definition, and thus any implementation that enforces it, still allows practical control-flow attacks.

In this work, we present a path-sensitive variation of CFI that utilizes runtime path-sensitive point-to analysis to compute the legitimate control transfer targets. We have designed and implemented a runtime environment, PITTYPAT, that enforces path-sensitive CFI efficiently by combining commodity, low-overhead hardware monitoring and a novel runtime points-to analysis. Our formal analysis and empirical evaluation demonstrate that, compared to CFI based on static analysis, PITTYPAT ensures that applications satisfy stronger security guarantees, with acceptable overhead for security-critical contexts.

@inproceedings{ding:pittypat,
  title        = {{Efficient Protection of Path-Sensitive Control Security}},
  author       = {Ren Ding and Chenxiong Qian and Chengyu Song and Bill Harris and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 26th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2017,
  address      = {Vancouver, Canada},
}
PlatPal: Detecting Malicious Documents with Platform Diversity Meng Xu, and Taesoo Kim. USENIX Security 2017.

Due to the continued exploitation of Adobe Reader, malicious document (maldoc) detection has become a pressing problem. Although many solutions have been proposed, recent works have highlighted some common drawbacks, such as parser-confusion and classifier-evasion attacks.

In response to this, we propose a new perspective for maldoc detection: platform diversity. In particular, we identify eight factors in OS design and implementation that could cause behavioral divergences under attack, ranging from syscall semantics (more obvious) to heap object metadata structure (more subtle) and further show how they can thwart attackers from finding bugs, exploiting bugs, or performing malicious activities.

We further prototype PlatPal to systematically harvest platform diversity. PlatPal hooks into Adobe Reader to trace internal PDF processing and also uses sandboxed execution to capture a maldoc's impact on the host system. Execution traces on different platforms are compared, and maldoc detection is based on the observation that a benign document behaves the same across platforms, while a maldoc behaves differently during exploitation. Evaluations show that PlatPal raises no false alarms in benign samples, detects a variety of behavioral discrepancies in malicious samples, and is a scalable and practical solution.

@inproceedings{xu:platpal,
  title        = {{PlatPal: Detecting Malicious Documents with Platform Diversity}},
  author       = {Meng Xu and Taesoo Kim},
  booktitle    = {Proceedings of the 26th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2017,
  address      = {Vancouver, Canada},
}
AVPASS: Leaking and Bypassing Antivirus Detection Model Automatically Jinho Jung, Chanil Jeon, Max Wolotsky, Insu Yun, and Taesoo Kim. Black Hat USA 2017.

AVPASS is a tool for leaking the detection model of Android antivirus (AV) programs, and bypassing the AV detection by using the leaked information coupled with APK perturbation techniques. AVPASS is able to infer not only the detection features, but also hierarchy of detection rule chains. With help from the leaked model and the built-in APK perturbation functions, AVPASS is able to disguise any android malware as a benign application. Furthermore, using our novel additive mode, AVPASS supports safe querying and guarantees that one can test if the application will be detected by the AV without sending the whole or core parts of application. As a result, AVPASS leaked significant detection features of commercial AVs and achieved almost zero detection from VirusTotal when tested with more than 5,000 malware.

In this talk, we present the entire pipeline of the APK perturbation process, leaking model process, and auto-bypassing process. In addition, we show findings about commercial AVs, including their detection features and hierarchy, and inform the attendees about the potential weaknesses of modern AVs.

AVPASS will be demonstrated, showing that it modifies real world malware precisely, and allows them to bypass all AVs following the leaked model. AVPASS will be released with every tool that we have built, including the original source code and the related test data, to enable researchers to replicate the research on their own.

@inproceedings{jung:avpass,
  title        = {{AVPASS: Leaking and Bypassing Antivirus Detection Model Automatically}},
  author       = {Jinho Jung and Chanil Jeon and Max Wolotsky and Insu Yun and Taesoo Kim},
  booktitle    = {Black Hat USA Briefings (Black Hat USA)},
  month        = jul,
  year         = 2017,
  address      = {Las Vegas, NV},
}
Scalable NUMA-aware Blocking Synchronization Primitives Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. ATC 2017.

Application scalability is a critical aspect to efficiently use NUMA machines with many cores. To achieve that, various techniques ranging from task placement to data sharding are used in practice. However, from the perspective of an operating system, these techniques often do not work as expected because various subsystems in the OS interact and share data structures among themselves, resulting in scalability bottlenecks. Although current OSes attempt to tackle this problem by introducing a wide range of synchronization primitives such as spinlock and mutex, the widely used synchronization mechanisms are not designed to handle both under- and over-subscribed scenarios in a scalable fashion. In particular, the current blocking synchronization primitives that are designed to address both scenarios are NUMA oblivious, meaning that they suffer from cache-line contention in an under-subscribed situation, and even worse, inherently spur long scheduler intervention, which leads to sub-optimal performance in an over-subscribed situation.

In this work, we present several design choices to implement scalable blocking synchronization primitives that can address both under- and over-subscribed scenarios. Such design decisions include memory-efficient NUMA-aware locks (favorable for deployment) and scheduling-aware, scalable parking and wake-up strategies. To validate our design choices, we implement two new blocking synchronization primitives, which are variants of mutex and read-write semaphore in the Linux kernel. Our evaluation shows that these locks can scale real-world applications by 1.2--1.6X and some of the file system operations up to 4.7X in both under- and over-subscribed scenarios. Moreover, they use 1.5--10X less memory than the state-of-the-art NUMA-aware locks on a 120-core machine.

@inproceedings{kashyap:cst,
  title        = {{Scalable NUMA-aware Blocking Synchronization Primitives}},
  author       = {Sanidhya Kashyap and Changwoo Min and Taesoo Kim},
  booktitle    = {Proceedings of the 2017 USENIX Annual Technical Conference (ATC)},
  month        = jul,
  year         = 2017,
  address      = {Santa Clara, CA},
}
CAB-Fuzz: Practical Concolic Testing Techniques for COTS Operating Systems Su Yong Kim, Sangho Lee, Insu Yun, Wen Xu, Byoungyoung Lee, Youngtae Yun, and Taesoo Kim. ATC 2017.

Discovering the security vulnerabilities of commercial off-the-shelf (COTS) operating systems (OSes) is challenging because they not only are huge and complex, but also lack detailed debug information. Concolic testing, which generates all feasible inputs of a program by using symbolic execution and tests the program with the generated inputs, is one of the most promising approaches to solve this problem. Unfortunately, the state-of-the-art concolic testing tools do not scale well for testing COTS OSes because of state explosion. Indeed, they often fail to find a single bug (or crash) in COTS OSes despite their long execution time.

In this paper, we propose CAB-Fuzz (Context-Aware and Boundary-focused), a practical concolic testing tool to quickly explore interesting paths that are highly likely triggering real bugs without debug information. First, CAB-Fuzz prioritizes the boundary states of arrays and loops, inspired by the fact that many vulnerabilities originate from a lack of proper boundary checks. Second, CAB-Fuzz exploits real programs interacting with COTS OSes to construct proper contexts to explore deep and complex kernel states without debug information. We applied CAB-Fuzz to Windows 7 and Windows Server 2008 and found 21 undisclosed unique crashes, including two local privilege escalation vulnerabilities (CVE2015-6098 and CVE-2016-0040) and one information disclosure vulnerability in a cryptography driver (CVE2016-7219). CAB-Fuzz found vulnerabilities that are non-trivial to discover; five vulnerabilities have existed for 14 years, and we could trigger them even in the initial version of Windows XP (August 2001).

@inproceedings{kim:cab-fuzz,
  title        = {{CAB-Fuzz: Practical Concolic Testing Techniques for COTS Operating Systems}},
  author       = {Su Yong Kim and Sangho Lee and Insu Yun and Wen Xu and Byoungyoung Lee and Youngtae Yun and Taesoo Kim},
  booktitle    = {Proceedings of the 2017 USENIX Annual Technical Conference (ATC)},
  month        = jul,
  year         = 2017,
  address      = {Santa Clara, CA},
}
Bunshin: Compositing Security Mechanisms through Diversification Meng Xu, Kangjie Lu, Taesoo Kim, and Wenke Lee. ATC 2017.

A number of security mechanisms have been proposed to harden programs written in unsafe languages, each of which mitigates a specific type of memory error. Intuitively, enforcing multiple security mechanisms on a target program will improve its overall security. However, this is not yet a viable approach in practice because the execution slowdown caused by various security mechanisms is often non-linearly accumulated, making the combined protection prohibitively expensive; further, most security mechanisms are designed for independent or isolated uses and thus are often in conflict with each other, making it impossible to fuse them in a straightforward way.

In this paper, we present Bunshin, an N-version based system that enables different and even conflicting security mechanisms to be combined to secure a program while at the same time reducing the execution slowdown. In particular, we propose an automated mechanism to distribute runtime security checks in multiple program variants in such a way that conflicts between security checks are inherently eliminated and execution slowdown is minimized with parallel execution. We also present an N-version execution engine to seamlessly synchronize these variants so that all distributed security checks work together to guarantee the security of a target program.

@inproceedings{xu:bunshin,
  title        = {{Bunshin: Compositing Security Mechanisms through Diversification}},
  author       = {Meng Xu and Kangjie Lu and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 2017 USENIX Annual Technical Conference (ATC)},
  month        = jul,
  year         = 2017,
  address      = {Santa Clara, CA},
}
Mosaic: Processing a Trillion-Edge Graph on a Single Machine Steffen Maass, Changwoo Min, Sanidhya Kashyap, Woonhak Kang, Mohan Kumar, and Taesoo Kim. EuroSys 2017.

Processing a one trillion-edge graph has recently been demonstrated by distributed graph engines running on clusters of tens to hundreds of nodes. In this paper, we employ a single heterogeneous machine with fast storage media (e.g., NVMe SSD) and massively parallel coprocessors (e.g., Xeon Phi) to reach similar dimensions. By fully exploiting the heterogeneous devices, we design a new graph processing engine, named MOSAIC, for a single machine. We propose a new locality-optimizing, space-efficient graph representation - Hilbert-ordered tiles, and a hybrid execution model that enables vertex-centric operations in fast host processors and edge-centric operations in massively parallel coprocessors. Our evaluation shows that for smaller graphs, MOSAIC consistently outperforms other state-of-the-art out-of-core engines by 3.2–58.6x and shows comparable performance to distributed graph engines. Furthermore, MOSAIC can complete one iteration of the Pagerank algorithm on a trillion-edge graph in 21 minutes, outperforming a distributed disk-based engine by 9.2x.

@inproceedings{maass:mosaic,
  title        = {{Mosaic: Processing a Trillion-Edge Graph on a Single Machine}},
  author       = {Steffen Maass and Changwoo Min and Sanidhya Kashyap and Woonhak Kang and Mohan Kumar and Taesoo Kim},
  booktitle    = {Proceedings of the 12th European Conference on Computer Systems (EuroSys)},
  month        = apr,
  year         = 2017,
  address      = {Belgrade, RS},
}
Enhancing Security and Privacy of Tor's Ecosystem by using Trusted Execution Environments Seongmin Kim, Juhyeng Han, Jaehyeong Ha, Taesoo Kim, and Dongsu Han. NSDI 2017.

With Tor being a popular anonymity network, many attacks have been proposed to break its anonymity or leak information of a private communication on Tor. However, guaranteeing complete privacy in the face of an adversary on Tor is especially difficult because Tor relays are under complete control of world-wide volunteers. Currently, one can gain private information, such as circuit identifiers and hidden service identifiers, by running Tor relays and can even modify their behaviors with malicious intent.

This paper presents a practical approach to effectively enhancing the security and privacy of Tor by utilizing Intel SGX, a commodity trusted execution environment. We present a design and implementation of Tor, called SGX-Tor, that prevents code modification and limits the information exposed to untrusted parties. We demonstrate that our approach is practical and effectively reduces the power of an adversary to a traditional network-level adversary. Finally, SGX-Tor incurs moderate performance overhead; the end-to-end latency and throughput overheads for HTTP connections are 3.9% and 11.9%, respectively.

@inproceedings{kim:sgx-tor,
  title        = {{Enhancing Security and Privacy of Tor's Ecosystem by using Trusted Execution Environments}},
  author       = {Seongmin Kim and Juhyeng Han and Jaehyeong Ha and Taesoo Kim and Dongsu Han},
  booktitle    = {Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI)},
  month        = mar,
  year         = 2017,
  address      = {Boston, MA},
}
SGX-Shield: Enabling Address Space Layout Randomization for SGX Programs Jaebaek Seo, Byoungyoung Lee, Sungmin Kim, Ming-Wei Shih, Insik Shin, Dongsu Han, and Taesoo Kim. NDSS 2017.

Traditional execution environments deploy Address Space Layout Randomization (ASLR) to defend against memory corruption attacks. However, Intel Software Guard Extension (SGX), a new trusted execution environment designed to serve security-critical applications on the cloud, lacks such an effective, well-studied feature. In fact, we find that applying ASLR to SGX programs raises non-trivial issues beyond simple engineering for a number of reasons: it conveys fundamental challenges for a number of reasons: 1) SGX is designed to defeat a stronger adversary than the traditional model, which requires the address space layout to be hidden from the kernel; 2) the limited memory uses in SGX programs present a new challenge in providing a sufficient degree of entropy; 3) remote attestation conflicts with the dynamic relocation required for ASLR; and 4) the SGX specification relies on known and fixed addresses for key data structures that cannot be randomized.

This paper presents SGX-Shield, a new ASLR scheme designed for SGX environments. SGX-Shield is built on a secure in-enclave loader to secretly bootstrap the memory space layout with a finer-grained randomization. To be compatible with SGX hardware (e.g., remote attestation, fixed addresses), SGX-Shield is designed with a software-based data execution protection mechanism through an LLVM-based compiler. We implement SGX-Shield and thoroughly evaluate it on real SGX hardware. It shows a high degree of randomness in memory layouts and stops memory corruption attacks with a high probability. SGX-Shield shows 7.61% performance overhead in running common micro-benchmarks and 2.25% overhead in running a more realistic workload of an HTTPS server.

@inproceedings{seo:sgx-shield,
  title        = {{SGX-Shield: Enabling Address Space Layout Randomization for SGX Programs}},
  author       = {Jaebaek Seo and Byoungyoung Lee and Sungmin Kim and Ming-Wei Shih and Insik Shin and Dongsu Han and Taesoo Kim},
  booktitle    = {Proceedings of the 2017 Annual Network and Distributed System Security Symposium (NDSS)},
  month        = feb,
  year         = 2017,
  address      = {San Diego, CA},
}
T-SGX: Eradicating Controlled-Channel Attacks Against Enclave Programs Ming-Wei Shih, Sangho Lee, Taesoo Kim, and Marcus Peinado. NDSS 2017.

Intel Software Guard Extensions (SGX) is a hardware-based trusted execution environment (TEE) that enables secure execution of a program in an isolated environment, an enclave. SGX hardware protects the running enclave against malicious software, including an operating system (OS), a hypervisor, and even low-level firmwares. This strong security property allows the trustworthy execution of programs in a hostile environment, such as a public cloud, without trusting anyone (e.g., a cloud provider) between the enclave and the SGX hardware. However, recent studies have demonstrated that enclave programs are vulnerable to an accurate controlled-channel attack: %conducted by a malicious OS. Since enclaves rely on the underlying OS, a curious or potentially malicious OS can observe a sequence of accessed addresses by intentionally triggering page faults.

In this paper, we propose T-SGX, a complete mitigation solution to the controlled-channel attack in terms of compatibility, performance, and ease of use. T-SGX relies on a commodity component of the Intel processor (since Haswell), Transactional Synchronization Extensions (TSX), which implements a restricted form of hardware transactional memory. As TSX is implemented as an extension (i.e., snooping the cache protocol), any unusual event, such as an exception or interrupt, that should be handled in its core component, results in an abort of the ongoing transaction. One interesting property is that the TSX abort suppresses the notification of errors to the underlying OS, which means that the OS cannot know whether a page fault has occurred during the transaction. T-SGX, by utilizing such property, can carefully isolate effects of attempts to tap running enclaves, thereby completely eradicating the known controlled-channel attack.

We have implemented T-SGX as a compiler-level scheme that automatically transforms a normal enclave program into a secured one. We not only evaluate the security properties of T-SGX, but also demonstrate that it applies to all the previously demonstrated attack targets including libjpeg, Hunspell, and FreeType. In addition, we evaluate the performance of T-SGX by porting ten benchmark programs of nbench to the SGX environment. The results are promising; that is, T-SGX incurs on average 50% runtime overhead, which is an order of magnitude faster than state-of-the-art mitigation schemes.

@inproceedings{shih:tsgx,
  title        = {{T-SGX: Eradicating Controlled-Channel Attacks Against Enclave Programs}},
  author       = {Ming-Wei Shih and Sangho Lee and Taesoo Kim and Marcus Peinado},
  booktitle    = {Proceedings of the 2017 Annual Network and Distributed System Security Symposium (NDSS)},
  month        = feb,
  year         = 2017,
  address      = {San Diego, CA},
}

2016

Breaking Kernel Address Space Layout Randomization with Intel TSX Yeongjin Jang, Sangho Lee, and Taesoo Kim. CCS 2016.

Kernel hardening has been an important topic since many applications and security mechanisms often consider the kernel as part of their Trusted Computing Base (TCB). Among various hardening techniques, Kernel Address Space Layout Randomization (KASLR) is the most effective and widely adopted defense mechanism that can practically mitigate various memory corruption vulnerabilities, such as buffer overflow and use-after-free. In principle, KASLR is secure as long as no memory leak vulnerability exists and high entropy is ensured.

In this paper, we introduce a highly stable timing attack against KASLR, called DrK, that can precisely de-randomize the memory layout of the kernel without violating any such assumptions. DrK exploits a hardware feature called Intel Transactional Synchronization Extension (TSX) that is readily available in most modern commodity CPUs. One surprising behavior of TSX, which is essentially the root cause of this security loophole, is that it aborts a transaction without notifying the underlying kernel even when the transaction fails due to a critical error, such as a page fault or an access violation, which traditionally requires kernel intervention. DrK turned this property into a precise timing channel that can determine the mapping status (i.e., mapped versus unmapped) and execution status (i.e., executable versus non-executable) of the privileged kernel address space. In addition to its surprising accuracy and precision, DrK is universally applicable to all OSes, even in virtualized environments, and generates no visible footprint, making it difficult to detect in practice. We demonstrated that DrK can break the KASLR of all major OSes (i.e., Windows, Linux, and OS X) with near-perfect accuracy in under a second. Finally, we propose potential countermeasures that can effectively prevent or mitigate the DrK attack.

We urge our community to be aware of the potential threat of having Intel TSX, which is present in most recent Intel CPUs -- 100% in workstation and 60% in high-end Intel CPUs since Skylake -- and is even available on Amazon EC2 (X1).

@inproceedings{jang:drk-ccs,
  title        = {{Breaking Kernel Address Space Layout Randomization with Intel TSX}},
  author       = {Yeongjin Jang and Sangho Lee and Taesoo Kim},
  booktitle    = {Proceedings of the 23rd ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2016,
  address      = {Vienna, Austria},
}
UniSan: Proactive Kernel Memory Initialization to Eliminate Data Leakages Kangjie Lu, Chengyu Song, Taesoo Kim, and Wenke Lee. CCS 2016.

The operating system kernel is the de facto trusted computing base for most computer systems. To secure the OS kernel, many security mechanisms, e.g., kASLR and StackGuard, have been increasingly deployed to defend against attacks (e.g., code reuse attack). However, the effectiveness of these protections has been proven to be inadequate---there are many information leak vulnerabilities in the kernel to leak the randomized pointer or canary, thus bypassing kASLR and StackGuard. Other sensitive data in the kernel, such as cryptographic keys and file caches, can also be leaked. According to our study, most kernel information leaks are caused by uninitialized data reads. Unfortunately, existing techniques like memory safety enforcements and dynamic access tracking tools are not adequate or efficient enough to mitigate this threat.

In this paper, we propose UniSan, a novel, compiler-based approach to eliminate all information leaks caused by uninitialized read in the OS kernel. UniSan achieves this goal using byte-level, flow-sensitive, context-sensitive, and field-sensitive initialization analysis and reachability analysis to check whether an allocation has been fully initialized when it leaves kernel space; if not, it automatically instruments the kernel to initialize this allocation. UniSan's analyses are conservative to avoid false negatives and are robust by preserving the semantics of the OS kernel. We have implemented UniSan as passes in LLVM and applied it to the latest Linux kernel (x86_64) and Android kernel (AArch64). Our evaluation showed that UniSan can successfully prevent 43 known and many new uninitialized data leak vulnerabilities. Further, 19 new vulnerabilities in the latest kernels have been confirmed by Linux and Google. Our extensive performance evaluation with LMBench, ApacheBench, Android benchmarks, and the SPEC benchmarks also showed that UniSan imposes a negligible performance overhead.

@inproceedings{lu:unisan,
  title        = {{UniSan: Proactive Kernel Memory Initialization to Eliminate Data Leakages}},
  author       = {Kangjie Lu and Chengyu Song and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 23rd ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2016,
  address      = {Vienna, Austria},
}
Fast, Scalable and Secure Onloading of Edge Functions using AirBox Ketan Bhardwaj, Ming-Wei Shih, Pragya Agarwal, Ada Gavrilovska, Taesoo Kim, and Karsten Schwan. SEC 2016.

This paper argues for the utility of back-end driven onloading to the edge as a way to address bandwidth use and latency challenges for future device-cloud interactions. Supporting such edge functions (EFs) requires solutions that can provide (i) fast and scalable EF provisioning and (ii) strong guarantees for the integrity of the EF execution and confidentiality of the state stored at the edge. In response to these goals, we (i) present a detailed design space exploration of the current technologies that can be leveraged in the design of edge function platforms (EFPs); (ii) develop a solution to address security concerns of EFs that leverages emerging hardware support for OS agnostic trusted execution environments such as Intel SGX enclaves; and (iii) propose and evaluate AirBox, a platform for fast, scalable and secure onloading of edge functions.

@inproceedings{bhardwaj:airbox,
  title        = {{Fast, Scalable and Secure Onloading of Edge Functions using AirBox}},
  author       = {Ketan Bhardwaj and Ming-Wei Shih and Pragya Agarwal and Ada Gavrilovska and Taesoo Kim and Karsten Schwan},
  booktitle    = {Proceedings of the 1st IEEE/ACM Symposium on Edge Computing (SEC 2016)},
  month        = oct,
  year         = 2016,
}
Protecting Computer Systems through Eliminating or Analyzing Vulnerabilities Byoungyoung Lee. Ph.D. thesis, Georgia Institute of Technology, August 2016.

There have been tremendous efforts to build fully secure computer systems, but it is not an easy goal. Making a simple mistake introduces a vulnerability, which can critically endanger a whole system's security.

This thesis aims at protecting computer systems from vulnerabilities. We take two complementary approaches in achieving this goal, eliminating or analyzing vulnerabilities. In the vulnerability elimination approach, we eliminate a certain class of memory corruption vulnerabilities to completely close attack vectors from such vulnerabilities. In particular, we develop tools DangNull and CaVer, each of which eliminates popular and emerging vulnerabilities, use-after-free and bad-casting, respectively. DangNull relies on the key observation that the root cause of use-after-free is that pointers are not nullified after the target object is freed. Thus, DangNull instruments a program to trace the object's relationships via pointers and automatically nullifies all pointers when the target object is freed. Similarly, CaVer relies on the key observation that the root cause of bad-casting is that casting operations are not properly verified. Thus, CaVer uses a new runtime type tracing mechanism to overcome the limitation of existing approaches, and performs efficient verification on all type casting operations dynamically. We have implemented these protection solutions and successfully applied them to Chrome and Firefox browsers. Our evaluation showed that DangNull and CaVer imposes 29% and 7.6% benchmark overheads in Chrome, respectively. We have also tested seven use-after-free and five bad-casting exploits in Chrome, and DangNull and CaVer safely prevented them all.

In the vulnerability analysis approach, we focus on a timing-channel vulnerability which allows an attacker to learn information about program's sensitive data without causing a program to perform unsafe operations. It is challenging to test and further confirm the timing-channel vulnerability as it typically involves complex algorithmic operations. We implemented SideFinder, an assistant tool identifying timing-channel vulnerabilities in a hash table. Empowered with symbolic execution techniques, SideFinder semi-automatically synthesizes inputs attacking timing-channels, and thus confirms the vulnerability. Using SideFinder, we analyzed and further synthesized two real-world attacks in the Linux kernel, and showed it can break one important security mechanism, Address Space Layout Randomization (ASLR).

@phdthesis{lee:thesis,
  title        = {{Protecting Computer Systems through Eliminating or Analyzing Vulnerabilities}},
  author       = {Byoungyoung Lee},
  school       = {{Georgia Institute of Technology}},
  year         = 2016,
  month        = aug,
}
Preventing Exploits against Memory Corruption Vulnerabilities Chengyu Song. Ph.D. thesis, Georgia Institute of Technology, August 2016.

The most common cyber-attack vector is exploit of software vulnerability. Despite much efforts toward building secure software, software systems of even modest complexity still routinely have serious vulnerabilities. More alarmingly, even the trusted computing base (e.g. OS kernel) still contains vulnerabilities that would allow attackers to subvert security mechanisms such as the application sandbox on smartphones. Among all vulnerabilities, memory corruption is one of the most ancient, prevalent, and devastating vulnerabilities. This thesis proposed three projects on mitigating this threat. There are three popular ways to exploit a memory corruption vulnerability---attacking the code (a.k.a. code injection attack), the control data (a.k.a. control-flow hijacking attack), and the non-control data (a.k.a. data-oriented attack). Theoretically, code injection attack can be prevented with the executable XOR writable policy; but in practice, this policy is undermined by another important technique---dynamic code generation (e.g. JIT engines). In the first project, we first showed that this conflict is actually non-trivial to resolve, then we introduced a new design paradigm to fundamentally solve this problem, by relocating the dynamic code generator to a separate process. In the second project, we focused on preventing data-oriented attacks against operating system kernel. Using privilege escalation attacks as an example, we (1) demonstrated that data-oriented attacks are realistic threats and hard to prevent; (2) discussed two important challenges for preventing such attacks (i.e., completeness and performance); and (3) presented a system that combines program analysis techniques and system designs to solve these challenges. During these two projects, we found that lacking sufficient hardware support imposes many unnecessary difficulties in building robust and efficient defense mechanisms. In the third project, we proposed HDFI (hardware-assisted data-flow isolation) to overcome this limitation. HDFI is a new fine-grained isolation mechanism that enforces isolation at the machine word granularity, by virtually extending each memory unit with an additional tag that is defined by data-flow. This capability allows HDFI to enforce a variety of security models such as the Biba Integrity Model and the Bell--LaPadula Model. For demonstration, we developed and ported several security mechanisms to leverage HDFI, including stack protection, standard library enhancement, virtual function table protection, code pointer protection, kernel data protection, and information leak prevention. The evaluation results showed that HDFI is easy to use, imposes low performance overhead, and allows us to create simpler and more secure solutions.

@phdthesis{song:thesis,
  title        = {{Preventing Exploits against Memory Corruption Vulnerabilities}},
  author       = {Chengyu Song},
  school       = {{Georgia Institute of Technology}},
  year         = 2016,
  month        = aug,
}
Provably-Secure Remote Memory Attestation for Heap Overflow Protection Alexandra Boldyreva, Taesoo Kim, Richard J. Lipton, and Bogdan Warinschi. SCN 2016.

Memory corruption attacks may lead to complete takeover of systems. There are numerous works offering protection mechanisms for this important problem. But the security guarantees that are offered by most works are only heuristic and, furthermore, most solutions are designed for protecting the local memory. In this paper we initiate the study of provably secure remote memory attestation; we concentrate on provably detecting heap-based overflow attacks and consider the setting where we aim to protect the memory in a remote system. We present two protocols offering various efficiency and security trade-offs (but all solutions are efficient enough for practical use as our implementation shows) that detect the presence of injected malicious code or data in remotely-stored heap memory. While our solutions offer protection only against a specific class of attacks, our novel formalization of threat models is general enough to cover a wide range of attacks and settings.

@inproceedings{boldyreva:notion,
  title        = {{Provably-Secure Remote Memory Attestation for Heap Overflow Protection}},
  author       = {Alexandra Boldyreva and Taesoo Kim and Richard J. Lipton and Bogdan Warinschi},
  booktitle    = {Proceedings of the 10th Conference on Security and Cryptography for Networks (SCN 2016)},
  month        = aug,
  year         = 2016,
}
Toward Engineering a Secure Android Ecosystem: A Survey of Existing Techniques Meng Xu, Chengyu Song, Yang ji, Ming-Wei Shih, Kangjie Lu, Cong Zheng, Ruian Duan, Yeongjin Jang, Byoungyoung Lee, Chenxiong Qian, Sangho Lee, and Taesoo Kim. ACM Computing Surveys (CSUR) 49(2), August 2016.

The openness and extensibility of Android have made it a popular platform for mobile devices and a strong candidate to drive the Internet-of-Things. Unfortunately, these properties also leave Android vulnerable, attracting attacks for profit or fun. To mitigate these threats, numerous issue-specific solutions have been proposed. With the increasing number and complexity of security problems and solutions, we believe this is the right moment to step back and systematically re-evaluate the Android security architecture and security practices in the ecosystem. We organize the most recent security research on the Android platform into two categories: the software stack and the ecosystem. For each category, we provide a comprehensive narrative of the problem space; highlight the limitations of the proposed solutions; and identify open problems for future research. Based on our collection of knowledge, we envision a blueprint for engineering a secure, next-generation Android ecosystem.

@article{xu:android-survey,
  title        = {{Toward Engineering a Secure Android Ecosystem: A Survey of Existing Techniques}},
  author       = {Meng Xu and Chengyu Song and Yang ji and Ming-Wei Shih and Kangjie Lu and Cong Zheng and Ruian Duan and Yeongjin Jang and
    Byoungyoung Lee and Chenxiong Qian and Sangho Lee and Taesoo Kim},
  journal      = {ACM Computing Surveys (CSUR)},
  volume       = 49,
  number       = 2,
  month        = aug,
  year         = 2016,
}
APISan: Sanitizing API Usages through Semantic Cross-checking Insu Yun, Changwoo Min, Xujie Si, Yeongjin Jang, Taesoo Kim, and Mayur Naik. USENIX Security 2016.

API misuse is a well-known source of bugs. Some of them (e.g., incorrect use of SSL API, and integer overflow of memory allocation size) can cause serious security vulnerabilities (e.g., man-in-the-middle (MITM) attack, and privilege escalation). Moreover, modern APIs, which are large, complex, and fast evolving, are error-prone. However, existing techniques to help finding bugs require manual effort by developers (e.g., providing specification or model) or are not scalable to large real-world software comprising millions of lines of code.

In this paper, we present APISAN, a tool that automatically infers correct API usages from source code without manual effort. The key idea in APISAN is to extract likely correct usage patterns in four different aspects (e.g., causal relation, and semantic relation on arguments) by considering semantic constraints. APISAN is tailored to check various properties with security implications. We applied APISAN to 92 million lines of code, including Linux Kernel, and OpenSSL, found 76 previously unknown bugs, and provided patches for all the bugs.

@inproceedings{yun:apisan,
  title        = {{APISan: Sanitizing API Usages through Semantic Cross-checking}},
  author       = {Insu Yun and Changwoo Min and Xujie Si and Yeongjin Jang and Taesoo Kim and Mayur Naik},
  booktitle    = {Proceedings of the 25th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2016,
  address      = {Austin, TX},
}
Breaking Kernel Address Space Layout Randomization with Intel TSX Yeongjin Jang, Sangho Lee, and Taesoo Kim. Black Hat USA 2016.

Kernel hardening has been an important topic, as many applications and security mechanisms often consider the kernel as part of their Trusted Computing Base (TCB). Among various hardening techniques, Kernel Address Space Layout Randomization (KASLR) is the most effective and widely adopted defense mechanism that can practically mitigate various memory corruption vulnerabilities, such as buffer overflow and use-after-free. In principle, KASLR is secure as long as no memory leak vulnerability exists and high entropy is ensured.

In this paper, we introduce a novel timing attack against KASLR, called DrK, that can precisely de-randomize the memory layout of the kernel without violating any such assumptions. DrK exploits a hardware feature called Intel Transactional Synchronization Extension (TSX) that is readily available in most modern commodity CPUs. One surprising behavior of TSX, which is essentially the root cause of this security loophole, is that it aborts a transaction without notifying the underlying kernel even when the transaction fails due to a critical error, such as a page fault or an access violation, which traditionally requires kernel intervention. DrK turned this property into a precise timing channel that can determine the mapping status (i.e., mapped versus unmapped) and execution status (i.e., executable versus non-executable) of the privileged, kernel address space. In addition to its surprising accuracy and precision, DrK is universally applicable to all OSes, even on a virtualized environment, and generates no visible footprint, making it difficult to detect in practice.

We demonstrated that DrK could break KASLR of all major OSes (i.e., Windows, Linux, and OS X) with near-perfect accuracy in under a second. Finally, we propose a few potential countermeasures that can effectively prevent or mitigate the DrK attack. We urge our community to be aware of the potential threat of having Intel TSX, which is present in most recent Intel CPUs -- 100% in workstation and 60% in high-end Intel CPUs since Skylake -- and is even available on Amazon EC2 (X1).

@inproceedings{jang:drk-bh,
  title        = {{Breaking Kernel Address Space Layout Randomization with Intel TSX}},
  author       = {Yeongjin Jang and Sangho Lee and Taesoo Kim},
  booktitle    = {Black Hat USA Briefings (Black Hat USA)},
  month        = aug,
  year         = 2016,
  address      = {Las Vegas, NV},
}
Instant OS Updates via Userspace Checkpoint-and-Restart Sanidhya Kashyap, Changwoo Min, Byoungyoung Lee, Taesoo Kim, and Pavel Emelyanov. ATC 2016.

In recent years, operating systems have become increasingly complex and thus prone to security and performance issues. Accordingly, system updates to address these issues have become more frequently available and increasingly important. To complete such updates, users must reboot their systems, resulting in unavoidable downtime and further loss of the states of running applications.

We present , a practical OS update mechanism that uses a userspace checkpoint-and-restart mechanism, by using an optimized data structure for checkpointing and a memory persistence mechanism across update, combined with a fast in-place kernel switch. This allows for instant kernel updates spanning across major kernel versions without any kernel modifications.

Our evaluation shows that Kup can support any type of real kernel patches (e.g., security, minor or even major releases) with large-scale applications that include memcached, mysql, or in the middle of the Linux kernel compilation, unlike well-known dynamic hot-patching techniques (e.g., ksplice). Not only that, Kup can update a running Linux kernel in 3 seconds (overall downtime).

@inproceedings{kashyap:kup,
  title        = {{Instant OS Updates via Userspace Checkpoint-and-Restart}},
  author       = {Sanidhya Kashyap and Changwoo Min and Byoungyoung Lee and Taesoo Kim and Pavel Emelyanov},
  booktitle    = {Proceedings of the 2016 USENIX Annual Technical Conference (ATC)},
  month        = jun,
  year         = 2016,
  address      = {Denver, CO},
}
Understanding Manycore Scalability of File Systems Changwoo Min, Sanidhya Kashyap, Steffen Maass, Woonhak Kang, and Taesoo Kim. ATC 2016.

We analyze the manycore scalability of five widelydeployed file systems, namely, ext4, XFS, btrfs, F2FS, and tmpfs, by using our open source benchmark suite, FxMark. FxMark implements 19 microbenchmarks to stress specific components of each file system and includes three application benchmarks to measure the macroscopic scalability behavior. We observe that file systems are hidden scalability bottlenecks in many I/Ointensive applications even when there is no apparent contention at the application level. We found 25 scalability bottlenecks in file systems, many of which are unexpected or counterintuitive. We draw a set of observations on file system scalability behavior and unveil several core aspects of file system design that systems researchers must address.

@inproceedings{min:fxmark,
  title        = {{Understanding Manycore Scalability of File Systems}},
  author       = {Changwoo Min and Sanidhya Kashyap and Steffen Maass and Woonhak Kang and Taesoo Kim},
  booktitle    = {Proceedings of the 2016 USENIX Annual Technical Conference (ATC)},
  month        = jun,
  year         = 2016,
  address      = {Denver, CO},
}
HDFI: Hardware-Assisted Data-Fow Isolation Chengyu Song, Hyungon Moon, Monjur Alam, Insu Yun, Byoungyoung Lee, Taesoo Kim, Wenke Lee, and Yunheung Paek. IEEE SP 2016.

Memory corruption vulnerabilities are the root cause of many modern attacks. Existing defense mechanisms are inadequate; in general, the software-based approaches are not efficient and the hardware-based approaches are not flexible. In this paper, we present hardware-assisted data-flow isolation, or, Hdfi, a new fine-grained data isolation mechanism that is broadly applicable and very efficient. Hdfi enforces isolation at the machine word granularity by virtually extending each memory unit with an additional tag that is defined by data-flow. This capability allows Hdfi to enforce a variety of security models such as the Biba Integrity Model and the Bell--LaPadula Model. We implemented Hdfi by extending the RISC-V instruction set architecture (ISA) and instantiating it on the Xilinx Zynq ZC706 evaluation board. We ran several benchmarks including the SPEC CINT 2000 benchmark suite. Evaluation results show that the performance overhead caused by our modification to the hardware is low (<2%). We also developed or ported several security mechanisms to leverage Hdfi, including stack protection, standard library enhancement, virtual function table protection, code pointer protection, kernel data protection, and information leak prevention. Our results show that Hdfi is easy to use, imposes low performance overhead, and allows us to create more elegant and more secure solutions.

@inproceedings{song:hdfi,
  title        = {{HDFI: Hardware-Assisted Data-Fow Isolation}},
  author       = {Chengyu Song and Hyungon Moon and Monjur Alam and Insu Yun and Byoungyoung Lee and Taesoo Kim and Wenke Lee and Yunheung Paek},
  booktitle    = {Proceedings of the 37th IEEE Symposium on Security and Privacy (Oakland)},
  month        = may,
  year         = 2016,
  address      = {San Jose, CA},
}
Opportunistic Spinlocks: Achieving Virtual Machine Scalability in the Clouds Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. OSR 2016.

With increasing demand for big-data processing and faster in-memory databases, cloud providers are moving towards large virtualized instances besides focusing on the horizontal scalability.

However, our experiments reveal that such instances in popular cloud services (e.g., 32 vCPUs with 208 GB supported by Google Compute Engine) do not achieve the desired scalability with increasing core count even with a simple, embarrassingly parallel job (e.g., Linux kernel compile). On a serious note, the internal synchronization scheme (e.g., paravirtualized ticket spinlock) of the virtualized instance on a machine with higher core count (e.g., 80-core) dramatically degrades its overall performance. Our finding is different from the previously well-known scalability problem (i.e., lock contention problem) and occurs because of the sophisticated optimization techniques implemented in the hypervisor—what we call sleepy spinlock anomaly. To solve this problem, we design and implement OTICKET, a variant of paravirtualized ticket spinlock that effectively scales the virtualized instances in both undersubscribed and oversubscribed environments.

@inproceedings{kashyap:oppspinlocks,
  title        = {{Opportunistic Spinlocks: Achieving Virtual Machine Scalability in the Clouds}},
  author       = {Sanidhya Kashyap and Changwoo Min and Taesoo Kim},
  booktitle    = {Proceedings of the ACM SIGOPS Operating System Review},
  journal      = {SIGOPS Oper. Syst. Rev.},
  year         = 2016,
  month        = mar,
  volume       = 50,
  number       = 1,
}
S-NFV: Securing NFV states by using SGX Ming-Wei Shih, Mohan Kumar, Taesoo Kim, and Ada Gavrilovska. SDNNFVSEC 2016.

Network Function Virtualization (NFV) applications are stateful. For example, a Content Distribution Network (CDN) caches web contents from remote servers and serves them to clients. Similarly, an Intrusion Detection System (IDS) and an Intrusion Prevention System (IPS) have both per-flow and multi-flow (shared) states to properly react to intrusions. On today's NFV infrastructures, security vulnerabilities many allow attackers to steal and manipulate the internal states of NFV applications that share a physical resource. In this paper, we propose a new protection scheme, S-NFV that incorporates Intel Software Guard Extensions (Intel SGX) to securely isolate the states of NFV applications.

@inproceedings{shih:snfv,
  title        = {{S-NFV: Securing NFV states by using SGX}},
  author       = {Ming-Wei Shih and Mohan Kumar and Taesoo Kim and Ada Gavrilovska},
  booktitle    = {Proceedings of the 1st ACM International Workshop on Security in SDN and NFV},
  month        = mar,
  year         = 2016,
  address      = {New Orleans, LA},
}
OpenSGX: An Open Platform for SGX Research Prerit Jain, Soham Desai, Seongmin Kim, Ming-Wei Shih, JaeHyuk Lee, Changho Choi, Youjung Shin, Taesoo Kim, Brent B. Kang, and Dongsu Han. NDSS 2016.

Hardware technologies for trusted computing, or trusted execution environments (TEEs), have rapidly matured over the last decade. In fact, TEEs are at the brink of widespread commoditization with the recent introduction of Intel Software Guard Extensions (Intel SGX). Despite such rapid development of TEE, software technologies for TEE significantly lag behind their hardware counterpart, and currently only selected group of researchers have the privilege to access this technology. To address this problem, we develop an open source platform, called OpenSGX that emulates Intel SGX hardware components at the instruction level and provides new system software components necessarily required for full TEE exploration. We expect that the OpenSGX framework can serve as an open platform for SGX research, with the following contributions. First, we develop a fully functional, instruction-compatible emulator of Intel SGX for enabling the exploration of software/hardware design space, and development of enclave programs. OpenSGX provides a platform for SGX development, meaning that it provides not just emulation but also operating system components, an enclave program loader/packager, an OpenSGX user library, debugging, and performance monitoring. Second, to show OpenSGX's use cases, we applied OpenSGX to protect sensitive information (e.g., directory) of Tor nodes, and evaluated their potential performance impacts. Therefore, we believe OpenSGX has a great potential for broader communities to spark new research on soon- to-be-commodity Intel SGX.

@inproceedings{jain:opensgx,
  title        = {{OpenSGX: An Open Platform for SGX Research}},
  author       = {Prerit Jain and Soham Desai and Seongmin Kim and Ming-Wei Shih and JaeHyuk Lee and Changho Choi and Youjung Shin and Taesoo Kim and
    Brent B. Kang and Dongsu Han},
  booktitle    = {Proceedings of the 2016 Annual Network and Distributed System Security Symposium (NDSS)},
  month        = feb,
  year         = 2016,
  address      = {San Diego, CA},
}
Enforcing Kernel Security Invariants with Data Flow Integrity Chengyu Song, Byoungyoung Lee, Kangjie Lu, William R. Harris, Taesoo Kim, and Wenke Lee. NDSS 2016.

Operation system kernel is the foundation of the whole system, and is often the de facto trusted computing base for many higher level security mechanisms. Unfortunately, kernel vulnerabilities are not rare and are continuously being introduced with new kernel features. Once the kernel is compromised, attackers can bypass any access control checks, escalate their privileges and hide the evidence of attacks. Many protection mechanisms have been proposed and deployed to prevent kernel exploits. However, a majority of these techniques only focused on preventing control-flow hijacking attacks; and techniques that can mitigate non-control-data attacks either only apply to drivers/modules, or impose too much overhead. The goal of our research is to develop a principled defense mechanism against memory corruption based privilege escalation attacks. Towards this end, we leverage data-flow integrity to enforce security invariants of the kernel access control system. And in order for our protection mechanism to be practical, we develop two new techniques: one for automatically inferring data that are critical to the access control system without manual annotation, and the other for efficient DFI enforcement over the inference results. We have implemented a prototype of our technology for the ARM64 Linux kernel on an Android device. The evaluation results of our prototype implementation show that our technology can mitigate a majority of privilege escalation attacks, while imposing a moderate amount of performance overhead.

@inproceedings{song:kenali,
  title        = {{Enforcing Kernel Security Invariants with Data Flow Integrity}},
  author       = {Chengyu Song and Byoungyoung Lee and Kangjie Lu and William R. Harris and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 2016 Annual Network and Distributed System Security Symposium (NDSS)},
  month        = feb,
  year         = 2016,
  address      = {San Diego, CA},
}
FlexDroid: Enforcing In-App Privilege Separation in Android Jaebaek Seo, Daehyeok Kim, Donghyun Cho, Taesoo Kim, and Insik Shin. NDSS 2016.

Mobile applications are increasingly integrating third-party libraries to provide various features, such as advertising, analytics, social networking, and more. Unfortunately, such integration with third-party libraries comes with the cost of potential privacy violations of users, because Android always grants a full set of permissions to third-party libraries as their host applications. Unintended accesses to users’ private data are underestimated threats to users’ privacy, as complex and often obfuscated third-party libraries make it hard for application developers to estimate the correct behaviors of third-party libraries. More critically, a wide adoption of native code (JNI) and dynamic code executions such as Java reflection or dynamic code reloading, makes it even harder to apply state-of-the-art security analysis. In this work, we propose FLEXDROID, a new Android security model and isolation mechanism, that provides dynamic, fine-grained access control for third-party libraries. With FLEXDROID, application developers not only can gain a full control of third-party libraries (e.g., which permissions to grant or not), but also can specify how to make them behave after detecting a privacy violation (e.g., providing a mock user’s information or kill). To achieve such goals, we define a new notion of principals for third-party libraries, and develop a novel security mechanism, called inter-process stack inspection that is effective to JNI as well as dynamic code execution. Our usability study shows that developers can easily adopt FLEXDROID’s policy to their existing applications. Finally, our evaluation shows that FLEXDROID can effectively restrict the permissions of third-party libraries with negligible overheads.

@inproceedings{seo:flexdroid,
  title        = {{FlexDroid: Enforcing In-App Privilege Separation in Android}},
  author       = {Jaebaek Seo and Daehyeok Kim and Donghyun Cho and Taesoo Kim and Insik Shin},
  booktitle    = {Proceedings of the 2016 Annual Network and Distributed System Security Symposium (NDSS)},
  month        = feb,
  year         = 2016,
  address      = {San Diego, CA},
}
How to Make ASLR Win the Clone Wars: Runtime Re-Randomization Kangjie Lu, Stefan Nürnberger, Michael Backes, and Wenke Lee. NDSS 2016.

Existing techniques for memory randomization such as the widely explored Address Space Layout Randomization (ASLR) perform a single, per-process randomization that is applied before or at the process' load-time. The efficacy of such upfront randomizations crucially relies on the assumption that an attacker has only one chance to guess the randomized address, and that this attack succeeds only with a very low probability. Recent research results have shown that this assumption is not valid in many scenarios, e.g., daemon servers fork child processes that inherent the state-- and if applicable: the randomization--of their parents, and thereby create clones with the same memory layout. This enables the so-called clone-probing attacks where an adversary repeatedly probes different clones in order to increase its knowledge about their shared memory layout. In this paper, we propose RuntimeASLR--the first approach that prevents clone-probing attacks without altering the intended semantics of child process forking. The paper makes the following three contributions. First, we propose a semantics- preserving and runtime-based approach for preventing clone-probing attacks by re-randomizing the address space of every child after fork() at runtime while keeping the parent's state. We achieve this by devising a novel, automated pointer tracking policy generation process that has to be run just once, followed by a pointer tracking mechanism that is only applied to the parent process. Second, we propose a systematic and holistic pointer tracking mechanism that correctly identifies pointers inside memory space. This mechanism constitutes the central technical building block of our approach. Third, we provide an open-source implementation of our approach based on Intel's Pin on an x86-64 Linux platform, which supports COTS server binaries directly. We have also evaluated our system on Nginx web server. The results show that RuntimeASLR identifies all pointers, effectively prevents clone-probing attacks. Although it takes a longer time for RuntimeASLR to start the server program (e.g., 35 seconds for Nginx), RuntimeASLR imposes no runtime performance overhead to the worker processes that provide actual services.

@inproceedings{lu:runtimeaslr,
  title        = {{How to Make ASLR Win the Clone Wars: Runtime Re-Randomization}},
  author       = {Kangjie Lu and Stefan N{\"u}rnberger and Michael Backes and Wenke Lee},
  booktitle    = {Proceedings of the 2016 Annual Network and Distributed System Security Symposium (NDSS)},
  month        = feb,
  year         = 2016,
  address      = {San Diego, CA},
}
MetaSync: Coordinating Storage Across Multiple File Synchronization Services Seungyeop Han, Haichen Shen, Taesoo Kim, Arvind Krishnamurthy, Thomas Anderson, and David Wetherall. IEEE Internet Computing, 2016.

Cloud-based file synchronization services such as Dropbox are a worldwide resource for many millions of users. However, individual services often have tight resource limits, suffer from temporary outages or shutdowns, and sometimes silently corrupt or leak user data. As a solution, the authors design, implement, and evaluate MetaSync, a secure and reliable file synchronization service using multiple cloud synchronization services as untrusted storage providers. To provide global consistency among the storage providers, the authors devise a novel variant of Paxos that enables efficient updates on top of the unmodified APIs exported by each service. MetaSync provides better availability and performance, stronger confidentiality and integrity, and larger storage for end users.

@article{han:metasync2,
  title        = {{MetaSync: Coordinating Storage Across Multiple File Synchronization Services}},
  author       = {Seungyeop Han and Haichen Shen and Taesoo Kim and Arvind Krishnamurthy and Thomas Anderson and David Wetherall},
  journal      = {IEEE Internet Computing},
  year         = 2016,
}

2015

A First Step Towards Leveraging Commodity Trusted Execution Environments for Network Applications Seongmin Kim, Youjung Shin, Jaehyung Ha, Taesoo Kim, and Dongsu Han. HotNets 2015.

Network applications and protocols are increasingly adopting security and privacy features, as they are becoming one of the primary requirements. The wide-spread use of transport layer security (TLS) and the growing popularity of anonymity networks, such as Tor, exemplify this trend. Motivated by the recent movement towards commoditization of trusted execution environments (TEEs), this paper explores alternative design choices that application and protocol designers should consider. In particular, we explore the possibility of using Intel SGX to provide security and privacy in a wide range of network applications. We show that leveraging hardware protection of TEEs opens up new possibilities, often at the benefit of a much simplified application/protocol design. We demonstrate its practical implications by exploring the design space for SGX-enabled software-defined inter-domain routing, peer-to-peer anonymity networks (Tor), and middleboxes. Finally, we quantify the potential overheads of the SGX-enabled design by implementing it on top of OpenSGX, an open source SGX emulator.

@inproceedings{kim:sgxtor,
  title        = {{A First Step Towards Leveraging Commodity Trusted Execution Environments for Network Applications}},
  author       = {Seongmin Kim and Youjung Shin and Jaehyung Ha and Taesoo Kim and Dongsu Han},
  booktitle    = {Proceedings of the 14th ACM Workshop on Hot Topics in Networks (HotNets)},
  month        = nov,
  year         = 2015,
  address      = {Philadelphia, PA},
}
UCognito: Private Browsing without Tears Meng Xu, Yeongjin Jang, Xinyu Xing, Taesoo Kim, and Wenke Lee. CCS 2015.

While private browsing is a standard feature, its implementation has been inconsistent among the major browsers. More seriously, it often fails to provide the adequate or even the intended privacy protection. For example, as shown in prior research, browser extensions and add- ons often undermine the goals of private browsing. In this paper, we first present our systematic study of private browsing. We developed a technical approach to identify browser traces left behind by a private browsing session, and showed that Chrome and Firefox do not correctly clear some of these traces. We analyzed the source code of these browsers and discovered that the current implementation approach is to decide the behaviors of a browser based on the current browsing mode (i.e., private or public); but such decision points are scattered throughout the code base. This implementation approach is very problematic because developers are prone to make mistakes given the complexities of browser components (including extensions and add-ons). Based on this observation, we propose a new and general approach to implement private browsing. The main idea is to overlay the actual filesystem with a sandbox filesystem when the browser is in private browsing mode, so that no unintended leakage is allowed and no persistent modification is stored. This approach requires no change to browsers and the OS kernel because the layered sandbox filesystem is implemented by interposing system calls. We have implemented a prototype system called Ucognito on Linux. Our evaluations show that Ucognito, when applied to Chrome and Firefox, stops all known privacy leaks identified by prior work and our current study. More importantly, Ucognito incurs only negligible performance overhead: e.g., 0%-2.5% in benchmarks for standard JavaScript and webpage loading.

@inproceedings{xu:ucognito,
  title        = {{UCognito: Private Browsing without Tears}},
  author       = {Meng Xu and Yeongjin Jang and Xinyu Xing and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 22nd ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2015,
  address      = {Denver, Colorado},
}
ASLR-Guard: Stopping Address Space Leakage for Code Reuse Attacks Kangjie Lu, Chengyu Song, Byoungyoung Lee, Simon P. Chung, Taesoo Kim, and Wenke Lee. CCS 2015.

A general prerequisite for a code reuse attack is that the attacker needs to locate code gadgets that perform the desired operations and then direct the control flow of a vulnerable application to those gadgets. Address Space Layout Randomization (ASLR) attempts to stop code reuse attacks by making the first part of the prerequisite unsatisfiable. However, research in recent years has shown that this protection is often defeated by commonly existing information leaks, which provides attackers clues about the whereabouts of certain code gadgets. In this paper, we present ASLR-Guard, a novel mechanism that completely prevents the leaks of code pointers, and render other information leaks (e.g., the ones of data pointers) useless in deriving code address. The main idea behind ASLR-Guard is to render leak of data pointer useless in deriving code address by separating code and data, provide a secure storage for code pointers, and encode the code pointers when they are treated as data. ASLR-Guard can either prevent code pointer leaks or render their leaks harmless. That is, ASLR-Guard makes it impossible to overwrite code pointers with values that point to or will hijack the control flow to a desired address when the code pointers are dereferenced. We have implemented a prototype of ASLR-Guard, including a compilation toolchain and a C/C++ runtime. Our evaluation results show that (1) ASLR-Guard supports normal operations correctly; (2) it completely stops code address leaks and can resist against recent sophisticated attacks; (3) it imposes almost no runtime overhead (< 1%) for C/C++ programs in the SPEC benchmark. Therefore, ASLR-Guard is very practical and can be applied to secure many applications.

@inproceedings{lu:aslrguard,
  title        = {{ASLR-Guard: Stopping Address Space Leakage for Code Reuse Attacks}},
  author       = {Kangjie Lu and Chengyu Song and Byoungyoung Lee and Simon P. Chung and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 22nd ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2015,
  address      = {Denver, Colorado},
}
Breaking and Fixing VoLTE: Exploiting Hidden Data Channels and Mis-implementations Hongil Kim, Dongkwan Kim, Minhee Kwon, Hyungseok Han, Yeongjin Jang, Dongsu Han, Taesoo Kim, and Yongdae Kim. CCS 2015.

Long Term Evolution (LTE) is becoming the dominant cellular networking technology, shifting the cellular network away from its circuit-switched legacy towards a packet-switched network that resembles the Internet. To support voice calls over the LTE network, operators have introduced Voice-over-LTE (VoLTE), which dramatically changes how voice calls are handled, both from user equipment and infrastructure perspectives. We find that this dramatic shift opens up a number of new attack surfaces that have not been previously explored. To call attention to this matter, this paper presents a systematic security analysis.

Unlike the traditional call setup, the VoLTE call setup is controlled and performed at the Application Processor (AP), using the SIP over IP. A legitimate user who has control over the AP can potentially control and exploit the call setup process to establish a VoLTE channel. This combined with the legacy accounting policy (e.g., unlimited voice and the separation of data and voice) leads to a number of free data channels. In the process of unveiling the free data channels, we identify a number of additional vulnerabilities of early VoLTE implementations, which lead to serious exploits, such as caller spoofing, over-billing, and denial-of-service attacks. We identify the nature of these vulnerabilities and concrete exploits that directly result from the adoption of VoLTE. We also propose immediate countermeasures that can be employed to alleviate the problems. However, we believe that the nature of the problem calls for a more comprehensive solution that eliminates the root causes at mobile devices, mobile platforms, and the core network.

@inproceedings{kim:volte,
  title        = {{Breaking and Fixing VoLTE: Exploiting Hidden Data Channels and Mis-implementations}},
  author       = {Hongil Kim and Dongkwan Kim and Minhee Kwon and Hyungseok Han and Yeongjin Jang and Dongsu Han and Taesoo Kim and Yongdae Kim},
  booktitle    = {Proceedings of the 22nd ACM Conference on Computer and Communications Security (CCS)},
  month        = oct,
  year         = 2015,
  address      = {Denver, Colorado},
}
Cross-checking Semantic Correctness: The Case of Finding File System Bugs Changwoo Min, Sanidhya Kashyap, Byoungyoung Lee, Chengyu Song, and Taesoo Kim. SOSP 2015.

Today, systems software is too complex to be bug-free. To find bugs in systems software, developers often rely on code checkers, like Linux's Sparse. However, the capability of existing tools used in commodity, large-scale systems is limited to finding only shallow bugs that tend to be introduced by simple programmer mistakes, and so do not require a deep understanding of code to find them. Unfortunately, the majority of bugs as well as those that are difficult to find are semantic ones, which violate high-level rules or invariants (e.g., missing a permission check). Thus, it is difficult for code checkers lacking the understanding of a programmer's true intention to reason about semantic correctness.

To solve this problem, we present Juxta, a tool that automatically infers high-level semantics directly from source code. The key idea in Juxta is to compare and contrast multiple existing implementations that obey latent yet implicit high-level semantics. For example, the implementation of open() at the file system layer expects to handle an out-of-space error from the disk in all file systems. We applied Juxta to 54 file systems in the stock Linux kernel (680K LoC), found 118 previously unknown semantic bugs (one bug per 5.8K LoC), and provided corresponding patches to 39 different file systems, including mature, popular ones like ext4, btrfs, XFS, and NFS. These semantic bugs are not easy to locate, as all the ones found by Juxta have existed for over 6.2 years on average. Not only do our empirical results look promising, but the design of Juxta is generic enough to be extended easily beyond file systems to any software that has multiple implementations, like Web browsers or protocols at the same layer of a network stack.

@inproceedings{min:juxta,
  title        = {{Cross-checking Semantic Correctness: The Case of Finding File System Bugs}},
  author       = {Changwoo Min and Sanidhya Kashyap and Byoungyoung Lee and Chengyu Song and Taesoo Kim},
  booktitle    = {Proceedings of the 25th ACM Symposium on Operating Systems Principles (SOSP)},
  month        = oct,
  year         = 2015,
  address      = {Monterey, CA},
}
Type Casting Verification: Stopping an emerging attack vector Byoungyoung Lee, Chengyu Song, Taesoo Kim, and Wenke Lee. USENIX Security 2015.

Many applications such as the Chrome and Firefox browsers are largely implemented in C++ for its performance and modularity. Type casting, which converts one type of an object to another, plays an essential role in enabling polymorphism in C++ because it allows a program to utilize certain general or specific implementations in the class hierarchies. However, if not correctly used, it may return unsafe and incorrectly casted values, leading to so-called bad-casting or type-confusion vulnerabilities. Since a bad-casted pointer violates a programmer's intended pointer semantics and enables an attacker to corrupt memory, bad-casting has critical security implications similar to those of other memory corruption vulnerabilities. Despite the increasing number of bad-casting vulnerabilities, the bad-casting detection problem has not been addressed by the security community.

In this paper, we present CaVer, a runtime bad-casting detection tool. It performs program instrumentation at compile time and uses a new runtime type tracing mechanism, the type hierarchy table, to overcome the limitation of existing approaches and efficiently verify type casting dynamically. In particular, CaVer can be easily and automatically adopted to target applications, CaVer achieves broader detection coverage, and incurs reasonable runtime overhead. We have applied CaVer to large-scale software including Chrome and Firefox browsers, and CaVer discovered 11 previously unknown security vulnerabilities: nine in GNU stdlibc++ and two in Firefox, all of which have been confirmed and subsequently fixed by vendors. Our evaluation showed that CaVer imposes up to 7.6% and 64.6% overhead for performance-intensive benchmarks on the Chromium and Firefox browsers, respectively.

@inproceedings{lee:caver,
  title        = {{Type Casting Verification}: Stopping an Emerging Attack Vector},
  author       = {Byoungyoung Lee and Chengyu Song and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 24th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2015,
  address      = {Washington, DC},
}
Scalability In The Clouds! A Myth Or Reality? Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. APSys 2015.

With increasing demand of big-data processing and faster in-memory databases, cloud providers are gearing towards large virtualized instances rather than horizontal scalability.

However, our experiments reveal that such instances in popular cloud services (e.g., 32 vCPUs with 208 GB supported by Google Compute Engine) do not achieve the desired scalability with increasing core count even with a simple, embarrassingly parallel job (e.g., kernel compile). On a serious note, the internal synchronization scheme (e.g., paravirtualized ticket spinlock) of the virtualized instance on a machine with higher core count (e.g., 80-core) dramatically degrades its overall performance. Our finding is different from a previously well-known scalability problem (lock contention problem), and occurs because of the sophisticated optimization techniques implemented in the hypervisor, what we call -- sleepy spinlock anomaly. To solve this problem, we design and implement oticket, a variant of paravirtualized ticket spinlock that effectively scales the virtualized instances in both undersubscribed and oversubscribed environments.

@inproceedings{kashyap:oticket,
  title        = {{Scalability In The Clouds! A Myth Or Reality?}},
  author       = {Sanidhya Kashyap and Changwoo Min and Taesoo Kim},
  booktitle    = {Proceedings of the 6th Asia-Pacific Workshop on Systems (APSys)},
  month        = jul,
  year         = 2015,
  address      = {Tokyo, Japan},
}
Lightweight Application-Level Crash Consistency on Transactional Flash Storage Changwoo Min, Woon-Hak Kang, Taesoo Kim, Sang-Won Lee, and Young Ik Eom. ATC 2015.

Applications implement their own update protocols to ensure consistency of data on the file system. However, since current file systems provide only a preliminary ordering guarantee, notably fsync(), these update protocols become complex, slow, and error-prone.

We present a new file system, CFS, that supports a native interface for applications to maintain crash consistency of their data. Using CFS, applications can achieve crash consistency of data by declaring code regions that must operate atomically. By utilizing transactional flash storage (SSD/X-FTL), CFS implement a lightweight mechanism for crash consistency. Without using any heavyweight mechanisms based on redundant writes and ordering, CFS can atomically write multiple data pages and their relevant metadata to storage.

We made three technical contributions to develop a crash consistency interface with SSD/X-FTL in CFS: selective atomic propagation of dirty pages, in-memory metadata logging, and delayed deallocation. Our evaluation of five real-world applications shows that CFS-based applications significantly outperform ordering versions: 2-5x faster by reducing disk writes 1.9-4.1x and disk cache flushing 1.1-17.6x. Importantly, our porting effort is minimal: CFS requires 317 lines of modifications from 3.5 million lines of ported applications.

@inproceedings{min:cfs,
  title        = {{Lightweight Application-Level Crash Consistency on Transactional Flash Storage}},
  author       = {Changwoo Min and Woon-Hak Kang and Taesoo Kim and Sang-Won Lee and Young Ik Eom},
  booktitle    = {Proceedings of the 2015 USENIX Annual Technical Conference (ATC)},
  month        = jul,
  year         = 2015,
  address      = {Santa Clara, CA},
}
MetaSync: File synchronization across multiple untrusted storage services Seungyeop Han, Haichen Shen, Taesoo Kim, Arvind Krishnamurthy, Thomas Anderson, and David Wetherall. ATC 2015.

Cloud-based file synchronization services, such as Dropbox, are a worldwide resource for many millions of users. However, individual services often have tight resource limits, suffer from temporary outages or even shutdowns, and sometimes silently corrupt or leak user data.

We design, implement, and evaluate MetaSync, a secure and reliable file synchronization service that uses multiple cloud synchronization services as untrusted storage providers. To make MetaSync work correctly, we devise a novel variant of Paxos that provides efficient and consistent updates on top of the unmodified APIs exported by existing services. Our system automatically redistributes files upon reconfiguration of providers.

Our evaluation shows that MetaSync provides low update latency and high update throughput while being more trustworthy and available. MetaSync outperforms its underlying cloud services by 1.2-10x on three realistic workloads.

@inproceedings{han:metasync,
  title        = {{MetaSync}: File Synchronization Across Multiple Untrusted Storage Services},
  author       = {Seungyeop Han and Haichen Shen and Taesoo Kim and Arvind Krishnamurthy and Thomas Anderson and David Wetherall},
  booktitle    = {Proceedings of the 2015 USENIX Annual Technical Conference (ATC)},
  month        = jul,
  year         = 2015,
  address      = {Santa Clara, CA},
}
Preventing use-after-free with dangling pointers nullification Byoungyoung Lee, Chengyu Song, Yeongjin Jang, Tielei Wang, Taesoo Kim, Long Lu, and Wenke Lee. NDSS 2015.

Many system components and network applications are written in languages that are prone to memory corruption vulnerabilities. There have been countless cases where simple mistakes by developers resulted in memory corruption vulnerabilities and consequently security exploits. While there have been tremendous research efforts to mitigate these vulnerabilities, use-after-free still remains one of the most critical and popular attack vectors because existing proposals have not adequately addressed the challenging program analysis and runtime performance issues.

In this paper we present DangNull, a system that detects temporal memory safety violations --- in particular, use-after-free and double-free --- during runtime. DangNull relies on the key observation that the root cause of these violations is that pointers are not nullified after the target object is freed. Based on this observation, DangNull automatically traces the object's relationships via pointers and automatically nullifies all pointers when the target object is freed. DangNull offers several benefits. First, DangNull addresses the root cause of temporal memory safety violations. It does not rely on the side effects of violations, which can vary and may be masked by attacks. Thus, DangNull is effective against even the most sophisticated exploitation techniques. Second, DangNull checks object relationship information using runtime object range analysis on pointers, and thus is able to keep track of pointer semantics more robustly even in complex and large scale software. Lastly, DangNull does not require numerous explicit sanity checks on memory accesses because it can detect a violation with implicit exception handling, and thus its detection capabilities only incur moderate performance overhead.

@inproceedings{lee:dangnull,
  title        = {Preventing Use-after-free with Dangling Pointers Nullification},
  author       = {Byoungyoung Lee and Chengyu Song and Yeongjin Jang and Tielei Wang and Taesoo Kim and Long Lu and Wenke Lee},
  booktitle    = {Proceedings of the 2015 Annual Network and Distributed System Security Symposium (NDSS)},
  month        = feb,
  year         = 2015,
  address      = {San Diego, CA},
}

2014

Identifying information disclosure in web applications with retroactive auditing Haogang Chen, Taesoo Kim, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. OSDI 2014.

Rail is a framework for building web applications that can precisely identify inappropriately disclosed data after a vulnerability is discovered. To do so, Rail introduces retroactive disclosure auditing: re-running the application with previous inputs once the vulnerability is fixed to determine what data should have been disclosed. A key challenge for Rail is to reconcile state divergence between the original and replay executions, so that the differences between executions precisely correspond to inappropriately disclosed data. Rail provides application developers with APIs to address this challenge, by identifying sensitive data, assigning semantic names to non-deterministic inputs, and tracking dependencies.

Results from a prototype of Rail built on top of the Meteor framework show that Rail can quickly and precisely identify data disclosure from complex attacks, including programming bugs, administrative mistakes, and stolen passwords. Rail incurs up to 22% throughput overhead and 0.5 KB storage overhead per request. Porting three existing web applications required fewer than 25 lines of code changes per application.

@inproceedings{chen:rail,
  title        = {Identifying information disclosure in web applications with retroactive auditing},
  author       = {Haogang Chen and Taesoo Kim and Xi Wang and Nickolai Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI)},
  month        = oct,
  year         = 2014,
  address      = {Broomfield, Colorado},
}
Abusing performance optimization weaknesses to bypass ASLR Byoungyoung Lee, Yeongjin Jang, Tielei Wang, Chengyu Song, Long Lu, Taesoo Kim, and Wenke Lee. Black Hat USA 2014.

The primary goal of ASLR is to effectively randomize a program's memory layout so that adversaries cannot easily infer such information. As ASLR is a critical defense against exploitation, there have been tremendous efforts to evaluate the mechanism's security. To date, previous attacks that bypass ASLR have focused mostly on exploiting memory leak vulnerabilities, or abusing non-randomized data structures.

In this presentation, we leverage vulnerabilities introduced by performance-oriented software design to reveal new ways in which ASLR can be bypassed. In addition to describing how vulnerabilities originate from such designs, we will present real attacks that exploit them.

First, we analyze general hash table designs for various programming languages (JavaScript, Python, Ruby). To optimize object tracking for such languages, their interpreters may leak address information. Some hash table implementations directly store the address information in the table, whileothers permit inference of address information through repeated table scanning. We exhaustively examined several popular languages to see whether each of them has one or both of these problems, and present how they can be leveraged. As a concrete example, we demonstrate how address information can be leaked in the Safari web browser by simply running some JavaScript.

Second, we present an analysis of the Zygote process creation model, which is an Android operating system design for speeding up application launches. The results of our examination show that Zygote weakens ASLR because all applications are created with largely identical memory layouts. To highlight the severity of this issue, we demonstrate two different ASLR bypass attacks using real applications - Google Chrome and VLC Media Player.

@inproceedings{lee:breakaslr,
  title        = {Abusing Performance Optimization Weaknesses to Bypass {ASLR}},
  author       = {Byoungyoung Lee and Yeongjin Jang and Tielei Wang and Chengyu Song and Long Lu and Taesoo Kim and Wenke Lee},
  booktitle    = {Black Hat USA Briefings (Black Hat USA)},
  year         = 2014,
  month        = aug,
  address      = {Las Vegas, NV},
}
From Zygote to Morula: Fortifying weakened ASLR on Android Byoungyoung Lee, Long Lu, Tielei Wang, Taesoo Kim, and Wenke Lee. IEEE SP 2014.

There have been many research efforts to secure Android applications and the high-level system mechanisms. The low-level operating system designs have been overlooked partially due to the belief that security issues at this level are similar to those on Linux, which are well-studied. However, we identify that certain Android modifications are at odds with security and result in serious vulnerabilities that need to be addressed immediately.

In this paper, we analyze the Zygote process creation model, an Android operating system design for speeding up application launches. Zygote weakens Address Space Layout Randomization (ASLR) because all application processes are created with largely identical memory layouts.

We design both remote and local attacks capable of bypassing the weakened ASLR and executing return-oriented programming on Android. We demonstrate the attacks using real applications, such as the Chrome Browser and VLC Media Player. Further, we design and implement Morula, a secure replacement for Zygote. Morula introduces a small amount of code to the Android operating system and can be easily adopted by device vendors. Our evaluation shows that, compared to Zygote, Morula incurs a 13MB memory increase for each running application but allows each Android process to have an individually randomized memory layout and even a slightly shorter average launch time.

@inproceedings{lee:morula,
  title        = {{From Zygote to Morula}: Fortifying weakened {ASLR} on {Android}},
  author       = {Byoungyoung Lee and Long Lu and Tielei Wang and Taesoo Kim and Wenke Lee},
  booktitle    = {Proceedings of the 35th IEEE Symposium on Security and Privacy (Oakland)},
  month        = may,
  year         = 2014,
  address      = {San Jose, CA},
}

2013

Asynchronous intrusion recovery for interconnected web services Ramesh Chandra, Taesoo Kim, and Nickolai Zeldovich. SOSP 2013.

Recovering from attacks in an interconnected system is difficult, because an adversary that gains access to one part of the system may propagate to many others, and tracking down and recovering from such an attack requires significant manual effort. Web services are an important example of an interconnected system, as they are increasingly using protocols such as OAuth and REST APIs to integrate with one another. This paper presents Aire, an intrusion recovery system for such web services. Aire addresses several challenges, such as propagating repair across services when some servers may be unavailable, and providing appropriate consistency guarantees when not all servers have been repaired yet. Experimental results show that Aire can recover from four realistic attacks, including one modeled after a recent Facebook OAuth vulnerability; that porting existing applications to Aire requires little effort; and that Aire imposes a 19-30% CPU overhead and 6-9 KB/request storage cost for Askbot, an existing web application.

@inproceedings{chandra:aire,
  title        = {Asynchronous Intrusion Recovery for Interconnected Web Services},
  author       = {Ramesh Chandra and Taesoo Kim and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP)},
  month        = nov,
  year         = 2013,
  address      = {Farmington, PA},
}
Security bugs in embedded interpreters Haogang Chen, Cody Cutler, Taesoo Kim, Yandong Mao, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. APSys 2013.

Because embedded interpreters offer flexibility and performance, they are becoming more prevalent, and can be found at nearly every level of the software stack. As one example, the Linux kernel defines languages to describe packet filtering rules and uses embedded interpreters to filter packets at run time. As another example, the RAR archive format allows embedding bytecode in compressed files to describe reversible transformations for decompression. This paper presents an analysis of common pitfalls in embedded interpreter implementations, which can lead to security vulnerabilities, and their impact. We hope that these results are useful both in augmenting existing embedded interpreters and in aiding developers in building new, more secure embedded interpreters.

@inproceedings{chen:vmsec,
  title        = {Security Bugs in Embedded Interpreters},
  author       = {Haogang Chen and Cody Cutler and Taesoo Kim and Yandong Mao and Xi Wang and Nickolai Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 4th Asia-Pacific Workshop on Systems (APSys)},
  month        = jul,
  year         = 2013,
  address      = {Singapore},
}
Optimizing unit test execution in large software programs using dependency analysis Taesoo Kim, Ramesh Chandra, and Nickolai Zeldovich. APSys 2013.

Tao is a system that optimizes the execution of unit tests in large software programs and reduces the programmer wait time from minutes to seconds. Tao is based on two key ideas: First, Tao focuses on efficiency, unlike past work that focused on avoiding false negatives. Tao implements simple and fast function-level dependency tracking that identifies tests to run on a code change; any false negatives missed by this dependency tracking are caught by running the entire test suite on a test server once the code change is committed. Second, to make it easy for programmers to adopt Tao, it incorporates the dependency information into the source code repository. This paper describes an early prototype of Tao and demonstrates that Tao can reduce unit test execution time in two large Python software projects by over 96% while incurring few false negatives.

@inproceedings{kim:tao,
  title        = {Optimizing Unit Test Execution in Large Software Programs using Dependency Analysis},
  author       = {Taesoo Kim and Ramesh Chandra and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 4th Asia-Pacific Workshop on Systems (APSys)},
  month        = jul,
  year         = 2013,
  address      = {Singapore},
}
Practical and effective sandboxing for non-root users Taesoo Kim, and Nickolai Zeldovich. ATC 2013.

Mbox is a lightweight sandboxing mechanism for non-root users in commodity OSes. Mbox's sandbox usage model executes a program in the sandbox and prevents the program from modifying the host filesystem by layering the sandbox filesystem on top of the host filesystem. At the end of program execution, the user can examine changes in the sandbox filesystem and selectively commit them back to the host filesystem. Mbox implements this by interposing on system calls and provides a variety of useful applications: installing system packages as a non-root user, running unknown binaries safely without network accesses, checkpointing the host filesystem instantly, and setting up a virtual development environment without special tools. Our performance evaluation shows that Mbox imposes CPU overheads of 0.1–45.2% for various workloads. In this paper, we present Mbox's design, efficient techniques for interposing on system calls, our experience avoiding common system call interposition pitfalls, and Mbox's performance evaluation.

@inproceedings{kim:mbox,
  title        = {Practical and Effective Sandboxing for Non-root Users},
  author       = {Taesoo Kim and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 2013 USENIX Annual Technical Conference (ATC)},
  month        = jun,
  year         = 2013,
  address      = {San Jose, CA},
}

2012

Efficient patch-based auditing for web application vulnerabilities Taesoo Kim, Ramesh Chandra, and Nickolai Zeldovich. OSDI 2012.

Poirot is a system that, given a patch for a newly discovered security vulnerability in a web application, helps administrators detect past intrusions that exploited the vulnerability. Poirot records all requests to the server during normal operation, and given a patch, re-executes requests using both patched and unpatched software, and reports to the administrator any request that executes differently in the two cases. A key challenge with this approach is the cost of re-executing all requests, and Poirot introduces several techniques to reduce the time required to audit past requests, including filtering requests based on their control flow and memoization of intermediate results across different requests.

A prototype of Poirot for PHP accurately detects attacks on older versions of MediaWiki and HotCRP, given subsequently released patches. Poirot's techniques allow it to audit past requests 12-51 times; faster than the time it took to originally execute the same requests, for patches to code executed by every request, under a realistic MediaWiki workload.

@inproceedings{kim:poirot,
  title        = {Efficient Patch-based Auditing for Web Application Vulnerabilities},
  author       = {Taesoo Kim and Ramesh Chandra and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI)},
  month        = oct,
  year         = 2012,
  address      = {Hollywood, CA},
}
StealthMem: System-level protection against cache-based side channel attacks in the cloud Taesoo Kim, Marcus Peinado, and Gloria Mainar-Ruiz. USENIX Security 2012.

Cloud services are rapidly gaining adoption due to the promises of cost efficiency, availability, and on-demand scaling. To achieve these promises, cloud providers share physical resources to support multi-tenancy of cloud platforms. However, the possibility of sharing the same hardware with potential attackers makes users reluctant to off-load sensitive data into the cloud. Worse yet, researchers have demonstrated side-channel attacks via shared memory caches to break full encryption keys of AES, DES, and RSA.

We present StealthMem, a system-level protection mechanism against cache-based side-channel attacks in the cloud. StealthMem manages a set of locked cache lines per core, which are never evicted from the cache, and efficiently multiplexes them so that each VM can load its own sensitive data into the locked cache lines. Thus, any VM can hide memory access patterns on confidential data from other VMs. Unlike existing state-of-the-art mitigation methods, StealthMem works with existing commodity hardware and does not require profound changes to application software. We also present a novel idea and prototype for isolating cache lines while fully utilizing memory by exploiting architectural properties of set-associative caches. StealthMem imposes 1.05% of performance overhead on the SPEC 2006 CPU benchmark, and 3.4-5.8% overhead on secured AES, DES and Blowfish, requiring only 3-34 lines of code changes from the original implementations.

@inproceedings{kim:stealthmem,
  title        = {{StealthMem}: System-Level Protection Against Cache-based Side Channel Attacks in the Cloud},
  author       = {Taesoo Kim and Marcus Peinado and Gloria Mainar-Ruiz},
  booktitle    = {Proceedings of the 21st USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2012,
  address      = {Bellevue, WA},
}
Recovering from intrusions in distributed systems with Dare Taesoo Kim, Ramesh Chandra, and Nickolai Zeldovich. APSys 2012.

Dare is a system that recovers system integrity after intrusions that spread between machines in a distributed system. Dare extends the rollback-and-reexecute recovery model of Retro to distributed system recovery by solving the following challenges: tracking dependencies across machines, repairing network connections, minimizing distributed repair, and dealing with long-running daemon processes. This paper describes an early prototype of Dare, presents some preliminary results, and discusses open problems.

@inproceedings{kim:dare,
  title        = {Recovering from Intrusions in Distributed Systems with {Dare}},
  author       = {Taesoo Kim and Ramesh Chandra and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 3rd Asia-Pacific Workshop on Systems (APSys)},
  month        = jul,
  year         = 2012,
  address      = {Seoul, South Korea},
}

2011

Intrusion recovery for database-backed web applications Ramesh Chandra, Taesoo Kim, Meelap Shah, Neha Narula, and Nickolai Zeldovich. SOSP 2011.

Warp is a system that helps users and administrators of web applications recover from intrusions such as SQL injection, cross-site scripting, and clickjacking attacks, while preserving legitimate user changes. Warp repairs from an intrusion by rolling back parts of the database to a version before the attack, and replaying subsequent legitimate actions. Warp allows administrators to retroactively patch security vulnerabilities - i.e., apply new security patches to past executions - to recover from intrusions without requiring the administrator to track down or even detect attacks. Warp's time-travel database allows fine-grained rollback of database rows, and enables repair to proceed concurrently with normal operation of a web application. Finally, Warp captures and replays user input at the level of a browser's DOM, to recover from attacks that involve a user's browser. For a web server running MediaWiki, Warp requires no application source code changes to recover from a range of common web application vulnerabilities with minimal user input at a cost of 24-27% in throughput and 2-3.2 GB/day in storage.

@inproceedings{chandra:warp,
  title        = {Intrusion Recovery for Database-backed Web Applications},
  author       = {Ramesh Chandra and Taesoo Kim and Meelap Shah and Neha Narula and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP)},
  month        = oct,
  year         = 2011,
  address      = {Cascais, Portugal},
}

2010

Intrusion recovery using selective re-execution Taesoo Kim, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. OSDI 2010.

Retro repairs a desktop or server after an adversary compromises it, by undoing the adversary's changes while preserving legitimate user actions, with minimal user involvement. During normal operation, Retro records an action history graph, which is a detailed dependency graph describing the system's execution. Retro uses refinement to describe graph objects and actions at multiple levels of abstraction, which allows for precise dependencies. During repair, Retro uses the action history graph to undo an unwanted action and its indirect effects by first rolling back its direct effects, and then re-executing legitimate actions that were influenced by that change. To minimize user involvement and re-execution, Retro uses predicates to selectively re-execute only actions that were semantically affected by the adversary's changes, and uses compensating actions to handle external effects.

An evaluation of a prototype of Retro for Linux with 2 real-world attacks, 2 synthesized challenge attacks, and 6 attacks from previous work, shows that Retro can often repair the system without user involvement, and avoids false positives and negatives from previous solutions. These benefits come at the cost of 35-127% in execution time overhead and of 4-150 GB of log space per day, depending on the workload. For example, a HotCRP paper submission web site incurs 35% slowdown and generates 4 GB of logs per day under the workload from 30 minutes prior to the SOSP 2007 deadline.

@inproceedings{kim:retro,
  title        = {Intrusion recovery using selective re-execution},
  author       = {Taesoo Kim and Xi Wang and Nickolai Zeldovich and M. Frans Kaashoek},
  booktitle    = {Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI)},
  month        = oct,
  year         = 2010,
  address      = {Vancouver, Canada},
}
Making Linux protection mechanisms egalitarian with UserFS Taesoo Kim, and Nickolai Zeldovich. USENIX Security 2010.

UserFS provides egalitarian OS protection mechanisms in Linux. UserFS allows any user—not just the system administrator—to allocate Unix user IDs, to use chroot, and to set up firewall rules in order to confine untrusted code. One key idea in UserFS is representing user IDs as files in a /proc-like file system, thus allowing applications to manage user IDs like any other files, by setting permissions and passing file descriptors over Unix domain sockets. UserFS addresses several challenges in making user IDs egalitarian, including accountability, resource allocation, persistence, and UID reuse. We have ported several applications to take advantage of UserFS; by changing just tens to hundreds of lines of code, we prevented attackers from exploiting application-level vulnerabilities, such as code injection or missing ACL checks in a PHP-based wiki application. Implementing UserFS requires minimal changes to the Linux kernel-a single 3,000-line kernel module-and incurs no performance overhead for most operations, making it practical to deploy on real systems.

@inproceedings{kim:userfs,
  title        = {Making {Linux} Protection Mechanisms Egalitarian with {UserFS}},
  author       = {Taesoo Kim and Nickolai Zeldovich},
  booktitle    = {Proceedings of the 19th USENIX Security Symposium (Security)},
  month        = aug,
  year         = 2010,
  address      = {Washington, DC},
}