Techniques For Efficient Binary-Level Coverage Analysis

  • Code coverage analysis plays an important role in the software testing process. More recently, the remarkable effectiveness of coverage feedback has triggered a broad interest in feedback-guided fuzzing. In this work, we discuss static instrumentation techniques for binary-level coverage analysis without compiler support. We show that the proposed techniques are precise, efficient, and transparent significantly beyond the state of the art. We implement these techniques into two tools, namely, Spedi and bcov. Both tools are open source and publicly available. Spedi shows that the disassembly and function identification of stripped binaries can be highly accurate without resort to any external information. We build on these important results in bcov where we statically instrument x86-64 ELF binaries to track code coverage. However, improving efficiency and scaling to large real-world software required an orchestrated effort combining several techniques. First, we bring a well-known probe pruning technique, for the first time, to binary-level instrumentation and effectively leverage its notion of superblocks to reduce overhead. Second, we introduce sliced microexecution, a robust technique for jump table analysis which improves CFG precision and enables us to instrument jump table entries. Additionally, smaller instructions in x86-64 pose a challenge for inserting detours. To address this challenge, we aggressively exploit padding bytes. Also, we introduce a greedy scheme to systematically host detours in neighboring basic blocks. We evaluate bcov on a corpus of 95 binaries compiled from eight popular and well-tested packages like FFmpeg and LLVM. Two instrumentation policies, with different edge-level precision, are used to patch all functions in this corpus - over 1.6 million functions. Our precise policy has average performance and memory overheads of 14% and 22%, respectively. Instrumented binaries do not introduce any test regressions. The reported coverage is highly accurate with an average F-score of 99.86%. Finally, our jump table analysis is comparable to that of IDA Pro on gcc binaries and outperforms it on clang binaries. Our work demonstrates that static instrumentation can offer unique advantages in comparison to established methods like compiler instrumentation and dynamic binary instrumentation. It also opens the door for many interesting applications of static instrumentation, which can go well beyond coverage analysis.

Download full text files

Export metadata

Author:Mohamed Ammar Ben Khadra
Advisor:Wolfgang Kunz
Document Type:Doctoral Thesis
Language of publication:English
Date of Publication (online):2021/06/15
Year of first Publication:2021
Publishing Institution:Technische Universität Kaiserslautern
Granting Institution:Technische Universität Kaiserslautern
Acceptance Date of the Thesis:2021/05/21
Date of the Publication (Server):2021/06/16
Tag:binary analysis; code coverage analysis; jump table analysis; machine code analysis; probe pruning; static instrumentation
GND Keyword:software; analysis; reverse; engineering
Page Number:X, 112
Faculties / Organisational entities:Kaiserslautern - Fachbereich Elektrotechnik und Informationstechnik
CCS-Classification (computer science):D. Software / D.2 SOFTWARE ENGINEERING (K.6.3) / D.2.0 General (K.5.1)
DDC-Cassification:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Licence (German):Creative Commons 4.0 - Namensnennung (CC BY 4.0)