Techniques For Efficient Binary-Level Coverage Analysis
- Code coverage analysis plays an important role in the software testing process. More recently, the remarkable effectiveness of coverage feedback has triggered a broad interest in feedback-guided fuzzing. In this work, we discuss static instrumentation techniques for binary-level coverage analysis without compiler support. We show that the proposed techniques are precise, efficient, and transparent significantly beyond the state of the art. We implement these techniques into two tools, namely, Spedi and bcov. Both tools are open source and publicly available. Spedi shows that the disassembly and function identification of stripped binaries can be highly accurate without resort to any external information. We build on these important results in bcov where we statically instrument x86-64 ELF binaries to track code coverage. However, improving efficiency and scaling to large real-world software required an orchestrated effort combining several techniques. First, we bring a well-known probe pruning technique, for the first time, to binary-level instrumentation and effectively leverage its notion of superblocks to reduce overhead. Second, we introduce sliced microexecution, a robust technique for jump table analysis which improves CFG precision and enables us to instrument jump table entries. Additionally, smaller instructions in x86-64 pose a challenge for inserting detours. To address this challenge, we aggressively exploit padding bytes. Also, we introduce a greedy scheme to systematically host detours in neighboring basic blocks. We evaluate bcov on a corpus of 95 binaries compiled from eight popular and well-tested packages like FFmpeg and LLVM. Two instrumentation policies, with different edge-level precision, are used to patch all functions in this corpus - over 1.6 million functions. Our precise policy has average performance and memory overheads of 14% and 22%, respectively. Instrumented binaries do not introduce any test regressions. The reported coverage is highly accurate with an average F-score of 99.86%. Finally, our jump table analysis is comparable to that of IDA Pro on gcc binaries and outperforms it on clang binaries. Our work demonstrates that static instrumentation can offer unique advantages in comparison to established methods like compiler instrumentation and dynamic binary instrumentation. It also opens the door for many interesting applications of static instrumentation, which can go well beyond coverage analysis.
Author: | Mohamed Ammar Ben Khadra |
---|---|
URN: | urn:nbn:de:hbz:386-kluedo-64105 |
DOI: | https://doi.org/10.26204/KLUEDO/6410 |
Advisor: | Wolfgang Kunz |
Document Type: | Doctoral Thesis |
Language of publication: | English |
Date of Publication (online): | 2021/06/15 |
Year of first Publication: | 2021 |
Publishing Institution: | Technische Universität Kaiserslautern |
Granting Institution: | Technische Universität Kaiserslautern |
Acceptance Date of the Thesis: | 2021/05/21 |
Date of the Publication (Server): | 2021/06/16 |
Tag: | binary analysis; code coverage analysis; jump table analysis; machine code analysis; probe pruning; static instrumentation |
GND Keyword: | software; analysis; reverse; engineering |
Page Number: | X, 112 |
Faculties / Organisational entities: | Kaiserslautern - Fachbereich Elektrotechnik und Informationstechnik |
CCS-Classification (computer science): | D. Software / D.2 SOFTWARE ENGINEERING (K.6.3) / D.2.0 General (K.5.1) |
DDC-Cassification: | 0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik |
Licence (German): | Creative Commons 4.0 - Namensnennung (CC BY 4.0) |