Techniques For Efficient Binary-Level Coverage Analysis

  • Code coverage analysis plays an important role in the software testing process. More recently, the remarkable effectiveness of coverage feedback has triggered a broad interest in feedback-guided fuzzing. In this work, we discuss static instrumentation techniques for binary-level coverage analysis without compiler support. We show that the proposed techniques are precise, efficient, and transparent significantly beyond the state of the art. We implement these techniques into two tools, namely, Spedi and bcov. Both tools are open source and publicly available. Spedi shows that the disassembly and function identification of stripped binaries can be highly accurate without resort to any external information. We build on these important results in bcov where we statically instrument x86-64 ELF binaries to track code coverage. However, improving efficiency and scaling to large real-world software required an orchestrated effort combining several techniques. First, we bring a well-known probe pruning technique, for the first time, to binary-level instrumentation and effectively leverage its notion of superblocks to reduce overhead. Second, we introduce sliced microexecution, a robust technique for jump table analysis which improves CFG precision and enables us to instrument jump table entries. Additionally, smaller instructions in x86-64 pose a challenge for inserting detours. To address this challenge, we aggressively exploit padding bytes. Also, we introduce a greedy scheme to systematically host detours in neighboring basic blocks. We evaluate bcov on a corpus of 95 binaries compiled from eight popular and well-tested packages like FFmpeg and LLVM. Two instrumentation policies, with different edge-level precision, are used to patch all functions in this corpus - over 1.6 million functions. Our precise policy has average performance and memory overheads of 14% and 22%, respectively. Instrumented binaries do not introduce any test regressions. The reported coverage is highly accurate with an average F-score of 99.86%. Finally, our jump table analysis is comparable to that of IDA Pro on gcc binaries and outperforms it on clang binaries. Our work demonstrates that static instrumentation can offer unique advantages in comparison to established methods like compiler instrumentation and dynamic binary instrumentation. It also opens the door for many interesting applications of static instrumentation, which can go well beyond coverage analysis.

Volltext Dateien herunterladen

Metadaten exportieren

Metadaten
Verfasser*innenangaben:Mohamed Ammar Ben Khadra
URN:urn:nbn:de:hbz:386-kluedo-64105
DOI:https://doi.org/10.26204/KLUEDO/6410
Betreuer*in:Wolfgang Kunz
Dokumentart:Dissertation
Sprache der Veröffentlichung:Englisch
Datum der Veröffentlichung (online):15.06.2021
Jahr der Erstveröffentlichung:2021
Veröffentlichende Institution:Technische Universität Kaiserslautern
Titel verleihende Institution:Technische Universität Kaiserslautern
Datum der Annahme der Abschlussarbeit:21.05.2021
Datum der Publikation (Server):16.06.2021
Freies Schlagwort / Tag:binary analysis; code coverage analysis; jump table analysis; machine code analysis; probe pruning; static instrumentation
GND-Schlagwort:software; analysis; reverse; engineering
Seitenzahl:X, 112
Fachbereiche / Organisatorische Einheiten:Kaiserslautern - Fachbereich Elektrotechnik und Informationstechnik
CCS-Klassifikation (Informatik):D. Software / D.2 SOFTWARE ENGINEERING (K.6.3) / D.2.0 General (K.5.1)
DDC-Sachgruppen:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Lizenz (Deutsch):Creative Commons 4.0 - Namensnennung (CC BY 4.0)