In this tutorial, we will cover:
- PyLingual Internals: PyLingual is the first decompilation framework for Python that leverages NLP components.
- Perfect Decompilation: PyLingual strictly verifies the correctness of the decompiled results with instruction-level comparisons. The tutorial has three parts:
1. Introduction to PyLingual
We will begin with an overview of PyLingual, including its architecture and key components. PyLingual is accessible through an online web service (https://pylingual.io/) and through an open-source GitHub repository to run locally (https://github.com/syssec-utd/pylingual). We will cover the steps and considerations for setting up a local environment.
2. Perfect Decompilation with PyLingual
Because PyLingual uses NLP techniques, its decompilation results can be incomplete or contain errors. PyLingual automatically verifies perfect decompilations with a strict instruction-level equivalence test. The PyLingual web services integrate the verification mechanism into a web IDE, enabling reverse engineers to review and correct imperfect decompilations. We will demonstrate this workflow using real-world malware samples.
3. Demystifying Obfuscated Python Binaries
We will explore common obfuscation techniques observed in the wild through the PyLingual web service. The session will also introduce practical methods for analyzing and deobfuscating obfuscated Python bytecode.
[1] Wiedemeier, J. et al. PyLingual: Toward Perfect Decompilation of Evolving High-Level Languages. In IEEE Symposium on Security and Privacy (SP) (2025).
[2] Wiedemeier, J. et al. PyLingual: A Python Decompilation Framework for Evolving Python Versions. In BlackHat USA (2024).
[3] Wiedemeier, J. There and Back Again: Reverse Engineering Python Binaries. in (2024).