Recently I was building a Xamarin application as a capture the flag exercise. In the past, I’ve noted that the C# DLLs for Xamarin end up the
assemblies/ directory within an APK file. This time, however, I noticed that there were only two files present in the
assemblies.manifest. A quick Google search indicated that this format is a recent change to how Xamarin packs DLLs, and the tooling (other than the official Xamarin C# utilities) hasn’t caught up. In this short post, I’ll talk about my experiences with these 2 files and link to a Python utility that can be used to unpack
assemblies.blob files. If you’d rather just use the tool and move on, it can be found here.
Understanding the “assemblies.manifest”
assemblies.manifest file is an ASCII file that lists the names, IDs, and other metadata of the Xamarin DLL files. Its structure is described here and a basic parser can be found here. As we’ll find out in the next section, the only real piece of data useful within this file is the
Name field, as (at least in my testing) there is no DLL name data within the
I was also curious about the first two fields:
Hash 32 and
Hash 64. My initial suspicion was maybe these are checksums or an integrity check on the respective DLL. After digging through the Xamarin C# code, it was clear that was not the case. These fields are actually far less interesting from a reverse-engineering standpoint, as they’re just the output generated by using the
xxHash non-cryptographic hashing algorithm (source here) on the DLL’s name. Unless for some reason you’re changing the name of DLLs or adding DLLs to the AssemblyStore, this value can be ignored. There are plenty of
xxHash bindings (python, C#, ruby) should you need to set/reset these fields.
Let’s move on to the more exciting of the two files.
Understanding the “assemblies.blob”
assemblies.blob is where the real excitement is and requires a little more analysis, as it’s a binary structure. At the top level, the parser here describes the layout, but it also references additional classes. I’ll refer to this structure as an
AssemblyStore for the rest of the blog. The header for the
AssemblyStore is 20 bytes and includes the magic
XABA, version (which currently is 1), and number of included assembly files.
Immediately following the header is the confusingly named
AssemblyStoreAssembly class described here. For each included assembly file, a 24-byte structure will be present. These don’t have IDs but appear to be serialized in the correct order (e.g., the first structure would be index 0, then index 1, etc.). The
DataSize are both important here, as this tells you where in the
assemblies.blob file the DLL lives, and how many bytes to extract. In my testing, the remaining fields were all zeroed out so I’m not going to be talking about them.
Next, hash data for each assembly is detailed. The structure is 20 bytes and maps to the
Hash 32 and
Hash 64 values in the
assemblies.manifest above. You’ll see a 20-byte structure for each assembly for the
Hash 32, followed by the same number of 20-byte structures for the
Hash 64 values. There’s really no need to dig into these sections unless you intend to add additional assemblies to the
AssemblyStore or change names, but details on the structure can be found here.
The rest of data in the
assemblies.blob is the actual DLL content, which, combined with the
assemblies.manifest data, we can use to extract and name the associated DLLs.
Extracting from the assemblies.blob
Putting it all together, we can first parse the
assemblies.manifest to build a list of assemblies by name and index, then iterate through the
AssemblyStoreAssembly structures of the
assemblies.blob to extract the DLL with the correct name. That’s exactly what
unpack-store.py does. Once extracted, you may also need to decompress these DLLs, as it appears Xamarin is using LZ4 compression to pack the DLLs. That’s no worry, because there is already an existing tool to accomplish this. From here, you can pop the unpacked DLLs into your favorite C# reversing tool (such as dnSpy or ILSpy) to perform your analysis).
An exercise that I’ve not performed yet but could see value in doing is modifying content in one of the unpacked DLLs. Of course, you could use a dynamic instrumentation framework like Frida to perform this, but in the cases where that doesn’t work, it seems very possible to reverse the steps described above: Modify an unpacked DLL using dnSpy, pack with LZ4, re-package the
AssemblyStore object, serialize to a new
assemblies.blob, rebuild the APK. I can implement this functionality on a future release if people would find it useful.
And that’s it! I hope you found this write up useful. If the content has changed since writing this, don’t hesitate to reach out and correct me.