For the past decade, attackers have preferred to package malware in Microsoft Office file formats, particularly Word and Excel. In fact, in Q1 2022 nearly half (45%) of malware stopped by HP Wolf Security used Office formats. The reasons are clear: users are familiar with these file types, the applications used to open them are ubiquitous, and they are suited to social engineering lures.
In this post, we look at a malware campaign isolated by HP Wolf Security earlier this year that had an unusual infection chain. The malware arrived in a PDF document – a format attackers less commonly used to infect PCs – and relied on several tricks to evade detection, such as embedding malicious files, loading remotely-hosted exploits, and shellcode encryption.
Figure 1 – Alert timeline in HP Wolf Security Controller showing the malware being isolated.
PDF Campaign Delivering Snake Keylogger
A PDF document named “REMMITANCE INVOICE.pdf” was sent as an email attachment to a target. Since the document came from a risky vector – email, in this case – when the user opened it, HP Sure Click ran the file in an isolated micro virtual machine, preventing their system from being infected.
After opening the document, Adobe Reader prompts the user to open a .docx file. The attackers sneakily named the Word document “has been verified. However PDF, Jpeg, xlsx, .docx” to make it look as though the file name was part of the Adobe Reader prompt (Figure 2).
Figure 2 – PDF document prompting the user to open another document.
Analyzing the PDF file reveals that the .docx file is stored as an EmbeddedFile object. Investigators can quickly summarize the most important properties of a PDF document using Didier Stevens’ pdfid script (Figure 3).
Figure 3 – PDFiD analysis of the document.
To analyze the EmbeddedFilewe can use another tool from Didier Stevens’ toolbox, pdf-parser. This script allows us to extract the file from the PDF document and save it to disk.
Figure 4 – Using pdf-parser to save embedded file to disk.
Embedded Word Document
If we return to our PDF document and click on “Open this file” at the prompt, Microsoft Word opens. If Protected View is disabled, Word downloads a Rich Text Format (.rtf) file from a web server, which is then run in the context of the open document.
Figure 5 – Word document contacting web server.
Since Microsoft Word does not say which server it contacted, we can use Wireshark to record the network traffic and identify the HTTP stream that was created (Figure 6).
Figure 6 – HTTP GET request returning RTF file.
Let’s switch back to the Word document to understand how it downloads the .rtf. Since it is an OOXML (Office Open XML) file, we can unzip its contents and look for URLs in the document using the command shown in Figure 7.
Figure 7 – List of URLs in the Word document.
The highlighted URL caught our eye because it’s not a legitimate domain found in Office documents. This URL is in the document.xml.rels file, which lists the document’s relationships. The relationship that caught our eye shows an external object linking and embedding (OLE) object being loaded from this URL (Figure 8).
Figure 8 – XML document relationships.
External OLE Object
Connecting to this URL leads to a redirect and then downloads an RTF document called f_document_shp.doc. To examine this document more closely, we can use rtfobj to check if it contains any OLE objects.
Figure 9 – RTFObj output showing two OLE objects.
Here there are two OLE objects we can save to disk using the same tool. As indicated in the console output, both objects are not well-formed, meaning analyzing them with oletools could lead to confusing results. To fix this, we can use first to reconstruct the malformed objects. Then we can view basic information about the objects using oleid. This tells us the object relates to Microsoft Equation Editor, a feature in Word that is commonly exploited by attackers to run arbitrary code.
Figure 10 – Basic OLE information extracted with oleid.
Encrypted Equation Editor Exploit
Examining the OLE object reveals shellcode that exploits the CVE-2017-11882 remote code execution vulnerability in Equation Editor. There are many analyses of this vulnerability, so we won’t analyze it in detail. Instead we focus below on how the attacker encrypted the shellcode to evade detection.
Figure 11 – Shellcode that exploits CVE-2017-11882.
The shellcode is stored in the OLENativeStream structure at the end of the object. We can then run the shellcode in a debugger, looking for a call to GlobalLock. This function returns a pointer to the first byte of the memory block, a technique used by shellcode to locate itself in memory. Using this information, the shellcode jumps to a defined offset and runs a decryption routine.
Figure 12 – Multiplication and addition part of decryption routine.
The key is multiplied by a constant and added at each iteration. The ciphertext is then decrypted each time with an XOR operation. The decrypted data is more shellcode, which is executed afterwards.
Figure 13 – Decrypted shellcode presenting the payload URL.
Without running it further, we see that the malware downloads an executable called fresh.exe and runs it in the public user directory using ShellExecuteExW. The executable is Snake Keylogger, a family of information-stealing malware that we have written about before. We can now extract indicators of compromise (IOCs) from this malware, for example using dynamic analysis. At this point, we have analyzed the complete infection chain and collected IOCs, which can now be used for threat hunts or building new detections.
While Office formats remain popular, this campaign shows how attackers are also using weaponized PDF documents to infect systems. Embedding files, loading remotely-hosted exploits and encrypting shellcode are just three techniques attackers use to run malware under the radar. The exploited vulnerability in this campaign (CVE-2017-11882) is over four years old, yet continues being used, suggesting the exploit remains effective for attackers.
has been verified. however pdf, jpeg, xlsx, .docx
fresh.exe (Snake Keylogger)
External OLE reference URL
External OLE reference final URL
Snake Keylogger payload URL
Snake Keylogger exfiltration via SMTP