Verder naar navigatie Doorgaan naar hoofdinhoud Ga naar de voettekst

CVE-2018-8611 Exploiting Windows KTM Part 2/5 – Patch analysis and basic triggering

04 mei 2020

door Aaron Adams


TL;DR

Now that we have some basic understanding of KTM functions and the KTM subsystem, as described in part 1 of our blog series, let’s take a look at the vulnerability root cause. We used Diaphora to diff the Windows 7 patch and Windows 10 patch to start. At this time we hadn’t decided which version to do our in-depth analysis on yet, and wanted to confirm that the vulnerability was similar on the latest version of Windows as an older one. Then, we go over a really useful feature of IDA since 7.2 called "Shifted Pointers" which we haven’t seen heavily detailed by other research and show how it helps us reversing KTM code efficiently. Finally, we analyze how we reach the vulnerable code and explain why we decided to trigger the vulnerability with the help of the WinDbg debugger in order to confirm our understanding of the vulnerability as well as how we would approach exploitation.

Diving into the patch

Patch diffing

The CVE-2018-8611 vulnerabilty was patched in December 2018.

It is often useful to diff a patch on multiple Windows versions for the following reasons:

  • Some new features are implemented in some versions but not others

  • Some functions are inlined in some versions but not others

  • A patch is in one module in some versions but in a different one in others

Consequently, this make it sometimes easier to understand a patch on one Windows version. Once you have understood the patch, you can generally easily port it to another version. In our case the important functional changes on Windows 10 are actually the same as Windows 7, so not particually useful here, but this practice is worth keeping in mind when doing this type of diffing work.

On Windows 7 x64 SP1, we noted KB4471328 was released to replace KB4467107. So these are the two versions we reversed and diffed with Diaphora. The reason we take KB4471328 instead of KB4471318 is because it contains less files so it is generally easier to see which file(s) changed:

Since there is no tm.sys on Windows 7, we only analyse ntoskrnl.exe. The generation of .sqlite for each ntoskrnl.exe took 20 minutes and the actual diff took 3 minutes. Note that after you generate the .sqlite for each version, you need to do the actual diff from the old version IDB in order to get the expected result (green color for new code and red for old code). We get the following relevant partial matches:

We see a number of interesting functions, but most notable is TmRecoverResourcemanager().

One small thing to note: as the name TmRecoverResourcemanager() indicates, KTM-related functions are identifiable by a Tm or Tmp function name prefix. Tm most probably stands for "Transaction Manager". We don’t know for sure what the P in Tmp stands for, but it could indicate ‘private’ as it is generally only ‘called’ internally by other KTM functions. The majority of functions that have a TmXXX name are wrapped by a corresponding NtXXX system call, whereas TmpYYY functions are all internal functions called by TmXXX functions and don’t have some directly corresponding system call.

If we recall from the Kaspersky blog they said the exploit was calling NtRecoverResourceManager() function trying to trigger the bug, so we are pretty certain TmRecoverResourceManager() is our candidate, especially because NtRecoverResourceManager() directly calls into TmRecoverResourceManager():

There are about 4 pages worth of assembly-level changes in TmRecoverResourceManager() similar to the image below:

So the vulnerability is not something simple like a better length check implying a simple integer overflow, etc. (which we knew from Kaspersky’s analysis anyway).

If we use the Hex-Rays decompiler view for the diff instead, we get a much better picture. Below corresponds to the only changes in the TmRecoverResourceManager() function. Even if for now we don’t analyse the changes yet, we see how Hex-Rays is valuable compared to the previously assembly diff:

Note that the changes above are for a naked IDB i.e. without our reversing and cleaning up of the Hex-Rays output. The next 2 screenshots show the diff where we actually spent significant time cleaning up the IDB outputs and making them more readable:

Note that you would need to re-do the diff using Diaphora after cleaning up both versions of your binary for best results.

On Windows 10, the diff methodology is very similar. We use tm.sys instead of ntoskrnl.exe though. Since tm.sys is a lot smaller than ntoskrnl.exe the diff is a lot faster and it only gives TmRecoverResourceManager() function for changes which confirms our understanding on Windows 7.

Tidying Hex-Rays output with "Shifted Pointers"

This is a diversion from the vulnerability, and you can skip this part if you are not interested in internal features of IDA Pro/Hex-Rays.

Shifted pointers?

In order to get the decompilation shown above looking clear, we heavily used a feature called "Shifted Pointers" in Hex-Rays to help clean up the output of the related functions, and thought it was worth describing in case people were not too familiar with it. It was the first time we used this feature. Shifted pointers are briefly documented here. This functionality was added in IDA 7.2 and is a bit more documented here.

We encountered an annoyance while analyzing some KTM functionality in so far as there are numerous pointers that are assigned to the address of internal members of a structure (like the middle of the structure), but then that new pointer is used to access other structure members relative to the position of member the pointer points to.

If you read our part 1 of our blog series, we described that _KRESOURCEMANAGER.EnlistmentHead.Flink == _KENLISTMENT.NextSameRM. How do we make it useful when reversing?

Initial Hex-Rays output

Initially, we renamed variables in Hex-Rays and we had the following Hex-Rays output that is fairly confusing to read. For example look at v9 and v6 and access from them below. They are grouped by accesses [aX] and [bX]:

  __int64 __fastcall TmRecoverResourceManager(_KRESOURCEMANAGER *pResMgr)
  {
    _KRESOURCEMANAGER *pResMgr_; // r12
    char v2; // r14
    _KTM *Tm; // rax
    _LIST_ENTRY *EnlistmentHead_ptr; // r13
    _LIST_ENTRY *curr_Enlistment_list_entry; // rdi
    _LIST_ENTRY *v6; // rdi
    int v7; // er15
    unsigned int v8; // er14
    __int64 v9; // rsi
    // ...
  
    pResMgr_ = pResMgr;
    v2 = 0;
    Src_4 = 0;
    KeWaitForSingleObject( pResMgr->Mutex, Executive, 0, 0, 0i64);
    if ( pResMgr_->State == 1 )
      pResMgr_->State = 2;
    Tm = pResMgr_->Tm;
    if ( Tm    Tm->State == 3 )
    {
      EnlistmentHead_ptr =  pResMgr_->EnlistmentHead;
      curr_Enlistment_list_entry = pResMgr_->EnlistmentHead.Flink;
      while ( curr_Enlistment_list_entry != EnlistmentHead_ptr )
      {
[a1]    v9 =  curr_Enlistment_list_entry[-9].Blink;
        curr_Enlistment_list_entry = curr_Enlistment_list_entry->Flink;
[a2]    if ( !(*(v9 + 0xAC)   4) )
        {
[a3]      KeWaitForSingleObject((v9 + 64), Executive, 0, 0, 0i64);
[a4]      *(v9 + 172) |= 0x80u;
[a5]      KeReleaseMutex((v9 + 64), 0);
        }
      }
[b1]  v6 = EnlistmentHead_ptr->Flink;
      v7 = v19;
[b2]  while ( v6 != EnlistmentHead_ptr )
      {
[b3]    if ( BYTE4(v6[2].Flink)   4 )
        {
[b4]      v6 = v6->Flink;
        }
        else
        {
[b5]      ObfReferenceObject( v6[-9].Blink);
[b6]      KeWaitForSingleObject( v6[-5].Blink, Executive, 0, 0, 0i64);
          v10 = 0;
[b7]      if ( (HIDWORD(v6[2].Flink)   0x80u) != 0 )
          {
[b8]        v11 = HIDWORD(v6[2].Flink)   1;
[b9]        if ( v11    ((v12 = v6[1].Blink[12].Flink, v12 == 3) || v12 == 4) )
            {
              v10 = 1;
              v7 = 2048;
            }
[b10]       else if ( !v11    LODWORD(v6[1].Blink[12].Flink) == 5 || (v13 = v6[1].Blink[12].Flink, v13 == 4) || v13 == 3 )
            {
              v10 = 1;
              v7 = 256;
            }
[b11]       HIDWORD(v6[2].Flink)  = 0xFFFFFF7F;
          }

Dealing with shifted pointers

In the excerpt above, we see lots of magic and negative offsets for various pointers.

The first one we are interested in is v9 which depends on curr_Enlistment_list_entry.

    curr_Enlistment_list_entry = pResMgr_->EnlistmentHead.Flink;
    while ( curr_Enlistment_list_entry != EnlistmentHead_addr )
    {
      v9 =  curr_Enlistment_list_entry[-9].Blink;

EnlistmentHead is a member of the _KRESOURCEMANAGER type which is a _LIST_ENTRY. The Flink and Blink members of this _LIST_ENTRY point to another _LIST_ENTRY which is part of the _KENLISTMENT type that we guess because of the name EnlistmentHead. We confirmed this guess by using the !pool command on one of the list entry pointers to make sure that the address lives inside of a _KENLISTMENT structure in the non-paged pool. The method we use to search for all the types containing ENLIST is the WinDbg dt command with wildcards:

1: kd> dt nt!_KRESOURCEMANAGER
nt!_KRESOURCEMANAGER
   +0x000 NotificationAvailable : _KEVENT
   +0x018 cookie           : Uint4B
   +0x01c State            : _KRESOURCEMANAGER_STATE
   +0x020 Flags            : Uint4B
   +0x028 Mutex            : _KMUTANT
   +0x060 NamespaceLink    : _KTMOBJECT_NAMESPACE_LINK
   +0x088 RmId             : _GUID
   +0x098 NotificationQueue : _KQUEUE
   +0x0d8 NotificationMutex : _KMUTANT
   +0x110 EnlistmentHead   : _LIST_ENTRY
   +0x120 EnlistmentCount  : Uint4B
   +0x128 NotificationRoutine : Ptr64     long 
   +0x130 Key              : Ptr64 Void
   +0x138 ProtocolListHead : _LIST_ENTRY
   +0x148 PendingPropReqListHead : _LIST_ENTRY
   +0x158 CRMListEntry     : _LIST_ENTRY
   +0x168 Tm               : Ptr64 _KTM
   +0x170 Description      : _UNICODE_STRING
   +0x180 Enlistments      : _KTMOBJECT_NAMESPACE
   +0x228 CompletionBinding : _KRESOURCEMANAGER_COMPLETION_BINDING
1: kd> dt nt!*ENLIST*
          ntkrnlmp!_KENLISTMENT_STATE
          ntkrnlmp!_KENLISTMENT
          ntkrnlmp!_KENLISTMENT_HISTORY

Another important thing, which is relatively common with linked lists, is that Flink and Blink don’t point to the base of the _KENLISTMENT as the excerpt below shows it is one of NextSameTx or NextSameRm. We don’t yet know which it is, but let us examine the types and offsets to find out:

1: kd> dt nt!_KENLISTMENT
   +0x000 cookie           : Uint4B
   +0x008 NamespaceLink    : _KTMOBJECT_NAMESPACE_LINK
   +0x030 EnlistmentId     : _GUID
   +0x040 Mutex            : _KMUTANT
   +0x078 NextSameTx       : _LIST_ENTRY
   +0x088 NextSameRm       : _LIST_ENTRY
   +0x098 ResourceManager  : Ptr64 _KRESOURCEMANAGER
   +0x0a0 Transaction      : Ptr64 _KTRANSACTION
   +0x0a8 State            : _KENLISTMENT_STATE
   +0x0ac Flags            : Uint4B
   +0x0b0 NotificationMask : Uint4B
   +0x0b8 Key              : Ptr64 Void
   +0x0c0 KeyRefCount      : Uint4B
   +0x0c8 RecoveryInformation : Ptr64 Void
   +0x0d0 RecoveryInformationLength : Uint4B
   +0x0d8 DynamicNameInformation : Ptr64 Void
   +0x0e0 DynamicNameInformationLength : Uint4B
   +0x0e8 FinalNotification : Ptr64 _KTMNOTIFICATION_PACKET
   +0x0f0 SupSubEnlistment : Ptr64 _KENLISTMENT
   +0x0f8 SupSubEnlHandle  : Ptr64 Void
   +0x100 SubordinateTxHandle : Ptr64 Void
   +0x108 CrmEnlistmentEnId : _GUID
   +0x118 CrmEnlistmentTmId : _GUID
   +0x128 CrmEnlistmentRmId : _GUID
   +0x138 NextHistory      : Uint4B
   +0x13c History          : [20] _KENLISTMENT_HISTORY

We know curr_Enlistment_list_entry points to an offset of either 0x78 or 0x88 inside _KENLISTMENT.

We then look at the assembly of the v9 assignment to determine the relative offset from 0x78 or 0x88 inside _KENLISTMENT.

PAGE:0000000140321A76 ; 36:       v9 =  curr_Enlistment_list_entry[-9].Blink;    // 2
PAGE:0000000140321A76
PAGE:0000000140321A76 loc_140321A76:                          ; CODE XREF: TmRecoverResourceManager+8F↑j
PAGE:0000000140321A76                 lea     rsi, [rdi-88h]

Then we immediately deduce that curr_Enlistment_list_entry actually points to 0x88 offset since when we substract 0x88, v9 then points to the beginning of the _KENLISTMENT entry.

So we define v9 as a _KENLISTMENT* and curr_Enlistment_list_entry as a shifted pointer inside _KENLISTMENT: _KENLISTMENT_NextSameRm_ptr curr_Enlistment_NextSameRm_ptr. We add typedef _LIST_ENTRY *__shifted(_KENLISTMENT,0x88) _KENLISTMENT_NextSameRm_ptr; into the "Local Type" window.

The above declaration means that _KENLISTMENT_NextSameRm_ptr is a pointer to _LIST_ENTRY and if we decrement it by 0x88 bytes, we will end up at the beginning of _KENLISTMENT.

We are able to confirm the accuracy of the shifted pointer as we went from this hardly-readable code:

    if ( Tm    Tm->State == 3 )
    {
      EnlistmentHead_ptr =  pResMgr_->EnlistmentHead;
      curr_Enlistment_list_entry = pResMgr_->EnlistmentHead.Flink;
      while ( curr_Enlistment_list_entry != EnlistmentHead_ptr )
      {
[a1]    v9 =  curr_Enlistment_list_entry[-9].Blink;
        curr_Enlistment_list_entry = curr_Enlistment_list_entry->Flink;
[a2]    if ( !(*(v9 + 0xAC)   4) )
        {
[a3]      KeWaitForSingleObject((v9 + 64), Executive, 0, 0, 0i64);
[a4]      *(v9 + 172) |= 0x80u;
[a5]      KeReleaseMutex((v9 + 64), 0);
        }
      }

to the improved code representation:

    if ( Tm    Tm->State == 3 )
    {
      EnlistmentHead_addr =  pResMgr_->EnlistmentHead;
      curr_Enlistment_NextSameRm_ptr = pResMgr_->EnlistmentHead.Flink;
      while ( curr_Enlistment_NextSameRm_ptr != EnlistmentHead_addr )
      {
[a1]    curr_Enlistment = ADJ(curr_Enlistment_NextSameRm_ptr);
        curr_Enlistment_NextSameRm_ptr = ADJ(curr_Enlistment_NextSameRm_ptr)->NextSameRm.Flink;
[a2]    if ( !(curr_Enlistment->Flags   4) )
        {
[a3]      KeWaitForSingleObject( curr_Enlistment->Mutex, Executive, 0, 0, 0i64);
[a4]      curr_Enlistment->Flags |= 0x80u;
[a5]      KeReleaseMutex( curr_Enlistment->Mutex, 0);
        }
      }

There is a similar assignment to get v6 so we again define it as a _KENLISTMENT_NextSameRm_ptr and rename it to curr_Enlistment_NextSameRm_ptr_. We went from this hardly-readable code:

[b1]  v6 = EnlistmentHead_addr->Flink;
      v7 = v19;
[b2]  while ( v6 != EnlistmentHead_addr )
      {
[b3]    if ( BYTE4(v6[2].Flink)   4 )
        {
[b4]      v6 = v6->Flink;
        }
        else
        {
[b5]      ObfReferenceObject( v6[-9].Blink);
[b6]      KeWaitForSingleObject( v6[-5].Blink, Executive, 0, 0, 0i64);
          v10 = 0;
[b7]      if ( (HIDWORD(v6[2].Flink)   0x80u) != 0 )
          {
[b8]        v11 = HIDWORD(v6[2].Flink)   1;
[b9]        if ( v11    ((v12 = v6[1].Blink[12].Flink, v12 == 3) || v12 == 4) )
            {
              v10 = 1;
              v7 = 2048;
            }
[b10]       else if ( !v11    LODWORD(v6[1].Blink[12].Flink) == 5
                   || (v13 = v6[1].Blink[12].Flink, v13 == 4)
                   || v13 == 3 )
            {
              v10 = 1;
              v7 = 256;

to the improved code representation:

[b1]  curr_Enlistment_NextSameRm_ptr_ = EnlistmentHead_addr->Flink;
      v7 = v19;
[b2]  while ( curr_Enlistment_NextSameRm_ptr_ != EnlistmentHead_addr )
      {
[b3]    if ( ADJ(curr_Enlistment_NextSameRm_ptr_)->Flags   4 )
        {
[b4]      curr_Enlistment_NextSameRm_ptr_ = ADJ(curr_Enlistment_NextSameRm_ptr_)->NextSameRm.Flink;
        }
        else
        {
[b5]      ObfReferenceObject(ADJ(curr_Enlistment_NextSameRm_ptr_));
[b6]      KeWaitForSingleObject( ADJ(curr_Enlistment_NextSameRm_ptr_)->Mutex, Executive, 0, 0, 0i64);
          v10 = 0;
[b7]      if ( (ADJ(curr_Enlistment_NextSameRm_ptr_)->Flags   0x80u) != 0 )
          {
[b8]        v11 = ADJ(curr_Enlistment_NextSameRm_ptr_)->Flags   1;
[b9]        if ( v11    ((v12 = ADJ(curr_Enlistment_NextSameRm_ptr_)->Transaction->State, v12 == 3) || v12 == 4) )
            {
              v10 = 1;
              v7 = 2048;
            }
[b10]       else if ( !v11    ADJ(curr_Enlistment_NextSameRm_ptr_)->Transaction->State == 5
                   || (v13 = ADJ(curr_Enlistment_NextSameRm_ptr_)->Transaction->State, v13 == 4)
                   || v13 == 3 )
            {
              v10 = 1;
              v7 = 256;

For those unfamiliar with the use of shifted pointers, note that the ADJ() wrapper around the variables in the excerpt above are added by the Hex-Rays decompiler to indicate that the variable being accessed is based on a shifted pointer.

Resulting Hex-Rays output

Below is the previously shown Hex-Rays output cleaned up using the shifted pointer approach:

    _KENLISTMENT_NextSameRm_ptr curr_Enlistment_NextSameRm_ptr; // rdi
    _KENLISTMENT_NextSameRm_ptr curr_Enlistment_NextSameRm_ptr_; // rdi
    _KENLISTMENT *curr_Enlistment; // rsi
    
    ...
    
    if ( Tm    Tm->State == 3 )
    {
      EnlistmentHead_addr =  pResMgr_->EnlistmentHead;
      curr_Enlistment_NextSameRm_ptr = pResMgr_->EnlistmentHead.Flink;// 1
      while ( curr_Enlistment_NextSameRm_ptr != EnlistmentHead_addr )
      {
[a1]    curr_Enlistment = ADJ(curr_Enlistment_NextSameRm_ptr);// 2
        curr_Enlistment_NextSameRm_ptr = ADJ(curr_Enlistment_NextSameRm_ptr)->NextSameRm.Flink;
[a2]    if ( !(curr_Enlistment->Flags   4) )
        {
[a3]      KeWaitForSingleObject( curr_Enlistment->Mutex, Executive, 0, 0, 0i64);
[a4]      curr_Enlistment->Flags |= 0x80u;
[a5]      KeReleaseMutex( curr_Enlistment->Mutex, 0);
        }
      }
[b1]  curr_Enlistment_NextSameRm_ptr_ = EnlistmentHead_addr->Flink;// 3
      v7 = v19;
[b2]  while ( curr_Enlistment_NextSameRm_ptr_ != EnlistmentHead_addr )
      {
[b3]    if ( ADJ(curr_Enlistment_NextSameRm_ptr_)->Flags   4 )
        {
[b4]      curr_Enlistment_NextSameRm_ptr_ = ADJ(curr_Enlistment_NextSameRm_ptr_)->NextSameRm.Flink;
        }
        else
        {
[b5]      ObfReferenceObject(ADJ(curr_Enlistment_NextSameRm_ptr_));
[b6]      KeWaitForSingleObject( ADJ(curr_Enlistment_NextSameRm_ptr_)->Mutex, Executive, 0, 0, 0i64);
          v10 = 0;
[b7]      if ( (ADJ(curr_Enlistment_NextSameRm_ptr_)->Flags   0x80u) != 0 )
          {
[b8]        v11 = ADJ(curr_Enlistment_NextSameRm_ptr_)->Flags   1;
[b9]        if ( v11    ((v12 = ADJ(curr_Enlistment_NextSameRm_ptr_)->Transaction->State, v12 == 3) || v12 == 4) )
            {
              v10 = 1;
              v7 = 2048;
            }
[b10]       else if ( !v11    ADJ(curr_Enlistment_NextSameRm_ptr_)->Transaction->State == 5
                   || (v13 = ADJ(curr_Enlistment_NextSameRm_ptr_)->Transaction->State, v13 == 4)
                   || v13 == 3 )
            {
              v10 = 1;
[b11]         v7 = 256;
            }
            ADJ(curr_Enlistment_NextSameRm_ptr_)->Flags  = 0xFFFFFF7F;
          }

Hopefully the above shows that pursuing such fixups is worthwhile, and will make reversing a much more enjoyable and speedy experience.

Patch analysis

Once we have a nicely cleaned up IDB and understand what all the structures and states being analyzed actually are, we take a look at the vulnerability Microsoft fixed.

Vulnerable code

The vulnerable code in TmRecoverResourceManager() is as follows:

    pEnlistment_shifted = EnlistmentHead_addr->Flink;
    while ( pEnlistment_shifted != EnlistmentHead_addr ) {
      if ( ADJ(pEnlistment_shifted)->Flags   KENLISTMENT_FINALIZED ) {
        pEnlistment_shifted = ADJ(pEnlistment_shifted)->NextSameRm.Flink;
      }
      else {
        ObfReferenceObject(ADJ(pEnlistment_shifted));
        KeWaitForSingleObject( ADJ(pEnlistment_shifted)->Mutex, Executive, 0, 0, 0i64);

        [...]

        if ( (ADJ(pEnlistment_shifted)->Flags   KENLISTMENT_IS_NOTIFIABLE) != 0 ) {
          if ([...]) {
[v1]         bSendNotification = 1;
          }
          ADJ(pEnlistment_shifted)->Flags  = ~KENLISTMENT_IS_NOTIFIABLE;
        }

        [...]

        KeReleaseMutex( ADJ(pEnlistment_shifted)->Mutex, 0);

[v2]    if ( bSendNotification ) {
          KeReleaseMutex( pResMgr->Mutex, 0);

          ret = TmpSetNotificationResourceManager(
                  pResMgr,
                  ADJ(pEnlistment_shifted),
                  0i64,
                  NotificationMask,
                  0x20u, // sizeof(TRANSACTION_NOTIFICATION_RECOVERY_ARGUMENT)
                   notification_recovery_arg_struct);

[v3]      if ( ADJ(pEnlistment_shifted)->Flags   KENLISTMENT_FINALIZED )
            bEnlistmentIsFinalized = 1;

[v4]      ObfDereferenceObject(ADJ(pEnlistment_shifted));
[v5]      KeWaitForSingleObject( pResMgr->Mutex, Executive, 0, 0, 0i64);
          if ( pResMgr->State != KResourceManagerOnline )
            goto b_release_mutex;
        }

        [...]

[v6]    if ( bEnlistmentIsFinalized ) {
          pEnlistment_shifted = EnlistmentHead_addr->Flink;
          bEnlistmentIsFinalized = 0;
        }
        else {
[v7]      pEnlistment_shifted = ADJ(pEnlistment_shifted)->NextSameRm.Flink;
        }
      }
    }

Let’s analyze the unpatched code. If the lines marked [v1] and [v2] are executed, bSendNotification is set to 0x1 then TmpSetNotificationResourceManager() is called, which queues a notification related to the current enlistment being parsed. In order to send this notification, the _KRESOURCEMANAGER structure, pointed to by pResMgr, has its mutex unlocked. This means other code in a separate thread can execute code that requires the lock.

After a notification is queued by calling TmpSetNotificationResourceManager(), a test is done at the line marked [v3], which tests if the enlistment currently being parsed has been finalized or not, as indicated by a KENLISTMENT_FINALIZED flag. An enlistment being finalized means that the work is done and it is ready to be freed. This means that the subsequent ObjDereferenceObject() call marked [v4] could free the enlistment, and as such the code sets the bIsEnlistmentFinalized flag to indicate to itself that the pEnlistment pointer shouldn’t be touched again.

At [v5] the pResMgr‘s mutex is locked again, and the loop tries to move on to parsing and potentially notifying the next enlistment in the linked list. It decides where to fetch this new enlistment based on the previously set bIsEnlistmentFinalized flag being set or not, as indicated by [v6]. If the flag is set, the pointer is fetched from the head of the linked list maintained in the _KRESOURCEMANAGER structure. Otherwise the enlistment that just had a notification queued will be used to find the next link, indicated by [v7].

At a high level, the following diagram shows the structures being parsed by this code, and approximately what happens if a finalized enlistment is encountered. The diagram will be revisited in more detail later.

In contrast to what the diagram above shows, when the code in the loop detects a finalized enlistment and restarts from EnlistmentHead, the list head most likely points back to some _KENLISTMENT that was already parsed (like the entry pointed to by 1 in the diagram above). In this scenario the _KENLISTMENT being reparsed has been marked as non-notifiable, and so a new notification will not be placed into the queue. Additionally, any other enlistments that may have been finalized and since removed from the linked list may not be reparsed. Since this is hard to illustrate, we simplify the diagram to show step 2 having EnlistmentHead referencing the next notifiable _KENLISTMENT that has yet to be parsed.

Patched code

The following is the patched version of the same code:

    pEnlistment_shifted = EnlistmentHead_addr->Flink;
    while ( pEnlistment_shifted != EnlistmentHead_addr )
    {
      if ( ADJ(pEnlistment_shifted)->Flags   KENLISTMENT_FINALIZED ) {
        pEnlistment_shifted = ADJ(pEnlistment_shifted)->NextSameRm.Flink;
      }
      else {
        ObfReferenceObject(ADJ(pEnlistment_shifted));

        KeWaitForSingleObject( ADJ(pEnlistment_shifted)->Mutex, Executive, 0, 0, 0i64);
        bSendNotification = 0;
        if ( (ADJ(pEnlistment_shifted)->Flags   KENLISTMENT_IS_NOTIFIABLE) != 0 ) {
          if ([...]) {
[p1]         bSendNotification = 1;
          }
          [...]
          ADJ(pEnlistment_shifted)->Flags  = ~KENLISTMENT_IS_NOTIFIABLE;
        }
        [...]
        KeReleaseMutex( ADJ(pEnlistment_shifted)->Mutex, 0);
[p2]    if ( bSendNotification ) {
          KeReleaseMutex( pResMgr->Mutex, 0);

          ret = TmpSetNotificationResourceManager(
                  pResMgr,
                  ADJ(pEnlistment_shifted),
                  0i64,
                  notification_timeout,
                  0x20u,
                   cur_enlistment_guid);

          ObfDereferenceObject(ADJ(pEnlistment_shifted));
          KeWaitForSingleObject( pResMgr->Mutex, Executive, 0, 0, 0i64);
          if ( pResMgr->State != KResourceManagerOnline )
            goto b_release_mutex;

          Tm_ = pResMgr->Tm;
[p8]      if ( !Tm_ || Tm_->State != KKtmOnline )
          {
            ret = STATUS_TRANSACTIONMANAGER_NOT_ONLINE;
            goto b_release_mutex;
          }
[p6]      pEnlistment_shifted = EnlistmentHead_addr->Flink;
        }
        else
        {
          ObfDereferenceObject(ADJ(pEnlistment_shifted));
          pEnlistment_shifted = ADJ(pEnlistment_shifted)->NextSameRm.Flink;
        }
      }
    }

In the patched version of the function above, we see that between the locations marked [p1] and [p6] a check for KENLISTMENT_FINALIZED no longer occurs at all. Instead, we see that any time [p1] and [p2] are executed and a notification is queued, the code just assumes that the enlistment was potentially finalized, and always fetches the next _KENLISTMENT from the head of the linked list maintained in the _KRESOURCEMANAGER structures instead. There is also a new check added at line marked [p8] to see if the transaction manager has gone offline, in which case the loop is exited. This ended up not being useful for exploiting this vulnerability, but as we will explore in part 5 of this blog series, it can actually be used to safely detect if a system is vulnerable or not.

So what does the patch tell us about the vulnerability? We infer that there must be a race condition between the time the KENLISTMENT_FINALIZED flag is checked and the time the _KENLISTMENT pointer is actually used. This in turn implicates that likely the memory that the _KENLISTMENT pointer points to can be freed and potentially replaced, meaning we’re looking at a use-after-free that is triggered by winning the race. This idea is elaborated a bit further with added comments in the vulnerable code:

          if ( ADJ(pEnlistment_shifted)->Flags   KENLISTMENT_FINALIZED ) {
            bEnlistmentIsFinalized = 1;
          }
          // START: Race starts here, if bEnlistmentIsFinalized was not set

          ObfDereferenceObject(ADJ(pEnlistment_shifted));
          KeWaitForSingleObject( pResMgr->Mutex, Executive, 0, 0, 0i64);
          if ( pResMgr->State != KResourceManagerOnline )
            goto b_release_mutex;
        }

        //...

        // END: If at any time, after START, another thread finalized and
        // closed pEnlistment_shifted, it might now be freed, but 
        // `bEnlistmentIsFinalized` is not set, so the code doesn't know.
        if ( bEnlistmentIsFinalized ) {
          pEnlistment_shifted = EnlistmentHead_addr->Flink;
          bEnlistmentIsFinalized = 0;
        }
        else {
          // ADJ(pEnlistment_shifed)->NextSameRm.Flink could reference freed
          // memory
          pEnlistment_shifted = ADJ(pEnlistment_shifted)->NextSameRm.Flink;
        }

Now, from our initial analysis, we assume the vulnerability is a race condition, that leads to use after free, and that the type of structure we’ll be abusing is a _KENLISTMENT. This in turn tells us few things we need to figure out next to further validate our understanding and move towards actually triggering the vulnerability:

  • How and when is the KENLISTMENT_FINALIZED flag set?

  • Is this flag being set predicated on pResMgr->Mutex being unlocked?

  • Does ObjDereferenceObject() actually free the _KENLISTMENT?

Congestioning the mutex?

One interesting thing to note, now that we understand the vulnerability a bit more, is this quote from the Kaspersky writeup:

One of created threads calls NtQueryInformationResourceManager in a loop,
while second thread tries to execute NtRecoverResourceManager once.

If you reverse NtQueryInformationResourceManager you will see that one of the things it does is lock a _KRESOURCEMANAGER mutex. If we relook at the code where ObjDereferenceObject() is called in TmRecoverResourceManager() at [v4], it is immediately followed by an attempt to lock the _KRESOURCEMANAGER mutex at [v5]. This gives us a hint for the future that this location could potentially give us a way in which to force the race window open and give us time to free the enlistment. It also gives us a good candidate location for manually patching code in WinDbg in order to leave the race window opened to test the bug, which we’ll go into using more detail later. When analyzing these types of bugs it is good to always consider these possibilities as you go.

A simplified excerpt from NtQueryInformationResourceManager() is below, which shows what is done while this mutex is locked:

 KeWaitForSingleObject( pResMgr->Mutex, Executive, 0, 0, 0i64);
  if ( ResourceManagerInformationLength >= 0x18 )
  {
    DescriptionLength = pResMgr->Description.Length;
    * ResourceManagerInformation->DescriptionLength = DescriptionLength;
    AdjustedDescriptionLength = pResMgr->Description.Length + 0x14;
    if ( AdjustedDescriptionLength = AdjustedDescriptionLength ) {
      copyLength = pResMgr->Description.Length;
    }
    else {
      rcval = STATUS_BUFFER_OVERFLOW;
      copyLength = ResourceManagerInformationLength - 0x14;
    }
    memmove( ResourceManagerInformation->Description, pResMgr->Description.Buffer, copyLength);
  }
  else {
    rcval = STATUS_BUFFER_TOO_SMALL;
  }
  KeReleaseMutex( pResMgr->Mutex, 0);

So we see that by repeatedly calling the code above, there is some amount of code we can execute that will prevent TmRecoverResourceManager() from relocking the mutex, which gives some time to further win the race.

Initial strategy for exploitation

With our new understanding of the vulnerability, and what the patch does, our theory for how to exploit this vulnerability is the following:

  1. We must free that exact enlistment after TmRecoverResourceManager() checks to see if the enlistment is finalized, but before it is able to lock the _KRESOURCEMANAGER mutex.
  2. This free will be done by the ObjDereferenceObject() call in TmRecoverResourceManager() or in some other yet unidentified code while TmRecoverResourceManager() is waiting to lock the resource manager mutex.
  3. Prior to returning from locking the mutex, the freed memory should be replaced by some other attacker-controlled memory
  4. At that point, the NextSameRm.Flink value is controlled by us and we can make the kernel point to an arbitrary memory location.

Now let’s compare a normal scenario with a race condition scenario. This diagram is the same as before, and shows how the logic is meant to be working in a normal scenario:

The following diagram shows how by winning a race condition, an invalid _KENLISTMENT will be referenced:

A small final side note about the patch: we originally thought the check in the patched code for seeing if the transaction manager is offline might indicate the best way to finalize an enlistment to trigger the bug. This was not a good approach in the end and appears to just be an additional fix Microsoft added in the process of fixing CVE-2018-8611.

Reaching the vulnerable code

Transaction state prerequisites

As detailed before, if we look at the cross references to the vulnerable TmRecoverResourceManager() it becomes clear we can just call the NtRecoverResourceManager() syscall from userland, which matches what Kaspersky said.

Now we have a few things to do:

  1. We want to know how to ensure the transaction and enlistment states meet the requirements to hit the vulnerable code.
  2. We want to start with code that lets us correctly recover, without even trying to win the race. This lets us know what recovering a resource manager looks like.
  3. We want to figure out how to finalize an enlistment.

To deal with the first thing, we look at a piece of code that was ommitted earlier from the patch analysis. The following check in TmRecoverResourceManager() determines if bSendNotification gets set, which lets us send a notification and hit the buggy code path:

        bSendNotification = 0;
        if ( (ADJ(pEnlistment)->Flags   KENLISTMENT_IS_NOTIFIABLE) != 0 ) {
          bEnlistmentSuperior = ADJ(pEnlistment)->Flags   KENLISTMENT_SUPERIOR;
          if ( bEnlistmentSuperior
               ((state = ADJ(pEnlistment)->Transaction->State, state == KTransactionPrepared)
             || state == KTransactionInDoubt) ) {
            bSendNotification = 1;
            NotificationMask = TRANSACTION_NOTIFY_RECOVER_QUERY;
          }
          else if ( !bEnlistmentSuperior    ADJ(pEnlistment)->Transaction->State == KTransactionCommitted
                 || (state = ADJ(pEnlistment)->Transaction->State, state == KTransactionInDoubt)
                 || state == KTransactionPrepared ) {
            bSendNotification = 1;
            NotificationMask = TRANSACTION_NOTIFY_RECOVER;
          }
          ADJ(pEnlistment)->Flags  = ~KENLISTMENT_IS_NOTIFIABLE;
        }

There are two ways to set bSendNotification:

  • One is when the enlistment is superior and the transaction is in the KTransactionInDoubt or KTransactionPrepared state
  • The other is when the enlistment is not superior and the transaction is in the KTransactionCommitted, KTransactionInDoubt, or KTransactionPrepared state

Trial and error showed us that using superior enlistments here would not work, as it requires the transaction manager and resource manager to be non-volatile. As we noted in part 1 of our series, due to the number of enlistments we’ll be using and the potential number of attempts before winning the race, we need to use volatile managers that do not keep a log. Furthermore, due to the apparent restriction of having only one superior enlistment at a time, superior enlistments can’t be used to spray lots of candidate _KENLISTMENT for triggering the vulnerability.

This means we need to figure out how to ensure that the non-superior enlistments being parsed by the TmRecoverResourceManager() loop at the time of triggering the vulnerability are associated with a transaction in one of the three aforementioned states. We initially thought about ignoring KTransactionInDoubt and focusing on the KTransactionCommitted and KTransactionPrepared states as it appeared to us that transactions being in an InDoubt state could be harder to deal with.

Transitioning state

During our testing and debugging, we concluded that when a KTM client says that it wants to commit to some transaction, it calls CommitTransaction(), which will block until the transaction is actually completed and committed, or some other event occurs (like the transaction is rolled back).

While the client is blocked, the transaction can transition through a number of states that are all indicative of some amount of progress of the constituent enlistments involved in completing the transaction.

These states are tracked by the _KTRANSACTION_STATE enum:

//0x4 bytes (sizeof)
enum _KTRANSACTION_STATE
{
    KTransactionUninitialized = 0,
    KTransactionActive = 1,
    KTransactionPreparing = 2,
    KTransactionPrepared = 3,
    KTransactionInDoubt = 4,
    KTransactionCommitted = 5,
    KTransactionAborted = 6,
    KTransactionDelegated = 7,
    KTransactionPrePreparing = 8,
    KTransactionForgotten = 9,
    KTransactionRecovering = 10,
    KTransactionPrePrepared = 11
};

The most interesting states from our perspective are the transaction and enlistment states that lead to an enlistment being notified of a recovery, which in turn eventually leads us to the vulnerability we will be looking at. It may not be obvious which of these states match these requirements.

From above, we are interested to know when a transaction enters the KTransactionCommitted or KTransactionPrepared state. The only way to really tell when these state transitions occur is by reversing. Fortunately it is fairly easy to find related functions as they are well named and have the the Tm prefix or Tmp prefix we mentioned earlier.

After a bit of reversing we ran into a fairly frequently called function called TmpProcessNotificationResponse(), which operates on a set of arrays passed in as arguments:

__int64 TmpProcessNotificationResponse(_KENLISTMENT *pEnlistment,
    PLARGE_INTEGER VirtualClock,
    unsigned int ArrayCount,
    _KENLISTMENT_STATE *EnlistmentStatesArray,
    _KENLISTMENT_STATE *NewEnlistmentStateArray,
    _KTRANSACTION_STATE *TransactionStatesArray,
    unsigned int (**pCommitCallback)(struct _KTRANSACTION *, _QWORD),
    unsigned int *ShouldFinalizeFlag)
{
  [...]
  pTransaction = pEnlistment->Transaction;
  [...]
      pEnlistment->State = NewEnlistmentStateArray[i];
      if ( ShouldFinalizeFlag[i] == 1
        || ShouldFinalizeFlag[i] == 2    !(pEnlistment->NotificationMask   TRANSACTION_NOTIFY_COMMIT) )// optionally finalize
      {
        TmpFinalizeEnlistment(pEnlistment);
      }
      KeReleaseMutex( pEnlistment->Mutex, 0);
      bHaveEnlistmentMutex = 0;
      if ( pCommitCallback[i] ) {
        if ( pTransaction->PendingResponses-- == 1 )
          rcval = pCommitCallback[i](pTransaction, 0i64);
      }
    }
  }
  [...]
}

The interesting thing about this function is that it is responsible for setting various transaction states, as well as enlistment states.

By looking at all of the callers, we slowly worked out which function sets which state:

Taking one of the xrefs shown above, we see how TmPrepareComplete() calls TmpProcessNotificationResponse():

__int64 __fastcall TmPrepareComplete(_KENLISTMENT *pEnlistment, LARGE_INTEGER *VirtualClock)
{
  _KENLISTMENT_STATE NewEnlistmentStateArray[1]; // [rsp+40h] [rbp-18h]
  _KENLISTMENT_STATE EnlistmentStatesArray[1]; // [rsp+44h] [rbp-14h]
  unsigned int (__fastcall *pCommitCallback[1])(struct _KTRANSACTION *, _QWORD); // [rsp+48h] [rbp-10h]
  unsigned int ShouldFinalizeFlag[1]; // [rsp+70h] [rbp+18h]
  _KTRANSACTION_STATE TransactionStatesArray[1]; // [rsp+78h] [rbp+20h]

  ShouldFinalizeFlag[0] = 0;
  pCommitCallback[0] = TmpTxActionDoPrepareComplete;
  EnlistmentStatesArray[0] = KEnlistmentPreparing;
  NewEnlistmentStateArray[0] = KEnlistmentPrepared;
  TransactionStatesArray[0] = KTransactionPreparing;
  return TmpProcessNotificationResponse(
           pEnlistment,
           VirtualClock,
           1u,
           EnlistmentStatesArray,
           NewEnlistmentStateArray,
           TransactionStatesArray,
           pCommitCallback,
           ShouldFinalizeFlag);
}

Calling PrepareComplete() from userland will indeed end up calling NtPrepareComplete() which will call TmPrepareComplete(). We now know that calling PrepareComplete() from userland will set this _KENLISTMENT into a new state of KEnlistmentPrepared. If, in addition to modifying this enlistment parameter’s state, all of the enlistments associated with the same transaction are synchronized on the same new state, then the TmpTxActionDoPrepareComplete() function will end up being called, which will set the transaction to a new state. This transaction change is shown in the excerpt below:

__int64 __fastcall TmpTxActionDoPrepareComplete(_KTRANSACTION *pTransaction)
{
  LARGE_INTEGER *v1; // rdx
  struct _LIST_ENTRY *EnlistmentHead; // rcx
  int logerror; // eax MAPDST
  _KENLISTMENT *pSuperiorEnlistment; // rcx
  __int64 result; // rax

  if ( !pTransaction->SuperiorEnlistment || pTransaction->Flags   KTransactionPreparing )
  {
    pTransaction->State = KTransactionPrepared;
    result = TmpTxActionDoCommit(pTransaction, v1);
  }

So in this way we know that we must work out all the prerequisite state transitions prior to KTransactionPrepared to allow us to eventually call PrepareComplete(). When our transaction enters the KTransactionPrepared state, we will be able to trigger the code we want to reach in the vulnerable function.

Enlistment state prerequisites

Similar to the above, _KENLISTMENT structures also have a set of states, and in order for a transaction to transition to a new state, all of the associated enlistments must be in the matching state. So if a transaction should transition to a KTransactionPrepared state as in the example above, it is necessary that all _KENLISTMENT structures are already in the KEnlistmentPrepared state.

The full list of _KENLISTMENT_STATE states is below:

enum _KENLISTMENT_STATE
{
    KEnlistmentUninitialized = 0,
    KEnlistmentActive = 256,
    KEnlistmentPreparing = 257,
    KEnlistmentPrepared = 258,
    KEnlistmentInDoubt = 259,
    KEnlistmentCommitted = 260,
    KEnlistmentCommittedNotify = 261,
    KEnlistmentCommitRequested = 262,
    KEnlistmentAborted = 263,
    KEnlistmentDelegated = 264,
    KEnlistmentDelegatedDisconnected = 265,
    KEnlistmentPrePreparing = 266,
    KEnlistmentForgotten = 267,
    KEnlistmentRecovering = 268,
    KEnlistmentAborting = 269,
    KEnlistmentReadOnly = 270,
    KEnlistmentOutcomeUnavailable = 271,
    KEnlistmentOffline = 272,
    KEnlistmentPrePrepared = 273,
    KEnlistmentInitialized = 274
};

Similar to what was noted before, this just means we need to find the prerequisite functions to set enlistments into a specific state. If we look back at our earlier example for TmPrepareComplete(), we see that it will transition the _KENLISTMENTS states from KEnlistmentPreparing to KEnlistmentPrepared:

  EnlistmentStatesArray[0] = KEnlistmentPreparing;
  NewEnlistmentStateArray[0] = KEnlistmentPrepared;

Preparing for a recovery

Given what we have covered so far, what do we need to do to get a transaction into the necessary state? The following code shows how we do this:

	hTx1 = CreateTransaction(NULL);
	hEn1 = CreateEnlistment(hRM, hTx1, 0x39ffff0f, 0);
	hEn2 = CreateEnlistment(hRM, hTx1, 0x39ffff0f, 0);
	CommitTransactionAsync(hTx1);

	PrePrepareComplete(hEn1);
	PrePrepareComplete(hEn2);

	PrepareComplete(hEn1);
	PrepareComplete(hEn2);

After executing the above, our transaction is in the KTransactionPrepared state. The next step would be calling CommitComplete() to commit the enlistments, which would transition it into the KTransactionCommitted state. However, we want them to be in the non-committed state to allow the recovery.

Note that the above could simply be done with a single enlistment, however since exploitation involves many enlistments and those enlistments all need to be in the same state to transition the transaction, it is better to more closely simulate the exploitable circumstances.

One thing we noticed during our research is that for a recovery to occur, a single transaction in the KTransactionPrepared was not enough, so we effectively used two transactions:

  • one transaction in the KTransactionCommitted state, with the transaction having a single enlistment
  • another transaction in the KTransactionPrepared state, with the transaction having lots of enlistments

Building our race transaction

Now that we know how to transition between enlistment and transaction states, and get notifications about what state things are in (from part 1), we recover the resource manager so that TmRecoverResourceManager() parses the multiple enlistments part of our second transaction.

The following shows a very simplified view of what this might look like:

// recovery thread
void recover(void)
{
    //...

    // Creating TM and RM
    hTM = xCreateVolatileTransactionManager();
    hRM = xCreateVolatileResourceManager(hTM, NULL, pResManName);
    RecoverResourceManager(hRM);

    // have some completed transaction that allows us to actually recover
    hTx1 = setup_commit_completed_transaction(hRM);

    // set up a bunch of prepared enlistments that we will finalize in racer
    // thread
    uaf_tx_list = single_tx_uaf_enlistments(hRM, hTM, MAX_TX_ENLISTMENT_COUNT);

    // Call the buggy recovery code
    RecoverResourceManager(hRM);

    //...
}

In the future when we talk about the thread that is calling TmRecoverResourceManager(), we will call it the "recovery thread".

By reading notifications as detailed in part 1 while the recovery is running, we confirm that we hit the vulnerable code and see that we are receiving associated notifications for each of the enlistments as they are parsed by the loop. This means we finally reach the vulnerable code.

Triggering the vulnerability with an assisted race condition

Forcing the race window open

When exploiting race condition bugs a common early approach is to artificially open the race window to guarantee a win. This helps better understand the environment in which we are operating and also ensures that our exploit skeleton code is actually functioning. When failing to win a race, it is best to be able to differentiate if it is because our trigger is incorrect due to a misunderstanding or if we are just not winning it yet due to inefficiency in the race.

This approach also helps us investigate the post-race-win state, and helps examine approaches to detecting a race win once we stop using the artificial window. This can be worked out simultaneously by other people when one person is still trying to effectively win the race without "cheating".

In order to force the window open, we patch TmRecoverResourceManager() in WinDbg to change the KeWaitForSingleObject() call below to be a tight loop.

          // Race starts here
          ObfDereferenceObject(ADJ(pEnlistment_shifted));
          KeWaitForSingleObject( pResMgr->Mutex, Executive, 0, 0, 0i64);

Once this is patched, the recovery thread will be permanently hung, but other threads continue to do things. This gives us the opportunity to free the target enlistment in question, and also replace it.

Note that this only works for us because it is extremely unlikely any other code on the system is using TmRecoverResourceManager(). If you need to try to win a race on more heavily used code it is better to just change the instruction pointer temporarily for the thread you are racing so that it points to an infinite loop. This way other threads that potentially need to legitimately use the buggy code won’t be impacted.

How to free a _KENLISTMENT on demand?

In order to trigger the use-after-free during the race window, we need to know how to free an _KENLISTMENT. Reversing shows that once an enlistment has been committed it becomes finalized, and after this point, once all object counter references go away, the _KENLISTMENT will be freed.

From our analysis of TmRecoverResourceManager(), we know that one indication that an enlistment is pending free is the KENLISTMENT_FINALIZED flag being set. By searching through KTM related functions, we find the TmpFinalizeEnlistment() function.

At first glance the xrefs graph to TmpFinalizeEnlistment() looks like spaghetti, but by taking a bit of time to look through, there are some obvious candidates. One function a little ways up the call chain is TmCommitComplete().

The TmCommitComplete() function is called by NtCommitComplete(), which we can call from userland. One of the easiest paths to the function we want to call is:

NtCommitComplete()
-> TmCommitComplete()
    -> TmpProcessNotificationResponse()
        -> TmpFinalizeEnlistment()

We already looked at the TmpProcessNotificationResponse() code earlier. This function parses a few arrays of flags and starts by identifying if the transaction and current enlistment match one set of the corresponding states in the arrays. If so, then it will update the enlistment state flag. A separate array containing flags (which we call ShouldFinalizeFlag[]) indicate if, after the new enlistment state is set, it should also be finalized. In the case of TmCommitComplete() the ShouldFinalizeFlag flags are set, so TmpFinalizeEnlistment() will be called:

      if ( ShouldFinalizeFlag[i] == 1 || ShouldFinalizeFlag[i] == 2    !(pEnlistment->Key   TRANSACTION_NOTIFY_COMMIT) )
        TmpFinalizeEnlistment(pEnlistment);

The TmpFinalizeEnlistment() function does a fair bit, but we’ll look at a few things:

If the enlistment is already finalized, it doesn’t re-finalize it:

  EnlistmentFlags = *( pEnlistment->NotificationMask + 1);
  if ( EnlistmentFlags   KENLISTMENT_FINALIZED )
    return 0i64;
  [...]

Next we see the following code. It removes the association of the enlistment being finalized from the resource manager. As we recall, if the TmRecoverResourceManager() function is somehow permanently blocked on the KeWaitForSingleObject( pRm->Mutex) call itself in the recovery thread, the TmpFinalizeEnlistment() call on another thread could succeed.

  // Acquire the resource manager's mutex so we can deal with _KRESOURCEMANAGER.EnlistmentHead == _KENLISTMENT.NextSameRm
  KeWaitForSingleObject( pRM->Mutex, Executive, 0, 0, 0i64);
  TmpRemoveTransactionEnlistmentCounts(pEnlistment);
  [...]
  TmpRemoveEnlistmentResourceManager(pEnlistment);

This answers an earlier question, which is that removing the enlistment is indeed predicated on locking the resource manager’s mutex.

Next, as detailed in code excerpt below, there are two calls that reduce the refcount of the enlistment. These correspond to references from the linked lists that the enlistment was just removed from. Finally, the third ObfDereferenceObject() call actually matches the original refcount added when creating the structure in the first place.

  ObfDereferenceObject(pEnlistment);
  ObfDereferenceObject(pEnlistment);
  [...]
  KeReleaseMutex( pRM->Mutex, 0);
  TmpReleaseTransactionMutex(pTransaction);
  KeWaitForSingleObject(( pEnlistment->Mutex + 8), Executive, 0, 0, 0i64);
  ObfDereferenceObject(pEnlistment);
  return 0i64;

At this point, the enlistment refcount should be 1, and that is because the userland exploit still holds the opened handle to the enlistment. Following the finalization call, closing the enlistment handle from userland should be enough to make the refcount reach 0.

To summarize, the most important thing to realize about TmpFinalizeEnlistment() is that it decrements the _KENLISTMENT‘s refcount an additional time, to offset the original refcount during creation. As a result, once other functions have also decremented the refcount and the refcount reaches zero, then the _KENLISTMENT will finally be freed. Now let’s recall that TmRecoverResourceManager() also calls ObfDereferenceObject(pEnlistment), which means that the finalization process, in addition to closing the enlistment handle from userland, will actually free the _KENLISTMENT object.

So all of that is a long way to to say: If we can identify which enlistment we want to free in order to trigger a use-after-free, we must simply call CommitComplete() on that enlistment and then use CloseHandle to close the associated enlistment handle. That will trigger a free of the _KENLISTMENT object.

Confirming our understanding

In order to confirm our mental model, we enable Verifier to be able to catch a crash as soon as non-valid memory is touched.

We set a breakpoint in TmRecoverResourceManager(). When the breakpoint hits, we install the patch for simulating the mutex congestion. Note that we wrote a simple JavaScrit script for WinDbg Preview to test this (adding !patch and !unpatch commands):

0: kd> !patch
__installPatch
Setting breakpoint and patching mutex code

@$patch()   

Then we continue execution. Now everything hangs on the VM side and our userland exploit skeleton does not exit due to the recovery thread being in an infinite loop.

So we break in WinDbg, remove the patch and restore execution:

1: kd> !unpatch
__uninstallPatch
Removing breakpoint for patching mutex and restoring old code.


@$unpatch()     
1: kd> g

We hit the following crash:

rax=0000000000000000 rbx=fffff98026e1edd8 rcx=fffff9801a4b2e58
rdx=fffff98026e1edf0 rsi=fffff980277fce20 rdi=0000000000000000
rip=fffff80002d2eac1 rsp=fffff88005d4a9d0 rbp=fffff88005d4ab60
 r8=fffff78000000008  r9=0000000000000000 r10=0000000000000000
r11=fffff80002bf1180 r12=fffff98026e1edb0 r13=fffff98026e1eec0
r14=0000000000000000 r15=0000000000000100
iopl=0         nv up ei pl nz na pe cy
cs=0010  ss=0018  ds=002b  es=002b  fs=0053  gs=002b             efl=00010303
nt!TmRecoverResourceManager+0x129:
fffff800`02d2eac1 f6472404        test    byte ptr [rdi+24h],4 ds:002b:00000000`00000024=??
0: kd> r rdi
Last set context:
rdi=0000000000000000
0: kd> k
  *** Stack trace for last set context - .thread/.cxr resets it
 # Child-SP          RetAddr           Call Site
00 fffff880`05d4a9d0 fffff800`02d4bf49 nt!TmRecoverResourceManager+0x129
01 fffff880`05d4aa90 fffff800`02aae9d3 nt!NtRecoverResourceManager+0x51
02 fffff880`05d4aae0 00000000`778cabea nt!KiSystemServiceCopyEnd+0x13
03 00000000`0020a568 000007fe`fa4b219d ntdll!ZwRecoverResourceManager+0xa
04 00000000`0020a570 00000000`ff396d7b ktmw32!RecoverResourceManager+0x9

This corresponds to beginning of the loop, indicating the enlistment was freed in the previous iteration of the loop, a stale pointer was copied to pEnlistment_shifted at [1] and the use-after-free happened at [2]:

    pEnlistment_shifted = EnlistmentHead_addr->Flink;
    while ( pEnlistment_shifted != EnlistmentHead_addr )
    {
[2]   if ( ADJ(pEnlistment_shifted)->Flags   KENLISTMENT_FINALIZED )    // crash here
      {
        ...
        if ( bEnlistmentIsFinalized )
        {
          pEnlistment_shifted = EnlistmentHead_addr->Flink;
          bEnlistmentIsFinalized = 0;
        }
        else
        {
[1]       pEnlistment_shifted = ADJ(pEnlistment_shifted)->NextSameRm.Flink;
        }
      }
    }

Conclusion

We’ve now presented a detailed analysis of the CVE-2018-9611 vulnerability, and our understandings of the prerequisites of constructing an enlistment that allows us to reach the vulnerable code. We also now know how to free an enlistment on demand, and how to force our race window open to let us confirm an enlistment can actually be freed and reused at the next iteration of the loop. Our next blog will discuss how to identify which enlistment is the best candidate to free and how to win the race without forcing the race window open.

Read all posts in the Exploiting Windows KTM series