Reversing DOS functions: PCMP

After recently reversing the unpacker for Commander Keen, I moved on to reversing the game itself. One thing that shows up when reversing is the functions that get inserted by the compiler of the day. Old DOS games such as Commander Keen end up with a bunch of these, and IDA is good at recognizing them. Today we're going to look at the PCMP function (often referred to as N_PCMP@ as a near call, or F_PCMP@ as a far call).

image.png

This is what we start with. Nothing gets accessed from the stack, so we can assume the parameters are in the registers. Let's look at how this gets called:

image.png

So we're setting DX and AX from some local vars (from something done further up the function), and also setting CX and BX to 0. Let's start looking through at the code and see which of these get used:

push    cx

CX is a "throwaway" register so this is a sure sign we care about it for later on

mov     ch, al
mov     cl, 4
shr     ax, cl

We're saving AL in CH, and shifting AX right by 4 bits (or dividing by 16, or 0x10). This is something we might do to convert an offset to something we can add onto a segment register...

add     dx, ax

It's looking like DX:AX might actually be a segment:offset pair, and we've just scaled AX so we can add it to it. In other words, if we started with DX:AX = 1030:5678, we've done the following:

// dx = 0x1030
// ax = 0x5678
ch = al // ah = 0x78
ax = ax >> 4 // ax = 0x0567
dx = dx + ax // dx = 0x1030 + 0x0567 = 0x1597

The segment:offset pair 1030:5678 points to memory location 0x15978, and we've got 0x1597 and 0x78 in different registers so we appear to be building this up.

mov     al, ch
mov     ah, bl

We move the bottom byte of AX into AL and the bottom byte of BX into AH...

shr     bx, cl
pop     cx
add     cx, bx

And this mirrors what we have above, so it does look like we're doing to CX:BX what we did to DX:AX...

mov     bl, ah

Just putting it back, BL now equals the original BL, and remember AL now also equals the original AL

and     ax, 0Fh
and     bx, 0Fh

And we mask these bytes out. So we're in this position:

  • AX = bottom nybble of DX:AX
  • BX = bottom nybble of CX:BX
  • DX = bits 4-20 of DX:AX
  • CX = bits 4-20 of CX:BX

The reason we do this is because multiple segment:offset pairs can resolve to the same pointer. We're now in a position where we can compare AX == BX and DX == CX and if both are true then DX:AX points to the same memory location as CX:BX:

cmp     dx, cx
jnz     short locret_1E59A
cmp     ax, bx
locret_1E59A:
retn

We don't return any registers, it looks like we just return ZF. If we fail the first check (DX==CX) then we jump the return with ZF=0 (jumps/returns/stack operations don't change the flags). If we pass this check then we do the second check (AX==BX) and just return.

If you're doing this in IDA then you can change the signature (command "Y") to void __usercall N_PCMP_(void *pointer1@<dx:ax>, void *pointer2@<cx:bx>); and this will give you the following output:

image.png

Not perfect, but we can add a comment and this will help us later on. The name now makes sense - PCMP probably means Pointer CoMPare.

Hope this was helpful, and happy hacking!