Starcraft on Open Pandora: How the Port Came to Be

sc

When the Alive and Kicking Coding Competition for the Open Pandora Scene was about to close, one week before the deadline, Notaz (of Picodrive fame) had a special request: to move the deadline till the end of the weekend instead of the initial Saturday evening one. That’s when I knew Notaz was on to something, while he would not disclose anything else than “it’s not Pandora specific”. The deadline came and went and Notaz could not finish his project on time. A couple of weeks later, he released what he had been working on: a perfect port of Starcraft running on the Pandora. This is an impressive feat, since Starcraft is closed source, and only available as a x86 binary in the first place, and Pandora runs on ARM hardware. So, how did he do it ?

Before we move on to the technical details themselves, let’s come back on Starcraft a little. Starcraft has been an incredibly popular RTS game since the late 90s, a game that was still very much alive for online or LAN-based competitions all around the world and especially in Korea. Starcraft was a follow-up title from Blizzard after Warcraft and Warcraft II, and was supposed to be something like Warcraft in space with aliens. I won’t delve in the details here, but there’s a cool developer story on how Starcraft ended up being quite different from Warcraft because of another game they were competing against, called Dominion Storm. That game from Ion Storm was showcased in 1996 and actually looked much, much better than the original Starcraft prototype… but the trick is, what was demoed during the E3 was all fake and the actual game did not exist… so Blizzard folks worked their ass off to make Starcraft a better game, ignoring that Dominion Storm was actually very much vaporware. I definitely recommend you read the whole story.

Notaz has had a lot of fun with Starcraft back in the days, and was interested to find a way to make it work on Pandora.

Notaz: When I got my first PC around year 2000, I used to play quite a lot of it, so I thought it would be cool to play it on Pandora again. The other encouraging thing was that it has been done before by Winulator, so I knew there are no fundamental things preventing it. I’ve also peeked at the disassembly before starting, seeing that it’s not using anything special (like mmx/sse instructions) encouraged me too.

In order to produce this port, Notaz used what is usually referred to as static recompilation. A term that is sometimes discussed when talking about emulators. Emulators are usually using real-time interpreters to read and interpret processor instructions one-by-one and then perform the corresponding action each instruction indicates in every single step. Emulators often also use recompilers to translate on demand the code made for another machine to a code understandable and executable on the running hardware. Such emulators are said to be using a dynarec (short for dynamic recompiler) or JIT (just-in-time). But there is an equally valid approach, which is to go through the whole binary made for the original hardware, and do an offline recompilation of the code so that it produces a complete binary that can be executed on the target hardware, translated only a single time. This is, ideally, a very elegant method, but there are several cases where the correct translation can not be predicted accurately until the actual code is run. That is why most emulators still stick to interpreters or dynarecs/JITs.

This being said, Notaz managed to produce a port of Starcraft, initially a Windows 95 binary made to run with Windows libraries on a x86 architecture. And this is what you get in the end: Starcraft running on the Open Pandora, just like if it were made for it in the first place:

Through the conversion process, Notaz ended up with an ARM binary, and he used WINE for ARM to redirect Windows library calls to equivalent calls in a generic X11 environment. Note that static recompilation is not the same thing as reverse engineering. Reverse Engineering supposes that you deconstruct the structure of the original game and the files it uses as data, to recreate it from scratch or even modify it. This is not the case here, since there was no intention to understand how Starcraft was built in the first place, but simply how one can translate the executed code from one architecture to another.

The translation of the code went through this kind of flow:

x86 Binary -> x86 Asm Sources -> C Sources -> ARM Binary (i.e. Pandora port)

Obviously, some tools are needed to go from one end to another.

Notaz: I’m […] relying on an external program (IDA) for initial disassembly. The tools are on my github repository already, they are however extremely user-unfriendly… an awful hackjob really. The “source” of the port is huge-ass C file, it takes around a minute to compile. It doesn’t look much different from x86 disassembly, the only difference is that you can compile it for other 32 bit architectures (there is no way it would work on 64 bit ones).

Note that Broodwar works just fine as well.

You may wonder why it is needed to go to a C source instead of decoding x86 binary and converting it directly to an ARM binary…

Notaz: It has several advantages. First one is debugging, it just makes debugging a lot easier by compiling back to x86. This allows to do work in small increments by combining original asm code pieces with recompiled C, which allows isolating recompiler bugs rather quickly. Another is convenience and perhaps performance, you get things like ARM calling conventions taken care of by the C compiler, it also makes use of all ARM registers (with direct conversion you’d have half of them unused as IA32 only has 8), cortex-a8 dual issuing and so on. C compiler warnings are also helpful and often point to recompiler bugs.

Indeed, once you have C sources, you can compile the sources for different architectures:

C Sources -> x86 Binary
C sources -> ARM Binary

…and compare if the issues you have on one binary occur on the other.

There have been many questions whether the tools Notaz used for Starcraft could be used for any other port of Windows PC games. The short answer is “no”. The longer answer…

Notaz: It only implements the very minimum of what SC needs, so it would almost certainly fail on anything else. Only part of x86 instructions are handled (not all are fully implemented), there is no support for float/mmx/sse (SC almost doesn’t use float), it only understands compiler-generated functions (code compiled from hand-coded asm will fail). […] It would take some infinite time to write universal tool that could handle larger part of games/programs, it’s easier just to say it’s impossible. And in some cases it’s indeed impossible, like for self-modifying code and dynarecs (even SC has a dynarec which had to be worked around..).

Porting without the original code is one thing, but when the resulting port has bugs or experiences crashes, dealing with these issues can be very tricky, since the generated code is not really “readable” per se. However, Notaz had a process to deal with these issues.

Notaz: Fixing bugs was the most demanding part of this project, probably 80-90% of time was spent on bugs. There are several ways to deal with with them, depending on what kind of bug it is. First thing > is to check if the bug shows on x86 PC build. If it does, you are lucky, it means recompiled dll files can be switched around with original ones to find the responsible dll file, or if it’s the main .exe. When that’s done, a process called bisection is used to find the C function responsible. As said before, the code is first disassembled to compilable x86 assembly and then translated to C, so half of the functions can be taken from the assembly and another half from C, then compiled to separate object files which can be linked together to x86 .exe or .dll. The result is then tested on PC and depending on the test result, bad half of C code is chosen and split again into half, random half replaced with the assembly and again compiled and linked. This is tested again and the process repeats until faulty function is found (the point here is that the x86 assembly is the original code, so it’s correct). If the function is small, it’s usually not hard to spot a problem by comparing x86 asm and C code, but in some cases the function was huge and more tricks had to be used, like jumping from the various points of C function to special code that sets up things like they were in asm, and from there jumping to original asm. Jump location is then adjusted until faulty code is isolated and the bug can finally be found and fixed.

If the bug only shows up on ARM, then it’s more difficult and more effort is needed. If it’s a crash, then stacktraces can be analyzed, arguments checked, traces added (for comparison with PC) and so on. The worst case is when there is no crash but only some incorrect behavior, but luckily there was not much of that.

About finding out if fix is robust, well, the only way to find that is lots of testing. This also applies to all “normal” software.

Even Starcraft had some particularities to deal with. It uses its own dynarec, and that had to be translated as well.

Notaz: It’s used for drawing related operations. There is a simple scripting language from which piece of x86 code is generated repeated many times. It was probably used for loop unrolling, perhaps back in 1995-1998 when the game was developed, the game developers found that their C compiler wasn’t producing very good code and they came up with solution like this. Luckily, the number of “scripts” used by SC is limited, so simple replacement functions could be written in C.

Loop unrolling (or unwinding) is actually an interesting aspect of programming. Depending on the performance of your compiler, you can gain better performance in loops by writing more code than less. Here’s an example taken straight from Wikipedia:

int x;
 for (x = 0; x < 100; x++)
 {
     delete(x);
 }

The same code could be written in more lines, using less loops by processing more instructions per loop:

int x; 
 for (x = 0; x < 100; x+=5)
 {
     delete(x);
     delete(x+1);
     delete(x+2);
     delete(x+3);
     delete(x+4);
 }

This can be used to speed things up. It’s obviously a very simple example here, but you get the idea.

It took several months of work for Notaz to get to the end result. While the process described may sound easy, everything takes time, and here’s a quick overview of how time was spent on the project:

Notaz: I did all the work during my “hacking evenings” after work and the weekends, although there were other projects during that time, like Pandora’s firmware update. First task was to get a disassembly that can be compiled back to working .exe, as IDA isn’t outputting compilable assembly. That took 3 or 4 evenings to do manually (knowing what had to be done allowed later to write a plugin to automate this in IDA itself). When that succeeded (it was November, I think), I could start the tools, which took a week or so to be progress enough to translate the first function to C code that could be compiled. After lot of bugfixing, the main SC exe was fully translated by the end of January, but it was too early for Pandora testing as 2 .dll files were left. Just after I asked for compo deadline extension, I had found that there was third .dll with a .snp extension, evil… The first version that started at all on Pandora was around a week after the Compo deadline, so around the end of February. Finally, on March 4th, I had the first playable release.

The trickiest parts of the conversion were related to indirect calls in the original code. This required a lot of manual work to find out how to fix them.

Notaz: There were some difficulties with indirect calls (using function pointers). For normal direct calls, calling convention can be found by analyzing the target function, but when it’s a pointer loaded from internal data structures, it’s impossible to know what it’s calling and the calling convention (are the arguments passed in registers, on stack, or both?). For that I had to actually run the code and annotate it later (write some ‘help text’ in assembly for the tools to read) with what the convention is, so that the translation tool knows. If an annotation is missing, the tools will try some heuristics or just make a guess. Some Win32 APIs are inconvenient for translation (DirectX is based on function pointers…). There was some clearly hand-coded assembly in the video decoder .dll. That code doesn’t express well in C, as the functions were not following any calling conventions, so more manual annotations were needed. Some functions had to be converted by hand. There were a couple of instances of “insane” code that doesn’t do any useful work but was confusing the tools. Perhaps some bugs in the compiler used for the original program… I had to remove that code with annotations.

And there you go. Through his months of work on the conversion, we now have a feature-full, bug-free version of Starcraft running perfectly on the Open Pandora. An amazing game to play on the go.

Many thanks to Notaz for his availability to discuss this topic.

Leave a Reply

4 Comments on "Starcraft on Open Pandora: How the Port Came to Be"

avatar
  Subscribe  
newest oldest most voted
Notify of
Steven Craft
Guest

So neat; and such persistence – typically I’d expect something like this to go to some early proof of concept and then fizzle out, but to get to a perfect final product takes determination.

wardred
Guest

Really nice write up. I’m with Steven. Notaz it’s amazing the effort you put into not just getting Starcraft to run, but to run well.

klapse
Guest

There are not many people on the planet who could pull this off.

trackback

[…] has once again taken everyone by surprise. After Starcraft and Diablo which were already pushing the limits of static recompilation, he has managed to port […]