Building a MIPS Assembly Interpreter in Python

Building a MIPS interpreter from scratch with Python.

Building a MIPS Assembly Interpreter in Python

In my CCS 1B class (equivalent to UCSB's CS 64), we are learning about Assembly using the MIPS instruction set. Most of the "labs" are just homework questions about implementing simple programs or doing bit shifting by hand. As a CCS student, we get to do something a little more fun—we're building an entire Assembly interpreter from scratch!

Example: minimum.asm

GitHub repo here:

GitHub - ThePickleGawd/MIPS-python: MIPS Assembly Interpreter in Python
MIPS Assembly Interpreter in Python. Contribute to ThePickleGawd/MIPS-python development by creating an account on GitHub.

Getting Started

Prof. Balkind gave us the option to use any language we wanted. For me, that meant I'd either be writing this in C++ or Python. But obviously I chose Python since it's way easier.

We learned that CPUs follow a "fetch-decode-execute" cycle to process instructions. So I wanted to abstract all the functionality into a nice, clean loop.

main.py

Now, I just have to build out each component, and I'll be done!

Decode

I started with decode since that was the most straightforward. Basically, the goal was to take an instruction represented in hex (for example: 0x014B4820), and parse it into something easier for me to work with. I decided to build my own Python type based on the MIPS reference card.

Depending on instr_type, some fields may not have a value

After some bit shifting and masking, I was able to construct the dictionary and return it.

Execute

Now that we can easily understand each instruction, we need to tell our interpreter what to do when they see it. I started with a map from the opcode/funct parameter to the corresponding function definition. I made a dict for R, I, and J types.

functions.py

Then, I grinded out each function. For example, here's a very simple implementation of addi.

functions.py

The most annoying part was implementing beq and bne. Everything I fetch an instruction, I automatically increment the Program Counter, but MIPS reference card assumes that you don't increment it when you call beq/bne. I spent like 3 hours debugging this until I finally realized this. Ugh.

functions.py

Writing syscall was also a little annoying because for some reason the data was in little endian format so I had read the hex bytes backwards, but then increment the memory address forward after reading each word.

Fetch

This part was pretty fun too. After running spim -assemble [file].asm, it gives you a binary with all the instructions and data in nice hex format. It looks somethign like this.

.text # 0x400024 .. 0x400044
.word 0x34080005, 0x34090007, 0x1285020, 0x34020001, 0x82021, 0xc, 0x3402000a, 0xc
.data # 0x10000000 .. 0x10010000
.word 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0,

Using Python magic, I parsed it and loaded all the instructions and data into an array. Whenever I want to access the data/instruction from it's actual memory address, I just subtract the start offset (first hex value after #) and divide by 4 to get the index in my array. For example:

cpu.py

To fetch the next instruction, all we need to do it get the value in the array at pc_idx() and then increment PC += 4.

Conclusion

That's it! It was pretty fun building this interpreter. I think this was actually pretty similar to the Chip8 emulator tutorial that I never completed a few years ago. Anyways, lot's of cooler projects coming up, like a VR Duolingo game but it's a zombie outbreak and you have to talk in Chinese to survive.