Skip to content
Snippets Groups Projects
  1. Dec 25, 2021
    • Geo Ster's avatar
      Memory subsystem rewrite + EE IRQs! · ec313120
      Geo Ster authored
      This is a pretty big commit so the description is probably going
      to be a whole essay again explaining all the changes. Emulation is extermely
      complicated and thus I need to explain all of my reasoning and sources.
      This commit contains 3 major changes that all work together to form the new memory subsystem:
      
      * New handler infrastrucutre
      * Compiler switch to clang-cl
      * Initial implementation of EE interrupts
      
      Now, you reader, might wonder why I decided to redo the relatively simple
      and straightforward system we had before. Well that system had some
      drawbacks that I think needed to be addressed early on. Firstly, it is
      highly centralized, which means that for every new component the read/write
      functions of the ComponentManger (now Emulator) need to updated. This isn't
      that big of an issue as the second one though. The old system relies heavily
      on branches to figure out the destination of a read/write which is bad for
      performance. Especially because our address ranges aren't continuous, the
      compiler can't optimize the switch statement in any way. This leads to a lot
      of assembly code, many jumps.
      
      The initial idea for this new system was taken from a PCSX2 devblog I read
      recently: https://pcsx2.net/developer-blog/218-so-maybe-it-s-about-time-we-explained-vtlb.html
      It explains a system, where the address range is divided into pages, where each
      page is handled by a handler function. This is perfect for us, because it moves
      most of the code to the initialization phase (when the components register
      their handlers), while reads/writes are very fast, only having to lookup
      the handler table and calling the appropriate function.
      
      However is isn't as easy to implement to implement though. The main problem
      was how to store class member function of different classes in a single array
      and call them without knowing their type. Firstly I thought of using
      std::function, which is perfect for this due to its type erasure but is
      was quickly ruled out because of the very high overhead. Next, I considered inheritence
      and virtual functions, which was a step to the right direction. However that
      also has the overhead of looking up the vtable. Finally, though, I discovered
      a neat little trick with function pointers. You can actually cast a pointer to
      a base class member function, to a derived class member function as long as the
      function isn't ambigious. So the final solution was to make all the components
      inherit from an empty (for now) Compoent class and store a common Component function pointer.
      The compiler will handle the rest, with some dose of magic and inheritance!
      The handler interface is located in the common/component.h file.
      You can check out the IOP DMA controller constructor for how a component can register
      handlers with this system.
      
      This is very efficient, generating only 10-15 lines of assembly (with clang 12.0), which
      leads me to the second change, that of the compiler. The switch to clang-cl was made primarily
      for performance reasons. clang generates a lot more efficient code than MSVC does so the switch
      will improve perfomance down the road. It also catches more warnings and code issues, allowing for
      cleaner code overall.
      
      The next hurdle, was figuring the handler page size. This is more difficult than it seems, because there
      are additional "hidden" addresses the BIOS writes to, which aren't listed in the ps2tek
      memory map. Making the page size too big, will lead to these garbage addresses being handled
      by our compoents which defeats the purpose of this whole system. Making the page size too
      small though, will both make the handler array table massive and require compoents to register
      many handlers to cover their address ranges. So after studying the memory map for a while, I
      decided that 0x80 = 128 is the best size. For example in the DMAC (EE DMA) each channel takes up
      exactly 0x80, while the IOP DMA each channel group is also exactly 0x80 in size.
      0x80 is, in addition, small enough that garbage addresses don't get caught.
      Even in the case we have something like that, I have placed asserts on debug builds to capture them.
      
      Our struggle isn't done though! The initial handler table ended up causing stack
      overflows because the array was too large. To mitigate this, the stack size was increased
      to 10MB and a small optimization was implemented. If you view all the addresses in the memory map of
      the PS2, a pattern emerges. It turns out that a byte inside the address is always zero, no matter the address
      (except for 0xfffe addresses which we don't care about). This means we can "squash"
      the address by removing that byte, allowing us to significantly reduce the handler table size:
      
      0x100|0|3070 -> 0x1003070
      0x120|0|0060 -> 0x1200060
      0x1F4|0|2006 -> 0x1F42006
      0x1F8|0|1120 -> 0x1F81120
      0x1F9|0|01AC -> 0x1F901AC
      
      This is implemented in the Emulator::calculate_page function.
      A debug assert is also placed here to ensure nothing our of the ordinary happens.
      
      Finally, I also implemented EE interrupts because they are needed at this stage. Timer 5, should normally
      be ticking now (next commit I promise), and is waiting to cause an interrupt, thus we need to have those implemented.
      The implementation is taken from a new document I found, which is the same as the previous one, but more focused on
      the EE and its features, something that should help us a lot in the near future. Right now its not finished, but
      that will come in the next commit.
      ec313120
  2. Dec 12, 2021
    • Geo Ster's avatar
      Add support for IOP timers and interrupts · af6560f5
      Geo Ster authored
      * After a while the IOP starts setting up timer 5 so we need to start
      implementing timer support. Timers are pretty simple actually. Each one
      has 3 32bit registers (we use 64bit registers to check for overflow), a
      count register that counts the number of cycles, the mode register which
      configures the timer and the target register which generates an interrupt
      when count == target. For now that's all we need.
      
      * In addition, interrupts are also implmented. These are a bit more
      complicated since they involve COP0, but not to difficult either. The IOP
      has 3 registers, I_MASK, I_STAT and I_CTRL. I_CTRL acts as a global
      enable/disable so it's pretty simple. I_STAT is a bit mask that states
      which interrupts are pending. I_MASK on the other hand has the ability
      to enable/disable specific interrupts. So to check if the interrupt will be
      executed we must do !I_CTRL && (I_MASK & I_STAT). All info can be found
      on ps2tek/nocash psx
      
      * On the EE side, a few new instructions are added to progress further.
      Now the EE starts setting up the GIF, which is quite exciting!
      
      * I think it's a good time to also elaborate on how we read structs. Instead
      of using switch statements I prefer pointers and struct because these
      generate a lot more compat code, even with compiler optimizations and eliminate
      the need for branches. This should not be of concern on normal applications but
      we are special ;). Most registers on the console are 32bit, so structs
      are cast to uint32_t* to access them. And since the offsets are always in bytes
      we must divide them by sizeof(uint32_t) = 4 (I prefer >> 2 since it's more efficient)
      
      * Some registers however are peculiar in a sense that a parts of them
      are located in completely different address ranges. This is bad because
      for example timer 0 and timer 3 have the same offset of 0 from their relative
      address ranges. To fix this, we introduce a variable called group that records
      which "group" the write/read is refering to, with some simple bit masks.
      The result is casted to bool which converts the result to 0 or 1. Then
      the expression "offset + group * <number>" is used to access registers.
      In the timer examples accessing timer 0 will give group 0 and offset 0
      so timer 0 will be accessed. With timer 3 though group will be 1 so
      0 + 1 * 3 = 3 will be accessed. This a convenient way to bypass branches.
      af6560f5
  3. Dec 02, 2021
  4. Nov 30, 2021
    • Geo Ster's avatar
      Introducing the IOP · 1d16fdad
      Geo Ster authored
      * So after a week, it's finally here! The initial implementation of the
      IOP has been added to the emulator. You might wonder why did it take so
      long? This was mostly because I wanted to make the implementation as complete
      as possible and also test it to ensure it's bug free. So this is actually
      based on the MIPS R3000A interpreter I wrote last year for my PS1 emulator.
      So did I just copy the code and call it a day? Hell no, the code in that
      ancient project is awful, even if it works. So I completely rewrote the
      interpreter by using our modern techiniques of storing state. So rewriting the old
      code allowed me to test if it actually worked in that environment
      and could boot PSX games.
      
      * Due to this, the implementation is a bit more complete than the EE
      as it includes interrupt support. In addition we have to account for
      the fact that the IOP runs at 36.864MHz, in constrast to the EE which
      clocks at 295MHz. This maps approximatly to an 1/8 ratio, which means
      that 1 IOP instruction will run every 8 EE cycles. The current implementation
      of this is hacky and a bit inaccurate because some EE instructions
      can take more than 1 cycle to execute, but it's good enough for now
      (Play! assumes this as well and can boot 40%+ of games).
      
      * Because both the CPU emulators can share a lot of naming conventions,
      to avoid confusion each processor has been seperated into a namespace
      so we can always know which CPU we are refering to. Finally, for now
      reads/writes except for the BIOS and IOP RAM, haven't been implemented
      but will come soon.
      1d16fdad
  5. Nov 16, 2021
    • Geo Ster's avatar
      Move ComponentManger to a seperate thread · d8a4a251
      Geo Ster authored
      * Currently the interperter performance was extermely slow
      due to the handling of window events from glfw blocking the execution
      of new instructions. Moving the execution to a seperate thread, results
      in massive performance improvements. However the current implementation
      is only temporary and will be modularized in the future.
      d8a4a251
  6. Nov 15, 2021
    • Geo Ster's avatar
      Add more shift (SRA/SLL) instructions · 49cabc82
      Geo Ster authored
      * Now the BIOS enters another infinite loop. However that seems to be
      normal, as it's waiting for COP0_REG[9], the timer which we haven't
      implemented yet. I think it's also time to add support for interrupts as
      well since these go hand in hand with timers
      49cabc82
  7. Nov 04, 2021
    • Geo Ster's avatar
      Add scratchpad handler · 49a61fb1
      Geo Ster authored
      * Currently the BIOS tries to write something to scratchpad and it fails
      because its address is not our currently supported range. So add a new buffer
      for the 16KB scratchpad used by the CPU and add a function to use it if the
      address is in the correct range.
      
      * Also shorten some function names and change some array to C-style because we
      like to work with pointers and STL doesn't like that.
      49a61fb1
  8. Oct 31, 2021
    • Geo Ster's avatar
      Add initial memory map · a54398cc
      Geo Ster authored
      * The ps2tek documentation states the memory map clearly [1].
      It seems to have a similar architecture with the PSX, where
      the main memory map (KUSEG0) is mirrored in multiple regions
      (KUSEG1/KUSEG2) with different access patterns for each region.
      For now we don't have to emulate all of them, just the main
      memory map.
      
      * Allocate the entire 512MB memory into an array and make a convenient
      struct to abstract memory range operations.
      
      [1] https://psi-rockin.github.io/ps2tek/#memorymap
      a54398cc
    • Geo Ster's avatar
      Add initial EE/Bus implementations · 2ad0640f
      Geo Ster authored
      * This commit adds a most basic CPU class that acts as a template
      which we will slowly build.
      
      * The architecture is pretty simple; the ComponentManager will create all
      the seperate components (EE, VP, IOP, GS etc) as unique_ptr's since
      it owns them and only it has access to them. All the other components
      must pass through the manager to read/write data to memory.
      To achieve this they are given a pointer to the ComponentManger in their constructor.
      
      * For now the CPU directly accesses the bios which shouldn't
      happen but will be fixed eventually when I implement generic
      read/writes. The goal is to start implementing the CPU as fast as
      possible in order to get to the GPU/VPU's and display something!
      2ad0640f
    • Geo Ster's avatar
      Initial commit · 9c39208d
      Geo Ster authored
      * This is the beginning of a surely arduous journey of semi-correctly emulating
      the PS2 the flagship console from Sony in 2001. The console was chosen
      for it's impressive performance at the time, relatively simple MIPS architecture
      compared to the PowerPC (Gamecube) and x86 (Xbox) competitors at the time, and because
      I own one since I wouldn't want to be caught doing piracy on an open source
      competition...
      
      The PS2 also has a myriad of resources available including comprehensive CPU documentation
      for it's MIPS ISA which will be used in the development of this emulator. Any sources that I use, will be referenced
      in the coresponding commits for the judges to look at. For development hardware documentation and info from real emulators will be used
      (I'll try to avoid using code from other projects as much as possible though).
      I've also done a PS1 emulator in the past so the minor similarities in architecture
      will help speed this process up a little.
      
      For now this is just a window with glfw and a ready opengl context.
      I hope it will be able to boot the PS2 soon enough though...
      9c39208d
Loading