Skip to content
Snippets Groups Projects
  1. Jan 01, 2022
    • Geo Ster's avatar
      VU0: Add basic ADD/SUB/MUL instructions · 787df2b4
      Geo Ster authored
      * The bios uses them to initialize the VU0 registers. For now I don't
      check for overflow but I think it's going to become necessary in the
      near future.
      787df2b4
    • Geo Ster's avatar
      Report ready DVD status to CDVDMAN · 71aab146
      Geo Ster authored
      * Early on the IOP continiously reads from 0x1F402005 which
      is the N command status register [1], to know if it's
      ready or not. So it be safe, let's report that the drive is ready
      by returning 0x40 (bit 6 set)
      
      [1] https://psi-rockin.github.io/ps2tek/#cdvdioports
      71aab146
    • Geo Ster's avatar
      Add VU0 support · b591d341
      Geo Ster authored
      * The VUs are custom made SIMD processors used to accelerate
      math operations on vectors and matrices. This doesn't seem that
      bad but in reality they are probably the hardest piece of hardware
      on the PS2 to emulate correctly. That is for two reasons:
      
      1. Not much documentation
      2. Complex and confusing pipeline
      
      The first it pretty self explanatory. However the second reason,
      the pipeline is what makes them so hard. Normally, even in LLE
      emulators we don't care about the internal pipeline of the chips,
      as it doesn't affect the result of the instructions themselves,
      it just makes them run faster. The CPU doesn't expose its pipeline
      to the target program.
      
      Some architectures are different. MIPS for example has branch delay
      slots which in reality are a pipeline quirk. Generally the more
      pipeline quirks you expose to the program the more complex it is
      to emulate correctly. The VUs basically expose their full pipeline...
      
      For now we only support a portion of the macro mode instruction set.
      The pipeline is going to come when the VU starts executing micro programs
      and when I figure out how it works...
      b591d341
  2. Dec 29, 2021
    • Geo Ster's avatar
      Lay the foundation for the DMAC · 690d9651
      Geo Ster authored
      * The BIOS now continues by initializing the DMA Controller.
      This is one of the most important hardware components of the PS2,
      as it assists the EE with transfering data where it needs to be. I've
      even read that at times it can do more work than the EE itself.
      
      * Since the DMAC isn't used at this stage, we only really have to
      implement its registers and reads/writes to them, which is pretty easy.
      However one register D_CTRL is a bit quirky in a sense that writes to it
      clear/reverse its bits, not overwrite them.
      
      * To emulate this, an additional struct is added to the register unions
      and bitwise operators are used to write to the upper and lower parts of
      the register appropriately. You can look into the source code for more details.
      
      * This allows the EE to start initializing the VU1 which is quite exciting!
      690d9651
    • Geo Ster's avatar
      Fix small timer bug · 1f7dedbe
      Geo Ster authored
      1f7dedbe
    • Geo Ster's avatar
      Implement NOR/SRAV EE instructions · c250c5ed
      Geo Ster authored
      * Allows us to progress futher into the initialization phase
      c250c5ed
    • Geo Ster's avatar
      Add initial implementation of EE timers · fc244e8a
      Geo Ster authored
      * Yeah, timers again, my favourite topic... To be frank the EE timers
      are a bit simpler than the IOP timers as they have less complexity
      in their configuration. However, since the BIOS starts to use them
      at this point, we can't get away with a extermelly partial implementation
      like the IOP.
      
      * The Emotion Engine has four hardware timers, each of them having
      three registers (four on Timer 0 and 1). They are practically the same
      with the IOP in that regard, having a count a compare/target and a mode
      register. Timer 0 and 1 have an additional register Tn_HOLD which
      keeps track of the count value when a peripheral on the
      SBUS generates an interrupt.
      
      * All the timers increment based on the bus clock which is exactly
      half of the EE clock. The timers can also be configured to count
      based on external sources, namely hblank and vblank. These are less
      accurate but can be used to keep track when the screen refreshes.
      I had hoped that we could have ignored hblank for now, but the BIOS
      configures Timer 3 (used for BIOS alarms) to use it so implementing it
      is necessary. The timings were taken from the timer header [1]
      of the ps2sdk.
      
      * An interesting fact as well is the interrupts as edge triggered
      which means that an interrupt is sent to the EE when the raised flags in
      Tn_MODE switch from 0 to 1 [2]. This is easy to implement and so did I,
      to avoid any headaches in the future.
      
      * Since the EE ticks the timers directly, we can't increment the counters
      each time the function get called. To properly emulate the timer frequency,
      an internal counter is used, that when its value is equal to the ratio
      between the EE frequency and the timer clock, the real counter is incremented.
      
      * This can be expensive since the timer function gets called every EE cycle
      so we will probably change it to cycle adding in the future, especially when
      the JIT will be implemented.
      
      [1] https://github.com/ps2dev/ps2sdk/blob/master/ee/kernel/include/timer.h#L53
      [2] https://psi-rockin.github.io/ps2tek/#eetimers
      fc244e8a
  3. Dec 27, 2021
    • Geo Ster's avatar
      Handle the entire register map of the GS · c0e3bc03
      Geo Ster authored
      c0e3bc03
    • Geo Ster's avatar
      Implement GIF PATH3 packed transfer mode · 366d03f9
      Geo Ster authored
      * Firstly, I fixed a small bug in the Handler that caused data loss
      on 128bit operations.
      
      * The GIF is a marvellous and complicated little piece of hardware that
      handles transfers between the EE and the GS. It can be "fed" by three
      paths, PATH1 is from the VPU1 memory, PATH2 is from the VPU1 FIFO and PATH3
      is directly from the main bus. Since we don't have any VUs implemented
      we only care about PATH3 at this stage.
      
      * Each primitive sent has the form of a linked list. The EE first sends an
      128bit GIFTag that acts as the header and tells the GIF how much more
      data to expect and what to do with it. The loop ends when the EE sends a GIFTag
      with the EOP field set to 1. (EE User's Manual [150])
      
      * Each data packet after a GIFTag can be processed in three different
      ways depending on the FLG field of the tag; PACKED, REGLIST or IMAGE mode.
      For now we only care about PACKED.
      
      * When in PACKED mode, the EE will send NREG * NLOOP (specified in GIFtag) qwords
      after the tag. Each qword can be processed in different ways depending on the desc
      in REG field of the GIFTag. Page 152 of the EE User's Manual shows the different modes.
      The REG field though is in reality a bit array of 4-bit descriptors. To understand
      this better, here are the processing steps:
      
      1. The first qword after the GIFTag is processed based on the least significant bits (64:67) (the first descriptor)
      and is output
      
      2. The second qword is processed based on the next descriptor (68:71) (second descriptor) and is output
      
      3. Steps 1,2 are repeated NREG times.
      
      4. Steps 2,3 are repeated NLOOP times
      
      There are more variables we have to take into account with PATH3, because it can also be masked
      by other PATHs which have higher priority. But that is for later. Don't worry though if you
      didn't get it completetly. The GIF is nowhere near finished, so I will have more
      chances to explain how it works. For more info you can read the GIF chapter of the provided EE User's Manual.
      366d03f9
  4. Dec 26, 2021
    • Geo Ster's avatar
      Log accesses to INTC · 295daa15
      Geo Ster authored
      295daa15
    • Geo Ster's avatar
      Register handlers with addresses · e29049d4
      Geo Ster authored
      * Since the components will never give pages directly, let them
      use addresses instead and compute the page in the register function
      to save some work on the component side.
      e29049d4
    • Geo Ster's avatar
      Unify 128bit EE reads/writes · 3c4945af
      Geo Ster authored
      * Initially the LQ/SQ instructions were implemented to perform two
      sequential 64bit operations to emulate 128bit reads/writes. However
      this won't work well for us, especially when writing to the GIF FIFO.
      To mitigate this we can use the __int128 gcc extension (yay for switching
      to clang once again!), which provides us with an optimized way of storing
      128bit data.
      3c4945af
    • Geo Ster's avatar
      Respect type of handler · 70f31230
      Geo Ster authored
      * Until now the memory system didn't take into account the bit width
      of the data coming in and out of the handlers. Instead I assumed that
      64bits would be large enough for everything. But alas I was wrong.
      Some addresses (notably the GIF/IPU FIFOs) are read/written with 128bit values.
      I don't want to force every function to return __int128 types as that
      would cripple performance so some tweaks were needed.
      
      This isn't as hard as it might sound. The emulator read/write functions
      are templates so we know which type we want beforehand. So it's as
      simple as abstracting the Handler with a bit more inheritence magic and
      we can cast HandlerBase to the type we want.
      70f31230
    • Geo Ster's avatar
      Initial GIF implementation · a81117ea
      Geo Ster authored
      Currently there's nothing really of note, it's just an empty
      class that handles reads/writes to some of the registers. The
      functionality will be explained in subsequent commits. Along with
      this I've added a new document, from which, the GIF implementation will
      be based on.
      a81117ea
    • Geo Ster's avatar
      Stop leaking memory · 622034ef
      Geo Ster authored
      * The handler table is dynamically but the memory never gets
      deallocated. Plug the leak by clearing any memory in the destructor ;)
      622034ef
    • Geo Ster's avatar
      Prevent write to some addresses · 52b34632
      Geo Ster authored
      They cause too much logspam for now and we don't use them
      52b34632
    • Geo Ster's avatar
      Cleanup IOP timer processing · abc036d3
      Geo Ster authored
      * Remove cycle argument, we don't need it as we tick the timers each
      IOP cycle
      
      * Make the code a bit cleaner
      abc036d3
  5. Dec 25, 2021
    • Geo Ster's avatar
      Refactor and fix IOP interrupts · a6679fdc
      Geo Ster authored
      This commit fixes some issues preventing IOP interrupts from working
      correctly while also seperating them into a seperate class for convenience.
      
      * Previously the pending flag was written to the first bit of cause.IP, which
      while correct was flawed. To understandw why let's look at how interrupts
      get triggered. COP0 has 2 8 bit masks, IP (cause) and Im (status). On both
      of these registers the first 2 bits are ignored because they are used for
      software interrupts which are unsupported on the IOP. However while Im was
      including these unused bits, IP did not thus causing mistaken comparions.
      Below is a diagram that shows the issue. IP was bits 10-15 while Im was bits
      8-15. Comparing diffent ranges like this doesn't work.
      
      Cause: ... 00|111111| ...
      Status: ... |00111111| ...
      
      The fix was to make IP point to 8-15 range and adjust the writing
      mechanism in the INTR::interrupt_pending function.
      
      * In addition the usage of >= instead of == in the timers, caused
      a bug where the timer would continiously send interrupts after reaching
      target which is not the intended behaviour. Fix that as well.
      a6679fdc
    • Geo Ster's avatar
      Memory subsystem rewrite + EE IRQs! · ec313120
      Geo Ster authored
      This is a pretty big commit so the description is probably going
      to be a whole essay again explaining all the changes. Emulation is extermely
      complicated and thus I need to explain all of my reasoning and sources.
      This commit contains 3 major changes that all work together to form the new memory subsystem:
      
      * New handler infrastrucutre
      * Compiler switch to clang-cl
      * Initial implementation of EE interrupts
      
      Now, you reader, might wonder why I decided to redo the relatively simple
      and straightforward system we had before. Well that system had some
      drawbacks that I think needed to be addressed early on. Firstly, it is
      highly centralized, which means that for every new component the read/write
      functions of the ComponentManger (now Emulator) need to updated. This isn't
      that big of an issue as the second one though. The old system relies heavily
      on branches to figure out the destination of a read/write which is bad for
      performance. Especially because our address ranges aren't continuous, the
      compiler can't optimize the switch statement in any way. This leads to a lot
      of assembly code, many jumps.
      
      The initial idea for this new system was taken from a PCSX2 devblog I read
      recently: https://pcsx2.net/developer-blog/218-so-maybe-it-s-about-time-we-explained-vtlb.html
      It explains a system, where the address range is divided into pages, where each
      page is handled by a handler function. This is perfect for us, because it moves
      most of the code to the initialization phase (when the components register
      their handlers), while reads/writes are very fast, only having to lookup
      the handler table and calling the appropriate function.
      
      However is isn't as easy to implement to implement though. The main problem
      was how to store class member function of different classes in a single array
      and call them without knowing their type. Firstly I thought of using
      std::function, which is perfect for this due to its type erasure but is
      was quickly ruled out because of the very high overhead. Next, I considered inheritence
      and virtual functions, which was a step to the right direction. However that
      also has the overhead of looking up the vtable. Finally, though, I discovered
      a neat little trick with function pointers. You can actually cast a pointer to
      a base class member function, to a derived class member function as long as the
      function isn't ambigious. So the final solution was to make all the components
      inherit from an empty (for now) Compoent class and store a common Component function pointer.
      The compiler will handle the rest, with some dose of magic and inheritance!
      The handler interface is located in the common/component.h file.
      You can check out the IOP DMA controller constructor for how a component can register
      handlers with this system.
      
      This is very efficient, generating only 10-15 lines of assembly (with clang 12.0), which
      leads me to the second change, that of the compiler. The switch to clang-cl was made primarily
      for performance reasons. clang generates a lot more efficient code than MSVC does so the switch
      will improve perfomance down the road. It also catches more warnings and code issues, allowing for
      cleaner code overall.
      
      The next hurdle, was figuring the handler page size. This is more difficult than it seems, because there
      are additional "hidden" addresses the BIOS writes to, which aren't listed in the ps2tek
      memory map. Making the page size too big, will lead to these garbage addresses being handled
      by our compoents which defeats the purpose of this whole system. Making the page size too
      small though, will both make the handler array table massive and require compoents to register
      many handlers to cover their address ranges. So after studying the memory map for a while, I
      decided that 0x80 = 128 is the best size. For example in the DMAC (EE DMA) each channel takes up
      exactly 0x80, while the IOP DMA each channel group is also exactly 0x80 in size.
      0x80 is, in addition, small enough that garbage addresses don't get caught.
      Even in the case we have something like that, I have placed asserts on debug builds to capture them.
      
      Our struggle isn't done though! The initial handler table ended up causing stack
      overflows because the array was too large. To mitigate this, the stack size was increased
      to 10MB and a small optimization was implemented. If you view all the addresses in the memory map of
      the PS2, a pattern emerges. It turns out that a byte inside the address is always zero, no matter the address
      (except for 0xfffe addresses which we don't care about). This means we can "squash"
      the address by removing that byte, allowing us to significantly reduce the handler table size:
      
      0x100|0|3070 -> 0x1003070
      0x120|0|0060 -> 0x1200060
      0x1F4|0|2006 -> 0x1F42006
      0x1F8|0|1120 -> 0x1F81120
      0x1F9|0|01AC -> 0x1F901AC
      
      This is implemented in the Emulator::calculate_page function.
      A debug assert is also placed here to ensure nothing our of the ordinary happens.
      
      Finally, I also implemented EE interrupts because they are needed at this stage. Timer 5, should normally
      be ticking now (next commit I promise), and is waiting to cause an interrupt, thus we need to have those implemented.
      The implementation is taken from a new document I found, which is the same as the previous one, but more focused on
      the EE and its features, something that should help us a lot in the near future. Right now its not finished, but
      that will come in the next commit.
      ec313120
  6. Dec 12, 2021
    • Geo Ster's avatar
      Fix mistake in lq instruction · bcbb54ab
      Geo Ster authored
      Wtf, why did I miss this?
      bcbb54ab
    • Geo Ster's avatar
      Add support for IOP timers and interrupts · af6560f5
      Geo Ster authored
      * After a while the IOP starts setting up timer 5 so we need to start
      implementing timer support. Timers are pretty simple actually. Each one
      has 3 32bit registers (we use 64bit registers to check for overflow), a
      count register that counts the number of cycles, the mode register which
      configures the timer and the target register which generates an interrupt
      when count == target. For now that's all we need.
      
      * In addition, interrupts are also implmented. These are a bit more
      complicated since they involve COP0, but not to difficult either. The IOP
      has 3 registers, I_MASK, I_STAT and I_CTRL. I_CTRL acts as a global
      enable/disable so it's pretty simple. I_STAT is a bit mask that states
      which interrupts are pending. I_MASK on the other hand has the ability
      to enable/disable specific interrupts. So to check if the interrupt will be
      executed we must do !I_CTRL && (I_MASK & I_STAT). All info can be found
      on ps2tek/nocash psx
      
      * On the EE side, a few new instructions are added to progress further.
      Now the EE starts setting up the GIF, which is quite exciting!
      
      * I think it's a good time to also elaborate on how we read structs. Instead
      of using switch statements I prefer pointers and struct because these
      generate a lot more compat code, even with compiler optimizations and eliminate
      the need for branches. This should not be of concern on normal applications but
      we are special ;). Most registers on the console are 32bit, so structs
      are cast to uint32_t* to access them. And since the offsets are always in bytes
      we must divide them by sizeof(uint32_t) = 4 (I prefer >> 2 since it's more efficient)
      
      * Some registers however are peculiar in a sense that a parts of them
      are located in completely different address ranges. This is bad because
      for example timer 0 and timer 3 have the same offset of 0 from their relative
      address ranges. To fix this, we introduce a variable called group that records
      which "group" the write/read is refering to, with some simple bit masks.
      The result is casted to bool which converts the result to 0 or 1. Then
      the expression "offset + group * <number>" is used to access registers.
      In the timer examples accessing timer 0 will give group 0 and offset 0
      so timer 0 will be accessed. With timer 3 though group will be 1 so
      0 + 1 * 3 = 3 will be accessed. This a convenient way to bypass branches.
      af6560f5
  7. Dec 10, 2021
    • Geo Ster's avatar
      Groundwork for IOP DMA implementation · 1cce9b6f
      Geo Ster authored
      * The DMA routine on the IOP works similarly to the PSX version with a
      few additions. There are 7 channels from the PSX and an additional 6 new
      PS2 exclusive ones. One the PSX, each channel has 3 registers used
      to configure and use it and 3 global registers.
      
      * The PS2 contains all the older DMA registers, but it add 6 more channels
      and duplicates the global registers (DPCR now has a counterpart called DPCR2)
      This is done because each global register can control up to 7 channels.
      An additional register on each channel (tadr) and 2 additional
      global registers have been added as well. For now we don't really
      care to implement them, only read/write to them.
      
      * For reading and writing to the registers structs are used to prevent
      the usage of switch and if statements.
      1cce9b6f
  8. Dec 04, 2021
    • Geo Ster's avatar
      Fix load instruction logging in IOP · 42172fa2
      Geo Ster authored
      * Due to load delay slots the target register doesn't get
      written immediately, so use the value instead to correctly
      display the loaded value in the logs
      42172fa2
  9. Dec 02, 2021
  10. Dec 01, 2021
    • Geo Ster's avatar
      Minor branch optimization · 1a3d77c9
      Geo Ster authored
      * On reads/writes it is important to check the address alignment before
      proceeding with the operation. However unalignment errors almost
      never happen in real world games, so let the compiler know that these
      branches are unlikely to happen to speed them up a bit.
      1a3d77c9
  11. Nov 30, 2021
    • Geo Ster's avatar
      Introducing the IOP · 1d16fdad
      Geo Ster authored
      * So after a week, it's finally here! The initial implementation of the
      IOP has been added to the emulator. You might wonder why did it take so
      long? This was mostly because I wanted to make the implementation as complete
      as possible and also test it to ensure it's bug free. So this is actually
      based on the MIPS R3000A interpreter I wrote last year for my PS1 emulator.
      So did I just copy the code and call it a day? Hell no, the code in that
      ancient project is awful, even if it works. So I completely rewrote the
      interpreter by using our modern techiniques of storing state. So rewriting the old
      code allowed me to test if it actually worked in that environment
      and could boot PSX games.
      
      * Due to this, the implementation is a bit more complete than the EE
      as it includes interrupt support. In addition we have to account for
      the fact that the IOP runs at 36.864MHz, in constrast to the EE which
      clocks at 295MHz. This maps approximatly to an 1/8 ratio, which means
      that 1 IOP instruction will run every 8 EE cycles. The current implementation
      of this is hacky and a bit inaccurate because some EE instructions
      can take more than 1 cycle to execute, but it's good enough for now
      (Play! assumes this as well and can boot 40%+ of games).
      
      * Because both the CPU emulators can share a lot of naming conventions,
      to avoid confusion each processor has been seperated into a namespace
      so we can always know which CPU we are refering to. Finally, for now
      reads/writes except for the BIOS and IOP RAM, haven't been implemented
      but will come soon.
      1d16fdad
  12. Nov 29, 2021
    • Geo Ster's avatar
      Simplify branch delay detection logic · 6355a7f8
      Geo Ster authored
      * Instead of having a global is_branch_delay that we must manage, instead
      we can just set the attribute of the next instruction since we have
      it ahead of time.
      
      * In addition fix an oopsie in the alignment detection logic in op_lw
      that caused an unwaranted exception and infinite loop
      6355a7f8
  13. Nov 17, 2021
    • Geo Ster's avatar
      Add exception support · 9145d821
      Geo Ster authored
      * This commit adds support for handling exceptions for our virtual
      MIPS CPU. The implementation is based on the provided document's chapter
      on Exception handling, and more specifically on the flowchart of level 1
      exceptions (Section 5.1.1). To check if the current instruction is in a delay
      slot we use a bool and cache it together with the currently executing instruction.
      
      * The current implementation is only partially complete, since it's missing level 2
      exception handling, but I reckon that those exceptions won't be needed
      for a long time. This acts more as a foundation for implementing interrupts
      which are 100% required for emulating even the most basic of systems.
      9145d821
    • Geo Ster's avatar
      Ignore unrecognized reads/writes · 327c71d0
      Geo Ster authored
      * Currently the BIOS only writes to scratchpad and some very few mysterious
      addresses that don't seem to do anything. However it is important to know
      when it will try to write to DMAC for example so we can implement it. So
      instead of writing anything and uncontrollably into a single large buffer
      let's make an if-else with all the known addresses and how to handle them.
      When the BIOS tries to write somewhere new we will be notified immediately.
      
      * Also rework the disassembly logger to use C FILE* since these are faster
      then std::ofstream. Normally I wouldn't care about this but in our usecase
      which is very performance sensitive, it makes a noticeable difference.
      327c71d0
  14. Nov 16, 2021
    • Geo Ster's avatar
      Progress further - solve infinite loop · 40e730b7
      Geo Ster authored
      * Add a few more instructions required to progress further into the BIOS
      execution
      
      * After that the BIOS writes something to 0xb000f410 and then enters
      a loop at 0x9fc417b0 that only exits when that address contains a value of 0.
      So unless that address magically changes value on its own, it's either an interrupt
      (higly unlikely since the INT regs aren't touched) or the IOP doing
      this (also impossible since the IOP and EE only communicate with DMA and
      have seperate memory regions). So my guess is that the BIOS expects that address
      to always be zero no matter the value written to it. Until I am proven wrong
      let's stick to that to exit that infinite loop and continue...
      40e730b7
    • Geo Ster's avatar
      Move ComponentManger to a seperate thread · d8a4a251
      Geo Ster authored
      * Currently the interperter performance was extermely slow
      due to the handling of window events from glfw blocking the execution
      of new instructions. Moving the execution to a seperate thread, results
      in massive performance improvements. However the current implementation
      is only temporary and will be modularized in the future.
      d8a4a251
    • Geo Ster's avatar
      Implement reads/writes MCH_RICM/MCH_DRD · 45db21fd
      Geo Ster authored
      * The BIOS tries to write to 0x1000f430/0x1000f440 which contain the
      registers MCH_RICM/MCH_DRD. Saldy these registers are quite undocumented.
      So the writing logic has been taken from PCSX2:
      https://github.com/PCSX2/pcsx2/blob/master/pcsx2/HwWrite.cpp#L237
      Forgive me, for I have sinned.
      45db21fd
    • Geo Ster's avatar
      Fix bugged scratchpad reads/writes · 688bc87f
      Geo Ster authored
      * Looking at the console output introduced in the previous commit, it
      was obvious that it was very wrong. Booting the same BIOS file with PCSX2
      reports: Initialize memory (rev:3.17, ctm:393Mhz, cpuclk:295Mhz )
      
      * It turns out that the problem wasn't with the CPU, but the string printing
      function. The BIOS writes the numbers it calculates to the scratchpad and
      the print function reads them from there. But there were 2 issues with the
      current implementation
      
      1. addr & 0x3FFC is ignoring the last 2 bits which makes 8bit writes/reads
      broken.
      
      2. Always casting to uint32_t* instead of the type provided T* results in
      byte write overriding whole 32bit sections of the scratchpad, destroying
      critical data.
      688bc87f
    • Geo Ster's avatar
      Wrap log funtion into a macro and implement BIOS console · c4c3a1ff
      Geo Ster authored
      * Since log output is getting very large, it's common to have to wait
      5+ minutes before any unknow instruction is encoutered. Printing to the
      console actually takes a lot of time and slows down interpretation
      significantly. Right now we don't care, since we just want to boot the BIOS
      but let's have an option to disable all the log messages if we want.
      
      * In addition record all writes to 0xb000f180 which is the BIOS console
      output address, so we can have some output, which will be very useful
      c4c3a1ff
  15. Nov 15, 2021
    • Geo Ster's avatar
      Add more shift (SRA/SLL) instructions · 49cabc82
      Geo Ster authored
      * Now the BIOS enters another infinite loop. However that seems to be
      normal, as it's waiting for COP0_REG[9], the timer which we haven't
      implemented yet. I think it's also time to add support for interrupts as
      well since these go hand in hand with timers
      49cabc82
    • Geo Ster's avatar
      Treat registers as signed in branch instructions with comparisons · eee0af83
      Geo Ster authored
      * The documentation doesn't state this, but it's necessary. On the loop
      at 0xbfc43140 the register s3 is loaded with 0x27 and is used as the counter
      in a for loop. However because its value wasn't treated as signed so it looked
      more like 0xffffffffffffff27, which is very large, making the loop run forever.
      Fix this by treating registers as signed where needed
      eee0af83
  16. Nov 14, 2021
    • Geo Ster's avatar
      Fix bug in the BEQ/BEQL/BNEL instructions · 9ec5b188
      Geo Ster authored
      * Seems like branches really do love having bugs in them ;)
      The bug was noticed when the BEQ instruction was provided 0xffd1 as the offset.
      Decompiling with ghidra revealed that the offset was -0xbc or -188 as signed
      but with this bug the value would be 261956 which completely broke
      the program. Fix this by first casting to int16_t to let the
      compiler know that we are giving it a 16bit signed int and then convert
      it to int32_t
      
      * In addition make stores/loads bold so I can notice them better, as
      log output is starting to incrase exponentially
      9ec5b188
    • Geo Ster's avatar
      Minor fixes · eeaa0b20
      Geo Ster authored
      * Add more instructions, now the BIOS starts exeuting for longer
      without interruption yay!
      * Move some logging messages before the instruction, to avoid printing
      wrong values in case a register is modified with itself.
      eeaa0b20
    • Geo Ster's avatar
      Implement first FPU instruction · 9444ba92
      Geo Ster authored
      * Since we have encountered our first FPU register, add the 32 floating
      point registers to the CPU.
      
      * In addition solve a small bug in the JAL instruction related to the
      return link address. See previous commit for details
      9444ba92
    • Geo Ster's avatar
      Fix bug in jump instructions · 2c622b24
      Geo Ster authored
      * As stated in earlier commits we prefetch next instruction before
      the current one gets executed to guarantee that we have it available
      in case a branch instruction changes the PC. So a typical fetch cycle
      for a branch instruction would look like:
      
      /* Cycle 1. */
      instr = <something>
      next_instr = read(PC) -> jump
      PC += 4 (now it points to the branch delay)
      
      /* Cycle 2. */
      instr = jump
      next_instr = read(PC) -> branch delay
      PC += 4 (now it points to the instruction AFTER the delay slot)
      
      <execute branch>
      
      So if a branch uses offsets instead of hard coding the PC, it will point
      to the wrong address since it expects to have the PC pointing to the branch
      delay instruction. To fix this, subtrack 4 from the pc.
      2c622b24
Loading