Views: 1,070,802 Homepage | Main | Rules/FAQ | Memberlist | Active users | Last posts | Calendar | Stats | Online users | Search 10-26-20 08:57 PM

0 users reading GBAtek addendum/errata | 1 bot

Main - Development - GBAtek addendum/errata New reply

Posted on 04-11-17 04:16 PM (rev. 27 of 07-23-19 09:34 PM) Link | #87
GBAtek is an amazing piece of documentation, but it can be improved upon :)

This is a general pile of findings. I claim no ownership on those, they come from several individuals.

ROM transfers

* transfer time is 8 cycles for a command, 4 cycles per response word (basically 1 cycle per byte) (see ROMCTRL bit27 for cycle duration)
* plus start delay and 0x200-block delay at the start of each 0x200 block
* bit28 allows skipping incoming bytes automatically during delays
* DRQ bit (bit23) is set once a response word has been transferred
* reading from 0x04100010 clears the DRQ bit, and:
** if the transfer is finished: clears the busy bit and triggers IRQ if specified
** if there are more words to transfer: begins transferring the next word


* The main memory display FIFO is a simple circular buffer that holds 16 pixels. The video controller reads from it regardless of whether you fill it. It doesn't get 'empty' or 'full'.
* Writing to the upper halfword of 0x04000068 increments the FIFO write pointer by two (writes to the lower halfword leave it unchanged). The write pointer simply wraps to 0 when reaching the end of the buffer. It is also reset upon VBlank.
* 8-bit writes to 0x04000068 don't work well. TODO: figure out what's happening. eventually.

* Colors are converted early from 5-bit to 6-bit, as such: 6bit = 5bit*2
* Color special effects (brightness, blending) are applied to the 6-bit color components
* In some cases, the MSb of color values is used as LSb for the green component. TODO: find out where this applies. Confirmed to apply to the standard BG palette.

* Bitmap sprite blending follows the same rules as non-bitmap semitransparent sprites, with EVA=alpha+1 and EVB=16-EVA. Except: bitmap sprites with alpha=0 are always hidden.

* 3D layer blending follows rules similar to those of semitransparent sprites (only requires second target bits set in BLDCNT, overrides BLDCNT color effect selection and window 'enable color effect' setting where it applies).
* 3D layer blending uses 5-bit alpha values (from the 3D graphics), such as: EVA=alpha+1 and EVB=32-EVA.
* When the 3D alpha is less than 16, the final color components are incremented by one. (seems to be some hardware glitch??)
* 3D layer pixels with alpha=0 are always hidden (not rendered). They're preserved when capturing the 3D output alone, though.

* BG mode 6 works on both GPUs. On the sub GPU, it only gets 128K of VRAM, so it will repeat the same bitmap 4 times.
* BG mode 7 renders (text-mode) BG0, BG1, and sprites. No BG2/BG3.

* large BG sizes 2-3 are the same as corresponding sizes for regular bitmap BGs (512x256, 512x512)


* Shadow polygons can use textures. In that case, decal blending is applied.
* The stencil buffer can hold two scanlines. It's cleared only when the current scanline contains shadow mask polygons, before rendering a group of shadow mask polygons.
* Stencil buffer bits are set only where the shadow mask is drawn but fails the depth test.
* Visible shadow polygons (polyID>0) are only drawn where stencil buffer bits are set and where the destination pixel's polygonID is different from that of the shadow, regardless of whether that pixel was translucent.

* Drawing visible shadow polygons supposedly resets the stencil bits. TODO: check. I guess not.

* Toon highlight mode uses the following formula: (GBAtek is wrong)
v=vertex t=texture s=tooncolor=toontable[Rv]
R = ((Rt+1)*(Rv+1)-1)/64+Rs ;truncated to MAX=63
G = ((Gt+1)*(Rv+1)-1)/64+Gs ;truncated to MAX=63
B = ((Bt+1)*(Rv+1)-1)/64+Bs ;truncated to MAX=63
A = ((At+1)*(Av+1)-1)/64

* Translucent pixels are only drawn where the destination pixel has a different polygonID OR where the destination pixel was opaque.

* for each separate polygon, W values are 'normalized', they're collectively shifted left or right by 4 until they all fit within 16 bits (if they fit within 12 bits or less, they can be shifted left to use the 16-bit range better)

* conversion for Z values:
** Z-buffering: zbuf = (((Z * 0x4000) / W) + 0x3FFF) * 0x200 (using original W)
** W-buffering: zbuf = W (but it appears to use normalized W)

* conversion for clear depth:
** clearZ = (val * 0x200) + 0x1FF

* There are special depth-test rules for polygon borders. TODO: work it out.
** it seems to only apply to wireframe polygons
** when Z values are equal, left edges have priority over right edges, and top edges have priority over bottom edges (TODO: check wireframe vs normal)
** 'less or equal' depth test has no margin
** (dunno about other orders but they should use the regular rules. Y-sorting gets in the way)

* Cases where 'less than' depth test becomes 'less or equal':
** wireframe polygon borders as mentioned above
** apparently, polygon borders in some other cases too
** when rendering frontfacing polygon pixels over existing opaque backfacing polygon pixels

* in W-buffering mode, 'equal' depth test mode has a margin of 0xFF in either direction. That is, for example, incoming Z range of 0x100-0x2FE is considered equal to an existing Z-buffer value of 0x1FF.
* in Z-buffering mode, margins are +-0x200.

* PUSH/POP/STORE/RESTORE to the modelview matrix always apply to the vector matrix too, even in matrix mode 1.
* "NORMAL/VEC_TEST require matrix mode 2" <- wrong. They work the same regardless of the matrix mode.

* edge marking
Posted by GBAtek
Technically, when rendering a polygon, it's edges (ie. the wire-frame region) are flagged as possible-edges (but it's still rendered normally, without using the edge-color). Once when all opaque polygons (*) have been rendered, the edge color is applied to these flagged pixels, under following conditions: At least one of the four surrounding pixels (up, down, left, right) must have different polygon_id than the edge, and, the edge depth must be LESS than the depth of that surrounding pixel (ie. no edges are rendered if the depth is GREATER or EQUAL, even if the polygon_id differs). At the screen borders, edges seem to be rendered in respect to the rear-plane's polygon_id entry (see Port 4000350h).

-> polygon ID rule for screen edges confirmed
-> at screen edges, the aforementioned depth test uses CLEAR_DEPTH (when testing against a pixel that would be offscreen), even when using a clear bitmap

* antialiasing
** seems to be calculated from edge slopes
** topmost two pixels are retained, antialiasing blends them together including alpha (except color isn't blended when the pixel below has alpha=0)
** during rendering, if an incoming pixel fails the depth test with the topmost pixel, it is checked against the pixel below


* ARM9 DMA start mode 3 is similar to the GBA's 'video capture' DMA, although GBAtek doesn't make it obvious. It is triggered at the start of each scanline from 2 to 193. The enable bit is automatically cleared on scanline 194.
* TODO: find when 'main memory display' DMA starts. Probably 8 pixels (48 cycles) in advance from the actual display. -- DMA starts ~32 cycles after the start of the scanline. Actual display starts ~48 cycles after the start of the scanline.


* repeat mode 3 behaves same as mode 1 (loops)
* TODO: check to see what can be changed while a channel is playing. Format can be changed and that's a whole fucking pile of things to check.


Posted on 07-12-18 08:43 AM Link | #625
I'd actually like to see an open-source documentation (perhaps a melonDS wiki) on this. As you say GBATEK is great but yeah.

Also you did a nice job explaining things, you've always been clear ^^

Posted on 12-01-19 12:59 PM (rev. 3 of 09-15-20 06:36 PM) Link | #1406
Some that need to be confirmed:

### DSP ###

* DSP_PSTS bits 10..12 (REP0..REP2) are active-high (as in, 1=was written by DSP), while GBAtek says they're active-low

* DSP_PCFG bits 12..15 have an undocumented transer mode (7: ARM9 bus loopback): transfers to/from the ARM9 bus, cf. DSP-internal DMA transfer mode 7. This mode requres some additional setup: you first need to set the following DSP-internal DMA registers to the following values (using transfer mode 1):

[0x81BE] = 0 // select channel 0
[0x81C6] = 0xABCD // destination address, high 16-bit
[0x80E2] = 0 | (0<<1) | (1<<4) // configure AHBM (DSP->ARM9 DMA) // example value works for 16-bit transfers (see GBATek/Teakra docs for details)
[0x80E4] = (1<<9) | 1<<8) // resp. mandatory bit, direction (0=read, 1=write) (see GBATek/Teakra docs for details)
[0x80E6] = 1 // enable channel 0

Then perform a transfer as follows (the example writes 0x1337 to 0xABCDEF98):

DSP_PADR = 0xEF98 // destination address, low 16-bit
DSP_PDATA = 0x1337 // for a write, read from this address for a read

Posted on 09-15-20 06:38 PM Link | #2336
Rumble pak detection needs bit 6 set in addition to bit 1 reset (GBAtek only mentions bit 1).
Found empirically:

Main - Development - GBAtek addendum/errata New reply

Page rendered in 0.021 seconds. (2048KB of memory used)
MySQL - queries: 29, rows: 84/84, time: 0.017 seconds.
[powered by Acmlm] Acmlmboard 2.064 (2018-07-20)
© 2005-2008 Acmlm, Xkeeper, blackhole89 et al.