Views: 1,461,394 Homepage | Main | Rules/FAQ | Memberlist | Active users | Last posts | Calendar | Stats | Online users | Search 01-20-21 11:33 AM

0 users reading GPU divider RE | 1 bot

Main - Development - GPU divider RE New reply

Posted on 04-02-19 09:41 AM (rev. 3 of 04-04-19 07:09 AM) Link | #934
which is where we try to figure out how the GPU does divisions, because it's weird

I don't think they embedded a divider for each purpose tho?


GPU has a general-purpose unsigned 32bit divider

some special measures are taken before using it, to ensure that numerator and denominator a) are positive and b) fit within 32 bits

viewport transform

barring overflow cases (when W is greater than 0xFFFFFF -- it gets truncated to 24 bits)

sX = ((X + W) * sW) / (W*2)

when W is greater than 0xFFFF, (W*2) loses two bits of precision (effectively taking one bit from W).

so, first, assuming W within 0001..FFFF

X: -FFFF..FFFF (has to be between -W and W)
W: 0001..FFFF
sW: 000..1FF

X+W: 00000..1FFFF -> 17 bits
((X + W) * sW): 26 bits
(W*2): 17 bits

next, when W is greater than FFFF

W: 010000..FFFFFF

X+W: 0000000..1FFFFFF -> 25 bits
((X + W) * sW): 34 bits
(W*2): 26 bits

denominator is shifted right by two, same for numerator??

TODO: how is numerator handled? presumably it has to always fit within 32bit unsigned range, which would explain the precision loss at denominator

numerator W doesn't lose precision


((x * w0) << shift) / ((x * w0) + (xmax-x * w1))

x: 00..FF (in practice, never going to be greater)
xmax-x: same
wn: 0001..FFFF
shift: 8 or 9

numerator: 8+16+shift
denominator: 26 at most

numerator reaches 32 bits along X, 33 bits along Y

that would explain the interpolation quirks along Y: reducing W's to 15 bits so that the numerator fits within 32 bits


Main - Development - GPU divider RE New reply

Page rendered in 0.026 seconds. (2048KB of memory used)
MySQL - queries: 26, rows: 74/74, time: 0.023 seconds.
[powered by Acmlm] Acmlmboard 2.064 (2018-07-20)
© 2005-2008 Acmlm, Xkeeper, blackhole89 et al.