pbrt-v4 parity is a solid baseline - that codebase already leans hard on NVIDIA so a fair comparison was always going to be messy. surprised wavefront was the harder bit though, i'd have expected BVH tuning to be the nightmare.
To be fair I was suprised too. But I made a relatively simple straight port from the AMD rays sdk plus some input from the pbrt-v4 CPU bvh code and it just worked relatively well out of the box...
This is the main intersection function which is quite simple: https://github.com/JuliaGeometry/Raycore.jl/blob/sd/multityp...
I'm not even using local memory, since it was already fast enough ;)
But I think we can still do quite a lot, large parts of the construction code are still very messy, and I want to polish and modularize the code over time.
To be fair I was suprised too. But I made a relatively simple straight port from the AMD rays sdk plus some input from the pbrt-v4 CPU bvh code and it just worked relatively well out of the box... This is the main intersection function which is quite simple: https://github.com/JuliaGeometry/Raycore.jl/blob/sd/multityp... I'm not even using local memory, since it was already fast enough ;) But I think we can still do quite a lot, large parts of the construction code are still very messy, and I want to polish and modularize the code over time.