With pragmas, I am getting 0.4s on a 2012 rMBP (macOS 10.14, Xojo 2018r3)
Public Function fib(n as int32) as int32
#Pragma BackgroundTasks False
#Pragma BoundsChecking False
#Pragma BreakOnExceptions False
#Pragma NilObjectChecking False
#Pragma StackOverflowChecking False
return if( n < 2, n, fib( n - 1 ) + fib( n - 2 ) )
Optimizing is really really hard. Back in 2017, I prototyped a new image processing algorithm in a weekend. It did exactly what I wanted, but it took 10 minutes to process a 20 megapixel image.
By Summer 2018, I’d got that down to 2 seconds, but it was creating terrible artifacts.
By Spring 2019, I’d gotten it down to 10 seconds, with a lot less artifacts, and a lot of GPU hacks.
Right now, it’s down to about 4 seconds on unreleased code and the closest results to the original yet, I even have an idea that might save a microsecond per iteration.
Sadly, I could have optimized it a lot quicker if Apple had allowed me to use shared constants across shaders, but they don’t they nor have they any interest in doing so, which forces me to incur far more reads on a textures. Which is supported in newer versions of OpenGL or DirectX, but not in Metal.