Xojo is slower than Python

samRowlands · 2 June 2020 01:35

With pragmas, I am getting 0.4s on a 2012 rMBP (macOS 10.14, Xojo 2018r3)

Public Function fib(n as int32) as int32
  #Pragma BackgroundTasks False
  #Pragma BoundsChecking False
  #Pragma BreakOnExceptions False
  #Pragma NilObjectChecking False
  #Pragma StackOverflowChecking False
  
  return if( n < 2, n, fib( n - 1 ) + fib( n - 2 ) )
End Function

samRowlands · 2 June 2020 01:37

Ha ha ! Yeah I love the speed I get from Objective-C, but when Xcode gives a new error that I’ve never seen before, and I can’t find any information about it online…

Or when I’m trying to build custom CIPlugins with a modern Xcode, because it no longer has the CIPlugin template, man that took ages to figure out all of the settings across the 30 odd pages.

npalardy · 2 June 2020 01:37

Scroll back
Most of these have already been run

in something like Rust the biggest chunk is actually starting the process
the fibonacci calc itself takes next to nothing

samRowlands · 2 June 2020 01:47

Got it to 0.2 s

Really close now.

npalardy · 2 June 2020 01:52

guess how you’re timing things matters

i timed 2 things
total app time and time to run fib (you can see that by my posts way back)

I used this to get "total run to stop time
perl -MTime::HiRes=time -e 'printf "%.9f\n", time' ; <<insert cmd line to run your app here>> ; perl -MTime::HiRes=time -e 'printf "%.9f\n", time'

samRowlands · 2 June 2020 01:54

Optimizing is really really hard. Back in 2017, I prototyped a new image processing algorithm in a weekend. It did exactly what I wanted, but it took 10 minutes to process a 20 megapixel image.

By Summer 2018, I’d got that down to 2 seconds, but it was creating terrible artifacts.

By Spring 2019, I’d gotten it down to 10 seconds, with a lot less artifacts, and a lot of GPU hacks.

Right now, it’s down to about 4 seconds on unreleased code and the closest results to the original yet, I even have an idea that might save a microsecond per iteration.

Sadly, I could have optimized it a lot quicker if Apple had allowed me to use shared constants across shaders, but they don’t they nor have they any interest in doing so, which forces me to incur far more reads on a textures. Which is supported in newer versions of OpenGL or DirectX, but not in Metal.

samRowlands · 2 June 2020 01:58

1591062935.454566002
9227465
257,093
1591062935.753688097

299,122,095

npalardy · 2 June 2020 02:06

FWIW the Rust code I posted above (way back when) is my first attempt at that code and NO invocation of the compiler to optimize for speed

no idea about others code and tests

Karen · 2 June 2020 02:10

Measuring just the Fib method (not app startup or shutdown) I got 42,168.27 microseconds for Fib(35) in a console app with pragmas and aggressive compilation on a iMac Mojave 2019 i9

The run method was just this:

Dim result as Integer
Dim s as Double = Microseconds
result = fib(35)
Print(Str(Microseconds - S))

samRowlands · 2 June 2020 02:19

Holy cow! 0.042 seconds!

I am doing something seriously wrong for it to take 6x longer on a 8 year old i7.

Karen · 2 June 2020 02:23

I edited my post to mention that is an iMac (27") not a laptop, if that makes difference.

BTW you used 4byte integers on a 64bit machine… maybe that slows things?

-Karen

npalardy · 2 June 2020 02:24

Compiled my Rust code as a non-debug build
Yeah the 115 ms was a debug build

1591064524.154890060
fib(35) = 9227465
Millis: 52 ms
1591064524.625057936

samRowlands · 2 June 2020 02:24

It was slightly faster than using Integer… Plus it’s what the Swift code used.

npalardy · 2 June 2020 02:24

4 bytes types on a 64 bit machine can be slower

samRowlands · 2 June 2020 02:27

Int32
9227465
240,181

Integer
9227465
246,017

Int64
9227465
242,648

npalardy · 2 June 2020 02:28

what is 240,181 ?

samRowlands · 2 June 2020 03:02

Using Int32 as the datatype

npalardy · 2 June 2020 03:30

what does 240,181 mean ?
what does it represent ?

edit - is that elapsed milliseconds / microseconds ?

samRowlands · 2 June 2020 03:54

Xojo’s Microseconds

DaveDuke · 4 June 2020 13:31

FWIW,

Here I have written a GO version, specifically forcing it to use 1 CPU only.

package main

import “fmt”
func fr(n int) int {

if n <= 1 {
return n
}
return fr(n-1) + fr(n-2)
}

func main() {

for i := 0; i < 35; i++ {
fmt.Println(i, fr(i))
}
}

0.12s user 0.00s system 99% cpu 0.121 total

Forcing GO to use more than one processor (12 to be exact) resulted in
0.00s user 0.00s system 86% cpu 0.005 total