Efficient Canvas Drawing

It’s taken me a lot longer to be able to sit down and look at your code.

I’m using 2018r3 for production (as there’s some issues with newer versions of Xojo that I haven’t found the time to resolve yet).

The immediate things I see, that I would do differently.

  1. Remove everything that isn’t absolutely necessary to drawing, including the event that show when rendering is started / stopped.
  2. Don’t cache the image, draw directly to the canvas. Ensure that the tiles are pixel perfect for the screen. i.e. Don’t get the macOS to scale the images, it looks like if I run this on a Retina machine, the OS has to scale the images to fit the pixel areas.
  3. Reduce the number of steps in your paint event, get it as close to a single function as possible, each time it has to call another function or method that adds an overhead.
  4. Do all tiles calculations elsewhere, I see there’s a function “computeVisibleTileIndexes” that gets called when the cache is refreshed, do this somewhen else, like setting up the map or when an event happens.
  5. Loops have an overhead, when you recalculate which tiles should be onscreen, stuff them into an array, along with their positions, when drawing, loop through a single array and extract the pre-computed locations.
  6. Separate the graphics from the tiles, so that tiles which use the same picture/image can use the same picture/image. The OS does some caching, so if you have a 100 tiles that are the same pixels, but different image instances, it has to draw 100 different images. Whereas using the same image, it should be able to gain a bit of performance (if Apple still care about performance). This will also keep your memory down.

I understand that you want to make this as x-plat as possible, but in-order to gain best performance, I would recommend considering ding some platform specific hacks.

Hope that this helps in some way. Oh and turn off anti-alias.

1 Like

Thanks ever so much for this input @samRowlands . Some of which I have already implemented in recent branches but many I will now incorporate. I’m trying to put together a better demo app to showcase what it can currently do.

Btw, how do you turn off anti-aliasing? I’m wondering if this is also causing blurring text when I use Graphics.DrawText on the map too…

Very exciting things you write there @samRowlands .
Does this tip only refer to this special case or do you really suggest to never use a buffer picture in Xojo with graphics programs? Doesn’t that lead to flickering?

I have made the experience that if you cache as many calculated values as possible, this can mean enormous speed increases in rendering.

on macOS because the OS already buffers things using an additional buffer picture isnt required or recommended

on Windows it may be since you want to draw as infrequently as possible and then only draw your buffer - doing otherwise can lead to flicker on Windows

so you may need slightly different code that has #if targetWindows and #if targetMacOS paths to have windows use a buffer and macOS not

1 Like

What about Linux?

g.antialias = false

I believe the reason why the text may look blurry is because one of several things.

  1. Text rendering on 1x displays took a dive in quality on macOS 10.14.
  2. I didn’t notice any handling of Retina assets in your code (doesn’t mean that there isn’t any), and so I would assume that running the application on a Retina display it’s drawing 1x tiles instead of 2x tiles, which also causes a performance hit as the tiles are not “pixel perfect”, causing the OS to perform scaling.

There’s also another trick you can could try, and that’s to change the blending mode (done via declares or plugin) to a basic Copy, this way you avoid any transparency calculations during the draw. I can create the needed declares for this. I’ve not tested it (yet), so I don’t know how much difference it will make (if any), but in theory it should save some math.

it depends on what you’re doing, drawing a picture to a canvas graphics is one of the slowest things (in my experience), not to mention if you’re drawing to that picture and then to the canvas in the same run loop.
I find it’s often quicker to use the drawing primitives in a canvas, than creating a cached image, and drawing that. I’ve started to create some of my icons using native Xojo code, the other advantage is that you don’t need to worry as screen scales as native drawing primitives are vector in nature.
As for flickering, I only work on Mac, which has 3 levels of buffering in the OS, so I haven’t seen flickering for a while, except when I crash the graphics card.

This is something that Apple have been saying for decades, which is to ONLY draw in the paint event, do calculations elsewhere.

Li what ??? :stuck_out_tongue:
Linux should be more like Windows as its drawing engine doesnt inherently double buffer as far as I know

I did a small test, simply dragging an image around a canvas. It was a 1x image being displayed on a Retina screen (not pixel perfect).

Anti Alias ON (default): 20,000 Microseconds.
Anti Alias OFF: 6,000~7,000 Microsecond, so about 3x performance gain.

Drawing it at pixel perfect, so each pixel on the image matches each pixel on the screen.
AA ON: 3,000 Microseconds.
AA OFF: 1,500~ 2,000 Microseconds, so upto 13x performance gain (not 100% fair).

Pixel Perfect x 4 (to cover as much area as a 1x image not being drawn pixel perfect:
AA ON: 10,000 Microseconds.
AA OFF: 6,000 - 8,000 Microseconds.

I also experimented with altering various other options, including the compositing mode, but none made such a significant difference as disabling anti-alias.

In summary, it appears that doing pixel perfect graphics and disabling anti alias is the most optimal route I know, if you’re not going to bother with pixel perfect images, at least disable anti-aliasing.

Would you have the example project for download?

@samRowlands: Thank you so much for taking the time to look at this. Everything you have said has been incredibly useful.

I have just pushed new commits to the repo where I’m experimenting with gaming with Xojo. Feel free to download the project and run the desktop test harness. The code contains many fixes that @samRowlands has suggested.

I found by far the biggest performance increase was in drawing directly to the Graphics object that is derived from the Canvas.Paint event. This really surprised me. I had been doing a “big draw” of the map to a Picture and then drawing that into the visible Graphics object. I thought this would be faster as I could simply redraw a single tile to the buffered Picture if it changed and then copy the whole buffer back to the Graphics object but it turns out that it’s actually faster to redraw every single visible tile at 30 FPS than do a single copy of a portion of the larger buffer to the Graphics object. Very surprising.

I already only draw tiles that are visible which obviously hugely helps with performance.

In the demo app, I’m updating the Canvas with a timer 30 times a second. The first time the Canvas is painted, the entire visible map is drawn directly to the Graphics object. Simultaneously I also draw every visible tile to a Picture object that is the same size as the Canvas.Graphics object. I cache this picture and then the next time the Canvas.Paint event fires I check to see if any tiles have changed. If they haven’t I simply draw the cached Picture to the Graphics object. Speed wise this doesn’t change much but it hugely reduces the CPU load. The issue is though that Retina / /HiDPI is broken.

I say this because (in the demo at least) I’m only using Graphics primitives for the tiles (Graphics.FillRectangle). If I draw some text to the Canvas in the Paint event then it looks lovely. However, when the cached Picture is used, it’s blurry. Is there a way to fix this? Perhaps I’m not converting the HiDPI Graphics object to a Picture correctly? The code I’m using to cache it can be found in GameKit.SquareTileMap.RedrawViewport under the comment “// Render this tile to the cache.”.

Here’s a couple of screenshots. The first is the demonstration of GameKit’s customisable route finding using the A* algorithm:

The second screenshot demonstrates GameKit’s abstraction of maps. You can use a single data source for the tiles in a map but display two different size maps on a Canvas at once. The screenshot shows a large scrollable map and a smaller mini map. Both can be scrolled with the mouse and clicking a tile on either map colours it randomly and updates both maps simultaneously.

Is there a way to fix this? Perhaps I’m not converting the HiDPI Graphics object to a Picture correctly?

make sure your buffering picture is also set to the same dpi as the graphics object when you create it
I suspect your backing objec might be 72 dpi where the retina screen is 144 or higher

Ah, possibly. How does one achieve that? Graphics.ScaleX?

simply set the horizontal and vertical resolutions

Apologizes about the delayed response. To test this I made the performance widget I’ve always wanted. However I’m occasionally getting application hangs when using drawstring with it to display fps/ms.

I’ve been trying to figure out why, but I’m at a loss.

I also integrated this widget into my other application, and have been using it to gauge and improve performance. It’s forced me to have to look at how I’m rendering to Metal, as when I started it was running slower than OpenGL. Some of that probably is because Metal doesn’t support some of the things that OpenGL does and like to display solid black screens if pixels have values out of the range 0.0 ~ 1.0, so I have to do extra work on Metal to clamp them after each and every shader. Eurgh!

How much work was it to get a Metal surface on a view @samRowlands? Is this something you could abstract for more general use like Xojo’s OpenGLSurface? I’d pay for that. I think that’s actually how the OpenGLSurface came about - it was a plugin created by TinRocket.

First up, I’m only using Metal (and OpenGL) as a rendering context for Apple’s (soon to be retired Core Image) processing. So there’s enough to do what I need to do, in theory it could be extended enough to replace OpenGL surface.

  1. Initially, when I first worked on it almost 3 years ago, I was happy that it was much easier to configure than an OpenGL context. What I didn’t realize until last summer, was that I’d used incorrect documentation from Apple, and had created a Metal display portal that was backed by OpenGL. While it all worked perfectly on the array of hardware (with various OS version) here, it failed miserably for many customers.

Once I’d got the combination correct of a Metal display, backed by a Metal render, I realized that it had just moved the complexity from configuring to actual rendering. Sadly all the workarounds I’ve been forced to employ have pretty much obliterated any perceived performance improvement and actually broke a feature that made my app pretty powerful.

If it wasn’t for the fact that OpenGL is deprecated, I’d advise avoiding Metal entirely. That and while Metal was supported from 10.11, only 10.14 guarantees you a Metal device, so for anything lower, you need to support OpenGL and Metal.

  1. Yes it can be extracted and shared, it will take me some time to do so. My final design is pretty slick IMHO, as it uses CALayers, so from one single control at design time it can be either OpenGL or Metal at run time, I even allow the user to choose. It’s worth noting that because it’s only a rendering target for Core Image, it makes this much easier, otherwise you’ll need to have two sets of code, one for Metal and one for OpenGL.

I have several complaints with Metal (when compared with OpenGL).
A. pow( negative value ) with Metal will crash the render.
B. Negative values must be clamped after each shader, or weirdness ensures.
C. Stuttering. OpenGL frame rates are consistent, even with long renders, as they’re synchronous. Metal is asynchronous, passing control back to your application before it’s finished rendering, simulating faster performance, but in reality, it fills up it’s buffer, which is a maximum of 3 frames, and then your application freezes until all 3 buffers have been rendered. I’m still trying to find a solution for this nightmare.

Of course all my complaints with Metal could simply be because there’s very little documentation on using Core Image with Metal on the Mac.

I’m also not looking forwards to having to re-write all my core image code and kernels in pure Metal.

Ha ha ha… I decide to Google on how to make it Synchronous and implemented it, now it’s much smoother as it no longer stutters :slight_smile:

1 Like

How?

  1. Change the CAMetalLayer to use Core Animation when presenting (eugh, normally Core Animation slows things down).
  2. When “presenting” the CAMetalDrawable, commit the commandBuffer first, then ask the command buffer to “waitUntilScheduled” and call “present” on the CAMetalDrawable.

Each pass takes longer and asking the CAMetalLayer for a drawable still takes time, but it’s more consistent, than loading up 3 frames and pausing for a second or two before anything else can happen.