Simple code execution timer

doj · 10 June 2020 18:28

I am currently messing about with splitting up a long string of data and would like to quantify (very approximately) the period of execution for various parts of it with a super simple method that is not going to be hindered by the code execution its self.

I am not bothered about absolute, simply relative.

something like:-
start_time=some_way_to_count_time_now

code does its thing

end_time=some_way_to_count_time_now

run_time=start_time-end_time

super simple, greater than 1ms accuracy and works in Mac/19r1, thanks for looking.

npalardy · 10 June 2020 18:34

make start_time and end_time as DOUBLE’s
some_way_to_count_time_now = the Xojo microseconds function

dim start_time as double = microseconds

code does its thing

dim end_time as double = microseconds

// because end_time occurs after you get a positive value this way
dim run_time as double = end_time - start_time

doj · 10 June 2020 18:54

ok Norman, silly forum code mistake, thanks for the code, I am astonished I missed the ‘microseconds’ function, and there it is in Dash as plain as day, DOH!

npalardy · 10 June 2020 19:01

uh … shit happens ?

OH LOOK NO SWEAR WORDS ELIDED !!!

doj · 10 June 2020 19:16

I often use the wonderful expletive brought to the wider world through Father Ted, feck feck feck right off!!!

so feck is my got to textual word for many a situation that requires a little more impact than ‘oh gosh!’, even my very old and, most prim, Mum does not appear to balk at this wonderful word.

I am sure that and, some other adult words, maybe offensive to some readers…but (yes you guessed it) feck right off! its the real world, get with the fun and smile at the joke.

perhaps a new topic about ones most favourite not swear word would be appropriate, having read the forum guidelines recently, which apparently only nine of the membership have actually done! I have, apparently, got a badge to prove it, jajaja

doj · 10 June 2020 19:49

I tried the code and its works great, thanks.

for me the split function is actually working way faster than I though it might (although I am using a reasonably nimble machine)
for a 1.2Mb file with end of line search character I got 20,000+ array elements in just over 4ms.

I tried to upload the text file so anyone could test but its not allowed for some reason.

npalardy · 10 June 2020 20:03

I use split but on a file that is 15MB and its pretty reasonable

s7g2vp2 · 10 June 2020 20:29

If your text is ASCII or UTF-8 you can improve performance further by using SplitB instead of Split.

npalardy · 10 June 2020 20:32

Careful with SplitB on UTF-8 as it will split multibyte characters wrong (unless you want to handle all the bytes yourself)
On ASCII or other single byte encodings its fine

doj · 10 June 2020 21:57

looking at my parameters to parse I think I am going to need perhaps 4 split, a trim or 2 and, then inter element parsing of the array elemental data.

I think I can get it all done well within the time I had expected it to take while still being useable on a dog of a machine too.

thanks for the UTF8 thought, I am sure its UTF8 but who knows.

s7g2vp2 · 11 June 2020 08:14

As long as the UTF-8 data is correctly formed there should be no problem using B functions with multi-byte data. If it isn’t correctly formed you are probably screwed as the standard string functions won’t work correctly either.

Here is the section on self-synchronisation for UTF-8:
“The leading bytes and the continuation bytes do not share values (continuation bytes start with 10 while single bytes start with 0 and longer lead bytes start with 11). This means a search will not accidentally find the sequence for one character starting in the middle of another character.”

We have also tested this and using it in production with text from several different multi-byte scripts and it has yet to fail.

Clothears · 11 June 2020 13:36

I’m actually puzzled now (again). From what the doc has to say about Split and SplitBytes (or SplitB), I don’t understand what the difference between all these is. Either you split into characters (or strings, on a string), or into bytes. What is the point of SplitB (or SplitBytes)?

npalardy · 11 June 2020 13:36

If you’re handling the multibyte sequences yourself then SplitB shouldnt present issues

My caution was more aimed at “if you’re splitting and expecting ‘characters’ as a result then splitB won’t help you with multibyte UTF-8”

s7g2vp2 · 11 June 2020 14:36

SplitB / ReplaceB etc… are lower level functions which look for matching sequences of bytes. These functions will work with ASCII and UTF-8 multi-byte characters in most situations.

They work with multi-byte UTF-8 characters in most situations because bytes 1 / 2 / 3 etc. that make-up a single multi-byte character have different ranges. This means you cannot get a false match which turns out to be part of one multi-byte character and a part of the next multi-byte character.

Example:

  Dim s As String
  Dim a(-1) As String
  Dim s2 As String
  
  s = "abc" + Chr(9) + "γρεεκ" + Chr(9) + "кгышшфт" + Chr(9) + "ોવમ્કતોે્ઠ" + Chr(9) + "ะ้ฟร" + Chr(9) + "クォあsdklじゃsぢ王wq" + Chr(9) + "شقشزهذ"
  
  a = SplitB(s, Chr(9))

However, if you are splitting on an empty string to convert a string into an array of individual characters then SplitB will only work for ASCII and not UTF-8 multi-byte characters.

Example:
Dim s As String
Dim a(-1) As String
Dim b(-1) As String
Dim s2 As String

  s = "abc" + Chr(9) + "γρεεκ" + Chr(9) + "кгышшфт" + Chr(9) + "ોવમ્કતોે્ઠ" + Chr(9) + "ะ้ฟร" + Chr(9) + "クォあsdklじゃsぢ王wq" + Chr(9) + "شقشزهذ"
  
  'does not work
  a = SplitB(s, "")
  
  'does work
  b = Split(s, "")

It is worth noting that even the standard Xojo string functions don’t handle all situations correctly such as emojis since Xojo string functions don’t work with user perceived characters.

npalardy · 11 June 2020 14:55

Text did this better but …

Clothears · 11 June 2020 17:41

Personally I would have expected .SplitB etc to split on bytes, but no matter as I don’t use them. So far I’ve not had any issues with UTF-8 such as emojis but then I don’t try to split them.

Certainly the way UTF-8 is constructed is very useful. I have a method to verify that a string is valid UTF-8, and to replace wrong bytes with the replacement character. Works fine OK even with chars such as OLD PERSIAN SIGN AURAMAZDAAHA (𐏊) which is F0 90 8F 8F.

s7g2vp2 · 11 June 2020 23:04

It does

doj · 11 June 2020 23:45

WOW!
thanks for all the replies, I know my case is going to be only ASCII (as valid characters) but I did forget to ensure encoding, which I always do in serial port code but for some reason ignored that here.

everything mention here will be thought about and tested, many thanks to all contributors.