Best way to implement a case sensitive dictionary

I see from Xojo’s language reference that Dictionary now has KeyComparisonDelegate. As usual, the docs are less than stellar and I’m trying to figure out how to implement this.

Basically, I want a custom class (CaseSensitiveDictionary) that is exactly like the native Dictionary class except that String keys should be case sensitive. For example:

var d As New CaseSensitiveDictionary
d.Value("Name") = "Garry"

d.HasKey("Name") // True
d.HasKey("name") // False

Can anyone give me a nudge in the right direction for how to most efficiently handle this? Presumably I just need to subclass Dictionary and implement the KeyComparisonDelegate() function? I did this but I’m getting a compiler error:

Class CaseSensitiveDictionary
  Function KeyComparisonDelegate(leftKey As Variant, rightKey As Variant) As Integer
    If leftKey.Type = Variant.TypeString And rightKey.Type = Variant.TypeString Then
      Return leftKey.StringValue.Compare(rightKey.StringValue, ComparisonOptions.CaseSensitive)
    Else
      Return Super.KeyComparisonDelegate(leftKey, rightKey)
    End If
  End Function
End Class

The compiler is saying:

Expected Integer but got delegate Dictionary.KeyComparisonDelegate

Incorrect. You must force it using a special constructor for such, that exists only in 2019r2+

Dictionary.Constructor(keyComparison As KeyComparisonDelegate)

1 Like

I don’t have a current license, so I can’t test this. But I’m quite sure the else part of your method is wrong. You need to do the comparison for non-strings yourself, so roughly something like this:

...
if leftKey.Type = Variant.TpeString and rightKey.Type = Variant.TypeString then
  return leftKey.StringValue.Compare(rightKey.StringValue, ComparisonOptions.CaseSensitive)
elseif leftKey < rightKey then
  return -1
elseif leftKey > rightKey then
  return 1
else
  return 0
end if
...

No license is necessary to write tests. And your guess is not the problem he is facing. :smiley:

What about to write your own class encapsulating the dictionary but in your version of setting or reading a key, if string, you encode/decode it as a base64 string (preserving the “must be this exact way” property)?

Yes, I know. But since I don’t have a current license, I don’t have a current version on my computer. I’m still using Xojo 2017, and with that I can’t test this.

The problem he is facing is, that he returns a delegate in his else statement, and not an integer. That’s what causes the compiler error. The constructor you mentioned doesn’t solve that problem.

We don’t even know, if @Garry uses that constructor, or not. In case he doesn’t, he needs to put a constructor like this in his CaseSensitiveDictionary:

Sub Constructor()
  super.Constructor(AddressOf KeyComparisonDelegate)
End Sub

Not a bad suggestion but I suspect base64-encoding will negatively impact performance.

I think I know have it working thanks to a combination of the suggestions from @Rick.A and @C-8DO suggestions. Here’s the class in case anyone else needs it:

Class CaseSensitiveDictionary
  Private Function KeyComparer(leftKey As Variant, rightKey As Variant) as Integer
    ///
    ' Delegate used to compare the keys for this dictionary.
    ///
  
    If leftKey.Type = Variant.TypeString And rightKey.Type = Variant.TypeString Then
      // Both keys are strings so do a case-sensitive comparison.
      Return leftKey.StringValue.Compare(rightKey.StringValue, ComparisonOptions.CaseSensitive)
    Else
      If leftKey = rightKey Then
        Return 0
      ElseIf leftKey < rightKey Then
        Return -1
      Else
        Return 1
      End If
    End If  
  End Function

  Public Sub Constructor()
    Super.Constructor(AddressOf KeyComparer)
  End Sub
End Class

You use it just like a normal Dictionary:

Var cd As New CaseSensitiveDictionary
cd.Value("name") = "Garry"
Call HasKey("Name") // False as name <> Name

yeah you had to create the dictionary with the delegate so that for the life time of that dictionary the comparison is consistent

originally they had it so the delegate was NOT required as part of the constructor

so in theory you could create a dictionary, insert a pile of key value pairs, change the delegate then insert some more and now what is your assurance about “each key can only occur once” ?

it was a bad design

that said I wonder about the key comparer and whether you need to check not just for variants that holds string but things that will, or would, easily convert themselves to a string
I dont know if there are other items that also should compare as “the same” with this that might be overlooking

Sure, but, as I said, it´s a way of doing it with previous not 2019r2+ versions.

1 Like

There are definitely items that shouldn’t compare as “the same”, that’s why I put the “equal” result at the end of the if-then-else. And there are also items that shouldn’t be compared at all. At least—depending on what types @Garry expects/accepts as keys—the comparison of non-string values may need some more work.

I think he should only compare items of the same kind, e.g. strings with strings, numbers with numbers and so on. For items that aren’t of the same kind he should then simply return -1.
But then again he’d have to handle cases where leftKey = "123" and rightKey = 123

Let’s say he has one key that’s a string, and one that’s an integer. Does he want to compare them as strings (making the comparison lexicographic), or does he want to compare them as numbers (converting strings that don’t hold a numeric value to 0).

Was just pondering what else a regular dictionary might consider “equal keys” and that garry’s should as well - except when case differs
I think this works as is - I havent done exhaustive testing to be sure though

A bit off topic but I have always thought that rather than using variants for keys, dictionaries should be defined to use specific types…

When i use a dictionary the keys are always of the same type and so are the values (though they often are a different type from the keys)

I know the language does not support anything like this but I would like to see a dictionary defined like this:

Class MyCustomDictionaryClass as SpecificSpecific(KeyType as KeyDataType, ValueType As ValueDataType)

Where:
KeyDataType could be a simple type, class that implements a DictionaryKeyInterface (but casting is under the hood), other classes would just use the instance address or some other unique instance Identifier .

For strings one could have values of String or StringCaseSensitive

ValueDataType could be a simple type, any specific class or an array of those things.

Using templates, at compile time, the IDE would “write” and compile the code for that specific dictionary class using only the specific types and not use variants at all…

This would let the compiler catch more errors in our code, and remove the overheard of using variants.

I know Xojo would never do anything like this, but it is what I would like to see.

-karen

Take a look at @npalardy’s GitHub page: https://github.com/npalardy/TypedDictionary

Maybe that’s what you’re looking for.

I am speaking about something better.

Norm’s class only does run time checking (which is all that is possible without Xojo Inc doing something like i suggest, or writing your own templates from scratch and writing an app to write the actual class code when you specify the data types).

In any case his subclass still uses variants and adds a bit more overhead, rather than having less.

if I knew how to write high performance dictionary code, writing an app to create such dictionary classes might be a useful thing to do… But I don’t have the skill/knowledge for that

why not forgo a “dictionary” and create you own class based on an in memory SQLite database instead?

Read this chapter of Bob Nystrom’s absolutely amazing Crafting Interpreters free online book. He implements a dictionary in C but it’s easy to follow along to.

I absolutely wish Xojo could do that. That’s how Java does it and there are many advantages to it.

For this particular project, I’ll only be storing UTF-8 encoded strings as keys. Specifically I’m storing the keywords and reserved words for the language I’m writing for use in the tokeniser. I may ditch the dictionary altogether and use a trie ultimately but it’s always best to optimise later…

generics

I have something on github that lets you do this BUT it’s all done at runtime because sojo doesnt support any other model at the moment

I would LOVE to be able to do something like

      dim d as New Dictionary( string , boolean)

and that way only strings keys and boolean values would work

at the risk of repeating my self generics

I must be coming down with something :slight_smile:

Tries are damned fast for lots of things since you only have to store the common prefixes of any strings

somewhere I have / had code that create an unoptimized trie

That is pretty much how Swift does it…

 var d  : [String : Bool] = [:]