Some sample code:
Dim TestColl As New Collection TestColl.Add 153, "Greece" TestColl.Add 23, "Japan"now we can use an integer index to retrieve the n-th 1-based item (returns 23)
TestColl.Item(2)or the string to retrieve the item by key (returns 153)
TestColl.Item("Greece")We can also iterate the values:
Dim v As Variant For Each v In TestColl Debug.Print v NextThere is some debate as to whether the collection uses a proper hashed lookup to retrieve the value when given a string key. Many sites/blogs recommend using the Dictionary object in the System.Scripting namespace.
While this may mostly work, over the years I have learned to avoid external dependencies in MS Access wherever possible. Hence I determined to investigate the lookup performance of the native VBA collection for both integer and string based retreival.
The following code builds a collection of 100,000 integer values and then looks up a particular value at the start, middle or end of the collection 10,000 times, so as to return a measurable time.
Public Sub TestCollectionItemLookup() Dim TestColl As New Collection Dim i As Long Dim k As Long Dim StartTime As Single Dim Iterations As Long Iterations = 10000 Debug.Print "Iterations=" & Iterations For i = 0 To 100000 TestColl.Add i, CStr("k" & i) Next ' By Int Index, i=2 StartTime = Timer For i = 0 To Iterations k = TestColl.Item(2) Next Debug.Print "By Int Index, i=2: t=" & Format(Timer - StartTime, "0.000000") ' By Int Index, i=500 StartTime = Timer For i = 0 To Iterations k = TestColl.Item(500) Next Debug.Print "By Int Index, i=500: t=" & Format(Timer - StartTime, "0.000000") ' By Int Index, i=50000 StartTime = Timer For i = 0 To Iterations k = TestColl.Item(50000) Next Debug.Print "By Int Index, i=50000: t=" & Format(Timer - StartTime, "0.000000") ' By Int Index, i=99999 StartTime = Timer For i = 0 To Iterations k = TestColl.Item(99999) Next Debug.Print "By Int Index, i=99999: t=" & Format(Timer - StartTime, "0.000000") ' By String Index, i=500 StartTime = Timer For i = 0 To Iterations k = TestColl.Item(CStr("k" & 500)) Next Debug.Print "By String Index, i=500: t=" & Format(Timer - StartTime, "0.000000") ' By String Index, i=99999 StartTime = Timer For i = 0 To Iterations k = TestColl.Item(CStr("k" & 99999)) Next Debug.Print "By String Index, i=99999: t=" & Format(Timer - StartTime, "0.000000") '***************************************************************** ' By Int Index, random lookup StartTime = Timer For i = 0 To Iterations k = TestColl.Item(CLng(Rnd * 100000)) Next Debug.Print "By Int Index, random lookup: t=" & Format(Timer - StartTime, "0.000000") ' By String Index, random lookup StartTime = Timer For i = 0 To Iterations k = TestColl.Item(CStr("k" & CLng(Rnd * 100000))) Next Debug.Print "By String Index, random lookup: t=" & Format(Timer - StartTime, "0.000000") End SubExecuting this sub printed the following results for me:
Iterations = 10000 By Int Index, i=2: t=0.007813 By Int Index, i=500: t=0.031250 By Int Index, i=50000: t=5.406250 By Int Index, i=99999: t=16.195310 By String Index, i=500: t=0.031250 By String Index, i=99999: t=0.027344 By Int Index, random lookup: t=6.164063 By String Index, random lookup: t=0.027344We can draw the following conclusions:
- lookup in a collection by integer index just uses a linear for/next loop with a value comparison. Values at the start of the collection will be retrieved far more quickly than ones at the end
- lookup by string key IS hashed, and it makes no difference where in the array the values lies, although key lookup is at least an order of magnitude slower than direct array lookup would be (we would expect integer lookup for i=2 to be 2-3 times slower than an array lookup)
- the code also checks 10,000 tries of random retrieval, in case of some kind of caching was speeding up the key lookup after the first retrieval, but the results indicate this not to be the case, with the result very similar to the repeated lookup of a single value
Thus, VBA collections can be recommended for hashed string lookup for large collections, however should never be used for integer indexed lookup for large collections.
If integer lookup was required, it would be the best strategy to create an array and copy the values into it, then use the array for positional retrieval.
No comments:
Post a Comment