Saturday, December 28, 2013

Positional MSAccess OleDb Parameters

A few years back, I wrote an MS Access data provider for SubSonic 2, the DAL generation tool.

It was a rather thankless task - the V2 project was pretty much obsolete by then - but it was extremely educational.
I'd like to share some tricks I was forced to come up with that I don't think I've ever seen published.
Surprisingly, JET SQL called from .NET can come up with the goods in many ways you might not expect.

If you want to check out the DataProvider code directly, it's 'AccessDataProvider.cs' on GitHub here.

Issue #1 - Positional Parameters

When connecting to an MS Access database in .NET, it's best to use an OleDb connection, since these are cited by Microsoft as being optimal for MS Access, and give us the widest range of functionality.
It is well known that parameters in OleDb queries are assigned by position only, and the names are disregarded entirely.
For example:
SELECT CategoryName, [comp 2] as x1, [cat 1] as x2, [comp 2] as x3 From Categories WHERE CategoryName=[cat 1];
would actually expect four individual parameters which would be assigned in order despite parameters 1 and 3, and 2 and 4 having the same names, effectively giving

SELECT CategoryName, ? as x1, ? as x2, ? as x3 From Categories WHERE CategoryName=?;

This makes it very difficult to construct the automated SQL statements required by a query tool such as SubSonic's.

After quite a bit of digging, what I found was that you actually CAN use named, positional parameters in access: the SQL just needs to be formatted in a precise way.
To return to the above example, the following ad-hoc SQL:

PARAMETERS [cat 1] Text(255), [comp 2] Text(255);
SELECT CategoryName, [comp 2] as x1, [cat 1] as x2, [comp 2] as x3 From Categories WHERE CategoryName=[cat 1];

will expect two parameters, and will re-use the values in the specified locations.
The types for the parameters (above: Text(255) ) are standard DDL types, as catalogued by Allen Browne here.

The parameter names must be standard JET SQL identifiers and as such may not contain the @ character, unlike in MS SQL Server's Transact SQL where the @ is mandatory.

When saved as a query in Access, SQL with parameter declarations are marked as a 'Stored Procedure' by OleDb, a different schema category from normal queries which are marked as a 'View' (see examples below on schema data retrieval).

Issue #2 - Fetching Database Metadata

In order to construct a DAL for an MS Access database, it is necessary to retrieve schema information about database objects.
The OleDbConnection offers a GetOleDbSchemaTable() method that returns most comprehensive schema data of any of the connection types.

DataTable dt = autoOleDbConn.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, null);

However there are still two problems with the schema data retrieved.

(1) There is actually no way to determine whether a column is an Autonumber (the equivalent of a MSSQL identity).
An urban myth is circulating that COLUMN_FLAGS = 90 and DATA_TYPE = 3 (Int) means an Autonumber, but actually these conditions are met for any non-nullable Long field (of which an Autonumber is a subset).

(2) MS Access also has the 'Allow Zero Length String' property, which is unique among DBMS's as far as I know. This is not picked up by the schema data.

In the dataprovider, the way I solved this problem was to open the database object using DAO and use the DAO field attributes:
// Use DAO to extract parameters 
DAO.DBEngineClass dbc = new DAO.DBEngineClass();
DAO.Database db = dbc.OpenDatabase(
    GetAccessDBNameAndPathFromConnectionString(autoOleDbConn.ConnectionString), null, false, "");
...

    // if this is a table, extract addl DAO info
    DAO.Field dbField = db.TableDefs[tempTableName].Fields[tempColumnName];
    tempAllowEmptyString = dbField.AllowZeroLength;
    tempAutoInc = ((dbField.Attributes & (int)DAO.FieldAttributeEnum.dbAutoIncrField) != 0);
If any of this is not comprehensive enough, I'd recommend looking at the SubSonic provider code on GitHub as mentioned above.
There are a lot of useful bits and pieces there, including conversion between .NET DbType, .NET native type, DAO enumerated type and DAO DDL type.

Friday, December 27, 2013

VBA Collections: Using as a Hash Table for Fast String Lookup in Excel and Access

VBA collections are pretty simple beasts. They only have three methods: Add, Remove and Item, and a Count property. When adding values to the collection, we can include a string key for lookup.
Some sample code:
Dim TestColl As New Collection
TestColl.Add 153, "Greece"
TestColl.Add 23, "Japan"
now we can use an integer index to retrieve the n-th 1-based item (returns 23)
TestColl.Item(2)
or the string to retrieve the item by key (returns 153)
TestColl.Item("Greece")
We can also iterate the values:
Dim v As Variant

For Each v In TestColl 
  Debug.Print v
Next
There is some debate as to whether the collection uses a proper hashed lookup to retrieve the value when given a string key. Many sites/blogs recommend using the Dictionary object in the System.Scripting namespace.
While this may mostly work, over the years I have learned to avoid external dependencies in MS Access wherever possible. Hence I determined to investigate the lookup performance of the native VBA collection for both integer and string based retreival.

The following code builds a collection of 100,000 integer values and then looks up a particular value at the start, middle or end of the collection 10,000 times, so as to return a measurable time.
Public Sub TestCollectionItemLookup()
Dim TestColl As New Collection
Dim i As Long
Dim k As Long
Dim StartTime As Single
Dim Iterations As Long
    Iterations = 10000
    Debug.Print "Iterations=" & Iterations

    For i = 0 To 100000
        TestColl.Add i, CStr("k" & i)
    Next
    
    ' By Int Index, i=2
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(2)
    Next
    
    Debug.Print "By Int Index, i=2: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By Int Index, i=500
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(500)
    Next
    
    Debug.Print "By Int Index, i=500: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By Int Index, i=50000
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(50000)
    Next
    
    Debug.Print "By Int Index, i=50000: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By Int Index, i=99999
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(99999)
    Next
    
    Debug.Print "By Int Index, i=99999: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By String Index, i=500
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(CStr("k" & 500))
    Next
    
    Debug.Print "By String Index, i=500: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By String Index, i=99999
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(CStr("k" & 99999))
    Next
    
    Debug.Print "By String Index, i=99999: t=" & Format(Timer - StartTime, "0.000000")
    
    '*****************************************************************
    
    ' By Int Index, random lookup
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(CLng(Rnd * 100000))
    Next
    
    Debug.Print "By Int Index, random lookup: t=" & Format(Timer - StartTime, "0.000000")
    
    ' By String Index, random lookup
    StartTime = Timer
    
    For i = 0 To Iterations
        k = TestColl.Item(CStr("k" & CLng(Rnd * 100000)))
    Next
    
    Debug.Print "By String Index, random lookup: t=" & Format(Timer - StartTime, "0.000000")
End Sub
Executing this sub printed the following results for me:
   
Iterations = 10000
By Int Index, i=2: t=0.007813
By Int Index, i=500: t=0.031250
By Int Index, i=50000: t=5.406250
By Int Index, i=99999: t=16.195310
By String Index, i=500: t=0.031250
By String Index, i=99999: t=0.027344
By Int Index, random lookup: t=6.164063
By String Index, random lookup: t=0.027344
We can draw the following conclusions:
- lookup in a collection by integer index just uses a linear for/next loop with a value comparison. Values at the start of the collection will be retrieved far more quickly than ones at the end
- lookup by string key IS hashed, and it makes no difference where in the array the values lies, although key lookup is at least an order of magnitude slower than direct array lookup would be (we would expect integer lookup for i=2 to be 2-3 times slower than an array lookup)
- the code also checks 10,000 tries of random retrieval, in case of some kind of caching was speeding up the key lookup after the first retrieval, but the results indicate this not to be the case, with the result very similar to the repeated lookup of a single value

Thus, VBA collections can be recommended for hashed string lookup for large collections, however should never be used for integer indexed lookup for large collections.
If integer lookup was required, it would be the best strategy to create an array and copy the values into it, then use the array for positional retrieval.

Wednesday, July 3, 2013

Random Number Generation in ASP .NET

I wrote a previous article about Generating a Random Password using the ASP .NET provider.

Thanks to Joakim Uddholm for the comment that the Random() function was not really secure.
In this case, I think it was secure enough, but that doesn't change the fact that it is certainly a weak point in the system, which could be exploited under certain conditions.

In the previous article I tackled the main problem with Random(), which is that it returns random numbers in a fixed repeatable sequence for a given seed value, and that when created with the default (empty) constructor it is seeded with the current time. This means that if multiple instances of Random() are instantiated in a short period of time (before the time changes), they will return precisely the same sequence of 'random' numbers.

Googling around, I see comments on StackOverflow that agree with Joakim's position that System.Random() simply isn't meant for any security related purpose. This is probably true, however I'm equally certain that people for a variety of reasons will inevitably attempt to use it for security functions.

For this reason I'm posting first an improved version of the static Random() wrapper using a seed that isn't able to be associated with the current time. I'm still using a periodic re-seeding behaviour because I'm concerned that someone may be able to recognise a section of  the fixed 'random' sequence and use that to predict new values. This just emphasises that we shouldn't really be using System.Random() at all.
Anyway, I think this one is probably as good as we'll get to secure random values using Random().

Secondly, I'm posting a method to return a random int up to a given value using System.Security.Cryptography.RNGCryptoServiceProvider
I'm definitely recommending using the second method for any security related purposes.

The improved System.Random wrapper:
private static Random randomNumGenerator = new Random();
private static DateTime lastRandomNumGeneratorSeedTime = DateTime.Now;

public static Random RandomNumGenerator {
  get {
    lock (typeof(Random)) {
      if (randomNumGenerator == null) {
        randomNumGenerator = new Random();
      } else {
        if (DateTime.Now > lastRandomNumGeneratorSeedTime.AddSeconds(1)) {

          randomNumGenerator = new Random(randomNumGenerator.Next(Int16.MaxValue) * DateTime.Now.Millisecond);
          lastRandomNumGeneratorSeedTime = DateTime.Now;
      }
    }
    return randomNumGenerator;
   }
  }
}
And the better, crypto derived method. I've read comments about the performance when retrieving multiple bytes, so I've split it to only retrieve as many random bytes as it needs. The basic code is mainly from this MSDN page.
  private static System.Security.Cryptography.RNGCryptoServiceProvider rngCsp
    = new System.Security.Cryptography.RNGCryptoServiceProvider();

  public static int CryptoRandomNumber(int maxRndValue) {
    // deal with byte and UInt16 values separately for performance reasons
    if (maxRndValue <= Byte.MaxValue) {
      byte[] randomNumber = new byte[1];
      do {
        rngCsp.GetBytes(randomNumber);
      }
      while (!IsFairRoll(randomNumber[0], maxRndValue, Byte.MaxValue));

      return (int)(randomNumber[0] % maxRndValue);
    }

    if (maxRndValue <= UInt16.MaxValue) {
      byte[] randomNumber = new byte[2];
      int rnd = 0;
      do {
        rngCsp.GetBytes(randomNumber);
        rnd = (int)(randomNumber[0] + randomNumber[1] * 256);
      }
      while (!IsFairRoll(rnd, maxRndValue, UInt16.MaxValue));

      return (rnd % maxRndValue);
    }

    int rnd1 = 0;
    byte[] randomNumber1 = new byte[4];
    do {
      rngCsp.GetBytes(randomNumber1);
      rnd1 = (int)(randomNumber1[0] + randomNumber1[1] * 256 
        + randomNumber1[2] * 256 * 256 + randomNumber1[3] * 256 * 256 * 256);
      if (rnd1 < 0) { rnd1 = (rnd1 + 1) * -1; } 
    }
    while (!IsFairRoll(rnd1, maxRndValue, int.MaxValue));

    return (rnd1 % maxRndValue);
  }

  private static bool IsFairRoll(int result, int maxRndValue, int arrayMaxValue) {
    // There are MaxValue / numSides full sets of numbers that can come up 
    // in a single byte.  For instance, if we have a 6 sided die, there are 
    // 42 full sets of 1-6 that come up.  The 43rd set is incomplete. 
    int fullSetsOfValues = arrayMaxValue / maxRndValue;

    // If the roll is within this range of fair values, then we let it continue. 
    // In the 6 sided die case, a roll between 0 and 251 is allowed.  (We use 
    // < rather than <= since the = portion allows through an extra 0 value). 
    // 252 through 255 would provide an extra 0, 1, 2, 3 so they are not fair 
    // to use. 
    return result < maxRndValue * fullSetsOfValues;
  }
The new password generation function from the previous post now looks like this:
public static string GenerateFriendlyPassword(int length) {
    string chars = "abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ0123456789";
    var password = new StringBuilder(length);

    for (int i = 0; i < length; i++) {
        password.Append(chars[Gbl.CryptoRandomNumber(chars.Length)]);
    }
    return password.ToString();
}
Thanks!

Tuesday, June 11, 2013

Random Password Generation in ASP .NET sites

Recently I put together a new public facing web site for a client. This website ran off an existing private web-based database site I had created for them several years earlier.
There were two minor, but annoying, bumps on the road in getting this working.
Firstly, I wanted to be able to have repeatable password encryption key, meaning if I moved the site to another server, or to my development machine, I wanted the passwords to still work. More importantly, I wanted to be able to reset the password or generate a new account using one web app, and have it usable by the other. This is not the case by default - the encryption is carried out using a default key that is specific to each web server (or site ... had a quick look but I'm not sure).
Secondly, the automated password generated was too complex. The simplest password I could get using the Web.Config settings still frequently had several symbols in it, quite confronting for most average users, who really can't understand what a 'tilde' or 'asterisk' is.

1) Repeatable Password Encryption


Password encryption is controlled by the <machinekey> attribute in the <system.web> element of  web.config
<machinekey decryption="AES" 
decryptionkey="E3134ACE29C6C28A3B9CFD58CFD764D0AA2E2EE3468488C1D64DD331765B256F" 
validation="SHA1" validationkey="220C13FA9033D18C11AF964785D0C06A224B700805B3184E29973FE6A5EA3AF2E7630E81D9E24150D38891BDCACEF075DCCB287271A035993B86663FE940B056">
This would not be worth a comment in itself - this fact is pretty easy to find - but the tedious part is working out how to generate a new key for your site.
Luckily, there are several online generators for that:
      http://aspnetresources.com/tools/machineKey
      http://www.blackbeltcoder.com/Resources/MachineKey.aspx

Now simply ensure that all the sites you want to interoperate have the same Machine Key. Remember that this key needs to be kept secret. Don't go emailing your Web.Config around the place!

2) Auto-generating Simpler Passwords


This is trickier than it seems. The Web.Config settings offer a lot of control over the complexity of passwords entered by the user, but very little over what is auto-generated by the Membership provider itself.  This great article basically shows how to do it, but surprisingly, the methods suggested for generating random passwords out there veer between the wildly over-complicated and the downright crazy.
I thought I'd post a very simple but complete solution here.

Step 1: Create a Password Generation Function


This is a very simple generator function. It simply chooses randomly between an array of approved characters, taking as a single parameter the length of the desired password.
public static string GenerateFriendlyPassword(int length) {
    string chars = "abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ0123456789";
    var password = new StringBuilder(length);

    for (int i = 0; i < length; i++) {
        password.Append(chars[Gbl.RandomNumGenerator.Next(chars.Length)]);
    }
    return password.ToString();
}
You may notice, however, the call to Gbl.RandomNumGenerator.Next. A lot of the password samples out there use something like
    Random rnd = new Random();
    for (int i = 0; i < length; i++) {
         password.Append(chars[rnd.Next(chars.Length)]);
    }
    ...
This appears at first sight to work, but if generating batches of passwords, you'll quickly find that you get duplicate passwords being returned. Digging into the documentation reveals that Random(), like in pretty much every other language, uses a table of pseudo-random values to generate the numbers, but it uses the current clock tick value as a seed. This means that if multiple instances of Random() are instantiated quickly enough, they'll get the same seed and produce exactly the same string of 'random' numbers.
My solution is to keep a global (static) instance of Random() and use that for all number generation. Here's the code for that (in class Gbl):
private static Random randomNumGenerator = null;
private static DateTime lastRandomNumGeneratorSeedTime = DateTime.Now;

public static Random RandomNumGenerator {
   get {
      lock (typeof(Random)) {
         if (randomNumGenerator == null) {
            randomNumGenerator = new Random();
         } else {
            if (DateTime.Now > lastRandomNumGeneratorSeedTime.AddSeconds(1)) {
               randomNumGenerator = new Random();
               lastRandomNumGeneratorSeedTime = DateTime.Now;
            }
         }
         return randomNumGenerator;
      }
   }
}
This uses a global instance of Random, but also refreshes the seed value if it's been more than one second since the last use of the global instance. That ensures that there is some time-based randomness injected into the seed rather than just reeling out the values from the pseudo-random list.

EDIT: See updated post for a more secure solution to this !

Step 2: Override the Default Membership Provider


This is simple - we just override the password generation function of the provider and keep everything else the same.
using System;
using System.Text;
using System.Web.Security;

namespace MyApp {
   public class MyAppMembershipProvider : System.Web.Security.SqlMembershipProvider {
     public int GeneratedPasswordLength = 6;

     public MyAppMembershipProvider ()
      : base() {
     }

     public override string GeneratePassword() {
        return GenerateFriendlyPassword(GeneratedPasswordLength);
     }

     public static string GenerateFriendlyPassword(int length) {
        string chars = "abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ0123456789";
        var password = new StringBuilder(length);

        for (int i = 0; i < length; i++) {
           password.Append(chars[Gbl.RandomNumGenerator.Next(chars.Length)]);
        }
        return password.ToString();
     }
   }
}

Step 3: Reference the New MembershipProvider in the Web.Config


The MembershipProvider will be referenced in your Web.Config something like this:
<membership defaultProvider="AspNetSqlMembershipProvider">
  <providers>
    <clear/>
    <add name="AspNetSqlMembershipProvider" 
     type="System.Web.Security.SqlMembershipProvider" 
     connectionStringName="Db" 
     ...
  </providers>
</membership>
All that's needed is to change the
type="System.Web.Security.SqlMembershipProvider"
to
type="MyApp.MyAppMembershipProvider"
That's it!