Wednesday, July 3, 2013

Random Number Generation in ASP .NET

I wrote a previous article about Generating a Random Password using the ASP .NET provider.

Thanks to Joakim Uddholm for the comment that the Random() function was not really secure.
In this case, I think it was secure enough, but that doesn't change the fact that it is certainly a weak point in the system, which could be exploited under certain conditions.

In the previous article I tackled the main problem with Random(), which is that it returns random numbers in a fixed repeatable sequence for a given seed value, and that when created with the default (empty) constructor it is seeded with the current time. This means that if multiple instances of Random() are instantiated in a short period of time (before the time changes), they will return precisely the same sequence of 'random' numbers.

Googling around, I see comments on StackOverflow that agree with Joakim's position that System.Random() simply isn't meant for any security related purpose. This is probably true, however I'm equally certain that people for a variety of reasons will inevitably attempt to use it for security functions.

For this reason I'm posting first an improved version of the static Random() wrapper using a seed that isn't able to be associated with the current time. I'm still using a periodic re-seeding behaviour because I'm concerned that someone may be able to recognise a section of  the fixed 'random' sequence and use that to predict new values. This just emphasises that we shouldn't really be using System.Random() at all.
Anyway, I think this one is probably as good as we'll get to secure random values using Random().

Secondly, I'm posting a method to return a random int up to a given value using System.Security.Cryptography.RNGCryptoServiceProvider
I'm definitely recommending using the second method for any security related purposes.

The improved System.Random wrapper:
private static Random randomNumGenerator = new Random();
private static DateTime lastRandomNumGeneratorSeedTime = DateTime.Now;

public static Random RandomNumGenerator {
  get {
    lock (typeof(Random)) {
      if (randomNumGenerator == null) {
        randomNumGenerator = new Random();
      } else {
        if (DateTime.Now > lastRandomNumGeneratorSeedTime.AddSeconds(1)) {

          randomNumGenerator = new Random(randomNumGenerator.Next(Int16.MaxValue) * DateTime.Now.Millisecond);
          lastRandomNumGeneratorSeedTime = DateTime.Now;
      }
    }
    return randomNumGenerator;
   }
  }
}
And the better, crypto derived method. I've read comments about the performance when retrieving multiple bytes, so I've split it to only retrieve as many random bytes as it needs. The basic code is mainly from this MSDN page.
  private static System.Security.Cryptography.RNGCryptoServiceProvider rngCsp
    = new System.Security.Cryptography.RNGCryptoServiceProvider();

  public static int CryptoRandomNumber(int maxRndValue) {
    // deal with byte and UInt16 values separately for performance reasons
    if (maxRndValue <= Byte.MaxValue) {
      byte[] randomNumber = new byte[1];
      do {
        rngCsp.GetBytes(randomNumber);
      }
      while (!IsFairRoll(randomNumber[0], maxRndValue, Byte.MaxValue));

      return (int)(randomNumber[0] % maxRndValue);
    }

    if (maxRndValue <= UInt16.MaxValue) {
      byte[] randomNumber = new byte[2];
      int rnd = 0;
      do {
        rngCsp.GetBytes(randomNumber);
        rnd = (int)(randomNumber[0] + randomNumber[1] * 256);
      }
      while (!IsFairRoll(rnd, maxRndValue, UInt16.MaxValue));

      return (rnd % maxRndValue);
    }

    int rnd1 = 0;
    byte[] randomNumber1 = new byte[4];
    do {
      rngCsp.GetBytes(randomNumber1);
      rnd1 = (int)(randomNumber1[0] + randomNumber1[1] * 256 
        + randomNumber1[2] * 256 * 256 + randomNumber1[3] * 256 * 256 * 256);
      if (rnd1 < 0) { rnd1 = (rnd1 + 1) * -1; } 
    }
    while (!IsFairRoll(rnd1, maxRndValue, int.MaxValue));

    return (rnd1 % maxRndValue);
  }

  private static bool IsFairRoll(int result, int maxRndValue, int arrayMaxValue) {
    // There are MaxValue / numSides full sets of numbers that can come up 
    // in a single byte.  For instance, if we have a 6 sided die, there are 
    // 42 full sets of 1-6 that come up.  The 43rd set is incomplete. 
    int fullSetsOfValues = arrayMaxValue / maxRndValue;

    // If the roll is within this range of fair values, then we let it continue. 
    // In the 6 sided die case, a roll between 0 and 251 is allowed.  (We use 
    // < rather than <= since the = portion allows through an extra 0 value). 
    // 252 through 255 would provide an extra 0, 1, 2, 3 so they are not fair 
    // to use. 
    return result < maxRndValue * fullSetsOfValues;
  }
The new password generation function from the previous post now looks like this:
public static string GenerateFriendlyPassword(int length) {
    string chars = "abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ0123456789";
    var password = new StringBuilder(length);

    for (int i = 0; i < length; i++) {
        password.Append(chars[Gbl.CryptoRandomNumber(chars.Length)]);
    }
    return password.ToString();
}
Thanks!

Tuesday, June 11, 2013

Random Password Generation in ASP .NET sites

Recently I put together a new public facing web site for a client. This website ran off an existing private web-based database site I had created for them several years earlier.
There were two minor, but annoying, bumps on the road in getting this working.
Firstly, I wanted to be able to have repeatable password encryption key, meaning if I moved the site to another server, or to my development machine, I wanted the passwords to still work. More importantly, I wanted to be able to reset the password or generate a new account using one web app, and have it usable by the other. This is not the case by default - the encryption is carried out using a default key that is specific to each web server (or site ... had a quick look but I'm not sure).
Secondly, the automated password generated was too complex. The simplest password I could get using the Web.Config settings still frequently had several symbols in it, quite confronting for most average users, who really can't understand what a 'tilde' or 'asterisk' is.

1) Repeatable Password Encryption


Password encryption is controlled by the <machinekey> attribute in the <system.web> element of  web.config
<machinekey decryption="AES" 
decryptionkey="E3134ACE29C6C28A3B9CFD58CFD764D0AA2E2EE3468488C1D64DD331765B256F" 
validation="SHA1" validationkey="220C13FA9033D18C11AF964785D0C06A224B700805B3184E29973FE6A5EA3AF2E7630E81D9E24150D38891BDCACEF075DCCB287271A035993B86663FE940B056">
This would not be worth a comment in itself - this fact is pretty easy to find - but the tedious part is working out how to generate a new key for your site.
Luckily, there are several online generators for that:
      http://aspnetresources.com/tools/machineKey
      http://www.blackbeltcoder.com/Resources/MachineKey.aspx

Now simply ensure that all the sites you want to interoperate have the same Machine Key. Remember that this key needs to be kept secret. Don't go emailing your Web.Config around the place!

2) Auto-generating Simpler Passwords


This is trickier than it seems. The Web.Config settings offer a lot of control over the complexity of passwords entered by the user, but very little over what is auto-generated by the Membership provider itself.  This great article basically shows how to do it, but surprisingly, the methods suggested for generating random passwords out there veer between the wildly over-complicated and the downright crazy.
I thought I'd post a very simple but complete solution here.

Step 1: Create a Password Generation Function


This is a very simple generator function. It simply chooses randomly between an array of approved characters, taking as a single parameter the length of the desired password.
public static string GenerateFriendlyPassword(int length) {
    string chars = "abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ0123456789";
    var password = new StringBuilder(length);

    for (int i = 0; i < length; i++) {
        password.Append(chars[Gbl.RandomNumGenerator.Next(chars.Length)]);
    }
    return password.ToString();
}
You may notice, however, the call to Gbl.RandomNumGenerator.Next. A lot of the password samples out there use something like
    Random rnd = new Random();
    for (int i = 0; i < length; i++) {
         password.Append(chars[rnd.Next(chars.Length)]);
    }
    ...
This appears at first sight to work, but if generating batches of passwords, you'll quickly find that you get duplicate passwords being returned. Digging into the documentation reveals that Random(), like in pretty much every other language, uses a table of pseudo-random values to generate the numbers, but it uses the current clock tick value as a seed. This means that if multiple instances of Random() are instantiated quickly enough, they'll get the same seed and produce exactly the same string of 'random' numbers.
My solution is to keep a global (static) instance of Random() and use that for all number generation. Here's the code for that (in class Gbl):
private static Random randomNumGenerator = null;
private static DateTime lastRandomNumGeneratorSeedTime = DateTime.Now;

public static Random RandomNumGenerator {
   get {
      lock (typeof(Random)) {
         if (randomNumGenerator == null) {
            randomNumGenerator = new Random();
         } else {
            if (DateTime.Now > lastRandomNumGeneratorSeedTime.AddSeconds(1)) {
               randomNumGenerator = new Random();
               lastRandomNumGeneratorSeedTime = DateTime.Now;
            }
         }
         return randomNumGenerator;
      }
   }
}
This uses a global instance of Random, but also refreshes the seed value if it's been more than one second since the last use of the global instance. That ensures that there is some time-based randomness injected into the seed rather than just reeling out the values from the pseudo-random list.

EDIT: See updated post for a more secure solution to this !

Step 2: Override the Default Membership Provider


This is simple - we just override the password generation function of the provider and keep everything else the same.
using System;
using System.Text;
using System.Web.Security;

namespace MyApp {
   public class MyAppMembershipProvider : System.Web.Security.SqlMembershipProvider {
     public int GeneratedPasswordLength = 6;

     public MyAppMembershipProvider ()
      : base() {
     }

     public override string GeneratePassword() {
        return GenerateFriendlyPassword(GeneratedPasswordLength);
     }

     public static string GenerateFriendlyPassword(int length) {
        string chars = "abcdefghijkmnpqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ0123456789";
        var password = new StringBuilder(length);

        for (int i = 0; i < length; i++) {
           password.Append(chars[Gbl.RandomNumGenerator.Next(chars.Length)]);
        }
        return password.ToString();
     }
   }
}

Step 3: Reference the New MembershipProvider in the Web.Config


The MembershipProvider will be referenced in your Web.Config something like this:
<membership defaultProvider="AspNetSqlMembershipProvider">
  <providers>
    <clear/>
    <add name="AspNetSqlMembershipProvider" 
     type="System.Web.Security.SqlMembershipProvider" 
     connectionStringName="Db" 
     ...
  </providers>
</membership>
All that's needed is to change the
type="System.Web.Security.SqlMembershipProvider"
to
type="MyApp.MyAppMembershipProvider"
That's it!

Wednesday, October 17, 2012

Automated Foreign Key Cached Dictionary Generation in SubSonic 3 : Part 2

In part one, we tackled the problem of what Foreign Key (FK) relationships in a database could look like when translated into the object world.
An initial, not terribly successful attempt to provide these methods was undertaken.

At this stage, several design points became obvious:

  • there was a need to cache the entire set of objects
  • this would best be done using a static list / dictionary
  • the least confusing way to provide the FK lookups would be for each cached table to provide lookup dictionaries to itself, indexed by FK column value (rather than one table storing data about another)
  • we might want also to look up any indexed column values by index as well as those specifically with a FK
  • it would be great to have the option to cache only some tables/objects and not others, and have the object seamlessly detect and deal with this - retrieve cached values if present, or query the database otherwise

Cached Lists

The cached lists were declared in a normal class, which could later be declared as the static member of an application.
This is a sample of the objects for the Order table.
public class DataCache {
    public List<Order> Order_BaseList = null;
    public Dictionary<int, Order> Order_By_OrderID = null;    
    public Dictionary<string, List<Order>> Order_GroupBy_CustomerID = null;
    public Dictionary<int, List<Order>> Order_GroupBy_EmployeeID = null;
    public Dictionary<int, List<Order>> Order_GroupBy_ShipVia = null;

    public DataCache () {
        Order_BaseList = Order.All().ToList();
        foreach (IHasDataCache hdc in Order_BaseList ) { hdc.CachedData = this; } 
        Order_By_OrderID = Order.OrderID_CreateLookupList(Order_BaseList);  
        Order_GroupBy_CustomerID = Order.CustomerID_CreateFkChildList (Order_BaseList);
        Order_GroupBy_EmployeeID = Order.EmployeeID_CreateFkChildList (Order_BaseList);
        Order_GroupBy_ShipVia = Order.ShipVia_CreateFkChildList (Order_BaseList);
    }
}
The 'Base' list could be populated using LINQ from the database.

The 'X_By_XID' dictionary Order_By_OrderID, serves to retreive an Order object given an OrderID.
Any database column having a unique index (clearly including a single column PK) gets a list like this.

The 'X_GroupBy_YID' dictionaries hold lists of Order objects corresponding to a particular FK column value. For example, Order_GroupBy_EmployeeID[employeeId] contains a List<Order> of all the orders associated with that particular employeeID.

You can also see the initialisation code from the constructor. The static methods being called were designed only to be used in the initial creation of the dictionaries from the base list.

Don't worry about the second line, the 'foreach', yet.

It may not be clear yet, but all the functionality we created in Part 1 can easily be derived from these lists. As a bonus, we get this bunch of handy lookup lists to use for any other purpose we please.

Sharing the Cached Data Between Objects

In order to make the cached objects able to access other cached objects, the objects needed to contain a reference to the cache.
This was done by creating an IHasDataCache interface and modifying all data access objects to implement that interface:

public interface IHasDataCache {
    DataCache CachedData { get; set; }
}

public partial class Order: IActiveRecord, IHasDataCache {
    ...

    private DataCache _dataCache = null;

    [SubSonicIgnore]
    public DataCache CachedData {
        get { return _dataCache; }
        set { _dataCache = value; }
    }

    ...
}

public partial class Product: IActiveRecord, IHasDataCache {
    ...
etc
and now we can see the reason for the 'foreach' line back in the DataCache class constructor, remember:
foreach (IHasDataCache hdc in Order_BaseList ) { hdc.CachedData = this; } 
That line loops through each of the just-loaded cached objects and sets a reference to the parent DataCache object. Now the table classes could access the other related classes as they needed to.

Instantiating the Cache

I usually implement a static class 'Gbl' where I put anything that's global to my application, such as cached data ...
public static class Gbl {
    public static DataCache NorthwindDataCache = new DataCache();
    ...
}
The Access Methods

It's time to unveil the actual methods used to access the FK related objects. The naming conventions are a lot less cryptic. Any single FK object is accessed by a method starting with 'FkParent_...', and any list of FK objects is access by a method starting with 'FkList...'.
Here's the methods for the Product class:
List<Order_Detail> orderDetail = p.FkList_Order_Detail_ProductID;
List<Product_Category_Map> pcm = p.FkList_Product_Category_Map_ProductID;
Category category = p.FkParent_CategoryID;
Supplier supplier = FkParent_SupplierID;
Under the Hood

The beauty of this actually resides in the caching system, and the implementation of the above methods gives insight into how this system actually works:
public List<Order_Detail> FkList_Order_Detail_ProductID {
    get {
        if (_FkList_Order_Detail_ProductID == null) {
            if (_dataCache != null && _dataCache.Order_Detail_GroupBy_ProductID!=null) {
                if (!_dataCache.Order_Detail_GroupBy_ProductID.TryGetValue(_ProductID, out _FkList_Order_Detail_ProductID)) {
                    // deal with the case where there are no related records and hence no list
                    _FkList_Order_Detail_ProductID = new List<Order_Detail>();
                }
            } else {
                _FkList_Order_Detail_ProductID = (from items in Order_Detail.All()
                    where items.ProductID == _ProductID
                    select items).ToList();
            }
        }
        return _FkList_Order_Detail_ProductID;
    }
}
 
public Category FkParent_CategoryID {
    get {
       if (_FkParent_CategoryID == null) { 
           if (_dataCache != null && _dataCache.Category_By_CategoryID!=null) {
               _FkParent_CategoryID = _dataCache.Category_By_CategoryID[this.CategoryID]; 
           } else {
               _FkParent_CategoryID = Category.SingleOrDefault(x => x.CategoryID == this.CategoryID);
           }
       }
       return _FkParent_CategoryID;
    }
}
So it goes something like this:
  • lazy loading means the FK object or list is only fetched once
  • if the object was created in the DataCache class and was populated using the code in the constructor, then it contains a valid reference to the DataCache object
  • a valid DataCache reference is used to load the wanted information from the cached data dictionaries, if present
  • if no cache is present, the data is loaded from the database via LINQ
This system is pretty flexible. It allows ad-hoc mixing of cached and non cached objects in code, supporting optimally efficient access to all cached objects, but database fetches where that's not desired.
Not only that, but we can use LINQ to query the cached objects without generating a database call.

As mentioned in Part 1, the caching took the run-time of the first program I used this with (where all the tables needed to be cached, but I initially just cached the main ones and LINQ queried the rest) from 3.5 minutes to sub-second - literally finishing before the mouse button had moved back up from the click to start the program.

Here are some snippets:
foreach (Supplier supp in Gbl.NorthwindDataCache.Supplier_BaseList) {
    foreach Product product in supp.FkList_Product_SupplierID) {
        ...
    }
}

if (product.FkParent_SupplierID.Country != "Australia") { overseasSupplier = true; }

tablesAlphaOrder = Gbl.NorthwindDataCache.Product_BaseList
    .Where(x => x.SupplierId > 0 && x.CategoryID != 25)
    .OrderBy(x => x.ProductName).ToList();
The Code

So where can you get your hands on this little beauty ?
The best place is probably in my SubSonic templates branch here.
You only need the T4 Templates.

Automated Foreign Key List Generation in SubSonic 3 : Part 1

This is another post in the series of SubSonic enhancements.
I'm always tinkering with the class templates, trying to get just that little bit more mileage out of them.
This post is going to be a bit long, and a bit complex, but well worth your while.

The Problem

This post deals with a commonly heard request: 
  using foreign key links to connect to related lists of objects
For example, in the good old Northwind database, a product has a single supplier, but can belong to many orders. Wouldn't it be great if we could write something like the following:
Product p = Products.SingleOrDefault(productId);
if (p!=null) {
    string companyName = p.GetSupplier().CompanyName;
    List<order> orderList = p.GetOrderList();
    ... do something ...
}
Another complication is that often there is a need to cache data before performing complex operations.
For example, we might want to iterate over all orders and carry out some business logic to do with suppliers, which requires stepping through the product level.

We could use a single SQL dataset to handle all three levels at the same time, but then we denormalise the data and lose all the advantages of object-orientedness. 

We could iterate through the orders one at a time and query (by LINQ or SQL) all the related data for each row, which would be programmatically easy, but highly inefficient, generating multiple queries for each row (RBAR, anyone ?).
Time and time again, I'd find myself fetching all three full result sets as lists and writing something like the following:
foreach (Supplier s in supplierList) {
   foreach (Product p in productList) {
      if  (p.SupplierID==s.SupplierID) {
          foreach (Order o in orderList) {
             if  (o.ProductID==p.ProductID) {
                ... do something with supplier orders ...
          }
      }
}
At least if you're going to do wildly inefficient reiteration - do it in memory !

But wouldn't it be great if somehow we could automatically set up and pre-cache, with just three queries to the database, all of the related objects in their relationships to the other objects.

Read on ....

Stage 1: Creating LINQ lookups

Let's state the goal clearly at the outset. 
A foreign key is a one-to-many relationship.
In the Northwind example above, the supplier is on the one/parent side of the relationship (a product can have only one supplier), and the order is on the many/child side (a product can be in many orders).
So
  • for each parent FK relationship, we want to provide a single object 
  • for each child FK relationship, we want to provide a generic list of objects
In the first draft of the solution, this was achieved by providing two sets of properties. There was no attempt at caching.
For example:
Product p = Products.SingleOrDefault(productId);
Supplier s = p.FK_SupplierID_Supplier;
foreach (OrderDetail orderdet in p.FK_Order_ProductID) {
    ... do something ....
}
The naming convention was a little cryptic, but  FK links need to be named using the table and column involved in the FK relationship (Remember - there can be two differently named FK columns in a table that link to the same external table).
The order of the table names was swapped for the one-to-many and the many-to-one member.

The implementation was like this (this is the end result: the code was actually done in the T4 template to auto generate what you see below):
private Supplier _FK_SupplierID_Supplier;
public Supplier FK_SupplierID_Supplier {
  get {
      if (_FK_SupplierID_Supplier == null) { 
           _FK_SupplierID_Supplier = Supplier.SingleOrDefault(
             x => x.SupplierID == this.SupplierID);
      }
      return _FK_SupplierID_Supplier;
  }
}

private List<order> _FK_Order_ProductID;
public List<order> FK_Order_ProductID {
  get {
    if (_FK_Order_ProductID == null) {
        _FK_Order_ProductID = (from items in Order.All()
            where items.OrderID == _OrderID
            select items).ToList();
        }
        return _FK_Order_ProductID;
  }
}
Problems with Stage 1

Programmatically, this solved the problem.  But performance was terrible.
While this approach used lazy loading to avoid repeat queries, it still generated a query to the database for each related record or set of records when first requested.
When I set it up, I was under the impression that LINQ would in fact query the database at load-time rather than run-time. Not so.
To give us a benchmark, the program I first tried this out on, ran a procedure involving object hierarchies that took three and a half minutes to run, and hit the database steadily during the whole process.
This was, indeed, a textbook example of RBAR.

Once implementing the caching in the following steps, the same program ran in a fraction of a second. I didn't even bother to measure it, because it was so fast.

See Part 2 of 2



SubSonic 3 Automated Enum Generation

As I've said before, I love the SubSonic Project for generating data access classes from my database.

Probably 12 months ago, I wrote an automated enum generator for it, since that was one feature that was missing,
It's a drop-in - it doesn't affect any other functionality or require modifications to existing code.
I thought that it had been incorporated into the SubSonic 3 Templates trunk, but whoops - it hasn't !
So the two template files (Enums.tt and Enums.ttinclude) can be downloaded at my GitHub branch.

I don't think there's any point rewriting the comments at the start of Enums.tt regarding usage, so here they are:
----------------------------------------------------------------------------------------------------
INSTRUCTIONS
----------------------------------------------------------------------------------------------------

Enum Generator Features
-----------------------
 - auto generates enum values from the row data in the tables
 - will generate regular enums for integer values or an 'enum-like' struct for string values
 - a single setting will generate enums for all lookup tables with a standard prefix, with default enum
   name based on the table name
 - the enum name, and the value and description columns used to create the enum can be customised per-table
 - multiple enums can be generated from the same table 
 - a MULTI mode allows automated enum generation from a MUCK (Massively Unified Code-Key) general purpose lookup table
   (BTW MUCK tables are NOT a good idea, but in the tradition of SubSonic, we let you make the choice)

Typical 'integer valued' table:

  CategoryID  CategoryName   
  int         nvarchar(50)   
  ----------- ---------------
  1           Beverages       
  2           Condiments      
  3           Confections     
  4           Dairy Products  
  5           Grains/Cereals  

Typical 'string valued' table:

  State_Str     State
  nvarchar(10)  nvarchar(50)
  ------------  ----------------------------
  ACT           Australian Capital Territory
  NSW           New South Wales
  NT            Northern Territory
  QLD           Queensland
  SA            South Australia
  TAS           Tasmania
  VIC           Victoria
  WA            Western Australia

Typical 'MUCK' table:

  LookupKey                                          LookupVal    LookupDescLong
  nvarchar(50)                                       nvarchar(20) nvarchar(100)
  -------------------------------------------------- ----------   --------------------------
  AssignStatusStr                                    F            Fully
  AssignStatusStr                                    P            Partly
  AssignStatusStr                                    U            Not
  AssignStatusStr                                    X            n/a
  BatchAutoGenModeStr                                E            Assign to existing batch
  BatchAutoGenModeStr                                N            Make new batch
  BatchAutoGenModeStr                                X            Do not assign to batch
  BatchPackStatusStr                                 C            Cancelled
  BatchPackStatusStr                                 L            Locked
  BatchPackStatusStr                                 P            Packing
  BatchPackStatusStr                                 T            Complete


EnumSettings contains a list of enum generation settings.
NOTE: enum Generation uses CleanUp() from Settings.ttinclude to sanitize names so make sure it's up to scratch

FORMAT:   [table name regexp]:[enum name]:[id column name]:[descr column name]:[sql where clause]

 - all params are optional except the first. if omitting an earlier parameter but using a later parameter then 
   still include the ':' as a placeholder

  [table name regexp] = regular expression matching the table name.  Can be just the table name but is advisable 
      to use the end and/or start RegEx markers.

  [enum name] = the name to use for the enum (default=table name + 'Enum')
   - if the enum name is in the format MULTI=[KeyColName] then the key column values will be used to name 
     the enum and to match the blocks of row values to be used for each enum

  [id column name] = the name of the column to use for the enum value (default=PK col)

  [descr column name] = the name of the column to use for the enum description (default=first text col)

  [sql where clause] = where clause to use when retrieving the enum values (default=empty)

EXAMPLES
string[] EnumSettings = new string[]{
 "lk_.+",
 - generates enums from all table in the database starting with 'lk_' using default names and columns

 "tblLookupVals:AssignStatusEnumStr:LookupVal:LookupDescLong:where [LookupKey]='AssignStatusStr'",
 - generates the named enum from the designated table, using the designated columns and WHERE

 "tblLookupVals:MULTI=LookupKey:LookupVal:LookupDescLong",
 - generates multiple enums from the 'tblLookupVals' MUCK table; one enum for each block of values in column 'LookupKey'

 "lk_State:StateShortEnum:State_Str:State_Str",
 - generates an enum of 'short' state values only

 "lk_State:StateLongEnum:State:State",
 - generates an enum of 'long' state values only

};
Samples of generated enums are shown below. Note that the tool can generate string 'enums', which are a struct of a type I put together after browsing the many proposals on StackOverflow.
namespace MyNamespace { 

 // string enum derived from database rows: libm_ColType.DotNetSystemTypeName, libm_ColType.DotNetSystemTypeName
 public struct DataTypeEnum {
  public const string Int64 = "Int64 ";
  public const string Boolean = "Boolean ";
  public const string Byte = "Byte ";
  public const string Decimal = "Decimal ";
  public const string DateTime = "DateTime ";
  public const string Double = "Double ";

  public string Value { get; set; }
  public override string ToString() { return Value; }
 }

 // enum derived from database rows: libm_DataView.ViewDescr, libm_DataView.DataViewID
 public enum DataViewEnum {
  Table_Default = -81,
  Default = -18,
  None = -16
 }
}
The string 'enum' can be used in two ways. You can declare your variable as string, and simply use the enum const values:
string s = DataTypeEnum.Decimal;
Or you can declare your variable as as DataTypeEnum, and use the .Value method to get and set values. This doesn't actually restrict the values used to those in the enum unless you write some more code (you can add it to the template so it autogenerates), but it conforms to the style of enums, and if you don't ever assign anything but the matching enum constant values, it's eqivalent to an enum.
DataTypeEnum x;
x.Value = DataTypeEnum.Decimal;
string x1 = x.Value;
string x2 = x.ToString();
Up to you if it's important enough to add that feature. NOTE: I've also added automatic enum generation to SubSonic 2. I'm an admin of that GitHub project, so the enum generation IS part of the main code branch.

Sunday, October 14, 2012

The Death of the Hard Drive - and the Database Server ?

I'll start this post with a prediction: Hard drives are doomed

Actually, not much to argue with there for the informed observer. The new solid state hard drives can run rings around the mechanical version. Oh, they're not quite there yet - they're smaller and more expensive - but ten years from now you can bet that will all have changed.
And solid state hard drives are actually just memory - the only reason that they are packaged as hard drives is because that is the paradigm that we are locked into.

They're not the same memory as we buy for our motherboards.  That's dynamic, or volatile, RAM, where the data disappears with the power. The drives use static, or nonvolatile, memory (otherwise known as EEPROM).  The write cycles are orders of magnitude longer, and there is a limit to the number of write cycles per bit/byte/unit of storage, so the solid state drives have management units to spread the writing load around so as not to exhaust this limit.
Managing the write cycle limit is the one compelling reason to keep this memory as an independent unit, but I suspect that soon someone will find away around this limitation and also a way to greatly decrease the write cycle time.
And it's just a matter of time before the capacity increases beyond the ability of mechanics to compete with.

Once these things change, there will be no reason not to connect static memory direct to the processor. Why pipe it thought a slow interface when a 64 bit processor can directly address 18 million terabytes ?
The static memory will replace the role of the hard drive and probably some uses of the dynamic memory, and the dynamic memory will remain the fast, volatile work area it is now.

But the IDE/SATA/SCSI-connected box we call a hard drive will be gone.

And this brings us to the corollary of the first prediction:
Database servers as we know them are doomed

And that's because database servers are designed specifically to take memory based structures and store them in detached persistent storage - ie. a hard drive.

Now let's backtrack a fraction.
Database servers perform a lot of valuable and complex tasks, including not just storing the data, but also navigating it and keeping it consistent - transactions and query optimisation being two great examples.
So there's no way I'm predicting that these functions are going to disappear.

ACID is mandatory for any reliable system and in fact, I think this is a great litmus test for how any database system performs: assuming competent database design and indexing, can it handle an ad-hoc, highly complex query on millions of rows and still return results with low latency and throughput times ?
In the case of MS SQL Server and Oracle, the answer is yes (and server clustering is another game altogether, we won't cover it here). But many other database systems fail this test dismally.

But I do strongly believe that there will be a paradigm shift in how the data is actually stored. What concerns me is that none of the standard vendors appear to be gearing up for this.
Most of the serous RDBMSs utilise modelling of the hard disk topology at a very low level, so as to squeeze the maximum performance out of the disk.

So what happens when the hard drive disappears ? Well, the data will be stored persistently  in-place in memory. But we'll still need transactions. We'll still need a query optimiser.
And we'll still need to house the data store on a separate server and communicate with it somehow, whether by API or by SQL.
How is that going to look, and who's planning for this ? As far as I can see, as the Johnny Cash song goes: nobody.

Tuesday, August 14, 2012

Review of .NET Licensing Solutions

In preparation for my new application FlexMerge which will go on sale soon (plug: at www.arrow-of-time.com !), I've been investigating Licensing/code protection issues.
This is a windows class library which needs to implement serial number and license based protection using an activation server.
I also wanted the ability to automatically send serials from my ASP .NET website immediately on successful checkout.
I scoured the internet, coming up with a whole bunch of likely candidates:

DeployLX (by Xheo) ($700 +)
CryptoLicensing Professional ($300)
Infralution Licensing System ($170 + $170 license tracker source code)
Eziriz Intellilock ($179)
LomaCons InstallKey ($86 source code edition)
LicenseSpot (subscription)
Soraco Quick License Manager ($400)
Manco .NET Licensing System ($200/$480)

To cut to the chase, I ended up buying CryptoLicensing, as it offered equivalent features to any of the other offerings, along with samples of the serial generator API and an activation server, at a very reasonable price.
I bought it bundled with CryptoObfuscator.
I haven't rolled it out yet: watch this space. But in testing it has performed flawlessly.

Immediately discarded was LicenseSpot, as it offers a hosted, monthly subscription based service which looks great for those looking for a simple, hosted solution, but I was pretty sure the integration with my site was going to be a problem. I'd really rather have everything stored in my own database rather than spread over the internet.

Soraco Quick License Manager was also discarded quickly as its feature set appeared to be smaller than most of the other standard offerings. It didn't appear at first sight to support serial numbers, and I didn't have the time or inclination to go looking further.

Eziriz was an early frontrunner, but then I did run across reports of bugs, particularly with obfuscation, and tardy tech support. That resulted in a knockout too.
Note that price is not much of an object for me - with my hourly rate, it was going to cost WAY more to integrate the software than to buy it.

LomaCons InstallKey looked good, and is so cheap that I bought it to look at the source code rather than stuff around with a demo version. However, there were several bugs in the project which cost me time, and it  required IIS Express to work, which I didn't want to install on my Dev machine. In the end I couldn't get it to run, even WITH source.
It had a web based license manager with source code, but this was all written in old school ASP .NET WebForms style, which I have to say makes the gore rise in my throat. I wanted something I could tweak easily, and I would have ended up rebuilding that from scratch.

DeployLX makes a big effort to make their tools user friendly. But this comes with a price tag. This was easily the most expensive option. Since time is money, I was tempted to go with them for a bit, and benefit from a bit of hand holding, but it turned out that the features I wanted would invoke several more license charges over the base charge, and also that their serial generation API has certain license restrictions.
I think they would have worked, but I thought that he licensing restrictions were a bit onerous considering the other offers out here.

Infralution looks like a good solution. Also, pay a bit more for the source code. This one looked comparable to the Crypto product, but had a Winforms based license manager. Since I'll definitely be managing licenses online, this put me off a bit. All their ancillary tools dovetail into this license manager.
That said, they supply source, so rolling your own web based manager would have been doable and supported. I would have chosen these guys had Crypto not been around.

The Roundup 

All in all, there were several good players, and a lot of choice. It happens that the CryptoLicensing package was well reviewed, had a complete feature set, and also pairs with CryptoObfuscator which also is a plus.
Where I read of problems with licensing tools in general, it was often during obfuscation that this occurred, so buying a matched license/obfuscation pair (Crypto or DeployLX) makes good sense.

I'd highlight that I think most of these packages perform well for what they do, but being a developer myself, I have fairly demanding requirements.
Also, if you are REALLY short of cash, there are some very cheap solutions out there which I think would perform quite satisfactorily if you were willing to put a bit more time into making them perform.

I think the LomaCons comment that sometimes 'less is more' hits the nail on the head. In the end, I wasn't interested in the fluff of the 'all in one' integrated solutions, I just wanted a good solid bare-metal framework I could use for my own purposes. This meant a good feature set and an easy to use API.