Wednesday, October 17, 2012

Automated Foreign Key List Generation in SubSonic 3 : Part 1

This is another post in the series of SubSonic enhancements.
I'm always tinkering with the class templates, trying to get just that little bit more mileage out of them.
This post is going to be a bit long, and a bit complex, but well worth your while.

The Problem

This post deals with a commonly heard request: 
  using foreign key links to connect to related lists of objects
For example, in the good old Northwind database, a product has a single supplier, but can belong to many orders. Wouldn't it be great if we could write something like the following:
Product p = Products.SingleOrDefault(productId);
if (p!=null) {
    string companyName = p.GetSupplier().CompanyName;
    List<order> orderList = p.GetOrderList();
    ... do something ...
}
Another complication is that often there is a need to cache data before performing complex operations.
For example, we might want to iterate over all orders and carry out some business logic to do with suppliers, which requires stepping through the product level.

We could use a single SQL dataset to handle all three levels at the same time, but then we denormalise the data and lose all the advantages of object-orientedness. 

We could iterate through the orders one at a time and query (by LINQ or SQL) all the related data for each row, which would be programmatically easy, but highly inefficient, generating multiple queries for each row (RBAR, anyone ?).
Time and time again, I'd find myself fetching all three full result sets as lists and writing something like the following:
foreach (Supplier s in supplierList) {
   foreach (Product p in productList) {
      if  (p.SupplierID==s.SupplierID) {
          foreach (Order o in orderList) {
             if  (o.ProductID==p.ProductID) {
                ... do something with supplier orders ...
          }
      }
}
At least if you're going to do wildly inefficient reiteration - do it in memory !

But wouldn't it be great if somehow we could automatically set up and pre-cache, with just three queries to the database, all of the related objects in their relationships to the other objects.

Read on ....

Stage 1: Creating LINQ lookups

Let's state the goal clearly at the outset. 
A foreign key is a one-to-many relationship.
In the Northwind example above, the supplier is on the one/parent side of the relationship (a product can have only one supplier), and the order is on the many/child side (a product can be in many orders).
So
  • for each parent FK relationship, we want to provide a single object 
  • for each child FK relationship, we want to provide a generic list of objects
In the first draft of the solution, this was achieved by providing two sets of properties. There was no attempt at caching.
For example:
Product p = Products.SingleOrDefault(productId);
Supplier s = p.FK_SupplierID_Supplier;
foreach (OrderDetail orderdet in p.FK_Order_ProductID) {
    ... do something ....
}
The naming convention was a little cryptic, but  FK links need to be named using the table and column involved in the FK relationship (Remember - there can be two differently named FK columns in a table that link to the same external table).
The order of the table names was swapped for the one-to-many and the many-to-one member.

The implementation was like this (this is the end result: the code was actually done in the T4 template to auto generate what you see below):
private Supplier _FK_SupplierID_Supplier;
public Supplier FK_SupplierID_Supplier {
  get {
      if (_FK_SupplierID_Supplier == null) { 
           _FK_SupplierID_Supplier = Supplier.SingleOrDefault(
             x => x.SupplierID == this.SupplierID);
      }
      return _FK_SupplierID_Supplier;
  }
}

private List<order> _FK_Order_ProductID;
public List<order> FK_Order_ProductID {
  get {
    if (_FK_Order_ProductID == null) {
        _FK_Order_ProductID = (from items in Order.All()
            where items.OrderID == _OrderID
            select items).ToList();
        }
        return _FK_Order_ProductID;
  }
}
Problems with Stage 1

Programmatically, this solved the problem.  But performance was terrible.
While this approach used lazy loading to avoid repeat queries, it still generated a query to the database for each related record or set of records when first requested.
When I set it up, I was under the impression that LINQ would in fact query the database at load-time rather than run-time. Not so.
To give us a benchmark, the program I first tried this out on, ran a procedure involving object hierarchies that took three and a half minutes to run, and hit the database steadily during the whole process.
This was, indeed, a textbook example of RBAR.

Once implementing the caching in the following steps, the same program ran in a fraction of a second. I didn't even bother to measure it, because it was so fast.

See Part 2 of 2



No comments:

Post a Comment