jump to navigation

Fluent Thinking – C# Plus SQL Equals CouchDB. Maybe. January 23, 2010

Posted by ActiveEngine Sensei in .Net, ActiveEngine, C#, New Techniques, Open Source, Problem Solving.
Tags: , , , , , , ,
3 comments

I’ve gone too far to turn around
It’s hard to reach for you
When I’m lying face down
I can’t relieve my soul
I’m lost in a moment
Lying face down

Saving Abel

There are moments when life brings too much possiblilty to your attention and your mind catapults into overload.  Take the great potential that CouchDB has, as an example.  Elegant, simple.  Damien Katz has ignited a real fire storm with his work and may have single handedly given rise to the not-a-RDBMS movement.  Sometimes developing under the gun, Sensei has to admit that the 3rd normal form restrictions can be a little inhibiting.   Really, a collection of value pairs will do quite nicely, and XML, when it doesn’t run on for more that 1 page can be quite descriptive, it still is not quite the solution for quick configuration solutions.  JSON is just about right, easy to manipulate, and this is why Damien chose this as the structure for a document in CouchDB.

“But Sensei, I have SQL Server.  We aren’t allowed to say the L-word!”

“You mean Linux, right?  Just checking.”

There may be constraints imposed by the  environment you have at your disposal.And if you have a current system in production you might have a hard justifying to the powers that be that you should download a LAMP server from JumpBox just so you can introduce doucment managment into your current production system. Or perhaps you have an extensive data model and are concerned that that the meta data would too complex to store with the documents.

Here’s a simple yet effective way to introduce a “CouchDB lite” version to a current database platform that is applicable to SQL Server, Oracle, MySql and provides a flexible way to store meta data for quick retrieval.   It’s not badass like Sam Jackson, but it’s not ridiculous like the beast picture next to him either.

The goal stated succinctly:

  • Provide the abilty to associate a document to a record in an existing DBMS
  • Provide the abilty to store the meta data related to a specific record or a document type.
  • Ensure that future meta data and document types will not a change in the cardinality of the database schema nor require the addition of additional database attributes.
  • Provide a object oriented framework that adheres to the Open-Closed principal; that is, open to extension but closed to modification.

Previously Sensei has opined regarding fluency, and this solution adheres to those prinicipals.  For existing RDBMS a single table can introduce the simplicity and clarity you seek.  All document meta data can be stored in a table aptly named Document:

CREATE TABLE [dbo].[Document](
	[DocumentId] [int] IDENTITY(1,1) NOT NULL,
	[UserId] [int] NOT NULL,
	[Name] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
	[PostedDate] [datetime] NOT NULL,
	[DocumentHash] [varchar](255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
	[Attributes] [varchar](8000) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
 CONSTRAINT [PK_Document] PRIMARY KEY CLUSTERED
(
	[DocumentId] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF
GO
USE [REIMS_DEV]
GO
ALTER TABLE [dbo].[Document]  WITH CHECK ADD  CONSTRAINT [FK_Document_User] FOREIGN KEY([UserId])
REFERENCES [dbo].[User] ([UserId])

In brief, this table stores an identity key for each document, the name of the document, a foreign key that represents the user / person who saved the document, and the date when the document was posted.  The attribute DocumentHash is a hashkey comprised of the DocumentId, the UserId, the Name of the document, and the PostedDate.  The DocumentHash value is prepended to the document name to create a unique document.  Consider it a way to associate the document with a specific record in the database.  In other words, you should be able to re-create a haskey from the data attributes that matches the value pre-pended to the file name of the document.  Another nice by-product is that you can version documents quite easily.

Focus next on the column named Attributes, as this is where the magic happens.  The meta data for each document is stored here as JSON.  This takes the following form:

[{"Key":"Owners","Value":"79,2,4,79"},
{"Key":"BonusId","Value":"95"}
{"Key":"BonusTypeId","Value":"3"},
{"Key":"DealId","Value":"1035"}]

This JSON serilization of an array of Key/Value pairs is a format that allows you store whatever meta data you wish. Notice the first line named “Owners”. This allows you to store the user id’s of all users who have the ability to delete or edit the document.  You can set security based on this attribute, and implement a document centric security framework that can exist in addition to security place on the share where the document is stored.

As stated earlier, any type of data can be stored in the Attributes column.  If you notice line 2 that reads “BonusId” you see that keys to other tables in your database schema that can saved with the document record.  This feature allows you to maintain the relationship of a document back to a particular record or set or records.

You may be asking yourself why the Key Value nomenclature in the JSON string.  The attributes can be deserialized to an object of type List<KVPair> where KVPair  has the following structure:

namespace BusObj
{
    [Serializable]
    public class KVPair<TKey, TValue>
    {
        public TKey Key { get; set; }
        public TValue Value { get; set; }

        public KVPair(TKey k, TValue v)
        {
            Key = k;
            Value = v;
        }

        public KVPair() { }
    }
}

The Key Value pair lends itself to simple querying.  Since the attributes can vary from document to document, the meta data can include identity keys to records in your existing database schema, information regarding folder structure where the documents can be stored, and what ever other piece of important information of you need to retain.  Once the JSON is deserialized to List<KVPair> you can use lambdas to query your documents.  Consider the following code that is used to determine document ownership where the owners are represented as

{"Key":"Owners","Value":"79,2,4,79"}:
public bool IsDocumentOwner(int userId)
        {
            Enforce.ArgumentGreaterThanZero<int>(userId,
                                                "Document.IsDocumentOwner - parameter userId can not be null");
            //  No attributes?
            if(this.DocumentAttributes.Count == 0)
            {
                return false;
            }

            return DocumentAttributes.Find(x => x.Key == "Owners")
                                                    .Value.Split(',')
                                                    .Select(s => int.Parse(s)).ToList()
                                                    .Contains(userId);
        }

As you can see, multiple users can be designated as an owner of a document. Any operation that is to be performed on a document can be quickly validated against the owner list, so actions such as deleting or displaying document can be quickly validated.  Since the attributes are stored as JSON, you can easily augment existing documents.

The possiblities here are quite exciting. Perhaps you are looking for a quick an easy way to store configuration of objects but XML would prove to be too complicated to maintain. You could create an entry in the Document table with the attributes you wished to store and retrieve quite easily.  Or, you can easily create a categorization system for documents that sit out on shares and you need to correlate these documents to records in a accounting package.  There are numerous things you could do to make your life easier.

Full source code will be posted soon where an in depth discussion of the implementation will continue.

Advertisements

The Economics of Developing iPhone Apps August 6, 2009

Posted by ActiveEngine Sensei in ActiveEngine, software economics.
Tags: , , , ,
4 comments

Sensei has an iPhone and it is indeed a great technological achievement.  It just works.  Another attractive aspect to the iPhone is the lowe priced software available from the App Store.  We have all heard of the stories of the kid who made $40K by creating an app and selling it.   At Coding Horror, Jeff Atwood posted his thoughts regarding the effect of lowering the cost of a software product and how that can create a jump in sales.  In short, the lower priced software makes up for the loss with volume.  starwars-iphone

There are other considerations to be kept in mind before diverting your talent to iPhone application development.  The economics of surviving in that envirnoment are hazardous.   (more…)

The Clock is Ticking December 15, 2007

Posted by ActiveEngine Sensei in ActiveEngine, Coaching, Design Patterns, Mythology, Personal Development, Problem Solving.
Tags: , , ,
add a comment

The forth coming documentary movie Two Million Minutes discusses the changing demographics of our global economy:

Meanwhile, both India and China have made dramatic leaps in educating their middle classes – each comparable in size to the entire U.S. population. Compared to the U.S., China now produces eight times more scientists and engineers, while India puts out up to three times as many as the U.S. Additionally, given the affordability of their wages, China and India are now preferred destinations for increasing numbers of multinational high-tech corporations.

The premise of the documentary is that from 8th grade to high school graduation, student has 2 million minutes to prepare to enter the work force, be productive, fight the good fight to win the prize, bring home the bacon and contribute to society.

How do we as developers, architects, project managers spend our time? Some may contend that expansion of knowledge is the best route, that continual acquisition of skill is the key to remaining on top. The way of Bushido is to constantly refine through the repetition of basics. The life of Josh Waitzkin supports the latter theory, as neural pathways of the grand masters are created through analysis and repetition. Can this be done in 1 million minutes? What ways are we learning? What are the essential components to good design, and are they emphasized enough?

Design patterns come to mind as a kata, or set of instructions that when practiced to a high degree lead to increased performance. Design patterns describe quickly how a problem has been solved, and set expectations as to what is in store for you when you open up the code and read what has been done. When done correctly, design patterns will gain back some of those precious minutes.

But back to China and India. Are we, the software and architect community, too cloistered in our blogs and Alt.Net enclaves to contribute to the reduction of the 2 million minutes? Are we even a part of that 2 million minutes? Think about it.

%d bloggers like this: