Data Structures in SQL vs noSQL – with thoughts to django-nonrel

So I’ve been thinking again about the problem of SQL vs noSQL joins.  Basically, for background I am interested in how one could write another python package for the django-nonrel project, such that when a django project asks to do a many-to-many query, running on a nosql backend like MongoDB, the query is sent to that package.  Consequent to this, I would like the package to construct the data structures required to support such on a noSQL structure, ie interpret and build the necessary tables, then manage the interface between django’s data syntax on a standard SQL database and the table mapping so constructed, so that essentially the application is indistinguishable from one merely running on base django, on say a MySQL backend.  Then the behaviour of such would essentially allow one to use any django plugin currently available, without running into the need to rewrite its code (ie, directly alter its associated implicit data model(s)/data organisation, as described in its file(s)).

Hence I’ve done some research, and found the following post on the MongoDB blog.  For comparison, here is a post describing a simple data structure for pure SQL.  Then it would seem that the problem becomes quite clear – whereas it is possible to use the same primary key field name for multiple tables, and join on such in mySQL or SQLite, primary key names must be different for each table on MongoDB or CouchDB, so it is necessary to create a table to associate different primary key field values if joining different tables in same, or some other noSQL backend.  Apparently this is an artifact of the fact that MongoDB must be in third normal form, whereas databases like mySQL are less restrictive, and can be in first, second or third normal form.  Presumably the fact that data must be structured in a certain way in MongoDB, as opposed to the freedom of MySQL or SQLite allows it to scale better, and consequently be the platform of choice for big data type services.

So this reduces the noSQL many-to-many django nonrel problem to the following: given a query that is essentially predicated on the assumption of first (or second, but I think first) normal form, regarding a join, how can one build boilerplate code to do the behind the scenes switch/replace, and create the data structures in third normal form representing the same equivalent information (or query) as per first?


Tags: , , , , , ,

One Response to “Data Structures in SQL vs noSQL – with thoughts to django-nonrel”

  1. SutoCom Says:

    Reblogged this on Sutoprise Avenue, A SutoCom Source.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: