Forum Experiment, part one

I now have a database of forums available, at this location: http://myinfo-scs.appspot.com .  This is based on code located here: https://github.com/kjk/fofou and described in more detail here: http://blog.kowalczyk.info/software/fofou/ .

I’ve also been looking at DjangoBB for something similar, in the singular forum sense, from which I could probably then easily abstract to multiple forums.  DjangoBB is rather more polished than fofou, and, of course, is written in the Django framework, which is probably a bit more solid than a free-for-all implementation as above.  The packages I am using, apart from Django Bulletin Board, are:

The price to the latter it would seem is the difficulty in getting the thing to work – in fact, I have since discovered that the only way to make the program work properly is either to rewrite djangobb (hard) or django-nonrel (very hard!).   Although it is certainly instructive to have a bit of a fumble with the djangotoolbox and django-nonrel code, there are inherent limitations with nosql that make a full solution to some of these problems more or less impossible.  Since many of the pluggable components to django (such as djangobb) implicitly rely on relational queries this can make things very difficult for running things on a NoSQL database such as GAE.

There are then several approaches one can then take.  One, is to use Google Cloud SQL – which is not horrendously expensive and I believe can be used for multiple apps for a single instantiation of the API.  Another is to use alternative forum techs, such as fofou, and another which I discovered recently, gforum, though that comes with a user-advisory regarding its hunger for user information via the widget for user logon.  But otherwise relatively promising.  So I am currently looking at defanging that particular piece of code.  For the adventurous/curious, the googlecode repository is here: http://code.google.com/p/gforum/source/checkout .

The third approach is to somehow solve the NoSQL many to many field problem and then incorporate that into django-nonrel.  Apparently this is one of those ‘untouchable’ or overly-ambitious problems in the area of computer science.  But apparently it is possible to solve the problem (although the answer is currently not open-sourced) as per this announcement here (quite recent too, September 2012): http://fatfractal.com/prod/joins-and-nosql/:

I’m an engineer and not usually given to making sweeping statements like, “we’ve solved the many-to-many relationships problem for NoSQL;” but in this case, I hope you’ll agree, it’s merited.

FatFractal are another engine like GAE.  Apparently the engine solves the problem prior to loading to a NoSQL database which I presume is something like MongoDB, such as app engine uses.  Regardless, simply knowing that the problem can be solved wins half the battle.  Indeed, if FatFractal’s claim is true, the fact that the problem is not impossible means that presumably it is only a matter of time before it is independently discovered how to do so, and the knowledge becomes public domain – and thereby applicable to the current django-nonrel distribution / github project (currently at version 1.5 development, 1.4 stable), here: https://github.com/django-nonrel/django/tree/nonrel-1.5-beta .

But until this happens, for me, and other non-experts like myself, I think the best strategy is to use alternative technologies to django (as above) to work on specific applications, since the biggest advantage to django (apart from the built in admin) is the pluggability of components (like ruby-on-rails), and more or less everything breaks and therefore needs to be rewritten (if the data models allow such) for app engine.  Consequently one might as well use web2py which is fully supported.  Although I’d certainly like to learn a bit more about how SQL and noSQL model data ; if nothing else, this would be quite instructive.

As to the question – why NoSQL, if it makes joins so abominably difficult?  The quick answer – speed.  NoSQL is a stripped down version of SQL, and is therefore faster and more horizontally scalable (apparently) so more suited to use with applications / services that need colossal amounts of data (eg, location specific information, weather data).  SQL is slower and does not scale as well, so is more suitable to applications where the data is somewhat more limited, but more highly interwoven and connected (eg, forums, blogs).

Advertisements

Tags: , , , , , , , , ,

One Response to “Forum Experiment, part one”

  1. Gary Casey Says:

    Hi,

    Thanks for the mention of FatFractal above – I wrote the post you referred to. I’m not claiming any great breakthrough in computer science here 🙂

    What I was describing in the post is how FatFractal’s Backend-As-A-Service product (which is called NoServer) provides a solution for developers using our platform. Anyone can build support for many-to-many relationships, as I described in my post – essentially, one creates the equivalent of a SQL join table to hold the relationship.

    What we’ve done in NoServer is to build in that support on top of a NoSql core, so that every developer using NoServer didn’t have to re-invent it. Having that support built in also eliminates the need for the developer to write a *lot* of boilerplate code in order to build those relationships themselves.

    Hope that helps!

    – Gary

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: