segunda-feira, 8 de março de 2010

Thoughts about Document Oriented Databases

These last two weeks I have read many articles about Document-Oriented databases, for sure they have potential and also it is true that it is not the panacea to all our problems.

After so many readings I started to think about some scenarios were this kind of databases could be applied in a efficient way. Efficient way is a too generic definition, so before highlighting the scenario I would like to clarify the elements that I would like to investigate.

I want to divide it in two parts: developing and performance. In the developing area I want to find out:
  1. The effort to create the access layer;
  2. Due the document/object changes across the time, how much is the effort to migrate the data? Is it necessary to migrate the data or can we have a base with different documents version?

In the performance area I want to find out:
  1. How much time is spent in a single insert?
  2. How much time is spent in a single update?
  3. How many concurrency inserts can I have?
  4. How many concurrency updates can I have?
  5. How fast is a list query of the documents ?
  6. How fast is a list query of the documents during a sequence of inserts?
  7. How fast is a list query of the documents during a sequence of updates?
  8. Execute all those performance tests in a single node DB and a multi node DB.

With this analisys I would expect to have a better understanding of the Document-Oriented model, in development area and in the application runtime.

Ok. Now is time to define the scenario that I would like to evaluate this parameters. Let's suppose that you have an application that is a workflow based in a document.
This document is a set of information offered to your user/client, and as expected this document has a structure (as complex as your business). Let's ignore the workflow part and stick in the document.
  • We need to have the basic 4 operations for the document: create, read, update and delete (CRUD), plus a view of the list of the documents (with a small set of the data);
  • All the CRUD operation is executed at the document level and not at parts of the document;
  • The whole application uses the this base API, so every update/insert is make in the complete document.
  • It is done like this to simplify the API and due the fact the user interface to insert/update the information is build at runtime (I can have everything or just few fields).
Well this is the base scenario that I would like to investigate, it is because in my point of view the Document-Oriented database would fit here much better than the standard relational database.

In the next post I will step into the document definition and the user stories that I would like to implement in order to start the investigation.

See you soon.

Nenhum comentário: