LOCJ on IT world

segunda-feira, 10 de janeiro de 2011

New home!

Happy new year to everyone!

As usually happens a new years comes with some few news. Here it was not different.

This blog has been host since the begining in the blogspot. But because I created a new domain for my another blog and as it is hosted in a specific domain I decided to move this one there too.

You will find it with a new layout but the idea is still the same, post about news, crazy ideas, comments and the usual soundtrack for the post.

So I hope to see you at http://onitworld.laerteocj.net!

quarta-feira, 3 de novembro de 2010

For thoses that will be in São Paulo at 11th of November a good call is to participate of the World Usability Day that will happen at UMC (Universidade Mogi das Cruzes).

But if you are not in SP but wants to participate don't despair! The World Usability Day is organized around the globe, here you can find a full map with all the places and dates.

See you there !

Today I go with Out of the silent planet, from the Iron Maiden's Brave New World album.

terça-feira, 12 de outubro de 2010

Time! What is time?

Soon or later when you are working in a software project you will face a little element that can cause quite a big mess: time.
When I say time, I am not talking about the time to finish the project, time to perform a specific task or the software performance; in this case time is the time element of the data that you are working with. Doesn't matter if it comes as an hour and minute representation, a timestamp or millisecond counter; soon or later it will be there, and even if looks simple evolve into a irritating problem.

The usual solution to the statement "We need a timestamp!!!" is to get the millisecond of the computer clock and store it into the database, but in almost all the cases it is not enough. To know if it is enough or not, first we need to understand why do we need the timestamp; is this timestamp used only for user information, something like what date/hour something happened, or does it have a meaning for the system?

If it is only for user information, maybe a timestamp is not needed, you can have date (which includes the time) fields and make you user happy. But when it starts to mean something for the system things get a little bit more complicated; but if you need it for both purpose I would say to start splitting it: the user information should go in a field (even if it is created by the system) and a field with the timestamp; in this way you can have a field with comprehensive information for the user AND a field that can be managed by the system without magic.

So the next question is: why do the system needs a timestamp? To explain that I will start explaining what are the properties of the time, and based on those properties it is easy to identify where it could be used.

"But what, in fact, is time? If one concentrates on the structural aspects, then the prevalent view of time is that of a set of “instants” with a temporal precedence order satisfying certain obvious conditions such as transitivity, irreflexivity, linearity, and density. Interestingly, however, in most cases when we make use of time and clocks we do not need all these properties – for example, digital clocks obviously do not realize the density axiom but are nevertheless useful in many cases."[1]

So basically a time is something that can give to you the temporal idea, did A happen before B? Did something happen between A and B? So when you want to enable you system to work with such kind of question a time element is required. The next question is: how do I do it?

The first reaction is to use the machine timestamp, but doing things like this is a little bit danger because: to events for the same process can happen at the same time (remember that you are working at milliseconds level and you easily can have more than one event happening at the same milliseconds). Or you could have more than one machine, even in the server environment, today with clusters and high processing scale it is really likely to happen. Imagine that you have two events happening for the same process in different machines, they could end with the same timestamp, although they must follow the linearity rule.

I will skip the notation for the time related events (all those: before, after, depends, not depends), the important rule is: if the events are time related they should have a sequence, so I should be able to define which one happened first and which happened after. For that we need a consistent timestamp, and in most of the cases the easiest implementation for it is a Logical Time. What means that you have a time that has the transitivity, irreflexivity, linearity and density needed for a time, but at the same time it doesn't depend of a real clock.

There is different ways to implement a logical clock, one of the simplest is a Scalar Clock: in a nutshell a scalar clock creates a timestamp every time that it is asked, and for each creating the timestamp generated will be greater than the previous one, and it never creates something backward, always forward (in real life can you go back in time and do things that you would like to? or undo something? no, so in software land it should is also not be possible).

You can see it as a counter, which always increase. Looks simple and stupid, but such kind of clock should guarantee that doesn't matter what happens it will not create something into the past; but it is not among those guarantees the need to have sequential timestamps, in fact it could jump sequences.

Simple? Well not that much, imagine that if you have multiple machines in your cluster such clock should provide valid timestamps for all those machines, at the end it shouldn't matter where the timestamp was created, they should attempt to the constrains.

A bottle neck? Yes it could be, but according to the Peng and Dabek the Percolator scalar clock "serves around 2 million timestamps per second from a single machine"[2], so once well implemented the scalar clock could be a feasible solution.

You should notice that a scalar clock doesn't differentiate between processes, so even not related events will have a time relation. When it is not need or desired you can start looking for different type of clocks like a Vector clock or a Matrix Clock[1][3].

At the end remember, it is important to have consistent timestamps, and most of the times the system timestamp is not enough to conform to the required constrains (transitivity, irreflexivity, linearity and density). So if you need a timestamp, before doing a getTimestamp look around and check what other options you have.

[1] Mattern, F. – Logical Time. Darmstadt University of Technology.
[2] Peng, D., Dabek, F. Large-scale Incremental Processing Using Distributed Transactions and Notifications, pp. 06.
[3] AGUILERA, M. K., MERCHANT, A., SHAH, M., VEITCH, A., AND KARAMANOLIS, C. Sinfonia: a new paradigm for building scalable distributed systems. In SOSP ’07 (2007), ACM, pp. 159–174.

The natural song choose for this post is Time What is time, from Blind Guardian.

quarta-feira, 6 de outubro de 2010

Honey! It's not what it looks like: CODE REVIEW

Today I will start a new brand of posts: "Honey! It's not what it looks like". The idea here is to present some concepts about terms commonly used, but sometime misunderstood or just not clear enough to give a common ground for a good conversation.

Some confusion could raise when we use those kind of terms, you may use them expecting something and you get a completely different stuff; it is like in the every day life, you expect something in the other hand you get a raining of popcorn in your head, a dog barking or running at you and so on. The best we can do is go get a common ground about the terms, so every time that you see "Honey! It's not what it looks like" it will be me trying to discuss some of those terms.

As starter I will stick with code review. When someone comes to you and say: we need to do code review, what does it means? What are the expected outcomes from the code review? In a conceptual world which software engineering discipline is it included?
Let's just set the ground, every word or better group of word could have a locale meaning, I am not saying that those meanings are wrong, absolutely not, just saying that when those meanings are not clear, maybe you will get what you expect, but probably you will end up with popcorn in your head.

Looking at the Pressman[1] definition for FTR: "A formal technical review is a software quality assurance activity performed by software engineers (and others). The objectives of the FTR are (1) to uncover errors in function, logic, or implementation for any representation of the software; (2) to verify that the software under review meets its requirements; (3) to ensure that the software has been represented according to predefined standards; (4) to achieve software that is developed in a uniform manner; and (5) to make projects more manageable.", let's focus at item 1: the objective is to uncover errors in function, logic or implementation of any representation of the software, and the code is a representation of the software, as much a class diagram would be or a architectural design.
So from the Pressman definition we can say that a code review is a FTR applied at the code, which is a representation of the software, so from now on we can exchange the terms code review and FTR.
With the bases set: code review is a FTR, we can look at the other objectives of a code review; verify that the software meets the requirements, guarantee software standards, achieve development uniform and make it more manageable. Well that is perfect, right? With a simple review I can achieve all of it?

The answer is: yes and no, that is why: in order to achieve the desired output we need to have a proper input, for example the item 2. Do my software meets its requirements, yes a review could give you this answer, but to get the answer you need to have a proper requirement, otherwise how can I review my code? So when you ask for a code review and expect to have a item 2 attempted you need to give the proper input, and if you don't have a proper input, well then you have a problem.

I will not extend the discussing about the other items, but all of them require proper input to give you a proper output. I am not saying that the proper input is the 100% guarantee of a perfect output, but it is a good trail. In the other hand, it is not mandatory that all the output could be expected. You could say that I just want item 3 or item 4 or even another thing, but that need to be said.

The other element that you need to consider when you say –"I am going to do code review", it is how you are going to do that. According to Pressman[1] a FTR " is actually a class of reviews that includes walkthroughs, inspections, round-robin reviews and other small group technical assessments of software. Each FTR is conducted as a meeting and will be successful only if it is properly planned, controlled, and attended", are you planning to do all of it? If you are not then you need to understand that not all the outcomes from a FTR will be there for you or it will be valid.

Let's suppose that you don't have a meeting for the code review, how will you be able to achieve a development uniform knowledge in your team? A FTR when done by different people is a powerful tool to spread knowledge, sure we have alternatives like pair programming, but the alternatives should be investigated, and it is absolutely true that you need to do something, it doesn't come for free (again you don't do it properly the dog will bark at you).

There is alternatives to a FTR, to be more specific a FTR at the code exists, like the previous referenced pair programming, and there are others (you will find some references to other articles and web pages at the end of the post).
The important issue to consider is: what do I want to achieve with code review? Meet the requirement, standard code, spread of knowledge, manage my projects, etc; and from this point you need to start filling what those words mean. And remember, the experience proved that a code review done by more than one people with different knowledge about the problem is much more effective than a single person reviewer (ok I know that I should put some proper article reference here, so if you want I can bring it).

And before the end I would like to point to another detail concerning the FTR; where the FTR is placed in the Software Engineering disciplines. It is an instrument of Quality Assurance.
Some could say that in XP the pair programming is done by the developers and it is a code review; that is true, but it is still a Quality Assurance instrument. But why is it important? It is important because it shows what kind of purpose this tool has, it is not a testing tool, and it is one of the instruments to guarantee the quality of the code into the process, you should remember that one of the purposes of the Quality Assurance is to guarantee the quality of the product through a better process.
In this case the better process is achieved, for example, with the code knowledge spread across the developers. At the end while you are doing FTR you are also improving your software process, but it should be done properly and if this idea in mind.

As promised here it is a small list of articles about people discussing FTR, mainly code review, plus a website with a check list for code review[5] (could be a start for this kind of checklist):

[1] R.S. Pressman, Software Engineering – A practitioners's approach, MCGraw-Hill, 2001.
[2] T. Stalhane, C. Kutay, H. Al-Kilidar, R. Jeffery, Teaching the Process of Code Review, Proceedings of the 2004 Australian Software Engineering Conference (ASWEC’04), 2004.
[3] A. Vardhan, Learning to verify system, 2006.
[4] A. Harel, Prof. E. Kantorowitz, Estimating the Number of Faults Remaining in Software Code Documents Inspected with Iterative Code Reviews, IEEE International Conference on Software - Science, Technology & Engineering (SwSTE'05), 2005.
[5] Macadamian, Code Review Checklist, http://www.macadamian.com/insight/best_practices_detail/code_review_checklist/, 2010.

So remember that from time to time we need to recycle your dictionary, your common sense. Said that I will let you with the song, Time after Time, from the Brazilian band Dr. Sin.

segunda-feira, 4 de outubro de 2010

Is my browser compatible?

Well this kind of question is quite common (unless you don't care at all about the subject). There is a lot of methods to check if the browser is compatible or not and for sure none of them replace the testing approach (test in all the browsers that you want to support).

The Browserscope project could help a little bit in this manner: it has a set of testing to check the compatibility of the browsers against the standards and they have the reports for the common browsers, so you can have an idea if the resource is supported or not. They divide the tests in 6 groups: Security, Rich Text, Selectors API, Network, Acid3 and JSKB; and you can even run the tests online in your current browser.

Absolutely it doesn't answer the question: "Is it going to work in all the supported browsers?", but it could give you a hint about things that for sure are not going to work.

We all know that the browser compatibility is a problem without a near solution so we must carry on with it (and keep testing), let's do it listening Carry On, from the Brazilian band Angra.

segunda-feira, 13 de setembro de 2010

And the week begins...

Today is Monday, so instead of talking about any issue, I would prefer to come back the series of "Nooooo it cannot be true, I haven't read it!".

Software Illusion

Code talks

Well, what can I say?

Enjoy the week ;-)

The music is not related to the post, but I was listening to it yesterday so here it goes Long Before Rock 'N' Roll, from Mando Diao.

quinta-feira, 9 de setembro de 2010

Gaming contest

The Mozilla Labs start a call for a game creation competition, the idea is to create games for Open Web and the browser.

Check it out at the Pascal Finette blog.

See you there ;-)

I Read the blog enjoying the song Kiss With A Fist, from Florence and The Machine.