21 August, 2012

Agile in small pills - part 1


Introduction

I'm working for a big international corporation, in division responsible for providing IT services to the rest of the group. Besides server hosting services and running our corporate network, we are also building software solutions. We are organized in a way that other corporate entities are our customers and we deliver using standard commercial model including competition on the market.

My job is to develop software and after spending nearly 7 years there also providing couching to other developers. Several years ago I adopted agile method and I started using that in my projects with continuous line of success. Recently, I was given a task to spread this know-how to rest of our software development department which is very conservative and stick to classical waterfall method. What a challenging task!

Our software development method is really conservative, based on traditional steps including analysis, design, code, test and finally release to production. It is of course suffering from typical illnesses: inability to keep customer satisfied, huge extend of legacy code, frustrated developers, ... well all typical symptoms.

On the other hand I found agile software development to be very healthy method of building software. Significant advantage is that it is compatible with developers minds and with customer expectations. This is very rare match I need to say and I do believe this is core of the whole success of the agile. Let developers to do their work in the way they feel it is correct and incorporate customer interaction into such an environment. That is actually very easy and working nicely.

Now what is interesting, it seems that only people who wants to keep traditional development method in their products are developers themselves (and of course connected project/product management). Our customers are completely supportive when it comes to eventual switch to agile software development and even our senior management it giving green light to this transformation. We are actually delivering small number of our projects in an agile manner and we have built quite large set of success stories already. But developers are still quite negative when agile is discussed in the scope of their particular product.

So far, I observed that it is close to impossible to switch to agile software development in a "Big Bang" style, this is actually well known experience. You need to administer only small portions of "agile" to the project teams one for each project cycle. And this is what this series of blogs will be about - about my experience of introducing agile in this very conservative environment, about specific points that I discovered on the way and about recommendation for any eventual followers.

Chapter 1 - First step towards being agile

First step is aways the most difficult one; the same applies here therefore it is very important to select "the best pill". This step should obviously bring a lot of value to show to the team that implementation of agile is actually win-win situation and that improvements can be expected even quite early during adoption process.

It is not much beneficial to start teaching project teams whole new agile methodologies like for example SCRUM (not saying that SCRUM is bad) but it is important to introduce ideas of Agile first. This is the most important part of the whole transition, change of the spirit, change of the mindset of the team.

I found that for this purpose, Agile manifesto is the very best starting point. It summarizes everything important in 4 simple sentences (and well recently they added about 12 points with more practical principles). Unfortunately when you read it, it looks like bullshit bingo. It is so difficult for bright ideas to shine thru ...

So you need to identify something more practical for a real first step (just after showing manifesto - why not give it a change anyway). And first step, first implementation exercise should be then an "iterative development" including reviews with customers. This is the most valuable part of agile. If you implement just this point, you are half way there and your customer perception will improve - that is guaranteed.

Iterative development means that you need to introduce several reviews with your customer during your standard project run. Practical example: One of our major product has traditional release cycle that spans roughly across whole year (including periods of idling, waiting for customer inputs). Usually development team receives business requirements during Spring and build them till early Autumn. Customers are allowed to see actual implementation only very lately during October just few days before scheduled release to production. Switch to iterative development means that whole cycle is broken into for example 2 weeks 'iterations' and after each of this iteration team demonstrates to customers (and other people who are interested) result of their work. Of course it is important for a team to plan content of the iteration in the way that they will have something to show.

You will find this difficult - especially in so-called legacy products. Teams will say that is it impossible to demonstrate something just after 14 days (that only installation takes 10 days). Well, I said that first step is the most difficult. Agile software development has a unique feature of identifying weak areas of your product build process that are otherwise hidden (but still present and you have to pay for them anyway).

Good message is that you don't have to figure out how to do that - development team usually knows already what does it takes. It only needs a space to let it happen. And you customer will appreciate this, he will provide his feedback much sooner and team will still have a lot of time to incorporate all his comments. During this period, you will observe that a common mutual understanding spreads across development team and customers.

So to conclude: first pill towards agile is to break your build project cycle into "iterations".


To be continued ...

09 August, 2012

PTHREAD_PROCESS_SHARED not supported on Mac OS X - what a shame!

I just re-open one of my older project and did few inter-process communication tests on my Mac. Well, I ended miserably at the very beginning of this exercise: I discovered that Mac OS X version 10.6.8 (Snow Leopard) (still) doesn't support PTHREAD_PROCESS_SHARED. It is even written in relevant man page!

This is really shame of Apple: sharing ptheads synchronization primitives across processes unlocks great deal of innovative inter-process communication techniques. Now it looks like that only way is to use something proprietary/unique only to Mac OS X and sacrifice portion of portability.

:-(

30 July, 2012

SCGI in Apache HTTP Server and in Lighttpd

Brief description

SCGI is one of few protocols that utilizes communication between HTTP server and your application. This is very useful concept, especially in advanced architectures, where actual application (aka application server) runs as different and standalone process besides HTTP server. This process can possibly run on different hardware, allowing application to scale easily.

HTTP server (like Apache HTTP Server or Lighttpd) is responsible for serving static content and dispatching requests towards application servers. HTTP server then acts as front-end part of the application, application server is in the middle and eventual database is at the back-end. Important note is that HTTP server typically acts as client for the application server.

You can choose among several protocols for connecting an application server to HTTP server. Few well-known examples are FastCGI, WSCI and SCGI. There is whole bunch of others but I will discuss mainly SCGI here. This one is modeled after CGI protocol, which however works on slightly different premise: HTTP server executes standalone application for each request and output of this application will be send as a response back to client. So there is no dedicated application server that is permanently running. This setup has different performance profile than SCGI but it is still useful in some cases.

SCGI introduces communication link (TCP/IP or UNIX socket) between HTTP server and application server and uses this link to communicate request and response in very similar way as CGI request and response is structured. 'S' in SCGI stands for 'simple' and it is well chosen, this protocol is actually very very simple - it's description fits on 2 pages. Protocol is still very powerful with practically no compromise. It can be also very fast. On the other hand there are while places in the specification or just references to CGI standard. Unfortunately due to this, available implementations are not 100% compatible and this can be source of surprises. To provide complete information I have to say that FastCGI shares this concept.

SCGI implementations

mod_scgi for Apache HTTP Server

This is mature module for Apache HTTP Server (2.0+). Development seems to be stopped for a while but module is very stable and perform very well. It works on all major platform (including Windows; with little bit of Googling you can find recent Win32 binaries and skip compilation stage that is usually complicated on Windows). Compilation on Linux and Mac OS X is smooth (well - sometimes you have to play some nasty games with autoconf and automake). This implementation uses only TCP/IP - there is no support for UNIX sockets.

mod_proxy_scgi in Apache HTTP Server

SCGI proxy module was added to Apache HTTP Server 2.2 about a year ago providing build-in alternative for external mod_scgi module (described above). This looks quite promising, this module is part of Apache HTTP Server source code and it is usually distributed with Apache HTTP Server itself. So you don't have to compile anything. On the other hand, there are some deviances from mod_scgi behavior that can eventually make porting of the application complicated.

mod_scgi in Lighttpd

There is also support for SCGI in Lighttpd server - offering faster and lighter alternative to quite large Apache project. Very unfortunately, there are few differences in protocol understanding that makes use of this alternative quite troublesome especially in case you want to have some configuration flexibility and "HTTP server plaform independence".

Other implementations

There are other implementations of SCGI in various HTTP servers, for example SCGI in NGinx but I haven't test them so far.

Typical web application mount points

Speaking about SCGI, important term is "mount point" of the SCGI protocol (and eventually application server). This is location or logical path in your web, where application server is responsible for content serving. All requests that uses this location or any path that starts with that are handed over via SCGI to application server and it is now responsible for providing response. It is similar concept to Unix file system and its mounts.

Based on position and eventually postfix, you can implement various schemes that are then reflected in URL(s) of your web application. These URLs are usually the ones that are displayed in the address bar of browsers and they should be "user friendly"; also SEO requires 'nice' URLs in the application.

Here are few typical ones:

SCGI mount on the root

In this scheme, your application server is mounted on '/' (root) of your web server. Static content is server from different host (maybe virtual) or from dedicated sub-location (e.g. /__static).
This scheme is little bit complicated for configuration but provides the best results for singleton applications - URLs are nice and understandable without any prefixes nor postfixes.

SCGI mount on a logical location in the web application structure

Application server is mounted on e.g. '/node' location over static content in document root served by HTTP server. This scheme tends to produce little bit less attractive URLs that mount on the root - there is always prefix in the URL but it can be useful in case you have more applications (or application entry points) on the same domain.

SCGI invoked based on postfix

This is scheme known from e.g. PHP - application server is serving only request that location ends with given postfix (e.g. '.php'). As there is usually no file associated with SCGI request, this scheme makes less sense in this context and application URLs have to contain this postfix making them ugly and obscure.

Modern applications requires good support for first two mentioned schemes, last one is more or less inherited from history as should disappear gradually.

SCGI request in detail

When request hits HTTP server in location that belongs to particular SCGI mount or postfix, server prepares SCGI request that is passed thru SCGI protocol to application server for processing. This request consist of header and body. Header is very similar to HTTP header containing even the same values prefixed by 'HTTP_' (e.g. HTTP_USER_AGENT) plus series of CGI 'environment variables' containing important dispatching data like REQUEST_METHOD or CONTENT_LENGHT.

Body has (if any) the same content and HTTP request. Also response produced by application server for HTTP server (and indirectly for client browser) is plain HTTP and it is only forwarded to client (well, in the most cases).

For proper dispatching, application server usually needs to know an location of the request - it is used to determine what functionality of application server should be launched and/or what content should be sent back to client in response. This is comparable to serving static files.
Application server receive this information in few variables (coming from CGI standard):
  • PATH_INFO
  • SCRIPT_NAME
  • SCRIPT_FILENAME
  • ... few other
Interpretation of content of each variable is very different from implementation to implementation - and this causes major troubles when migrating application among these implementations.

Here is one example:

Assuming the application is mounted on '/' (document root) and request is stated as "http://eiclocal/0p/1p/2p", we can get following results (actual result is dependent on exact configuration of the HTTP server and its SCGI connector):

mod_scgi (Apache)

PATH_INFO: /0p/1p/2p
SCRIPT_NAME: ''
SCRIPT_FILENAME: not present


mod_proxy_scgi (Apache)

PATH_INFO: /0p/1p/2p
SCRIPT_NAME: ''
SCRIPT_FILENAME: proxy:scgi://127.0.0.1/0p/1p/2p

mod_scgi (Lighttpd)

PATH_INFO: /1p/2p
SCRIPT_NAME: /0p
SCRIPT_FILENAME: /Users/.../.../.../0p


You can see that there is some consistency in first two connectors (both Apache) but there is quite problematic situation in Lighttpd. Event that mount point is defined as '/', SCRIPT_NAME is reported as '/0p' (which will be the same case as if mount point is '/0p'). This is weird and it is effectively killing an possibility to properly use SCGI and lighttpd if you need portability and compatibility (e.g. you are author of web application server framework or you just want to enable your application to run using different HTTP frontends).

Being SCGI user for more than 5 years now, I tend to like Apache way of interpreting of PATH_INFO. It is perfectly logical and works in every case (including postfix scheme). Unfortunately this is not standardized and it already started to jeopardize SCGI protocol.

Few other interesting topics

There are few more things that are important for proper and correct use of SCGI protocol, including static file serving thru SCGI and local redirect. I can and maybe will write other blogpost if there will be an interest.