21 September, 2013

Ramona supervisor: a first year

First stable release, version 1.0.0, of Ramona supervisor has been recently published and I guess it is a good time to look back a little bit and summarise what has been done. I would like to uncover visions of what Ramona can become in a near future.


Motivation


Ramona is not the only nor the first software that is meant to supervise an application. Before even first thought of Ramona emerged, I was a long-time user of other Open-Source product called supervisord. I used it in multiple commercial projects and I found that there is a strong demand for such a software component. However supervisord`s approach is a very strange (at least to me). There is nothing wrong with an idea or software itself however I found myself several times in a situation when I ran out of supervisord possibilities or I had to finish configuration of my application in very weird way. I was again and again thinking: why it is done this way, it doesn't make a sense.

I also noticed one pattern: once you become familiar with a concept of the application supervisor, you start to use it everywhere. I did the exactly same, every new architecture of mine since that time has been built around this concept; it is great way how you unify 'maintenance user interface' of your application. No more troubles when preparing init.d scripts, no more platform specific scripts, no more questions from application support people how to start/stop/restart particular component. There is just one 'command' and it even has a help. And of course, there is much more.

Since I was not satisfied with supervisord, I've studied source codes of this product only to discover that changes I wanted, are just too large and I don't want to 'hijack' a Open-Source project, I also wasn't able to find any other feasible alternative. Then two things happened: I finished one project, in which we had to implement 'process roaster' component into a web application server and I got a contract to build a brand new product, not web application in this time, but a smart application that will interact with users only via SMSes and emails based on incoming events from an enterprise bus. It quickly became obvious that architecture of this new system will consist of several standalone processes that will cooperate together in order to deliver required functionality. So … wait a minute … I realised that I have an experience from recent project how to (and how to not) approach a task of a processes' supervision, I also have a very good use-case for such a supervisor and I have also some portion of a budget to build it. This was a combination that I couldn't resist.

I started first prototyping work in July 2012 and also it quickly turned out that I will not be the only developer. Jan S. joined quickly and he is actively contributing to Ramona source code till today (thank you :-) ).


Architecture


It quickly become obvious that task of application supervising is just a neat extension of functions provided by operating system. This means that by clever choices in a architecture of such a component, you can archive elegant, simple design. And that is exactly what I had in mind.

So here is a quick list of these architecture decisions:

Client-server architecture


User of Ramona will probably not notice that Ramona uses client-server architecture. But it really does. What you use, is Ramona console - administrator/maintenance user interface on command line or thru your browser - this represents unified point of interaction with Ramona server. Ramona server typically runs in a background (respectively daemonized) and maintains list of programs that run as server subprocesses. So you don't need to take care about daemonization, Ramona will take care about that for you.

Also Ramona console can start or go away and your application is running independently from these actions. If you are familiar with UNIX program 'screen', you can imagine Ramona being a 'screen' integrated into your application. This is one or many way of describing how Ramona operates. 

Start/stop scripts (for example init.d ones) just invokes Ramona console commands 'start', 'stop', 'restart' and others.



Built on POSIX but compatible with Windows too


Ramona is meant to support all major platforms that are used today in IT industry. By selecting POSIX as a main OS API, we covered all major UNIX flavours, BSD variations and we are still close enough to support also Windows.

This is excellent match with technologies that we are routinely using in our daily software development and production run of applications we built. Developers tends to use Windows but there is also Linux and Mac OS X around, production software runs mainly on Linux but also Windows are time-to-time selected.

Today Ramona runs on Linux (Red Hat, Ubuntu, etc.), Mac OS X and Windows. I guess it will fluently run on FreeBSD etc. but I hadn't got a chance to test it.



First-class Open-Source


Ramona supervisor is an open-source software from day one. I strongly believe that today you can build software for enterprise segment purely using open-source components and also release it as an open-source. So we started developing it on GitHub, published also on Ohloh and actively promoting it to Open-Source community.

Single-threaded, event-driven

Basically a great way how to construct simple, uncomplicated however powerful applications that are polite to system resources. We selected libev / pyev to be a core of Ramona. Even today I have to say that is has been absolutely correct choice.

This choice collapsed whole problem space into finding proper 'events' and writing a code that reacts to them correctly. Internal event loop is provided by libev; Ramona is most of a time sleeping inside poll() call (or platform-specific alternative) keeping system uninterrupted from its intended primary use.


Written in Python but agnostic to an application programming language

We decided to write Ramona in Python (2.7+) because we love this language and it makes us very productive. But it doesn't represent any requirement to supervised application. Even we operates several components controlled by Ramona written in Java, PHP, C and others.

First year


Vast majority of a source code base has been written during August and September of 2012. Steady stream of commits flowed into our Git repository every day and functionality started to be available very quickly. 






During October 2012, Ramona has been deployed with first project into production life. Since that time, features has been gradually added and 9 beta versions has been released during late 2012 and first half of 2013. Among added features are for example HTTP console, log monitoring and basic email notifications, ability to script custom tools that are integrated into Ramona.

Production use of Ramona has been considered as a success and we quickly observed the second deployment - a customised version of the same system now serving customers in another country. Since that time, every new system produced by our team has been equipped by Ramona. By time of writing, Ramona has been integrated into approximately 10 commercial products.

Thanks to this, we received valuable feedback that allowed us to stabilise this product and properly prioritise new features. Finally - after a year of beta - our first public release has been published in August 2013.

It was quite fun to built Ramona! I've personally learnt a lot and this is actually a first time I published standalone Open-Source product.


Future


In a shorted time period, we want to release a enhanced notification subsystem; a feature that will inform application administrator actively via email about issues of a supervised application and a relevant environment. This is based on log scanning and application status monitoring but it also provides API that can be extended in various ways providing creative space for custom monitoring and notifications. Application should virtually told to administrator that it requires some attention.

Ramona today is processing standard streams from the application therefore scanning of these streams for issues is quite obvious task that Ramona can deliver. By time of writing, this feature is already in alpha stage and first feedback looks quite promising.

As an extension of this concept, various watchdogs can be implemented. These are standalone application or programs that are launched by Ramona in a very same way as HTTP frontend and they implement different checks of the application, the operation system, hardware, network and/or the environment health. When issue is detected, this watchdog program informs Ramona via API and Ramona produces relevant notification (or trigger other action based on configuration).

In a long term, Ramona should become able to supervise an application that operates in a cloud environment. This means multi-node application managed from a single place, allowing smooth addition or removal of nodes, migration of application components across cloud nodes etc. Implicitly this will also enable disaster-recovery scenarios completely managed by Ramona.

Physically this will be done via network of Ramona servers (one on each node), operating autonomously but also maintaining communication with other nodes, discovering newly added nodes and reacting to nodes that disappears from a network. Architecture of this network is inspired by peer-to-peer networks mainly to prevent single point of failure.


Ramona name


People are time to time asking what name 'Ramona' means and why I chose that for this product. I'm a Czech and there used to be one famous passive radar system produced in Czech Republic that has been able to detect even a stealth aircraft like F-117. Although this seems to be as a good way how to choose a product name, I was actually not fully aware of this fact. So this is not a real story behind it.

The real story is that I actually met a girl with this name during early stage of Ramona development, she came and went away - but her name (or maybe just a nickname, this is not a common name here) was so fascinating (so as our few rendezvous have been) that I borrowed her name and use that when I sought one for this product.

She will probably never discover that but anyway … thank you.


Links

Ramona supervisor homepage: http://ateska.github.io/ramona/ 

21 August, 2012

Agile in small pills - part 1


Introduction

I'm working for a big international corporation, in division responsible for providing IT services to the rest of the group. Besides server hosting services and running our corporate network, we are also building software solutions. We are organized in a way that other corporate entities are our customers and we deliver using standard commercial model including competition on the market.

My job is to develop software and after spending nearly 7 years there also providing couching to other developers. Several years ago I adopted agile method and I started using that in my projects with continuous line of success. Recently, I was given a task to spread this know-how to rest of our software development department which is very conservative and stick to classical waterfall method. What a challenging task!

Our software development method is really conservative, based on traditional steps including analysis, design, code, test and finally release to production. It is of course suffering from typical illnesses: inability to keep customer satisfied, huge extend of legacy code, frustrated developers, ... well all typical symptoms.

On the other hand I found agile software development to be very healthy method of building software. Significant advantage is that it is compatible with developers minds and with customer expectations. This is very rare match I need to say and I do believe this is core of the whole success of the agile. Let developers to do their work in the way they feel it is correct and incorporate customer interaction into such an environment. That is actually very easy and working nicely.

Now what is interesting, it seems that only people who wants to keep traditional development method in their products are developers themselves (and of course connected project/product management). Our customers are completely supportive when it comes to eventual switch to agile software development and even our senior management it giving green light to this transformation. We are actually delivering small number of our projects in an agile manner and we have built quite large set of success stories already. But developers are still quite negative when agile is discussed in the scope of their particular product.

So far, I observed that it is close to impossible to switch to agile software development in a "Big Bang" style, this is actually well known experience. You need to administer only small portions of "agile" to the project teams one for each project cycle. And this is what this series of blogs will be about - about my experience of introducing agile in this very conservative environment, about specific points that I discovered on the way and about recommendation for any eventual followers.

Chapter 1 - First step towards being agile

First step is aways the most difficult one; the same applies here therefore it is very important to select "the best pill". This step should obviously bring a lot of value to show to the team that implementation of agile is actually win-win situation and that improvements can be expected even quite early during adoption process.

It is not much beneficial to start teaching project teams whole new agile methodologies like for example SCRUM (not saying that SCRUM is bad) but it is important to introduce ideas of Agile first. This is the most important part of the whole transition, change of the spirit, change of the mindset of the team.

I found that for this purpose, Agile manifesto is the very best starting point. It summarizes everything important in 4 simple sentences (and well recently they added about 12 points with more practical principles). Unfortunately when you read it, it looks like bullshit bingo. It is so difficult for bright ideas to shine thru ...

So you need to identify something more practical for a real first step (just after showing manifesto - why not give it a change anyway). And first step, first implementation exercise should be then an "iterative development" including reviews with customers. This is the most valuable part of agile. If you implement just this point, you are half way there and your customer perception will improve - that is guaranteed.

Iterative development means that you need to introduce several reviews with your customer during your standard project run. Practical example: One of our major product has traditional release cycle that spans roughly across whole year (including periods of idling, waiting for customer inputs). Usually development team receives business requirements during Spring and build them till early Autumn. Customers are allowed to see actual implementation only very lately during October just few days before scheduled release to production. Switch to iterative development means that whole cycle is broken into for example 2 weeks 'iterations' and after each of this iteration team demonstrates to customers (and other people who are interested) result of their work. Of course it is important for a team to plan content of the iteration in the way that they will have something to show.

You will find this difficult - especially in so-called legacy products. Teams will say that is it impossible to demonstrate something just after 14 days (that only installation takes 10 days). Well, I said that first step is the most difficult. Agile software development has a unique feature of identifying weak areas of your product build process that are otherwise hidden (but still present and you have to pay for them anyway).

Good message is that you don't have to figure out how to do that - development team usually knows already what does it takes. It only needs a space to let it happen. And you customer will appreciate this, he will provide his feedback much sooner and team will still have a lot of time to incorporate all his comments. During this period, you will observe that a common mutual understanding spreads across development team and customers.

So to conclude: first pill towards agile is to break your build project cycle into "iterations".


To be continued ...

09 August, 2012

PTHREAD_PROCESS_SHARED not supported on Mac OS X - what a shame!

I just re-open one of my older project and did few inter-process communication tests on my Mac. Well, I ended miserably at the very beginning of this exercise: I discovered that Mac OS X version 10.6.8 (Snow Leopard) (still) doesn't support PTHREAD_PROCESS_SHARED. It is even written in relevant man page!

This is really shame of Apple: sharing ptheads synchronization primitives across processes unlocks great deal of innovative inter-process communication techniques. Now it looks like that only way is to use something proprietary/unique only to Mac OS X and sacrifice portion of portability.

:-(

30 July, 2012

SCGI in Apache HTTP Server and in Lighttpd

Brief description

SCGI is one of few protocols that utilizes communication between HTTP server and your application. This is very useful concept, especially in advanced architectures, where actual application (aka application server) runs as different and standalone process besides HTTP server. This process can possibly run on different hardware, allowing application to scale easily.

HTTP server (like Apache HTTP Server or Lighttpd) is responsible for serving static content and dispatching requests towards application servers. HTTP server then acts as front-end part of the application, application server is in the middle and eventual database is at the back-end. Important note is that HTTP server typically acts as client for the application server.

You can choose among several protocols for connecting an application server to HTTP server. Few well-known examples are FastCGI, WSCI and SCGI. There is whole bunch of others but I will discuss mainly SCGI here. This one is modeled after CGI protocol, which however works on slightly different premise: HTTP server executes standalone application for each request and output of this application will be send as a response back to client. So there is no dedicated application server that is permanently running. This setup has different performance profile than SCGI but it is still useful in some cases.

SCGI introduces communication link (TCP/IP or UNIX socket) between HTTP server and application server and uses this link to communicate request and response in very similar way as CGI request and response is structured. 'S' in SCGI stands for 'simple' and it is well chosen, this protocol is actually very very simple - it's description fits on 2 pages. Protocol is still very powerful with practically no compromise. It can be also very fast. On the other hand there are while places in the specification or just references to CGI standard. Unfortunately due to this, available implementations are not 100% compatible and this can be source of surprises. To provide complete information I have to say that FastCGI shares this concept.

SCGI implementations

mod_scgi for Apache HTTP Server

This is mature module for Apache HTTP Server (2.0+). Development seems to be stopped for a while but module is very stable and perform very well. It works on all major platform (including Windows; with little bit of Googling you can find recent Win32 binaries and skip compilation stage that is usually complicated on Windows). Compilation on Linux and Mac OS X is smooth (well - sometimes you have to play some nasty games with autoconf and automake). This implementation uses only TCP/IP - there is no support for UNIX sockets.

mod_proxy_scgi in Apache HTTP Server

SCGI proxy module was added to Apache HTTP Server 2.2 about a year ago providing build-in alternative for external mod_scgi module (described above). This looks quite promising, this module is part of Apache HTTP Server source code and it is usually distributed with Apache HTTP Server itself. So you don't have to compile anything. On the other hand, there are some deviances from mod_scgi behavior that can eventually make porting of the application complicated.

mod_scgi in Lighttpd

There is also support for SCGI in Lighttpd server - offering faster and lighter alternative to quite large Apache project. Very unfortunately, there are few differences in protocol understanding that makes use of this alternative quite troublesome especially in case you want to have some configuration flexibility and "HTTP server plaform independence".

Other implementations

There are other implementations of SCGI in various HTTP servers, for example SCGI in NGinx but I haven't test them so far.

Typical web application mount points

Speaking about SCGI, important term is "mount point" of the SCGI protocol (and eventually application server). This is location or logical path in your web, where application server is responsible for content serving. All requests that uses this location or any path that starts with that are handed over via SCGI to application server and it is now responsible for providing response. It is similar concept to Unix file system and its mounts.

Based on position and eventually postfix, you can implement various schemes that are then reflected in URL(s) of your web application. These URLs are usually the ones that are displayed in the address bar of browsers and they should be "user friendly"; also SEO requires 'nice' URLs in the application.

Here are few typical ones:

SCGI mount on the root

In this scheme, your application server is mounted on '/' (root) of your web server. Static content is server from different host (maybe virtual) or from dedicated sub-location (e.g. /__static).
This scheme is little bit complicated for configuration but provides the best results for singleton applications - URLs are nice and understandable without any prefixes nor postfixes.

SCGI mount on a logical location in the web application structure

Application server is mounted on e.g. '/node' location over static content in document root served by HTTP server. This scheme tends to produce little bit less attractive URLs that mount on the root - there is always prefix in the URL but it can be useful in case you have more applications (or application entry points) on the same domain.

SCGI invoked based on postfix

This is scheme known from e.g. PHP - application server is serving only request that location ends with given postfix (e.g. '.php'). As there is usually no file associated with SCGI request, this scheme makes less sense in this context and application URLs have to contain this postfix making them ugly and obscure.

Modern applications requires good support for first two mentioned schemes, last one is more or less inherited from history as should disappear gradually.

SCGI request in detail

When request hits HTTP server in location that belongs to particular SCGI mount or postfix, server prepares SCGI request that is passed thru SCGI protocol to application server for processing. This request consist of header and body. Header is very similar to HTTP header containing even the same values prefixed by 'HTTP_' (e.g. HTTP_USER_AGENT) plus series of CGI 'environment variables' containing important dispatching data like REQUEST_METHOD or CONTENT_LENGHT.

Body has (if any) the same content and HTTP request. Also response produced by application server for HTTP server (and indirectly for client browser) is plain HTTP and it is only forwarded to client (well, in the most cases).

For proper dispatching, application server usually needs to know an location of the request - it is used to determine what functionality of application server should be launched and/or what content should be sent back to client in response. This is comparable to serving static files.
Application server receive this information in few variables (coming from CGI standard):
  • PATH_INFO
  • SCRIPT_NAME
  • SCRIPT_FILENAME
  • ... few other
Interpretation of content of each variable is very different from implementation to implementation - and this causes major troubles when migrating application among these implementations.

Here is one example:

Assuming the application is mounted on '/' (document root) and request is stated as "http://eiclocal/0p/1p/2p", we can get following results (actual result is dependent on exact configuration of the HTTP server and its SCGI connector):

mod_scgi (Apache)

PATH_INFO: /0p/1p/2p
SCRIPT_NAME: ''
SCRIPT_FILENAME: not present


mod_proxy_scgi (Apache)

PATH_INFO: /0p/1p/2p
SCRIPT_NAME: ''
SCRIPT_FILENAME: proxy:scgi://127.0.0.1/0p/1p/2p

mod_scgi (Lighttpd)

PATH_INFO: /1p/2p
SCRIPT_NAME: /0p
SCRIPT_FILENAME: /Users/.../.../.../0p


You can see that there is some consistency in first two connectors (both Apache) but there is quite problematic situation in Lighttpd. Event that mount point is defined as '/', SCRIPT_NAME is reported as '/0p' (which will be the same case as if mount point is '/0p'). This is weird and it is effectively killing an possibility to properly use SCGI and lighttpd if you need portability and compatibility (e.g. you are author of web application server framework or you just want to enable your application to run using different HTTP frontends).

Being SCGI user for more than 5 years now, I tend to like Apache way of interpreting of PATH_INFO. It is perfectly logical and works in every case (including postfix scheme). Unfortunately this is not standardized and it already started to jeopardize SCGI protocol.

Few other interesting topics

There are few more things that are important for proper and correct use of SCGI protocol, including static file serving thru SCGI and local redirect. I can and maybe will write other blogpost if there will be an interest.