Wednesday, December 7, 2011

What agile is about

You are all probably sick and tired of people constantly using the buzz-word agile in all possible contexts and scenarios. This has been so overused that trying to redefine it one more time seems like a waste of time, but I don't think it is. The fact that all the sources are slightly confusing... Agile is:
  • individuals and interactions
  • working software
  • customer collaboration
  • responding to change
This is just from the manifesto. From other sources you can infer that it is about:
  • Customer satisfaction by rapid delivery of useful software
  • Welcome changing requirements, even late in development
  • Working software is delivered frequently (weeks rather than months)
  • Working software is the principal measure of progress
  • Sustainable development, able to maintain a constant pace
  • Close, daily co-operation between business people and developers
  • Face-to-face conversation is the best form of communication (co-location)
  • Projects are built around motivated individuals, who should be trusted
  • Continuous attention to technical excellence and good design
  • Simplicity
  • Self-organizing teams
  • Regular adaptation to changing circumstances
  • Collaboration and cultivation of culture
  • Business
  • Getting functionality out
  • Quality and business value
  • Focus
  • and probably a 1000 more of things like that
However, I think that being agile is about

releasing software when it's good enough

and while I think that technically all of the points above from all the sources make sense and are correct technically, that one point makes all the difference.

1. requirements
The "good enough" approach means that your requirements don't need to be fleshed out to every single detail. That means
  • less time spent on gathering them and discussing what the scope is
  • you need more interaction with the customer
  • you need find out who your customer really is
you will get constructive feedback when you give the tool to the users, trust me. They will come with a bajillion improvements to what you delivered the very second they use it. Before that all the requirements gathering is just guesswork.

2. process
you don't need your testers to try to break everything in the software in every possible way. Being agile means dealing with the most burning issues as opposed to killing every single bug that you possibly can. It means putting a price on every item, and picking the ones that make sense both in short term and long term perspective. For instance, you can decide to put in some technical debt if you are solving some problem for the first time, so that you get the functionality out into the wild to get feedback. You need to be aware that the debt accrues, and you will have to pay that debt back. A good way to measure this is sonar, which can actually put a number on your code quality.

3. execution
fail fast. This is the single most important bit of being agile, I believe. Way too many people don't take that one seriously, but this is the essence. If you are doing any sort of proof of concept/prototype kind of work, you need to do as little as humanly possible to either fail or succeed. You need to start with the highest risk and prototype that. All in all - what's the point of failing after 3 months of work if you could have failed after a week and spent the rest of the time doing something that brings value?

4. interactions and communication
if you actually follow the rules of scrum (or at least mostly), then you have less required paths of communication between the developers and customers/users. You also communicate with them on a working software and they decide: do you need to extend features that are already there to improve usability/performance/whatever or do you add extra features. Either way, you release, when the software is good enough for the users to accept it.

Friday, August 19, 2011

Git svn switch on windows

This is apparently something relatively obscure, since noone is actually writing about this. Are you people not using git as the only svn client? Anyhow, there are sometimes reasons to move stuff in svn repo, which causes horrible things to happen in git:
HTTP Path Not Found: PROPFIND request failed on '/repos/...': '/repos/.. path not found at C:\Program Files\Git/libexec/git-core/git-svn line 4441
This solution seems to be working fine on unix. There is a slight difference on windows of course (remember to escape the slashes):

git filter-branch --msg-filter "sed \"s/old_path/new_path/g\""

Thursday, July 14, 2011

Size of geometry in your postgis table

Here's how to find the size of geometry column in a table:

SELECT pg_size_pretty(CAST(SUM(ST_Mem_Size(geometry_column)) As bigint) ) as totgeomsum
FROM yourtable;

Wednesday, July 13, 2011

Install oracle client and sql loader on ubuntu 1104

  1. Install the client and verify that it is working
  2. Download the 11g beta version of the database. Take the RPM of course and not .exe ;)
  3. Extract the rpm, then cpio and then contents of the cpio (I know, how insane is that?).
  4. copy the u01/app/oracle/product/11.2.0/xe/bin/sqlldr to /usr/lib/oracle/11.2/client64/bin/ (or /usr/lib/oracle/11.2/client/bin/ depending on your arch)
  5. copy the u01/app/oracle/product/11.2.0/xe/rdbms/mesg to /usr/lib/oracle/11.2/client64/rdbms/mesg/

Thursday, June 2, 2011

Loading csv files into postgres from remote machine

Could not find it easily anywhere, so here it goes:
psql -h -d database_name -U postgres -c "\copy tablename from 'contents_file.csv' with csv header 

Wednesday, March 30, 2011

Releasing from a contuous integration

In the previous post I have described the setup of the continuously integrated system with a hint of problems that you typically encounter when trying to release several components that rely on each other's snapshot versions. Just to give you some idea about this, let's use a drawing:

now, this is a situation we have a releasable component D, which depends on A and B and C, which in turn both depend on A. This is of course a trivial example, imagine that there is another component that depends on A. How do you proceed with releasing (and probably more importantly, bugfixing) workflow for those? Well, let me outline the regular workflow you would have to follow:
  1. release the lowest level of dependencies (A) with standard maven release plugin
  2. change the poms of the intermediate projects (B and C) to point to the just-released version of A
  3. release those
  4. change the version of A in B and C's poms to include the snapshot version of A
  5. lather, rinse, repeat for just as many levels as you have in your project structure.
That's not really user-friendly, is it? I mean there is maven versions plugin that can automate some of those for you, but this still seems odd to use. Now, I found a couple questions related to the topic, and here's what we came up with:
  • identify releasable parts of the system. Make them as large as possible but not larger. The smaller you make them, the more work you will have to do in the next step
  • release them one by one and actually go through the 5 steps described above. Automate with versions plugin
I think the best way to prepare for new release would be to actually branch the whole repo when starting the release procedure in order for the rest of the team to continue working on the trunk (the release should be done from a last-successfull functional tests anyway, I guess). The idea of branches for release candidate seems like a good one too, so that you can then make a branch later if you want to create bugfix release or something to that sort. One of those questions also contained an idea that I found very interesting, namely having the parent pom (or actually grouping poms for that matter) as a separate directory that contains links to the other project's directories via svn:extenals. This seems like really cool solution, but perhaps there are some caveats there too?

Tuesday, March 15, 2011

Continuous integration and deployment using hudson and snapshot versions of artifacts

In my most recent project we are writing a rather large system that consists of at least a dozen components. We started off with true and agile...

big design upfront.

We sat down for a week with the clients and agreed on the interfaces... This seemed counterintuitive, especially in the perspective of agility and just doing the simplest thing that can possibly work. But it turned out to be quite useful. We started with a team of over 20 people that were working their assess off to achieve the goals of the sprint. They were working full speed from the start because of the head start that the interfaces definition meeting gave us. I cannot imagine agreeing on those interfaces as we go along. Especially if (as is in our case) the clients are not in the same office (and not in the same timezone for that matter). We would have never has any conclusive conversations and the thing would take forever to build. With the interfaces defined everybody could focus on the sprint goal. And the goal of the first sprint was:

an integrated system

To do that we relied on Hudson. The real cool feature of it is parsing the poms of the projects that it builds to find out the dependencies between them and to automatically construct build chains from there. So we ended up with a setup in which every commit to SVN triggers a shapshot deployment, which triggers downstream builds and deployments, which triggers building our webapp, deployment to tomcat and running functional tests on the deployed instance of the webapp. This is so insanely cool, that I want to shout out! Now, I know that most of the definitions of Continuous Integration are saying just that the build should be self-testing. They only talk about integration code, whereas I believe we should not only be integrating code (as in everything compiles and the unit/integration tests pass), but also a larger piece of the whole puzzle functionality (as in everything compiles + all the parts of a system we are building should compile and work together + all the funtional tests pass). This bit of integrating small pieces, to larger pieces, to yet larger pieces is what the continuous integration is all about. But every talk I've ever seen on this concentrated on integrating the code rather than functionality. Should I mention that pervasive testing is an absolute must to know whether any and all of those pieces work? Now, to be able to do functional tests you need to

continuously deploy

your stuff to the testing environment (be it a database instance, appserver instance, whatever) and run your functional tests there. In the maven case this means running everything off of a clean repository to eliminate issues with repo manager and things like that. This is of course simple with hudson (a.k.a Jenkins) + maven and the repo location setting.
The next part of the puzzle is

Releasing stuff.

This is a major struggle right now. In all the poms we have 0.1-SNAPSHOT dependency versions. This is cool for the continuous integration, but is a pain in the ass for releasing. There is a version plugin which can update the versions in poms to release versions, but in order to actually release the system, you need to create the release build chain yourself by hand. This is suboptimal, and I think that there must be a better solution to the problem. Does anybody have a solution?

Wednesday, January 12, 2011

Maven is awesome!

As you might have alrady guessed (or not) I am not a huge fan of java as a language. But the tooling around it is just fucking amazing. Take maven for instance. Want to create a webapp? There you go, we have an app skeleton ready made for you. Want to run the app locally, sure, use the jetty plugin (or tomcat plugin if you want to be a real professional ;)). Want to deploy the app to a remote server - take the cargo plugin and do that. Anything you want to do - there's a plugin for it in maven. I wish python had some sort of all-in-one pluggable solution like maven: dependency management, build, distribution, reporting, almost anything you might imagine connected to your software.