Thursday, October 21, 2010

nXhtml: an Emacs mode for web development

I'm a fan of Emacs but I've often been frustrated when using it to edit PHP files. Although the default Emacs major mode for PHP generally did an okay job with "pure" PHP files (i.e. those only containing PHP code), I would still find myself occasionally struggling against Emacs automatically insisting on mis-formatting my code. And when the files contained a mixture of PHP and HTML (commonplace in web applications), the auto-indentation could get quite deranged.

Programming is difficult enough sometimes without having your editor working against you as well - in fact for a while on Windows I even switched to using Notepad++ (which solved the problem by not having any auto-indentation at all). The problem with Emacs is two-fold: first, I felt I needed a better PHP mode than the default; and second, it needed to be able to deal with "mixed modes" (when two major modes - in this case PHP and HTML - are used in a single file).

Initially php-mode looked promising, working really nicely with pure PHP - but unfortunately the solution for handling embedded HTML (to toggle manually between PHP and HTML modes as required using the M-x html-mode/M-x php-mode sequences) didn't feel that practical to me.

However the documentation for php-mode also suggested nXhtml, which describes itself as "an Emacs mode for web development". nXhtml includes a version of php-mode along with support for mixed modes, and it does a great job of handling PHP both on its own and with embedded HTML. The automatic formatting is nicely behaved so it works for rather than against me, and the syntax highlighting also distinguishes PHP code sections from those containing HTML. In addition there are some other useful-looking features (such as tag completion) that I haven't really explored yet.

However it looks like nXhtml was the solution that I was looking for all along. It's a straightforward install: on Linux, download the nXhtml.zip file, extract the contents to e.g. $HOME/nxhtml/, then edit $HOME/.emacs to include:

(load "your-path-to/nxhtml/autostart.el")

For Windows it's even easier, you can download and install a version of Emacs bundled with nXhtml (however note that I couldn't get nXhtml to work with an existing Xemacs 21.4 Windows install).

Although I haven't worked with it extensively yet, so far I've found nXhtml a great improvement for working with my web application code and I'd recommend anyone interested in using Emacs for PHP development to also give it a try.

Wednesday, October 13, 2010

Learning Eclipse: Total Beginners Tutorials

A couple of weeks ago I decided to start learning about the Eclipse IDE, which looked both powerful and daunting on the previous occasions I'd seen it. Luckily around the same time I also stumbled across the free video tutorial Eclipse and Java for Total Beginners by Mark Dexter, which gives an excellent hands-on introduction to Eclipse by leading the viewer through the development of a simple Java application.

There's a substantial amount of material in the 16 videos - each is around 12 to 16 minutes long, but pausing playback to follow along on my own machine effectively doubled the running time, so the total length was more like 8 hours for me (roughly a day's worth of training). Although the tutorial is based on Eclipse 3.3 (it's dated 2007) and is running on Windows, I didn't see any significant differences compared to using Eclipse 3.5 (the version fetched via Synaptic) on Ubuntu Linux 9.10.

The tutorials begin with Mark outlining the main features of the Eclipse workbench (i.e. main Eclipse window containing various subwindows, such as the editor, package explorer, console and so on). He quickly progresses to writing Java code, showing how Eclipse helps by giving rapid feedback as you type, and then introduces two powerful features, code assist and quick fix:
  • Code assist is invoked using ctrl+space and provides a list of possible completions based on the current context, from which the programmer can select the appropriate one. (This can often result in a code assist template, for example when creating new methods which gives the programmer fields with hints.)
  • Quick fix is invoked using ctrl+1 (or by right-clicking) on an error that has been identified by Eclipse. It then provides a set of suggestions (or "proposals") as to how that error could be fixed (for example, offering to correct a mis-typed variable, or to create new classes or methods). Accepting a proposal performs the correction automatically.
These two features made me feel like I had my very own Java expert to help me whenever I needed, and with practice I found in some cases it really sped up coding - especially when combined with JUnit, the next major component that Mark talks about.

JUnit is Java's unit testing framework, and its tight integration into Eclipse makes it easy to adopt a test-driven development (TDD) approach (where the test cases for new classes and methods are written before the classes and methods themselves are implemented). As well as providing a way of checking that the initial implementations are correct, the test cases also provide a detailed specification of how the implementation should behave.

Writing the test cases first means that they actually get written, which is significant enough on its own (writing tests after the code always feels like a major chore). However the whole test driven approach steps up a gear once it's coupled with the quick fix mechanism, as it enables the following workflow for rapid development:
  • Write the initial JUnit test cases for a new class and its methods.
  • Eclipse will identify the missing class and methods as errors, and stubs can be automatically created using quick fix.
  • Fill out the stubs with real code, running the unit tests as required to verify that the code is working properly.
This might not sound like much but when put into practice it felt like something of a revelation to me - essentially once the tests are written you can quickly generate the "scaffolding" code for the implementation and then spend most of your time writing the code that's specific to your application.

Eclipse makes it easy to rerun the tests and identify errors, and Mark shows how it also helps to catch regressions later on when the code is modified or extended (if the tests suddenly stop working then the programmer knows that either they broke the code, or that the tests need to be updated). The process continues to be, first update/extend the test cases, then update the code (using quick fix whenever possible to make it easier). At this point he also introduces some of Eclipse's refactoring tools to help (by for example automatically making local method variables into class fields, or extracting blocks of code into new methods).

There are a lot of other useful extras that Mark throws in along the way but which I won't mention here. However I will make an exception for one gem: the scrapbook feature essentially allows you to evaluate snippets of Java code without having to write a full program (with one caveat: code assist doesn't always work in the scrapbook). It's a useful but oddly well-hidden feature (right-click on the current project, select "New..." then "Other...", expand the "Java" group and then the "Java Run/Debug" group - you get the idea), giving you the Java equivalent of firing up an interactive interpreter to test out fragments of code in Python.

In conclusion, I really enjoyed working through these tutorials and definitely learned a lot more by performing the steps alongside watching the videos (especially with test-driven development, an unexpected high-value bonus). There's a lot of material but the pacing and structure is spot-on, Mark's delivery is relaxed and engaging, and Eclipse turns out to be fun to use too. So although there's a lot more to learn about Eclipse (and Java!), I'd really recommend this if you're new to the system and want help getting started (and then follow it up with the much shorter Eclipse Workbench Tutorial, to learn more about manipulating the IDE itself). Great stuff.

Monday, October 11, 2010

PHPNW10

Last weekend I was in Manchester at PHPNW10, the annual conference for the north-west of England PHP community. It was a fairly last minute decision to attend but looking at the conference programme persuaded me that I'd be rewarded with lots of good material, and I wasn't wrong - it's going to take me a while to process the volume of quality information from the talks I attended.

With that in mind I won't try to do more here than just summarise the sessions I was in on Saturday, kicking off with the keynote talk by Lorna Mitchell on professional development ("Teach a man to fish"). While at first this might have seemed a little out of place amidst all the technical content, it was entirely appropriate given that many people (including me) come to these events to learn. Much of the focus was on benchmarking and improving the skills of your team as a whole (making the point that skills gaps only exist in the context of knowing what skills are actually needed), but emphasis was also placed on individuals taking the responsibility for their own professional development. There was some solid practical advice for making the business case for training to managers, and suggested alternatives to formal training courses (for example, allowing developers "study days" on company time) if you have "a training budget of zero." And if you're working on your own then the suggestion to make your own "team" sounds like a good one to try.

After the keynote the sessions split into three parallel "tracks" for the remainder of the day, and in general I chose to focus on those talks on basic development tools and methodology (with a plan to catch the talks I missed about geolocation, REST interfaces and so on later online).

Things were off to a great start with Robert Mortimer's talk "Let your toolchain set you free", which dealt with choosing and using appropriate tools for setting up your development environment (Linux-Apache-MySQL-PHP i.e. LAMP), source code control (subversion), performing code validation (PHP_codesniffer), enforcing coding styles (PHP_Beautifier), generating documentation (docblockgen), unit testing, debugging (Xdebug) and developing using an IDE (NetBeans). Accompanied by live demos and full of little tricks to automate things, it was a dizzying talk.

I followed that with Ian Barber's talk "Debugging - Rules and Tools". At the beginning Ian confessed to being something of a debugging nerd and described the satisfaction he felt when tracking down bugs in an application - something I identified with from my former life in software maintenance. His presentation was structured around the nine rules outlined in David J. Agan's book "Debugging", giving practical advice and suggesting useful tools and resources for each rule. As with the earlier talk there was an almost overwhelming number of tips and tricks (a small number of examples: Jmeter for load-testing; Selenium for simulating user interactions; MySQL Proxy to isolate your database; Tamperdata to modify http requests...). I also learned a new word, "heisenbug", to describe an intermittent fault.

After lunch and a chat with some fellow attendees I dropped into Marco Tabini's talk describing the development of the php|architect website ("The curious case of php|architect"). Marco focused on higher-level design decisions - and the reasoning behind them - that had been made at various stages in the site's evolution, and reminding the audience that real-world projects don't always follow a smooth path.

Marco was followed by "Developing Easily Deployable PHP Applications", given by John Mertic of SugarCRM. At the beginning John made the important point that you need to define the "support matrix" (i.e. the set of environments defined by combinations of operating system, web server, database and PHP version) for your application, as knowing this will inform your design and test options. He then went on to describe how this is managed for SugarCRM, raising a number of interesting points - for example, providing hooks and other mechanism to make customising, configuring and extending the application as easy and as maintainable as possible for the end user.

The last real talk was Harrie Verveer's "Database version control without pain", which aimed to address the issues with managing changes to database schema in sync with changes to your application code. The problem is that the mechanism used for patching PHP code as part of an update can't be used to update the database - in this case the patches are in the form of SQL code rather than code differences - and Harrie conceded that (contrary to the title) there isn't a magic bullet for painless database version control. Nevertheless he did a great job of outlining the pro's and con's of the various options, ranging from a simple patching strategy (probably the way to go for me at the moment) through to tools like Phing, Liquibase, Akrabat DB Schema Manager and Doctrine Migrations.

I was feeling a little brain dead by then, and as entertaining as it was much of the substance of the closing "framework shoot-out" was lost on me (particularly as I'm not a framework user). But no matter - because I'd had a great day at a well-run conference, met some great fellow software developers, and been exposed to far more useful information than I could have hoped to find elsewhere on my own. I know it's going to take me some time to follow up on the various tools and techniques - in the meantime thanks to all the organisers and speakers for making my first PHPNW conference such a memorable and enjoyable one.

Friday, October 8, 2010

Open Source Content and Document Management in the Public Sector

Yesterday I attended another of the Manchester BCS's open lectures, this time an excellent talk by Graham Oakes entitled "Open Source Content and Document Management in the Public Sector". As the title implies, the talk focused on issues associated with using open source solutions for content management systems within public sector organisations and essentially reported the conclusions from a meeting held by the BCS Open Source Specialist Interest Group at the start of the year.

That meeting in turn had been inspired by a Guardian article reporting on a website project by Birmingham City Council ("Why can't local government and open source be friends?", 7th August 2009). The article suggested that the spiralling cost of the project could have been significantly reduced if the council had opted for open source software (with the defining characteristic of being free to install and use for any purpose) for the content management system (CMS) behind the website, rather than a costly proprietary solution.

Certainly there is now plenty of mature and robust open source software that is recognised by industry analysts as being "enterprise-ready" (possibly the most famous example being the Apache webserver, "the most popular web server on the Internet since April 1996"), and this includes open source CMS's being used successfully to run the websites of prominent public bodies (Graham cited several notable examples including various UK police forces, the C.I.A. and even the White House). So open source is certainly a viable option for these applications.

Also there are many apparent benefits over proprietary software:
  • Lower front-end costs: the software is free to install and use; also it's cheap to experiment with before committing to it wholesale.
  • Easier to work with: the source code is visible and can be modified as required; also there are no licensing issues for virtualization or cloud computing applications (which can be a real problem when using proprietary software).
  • Favours incremental delivery: you can start small and build up over time, rather than having to deliver one huge project all at once.
Another potential benefit for public sector organisations is that use of open source can demonstrate a commitment to openness by the organisation. However Graham was keen to point out that there are also a number of risk factors which counterbalance these benefits:
  • Misunderstood costs: probably the greatest misconception about open source: even though it might be free to download doesn't mean it costs nothing to run - someone still needs to support the system day to day. But perhaps more significantly, the biggest costs associated with the implementation of any CMS - whether open source or proprietary - are for things like content migration and training. (It's probably in these areas that the aforementioned city council website project actually went wrong, as the CMS software is likely to have been around only 10% of the total budget.)
There are other "misperceptions" (for example, having access to the source code is not an intrinsic benefit - it also requires that your organisation has the expertise to exploit it). Graham also cited some cultural factors that might work against the successful use of open source by public organisations:
  • Mismatch of scale: generally open source is on a much more smaller scale than government organisations, which are better at interacting with large corporations.
  • Broken procurement models: government procurement is geared towards buying licences rather than buying services, and this bias tends to favour proprietary solutions over open source. (An interesting later observation was that this model tends also to favour large programmes over smaller ones - but that in his opinion the larger a project is the greater the chances of failure are.)
(A further point was that people can have a "philosophical bias" towards or against open source, leading to unreasoned decisions about which technology to use, regardless of how well it matches the requirements.)

In conclusion, while there are potential benefits to public sector organisations using open source, it's certainly not a given. In many ways using open source isn't that different from proprietary software. He closed with a few conclusions:
  • All open source is not the same: there are variations in quality, capabilities and levels of community support, so these should be assessed before committing to a particular solution.
  • Focus on your problem and not on the technology
  • Consider the total life-cycle costs: look beyond just the start-up costs when comparing open source and proprietary software.
  • It's the team and not the technology that creates success: good people will still succeed using mediocre tools.
Hopefully I've accurately communicated the key points of a Graham's fascinating talk. I'd add that many of his conclusions chimed with those from "Open Source for the Enterprise" (Dan Woods & Gautam Guliani, O'Reilly Media) which makes the case for open source in the private sector - and reiterates the point that informed consideration of benefits and risks of any technology is vital in making good choices.

Monday, October 4, 2010

Devministrators and DevOps

Yesterday evening while reading a free sample of Jeff Barr's "Host Your Website In the Cloud" I downloaded from Sitepoint, I happened upon a term I'd never seen before: devministrator.

The way the word is used in the book seems to indicate a software developer who also does system administration tasks, so on that basis I guess I could describe myself as a devministrator - I'm the sole sys admin supporting my personal Ubuntu system used for my Linux development work - but a Google search turns up surprisingly little in the way of a concrete definition.

Google does pull up a few blog postings however, including one by Kris Buytaert who in turn references it from a 2008 Cloud Cafe podcast (about an open source configuration management tool called Puppet) which might even be the source of the word, describing system administrators who apply software development best practices (e.g. version control, continuous integration testing and automated scripting) to solve their administration problems.

However in looking these up I came across another term which aims to encapsulate a similar idea and which seems to have gained more traction: the DevOp (apparently a contraction of "developer" and "sysop" - itself a contraction of "system operator"). As described in this recent article from IT World "The New Type of Programmer: DevOp" [warning: link opens with an ad] brings together coding expertise with a detailed understanding of how to manage and configure the environments that the code operates in. And bringing me full circle, according to the article, it's the brave new world of cloud computing that's creating the need for this "new type of programmer": "A cloud developer needs to understand the operating environment ... as well as the development environment."

On that basis I'm probably more devministrator than DevOp. But anyway, there's an interesting overview of the DevOp movement (almost a manifesto) by Patrick Debois, What Is This DevOps Thing, Anyway, which offers a much broader perspective than just cloud development - though I'd also recommend the free sample of Jeff Barr's book if (like me) you're a non-expert looking for a reasonably detailed introductory overview of the technical aspects of developing for the cloud.

Saturday, October 2, 2010

MySQL user administration basics

Last week I spent a frustrating few hours setting up a script to copy a MySQL-backed web application from CVS into a local test environment. In the process I somehow broke the database access settings in a way I didn't understand, and - despite the fact that it was probably quite a trivial error - all my attempts at manually resetting the MySQL user and password failed. Finally I resorted to using phpMyAdmin to sort out the immediate problem, but afterwards I felt I needed to learn more about the basics of MySQL user administration.

First, a bit of background: it's typical for applications such as mine to access the database via a dedicated MySQL user explicitly created for the purpose, and with access rights restricted to only the data that it actually needs. This acts as a security measure to limit the potential damage to the database as a whole if the application is compromised. (Any application that uses MySQL for its database backend should also implement its own independent application-specific user management system, but this is a separate issue not covered here.)

The first point is that MySQL's users are entirely distinct and separate from the users defined on the host system - so they also need to be managed separately. The essential SQL commands for doing this are:
  • GRANT: creates a new user and sets its privileges (if the username/hostname pair specified doesn't already exist); modifies the privileges for an existing user (if it does exist):

    GRANT privileges ON data TO user IDENTIFIED BY password;

    For example:

    GRANT SELECT,INSERT,UPDATE,DELETE ON myappdb.* TO 'myappdbuser'@'localhost' IDENTIFIED BY 'quitesecret';

    gives the user myappdbuser@localhost a limited set of privileges for all tables in the myappdb database. (To grant all permissions, privileges can be specified as ALL; data can be specified as *.* to grant the rights to all tables in all databases. Also, note that more recent versions of MySQL seem to support an explicit CREATE USER command in addition to GRANT.)

  • REVOKE: the opposite of GRANT; removes a user's privileges:

    REVOKE privileges ON data FROM user;

    For example:

    REVOKE INSERT,UPDATE,DELETE ON myappdb.* FROM 'myappdbuser'@'localhost';

  • DROP USER: removes a user, for example:

    DROP USER 'myappdbuser'@'localhost';

  • SET PASSWORD sets or changes the password for a user, for example:

    SET PASSWORD FOR 'myappdbuser'@'localhost' = PASSWORD('newsecret');

  • To get a list of users as username/host pairs: execute a SELECT query on the user table of MySQL's mysql administrative database (where the user data is stored):

    SELECT User,Host FROM mysql.users;
These commmands are actually very straightforward, however based on recent experience two particular points are worth emphasizing:
  1. Each user is uniquely identified by a username and host pair (not just by the username) - so to MySQL fred@localhost, fred@example.com and fred@% are all different and distinct users.
  2. Each user has its own set of associated privileges, specifying which database operations the user is allowed to perform and on which data - so it's possible for each of fred@localhost, fred@example.com and fred@% to have different privileges.
These are significant when considering connection requests where there is an ambiguity in the user specification (for example, omitting the hostname) which results in more than one potential match in the user table. In this case MySQL uses the most specific match, which might not be the one that was intended. The connection might then be denied (if the intended and actual users have different passwords), or have subsequent problems executing SQL queries (if the two users have different privileges). So I'd also recommend always specifying users explicitly whenever possible with the full username@hostname pair.

Hopefully this overview has given some insight into the basics of MySQL user administration, although obviously there's a lot more than I've outlined here (for example I've skipped over many details are for the GRANT command - see the section Account Management Statements in the MySQL manual for more comprehensive information) - and I'd still recommend using phpMyAdmin or similar for database user management (especially if you don't do it that often). However I hope it's still useful - and I'd welcome any comments or corrections.