Category Archives: programming

The conciseness of the funny characters

Hacker news brought me the delights of this post on the point that the Chinese language can express more in fewer characters then English. Something I can vouch for which makes translation from English to other languages painful for me.

Anyway, the scary bit, Twitter doesn’t allow 140 Unicode characters !?!

Look, if you say character, you should mean you support any Unicode character! And yes, that may in fact be 560 bytes, but come on, English is not the only language in the world.


Every Project Starts with a Deadline

Everyone in Software is familiar with the “Iron Triangle”, the triumvirate of Time, Features (Functionality, or Scope), and Resource. In summary, these are the three factors that control any project, altering any one of them always alters the effect of the others, with enough Features, Time, and Resource, you have a high Quality product.

In recent years all the various methodologies have focussed on trying the win the “Time” argument, to stop the idea of time limits being set up front and thus reducing the Functionality as the Resource is generally fixed (which even if increased may have less benefit)  which may in turn lead to a lower quality product.

This has lent some credence to a new myth, that projects don’t need to have deadlines or set time limits, and in fact shouldn’t have in order to create good “Quality” software.

And in the real world…

This is a joke. Let’s start from the very beginning, suppose you are asked to deliver a project, you will generally be asked three questions, never mind all the ones you should be asking, those three are:

  • How much ?
  • How long ?
  • Who do you need to hire/what do you need to do it (you could suggest this is how much) ?

Consider that everything you do has a deadline, even if this deadline is not the one for delivery. It is important to set a goal, and often this represents a line in the sand, where you can evaluate the progress and determine if you can proceed.

If you doubt this, consider what you ask the guy who comes to install your cable. If (s)he fails to turn up when they say or takes 2 weeks to wire a plug, you are not going to pay them. When you bought your house, your bank didn’t give you money that you could “pay back when you wanted”, they attached a timeline to it that reflected a best guess according to your capability to pay.

The Punchline

Your project has a deadline. It will always have one. Your aim is to have a realistic one, and that assumes the business understands what it is trying to have delivered. Even then a deadline is not your enemy, it marks a decision point for others to determine if they want the deliverable. It also acts as a focal point for your efforts, provided that the focus is not panic. There is no point in the argument of “it’s done when it’s done”, the money running out stops that dead. Remember the bank who lent you the money for your house, they don’t stop the clock because your job did. The bad news is that the bigger the organisation, the larger the number of levels that the deadline has been established over… so maybe the idea started at the board, so the CEO has a point in time in mind, then the Architects get the idea, and at some point there has to be a business case, and this will always need a deadline!

Practically, you have the following options:

  • Cheap and quick – may never get upgraded/fixed, but will do, soon find out how import the deliverable really is 🙂
  • No time at all – build a prototype, be upfront about it, if it works out, you can work on the full system, if not, well software doesn’t take up landfill.
  • No resource – establish your own timeline of what you need to achieve – if the shortfall is obvious, either you get more resource, or time.
  • Work nights! – this is not an option. Unless you are 20 and can work on 3 hours sleep, you’ll start spending 50% of your productive time fixing what you thought you achieved. Don’t do it. The odd hour fine, night, no.

Or hopefully:

  • Just enough of each of the three parts of the triangle – due to the business understanding what needs to be done and having funded/invested appropriately


  • Too much time – things wander and all three parts can be wasted.

Good luck with the deadline 🙂

The Funny Character Taskforce Rides again!

Amusingly I got a link to Joel Spolsky’s post on Unicode by one of my Italian colleagues. The punch line being that even though we both work for a European company (owned by a US company) we can’t seem to be able to put the accent on the last ‘o’ of his surname – which should be ‘ó’ – in the User Directory of the mail system.

It also made me remember that I’d had this post in my reading list for a long time. In essence the point being made is that complying to Unicode standards does not mean an implicit use of UTF-\d{1,2} although for some reason which escapes me, this is exactly what .Net and Java do by having UCS style chars which are 2 bytes wide (or wchar). Great. Why ?!

Anyway, the point, Unicode – support it, you aren’t an island, no matter what Ted says, even if you never release your code to a non-english speaking country use a platform that supports it so on the chance you do, you’re ready. Given the number of places that need this (Hint: it’s the majority!) it’s going to make sense at a programming language level and on your product. If you’re using XML and ASCII the chances are you’re converting from ASCII to UTF-8/16 to process the XML even if you’ve specified ISO-8859-1 (Latin 1). If an encoding to support Unicode is there, use it.

For those in Europe and pretty much anywhere else in the world, it’s a must. In the US, I guess you can afford to annoy Spanish speaking people, hey ?


Douglas Adams was right, the world is powered by ideas on the back of a napkin. In fact I wish people would do more of this.

I didn’t know that one of it’s main uses would be to stop the “I can do that in a weekend” point of view. But then estimation, is time and time again (pun not intentional) an issue raised by developers.

Maybe the best way to start a project is in a bistro, with a large supply of napkins and no computers…

Automating Synergy Client Connections

A thing that has been driving me up the wall about my synergy setup at work is this. The main server is the Windows laptop that I use, which is fine when it’s docked, and when it’s mobile the server is disconnected from the clients (counter-intuitive I know but bear with me). The server only runs on the wired ethernet so the client doesn’t pick it up over wireless.

All well and good, when docked I have Windows dual screen with Ubuntu joy over on the the third screen.

Now, the only small problem is everything is on DHCP… and every so often the addresses change. Which is annoying. It means reconfiguration by hand on the Ubuntu box, after first running ipconfig on the windows box (by the way, anyone want to point me at an app that pops up the IP address in a tooltip over the network icon on the tray ? Make this a whole lot easier…).

The solution ? Automate (finally). Involving the magic of nmblookup (DHCP client on the laptop does not register itself as a DNS entry, so you find the server via WINS, which Ubuntu needs help with).

The script (see the cut) attempts to figure out if the synergy client is running or if it has died, if it has died, it attempts to restart, after first finding the server. The client should die when the connection is killed (simulate this by clicking on the “Force Reconnection” option in the Windows Synergy server context menu). So essentially, undock the laptop, and the connection is killed. If the laptop is switched off or not connected, the ip address is not available and the script with try again later. The Script itself is run via crontab. The ip address is extracted from nmblookup by using gawk (see after cut).

Continue reading

Funny Character Task-force

Well since I’ve been cited (not in the police sense…) I should provide links, but more importantly I have to strongly back Greg’s lesson here. The lesson is simple, there are more characters that you need to support right now, that are not in the ASCII range, the so-called “Funny Characters” and that includes any currency indicators (£,€, etc).

The number of times I have seen that annoying character:


usually in front of a character that is in the upper ranges of ASCII:


Better still is the little square that says the character isn’t known/rendered, probably due to an unsupported accent. This suggests the lesson isn’t being learned.

People, it’s simple, it isn’t 1980 anymore and not everyone speaks or even writes in English or any other Latin language. The fact is ASCII doesn’t even cope with any other Latin or Germanic based language. Let it go, it’s over, there are a few things that are 7-bit safe and few of those are actually conveying meaningful information to be read by humans without being decoded.

Modern programming languages support Unicode natively, and your decisions are going to be based around which of the UTF encoding standards you might follow and how you determine the the lexicographic ordering for your local languages.

But please; for the love of all sanity, if there is one point Greg and I are trying to make, it is this; make this one decision right now before creating any code:


When is a TV program the same ?

We’ve recently hit a problem based around a lack of clarity around the “equivalence” of assets. For people in the broadcast industry, this is an old chestnut, but I think it’s worth exploring here.

Lets say you have a program, for the sake of argument: “Vertical City” (1st Episode). Let us also assume it has a total of 6 episodes. (For those with Deja-Vu, it is a real program on Channel 4, but I’m just using this as an example).

Let us also suppose (this is not the case as far as I know) ZDF buys this series, as does NED1, and then RTE.

Let’s introduce a viewer, Bob, who has decided that he wants to watch the entire series. He gets all these broadcaster/channels: RTE, NED1, ZDF, and C4 but he’s come late to the party and C4 is at episode 4 (I’m ignoring the iPlayer/4oD part here :)).  Now if you had a Tivo, and told it to get the whole series, it would make this assumption:

A program is the same regardless of channel

So it could reconstruct the series as follows:

  • Ep1 – from RTE
  • Ep2 – from RTE
  • Ep3 – from ZDF
  • Ep4 – from NED1
  • Ep5 – from C4
  • Ep6 – from C4

And that’s the same as the original series on C4 ? Isn’t it ?


If you did this, you’d probably find that NED1 burns subtitles in (it’s a Dutch channel), ZDF might dub the program so the sound track is now German, and RTE might burn a graphic into the program.

The point is that the original asset produced (in this case by Electric Sky) by the program maker is the only part that you can make any assertion of equivalence on. This is how a broadcaster like the BBC can say that an asset they buy to put out on BBC1 is the same as the instance they put out on BBC3 a week later/earlier. Once it’s is on the channel and consumed, it is a different program, even if the viewer might say its a equivalent. BBC3 has the channel logo for instance, the time of showing might be different.

This difference is even more marked in a multi-language environment because of the differing audio/video tracks.

From the point of view of a Broadcast Network owner, such as UPC or Sky, this is frustrating, because you cannot assume equivalence, so each program is different unless the channel operator tells you explicitly that they are equivalent. This affects your approach to PVRs, network PVR, and meta-data. Essentially only the people who fed the tape into the ingest at the transmission stage can tell you if the program is the same.

From an architectural stand-point, this is something to watch where a development view will tell you that there is multiple copies of the same data in the system, even though the “copies” are actually different based on the business rules.

Just because you can run a string ‘==’ on the data with a result of ‘true’, doesn’t make it the same.