Skip to main content

Security: No Longer OS Dependant.

Windows… it’s a virus ridden security nightmare.  Apparently.

The server on which this blog is hosted (along with a number of client web sites, email accounts and services) experienced some very short but unexpected unavailability on Sunday.  Without giving too much away about this server, it is a dedicated Windows 2008 server, in a professionally ran hosting data centre… it’s patched and anti-virus’d (and kept that way), and sits behind a Cisco Firewall appliance with strict rules.

The downtime was caused by a hack of a known 0 day exploit.  But not in this server.  Or on any other Windows server on the same multi-gigabit LAN that this server is on.  No, this was an exploit inside Apache web-server on the Unix boxes on the same lan; which allowed arbitrary code the be ran on the unix servers.  The arbitrary code in this case manifested itself by using 100% of the available LAN bandwidth on the infected machines…. effectively peforming a DOS attack on the network.

Thankfully our hosting partner, Register 1 (who really are excellent) noticed this very promptly and immediately disabled the affected ports so that traffic to unaffected servers was restored promptly.  They then patched their other customers machines and brought them back on line.  Once again an excellent and very timely response.

Not the norm.

The point I am trying to make is that when you say “hack”, “exploit” or “virus” people instantly think of Windows.  That may be the norm, but just because you aren’t running Windows don’t get lazy with your patching and security.  Windows has become the de-facto target for security exploits for 2 main reasons.

  • It’s prevalence.  It’s everywhere, therefore exploiting it hurts more.
  • Historically it was easier to exploit, early versions of Windows just weren’t as secure as Unix.

Those days are gone. Hackers and exploiters will take what they can get now: be that a BSD box, Mac, Linux or any other flavour of OS or application which sits on it.  So complacency is the killer here, not the OS.

Are all of your servers and their apps all up to date?

To WWW or not to do WWW?

Preferring the robot over the human…

Having to type www before most websites you visit isn’t very friendly is it?  While I am well aware of the signficance of the WWW in DNS / network topology terms, it dates back to the days when the internet was largely used by nerds & geeks (takes one to know one!) – the demographic of the average internet user today really couldn’t be much different.

That, coupled with the fact that most websites start with www, just means it’s needless complication to my mind.  Most big companies realise this, and also find it snappier to advertise just their domain.  They will sort it out for you, try it – type microsoft.com, or dell.com, or bbc.co.uk into your browser…. you get the www. version, don’t you?

The www. has been dropped, just like the http:// has been too.

Government Fail

Unless, that is, you’re trying to get at some of the biggest UK government websites.  Take for example Her Majesty’s Revenue & Customs – possibly the one agency every citizen has no choice to use.  hmrc.gov.uk fails to resolve.  www.hmrc.gov.uk works fine though.  Same with the Air Accident Investigation Branch, Parliament, Prison Service and counltess others.

I should say, that some do work – and importantly the gateway to government websites, direct.gov does work.

Why would they do this?

Well, it could just be they forget to configure it – either in the DNS for the domain, or the webserver itself.  I’d say that’s lazy or sloppy – every domain I register and host is configured to allow both URLS to be used.

The more likely reason, I think, is duplicate content.  Search engines, especially Google, penalise duplicate content at different addresses – and this will hurt where your site is positioned in the search results.

Canonical Domains.

How can it be duplicate if it is the same content on the same site?  Well, simple – it’s at two different addresses; and so is indexed twice.  It’s duplicate.    For example, these two URLs are different, but the content the same:

www.dft.gov.uk/dvla/forms.aspx
dft.gov.uk/dvla/forms.aspx

The way this should be handled is with URL re-writing or a 301 redirect.  That is the webserver should make sure only 1 version ever appears on the web by changing it and redirecting the user (and thus also Google).  If you go to the second address above, you will find you actually end up at the first one! The DVLA is using a redirect to ensure only one version is available.

This is really simple to set up on both Apache and IIS webservers; and it means your users have a much nicer experience, and who is more important, the bots or the user?….maybe it should be a standard for government departments!

Changing URLs on WordPress

Hiding yourself from Search Engines…

It’s not often you hear someone saying that they want to hide a website which they are working on from Google completely, but that was the scenario which faced me earlier this week. I have a client who runs a couple of websites, each for a specific service his company offers.

While he works on the website content we generally don’t want it indexed, because that could potentially lead search engines to index incomplete, or incorrect information and we have no guarantee how quickly it would be updated once the site was complete. In the past to avoid this I have set up a new website on a different domain name, for instance development.domain.com which allows him to use the site normally, but made it unlikely (but not impossible) for a search engine spider to find it.

The Move To WordPress

However, I want to move him over to using WordPress installations, because they are easier to maintain and with the new custom taxonomies and post types in v3 WordPress  really comes of age as a CMS (content management system).

Wordpress Logo

I have set him up a WordPress install for him to work on his next site – but WordPress isn’t good at changing the URL at which it is hosted… because when you upload images into either posts or pages it inserts them with a full reference to the URL which the site is currently at.

So a picture in a page uploaded during development to development.domain.com will stop working when you change the site over to www.domain.com – because it no longer exists.

WordPress touch on this in the final paragraph of their advice on changing URL’s,but not really any more than to say it needs some thought.

There are several plug ins which claim to go through the database and fix this for you – but I can’t afford to have the first Word Press install go anything less than swimmingly for this chap, I don’t want him to lose confidence in it… he’s not technical (thats what he pays me for!).

Robots.txt

My solution is therefore temporary in it’s nature. I have installed his development WordPress site in the full URL which it will run at when it goes live – but to prevent it being indexed I have created a robots.txt file which instructs search engines not to index the site:

User-agent: *
Disallow: /

I am hoping this is temporary as it’s not the neatest fix in the world – but it will work! Anyone else have any other suggestions?

User-agent: *
Disallow: /

UI Changes in IIS7

or getting old…

As I mentioned earlier I recently migrated my dedicated server from a trusty 3 year old box to a shiny new one.  The old box was running Windows 2003 and IIS 6, whereas the new box is Windows 2008 R2 and thus IIS 7.5.   And boy, is IIS7.5 getting on my nerves.  (IIS stands for Internet Information Server by the way, and is Microsoft’s web server platform).

New Features.

IIS 7.5 adds loads of really really useful features which I have been waiting to use.

  • URL re-writing which is a god send if you have multiple domain names going to the same place (or even a site which is on both www.domain.com, and domain.com) – you can use it to ensure that you only have 1 canonical domain name.  Search engines hate duplicate content on different domains, so this is helpful.  In IIS6 you had to set up a new site and then 301 redirect it.
  • FTP Virtual Hosts.  It is now possible to host multiple FTP sites on one IP address, which wasn’t possible before.  This is a nice feature if you host multiple websites and don’t want FTP users seeing each others sites.
  • Integrated / Better PHP support via FastCGI.  Installing PHP in IIS6 was a bit of a chore and needed careful thought.  Finally Microsoft realised most people will use both technologies (ASP.Net & PHP) and made it easy to configure PHP inside IIS, using the also good Web Platform Installer.

Bad Times

However, they moved everything!  IIS 7 is now sold as being “modular,” and so it seems is the administration console.  It’s a real mess in my opinion.  Part of this is borne out of the fact I knew where everything was in IIS6, it was grouped logically in tabs available from the properties menu of the site.  Now, it’s a whole series of icons:

IIS6 Control Panel
IIS6 Control Panel
IIS 7 Control Panel
IIS 7 Control Panel

You can see how different they are!  While the IIS7 configuration is grouped loosely between ASP.NET config, IIS config & General Management, this is no-where near as nice as the outgoing tabbed dialog box.  With all its flashy icons I can only believe this is “dumbing down” of IT – it’s a slippery slope.   More annoying though is that once you double click on one of the new options, it replaces the contents of the main admin pane with the configuarable options, and you have to click on the “back” arrow above and left to get back to the overview screen.  Backspace on the keyboard doesn’t work like it would on a browser and it’s incredibly frustrating.

The summary pane on the right hand side is often useful as it points you in the direction of common tasks and what would ordinarily have been the options available in the right click context menu.  However, it is too narrow by default for some of the information it tries to display (such as bindings) and consequently looks way too busy.

Thankfully, it’s not just me that found this new way of presenting config options a little hard to get to grips with – someone else did too, and they created this handy guide to help you map the old config options to the new ones location wise!

I’m sure I’ll get used to it over time!

Migrating Servers

I have been a bit quiet on here of late; and there is a good reason for it.  I’ve been a bit busy in the real world, playing with computers.   I host a number of websites (including this one, obviously) for various clients along with associated email accounts on a dedicated server which is in turn hosted for us by the excellent Register 1 at their data centre in Docklands, London.

Renewal.

Certified for Windows 2008R2
New Win 2008 Server

When you sign up for a dedicated server you normally do so for a period of years, and although I wasn’t involved with this server when it was first bought the hosting was due to expire this week.  We spent some time investigating whether we could do what we needed with a virtualised server (using HyperV or similar), but came to the conclusion that we’d feel happier on a “proper” physical box. Also, the excellent Register 1 were offering a very attractive price on another 3 year deal with them.

We decided to take out that deal, because Register 1 really have been excellent.  Everything from their sales process to the customer & technical support really is first rate – as was to be proved during our migration.    While we waited for the server to be built and the Windows 2008 R2 operating system installed my colleague and I discussed how we’d move everything over.

Rationalisation

Over time, we’d acquired a fair bit of “junk” on the old server, ranging from websites that were no longer used, old databases, 2 instances of SQL, domains we didn’t need and utilities we used once – the usual sort of stuff.  This was a perfect opportunity to cut down on these and only move over the stuff we were interested in, so we compiled a list of our various sites and thought about the move.

Thankfully, my friend installed the various pre-requistes on the new server:  SQL 2008 (2 instances, 1 for mail 1 for websites), MySQL (for Word Press), PHP (for WordPress) and some utilities.  Once we had these we thought we’d copy over the websites files, backup and restore the databases, turn off email on the old machine, back it up and restore it to new machine, move the IP address and hey presto – no DNS changes, nice and simple!

Fail.

hMailServer

Everything went fine, until it came to the e-mail server.  We use hMailServer, and only host around 150 accounts on about 50 domains – it’s perfect for us.  Other than some trouble with an AntiVirus update last year, it’s been bullet proof.  With such a low user count we thought we could use the built in backup & restore routine to move the accounts and email to the new server with no loss of email, and because we’d have it disabled on the old server while we did this mail would queue at relays for us until we moved the IP address to the new server.

Then we found out that some of our users are accessing the server using IMAP and leaving old mail on the machine, making the total file size too big for the hMailServer backup/restore process.  Major major headache.

Manually Moving hMailServer

This left us with a big problem, we could import the accounts but not the messages – which clearly our users were relying on being on the server.  After we’d imported all the accounts, we moved the mail across manually – it sits as files on a disk in a complicated directory structure.. but then we needed to re-create the database entries which tell the software that this mail exists.

Luckily, hMailServer provide a tool called DataDirectorySynchroniser which will fly through the mail directory structure and re-write all the relevant entries into the database.   The big snag is that this does not recreate IMAP folder structures, or the read / unread flags.  This would’ve meant that one of our users would have had about 5,000 unread mails all of a sudden.

Cue SQL!

This problem was caused because the message flags are held in the database, and the tool can only work on the information it has – the file on the disk.  However we had the old database to hand and with a bit of crafty SQL and some C# code we managed to move everything.  If you’re in the same boat, here’s what we did:

  • Recreate the IMAPfolders table, noting that account ID’s will have changed and will need cross mapping.
  • Update messages table to reflect the folder ID for each message.
    Be careful here, because the message ID and the folder ID will not be the same.  We linked the message in the new messages table to the old table by using the last 42 characters of the file name, a GUID so guaranteed to be unique; then we created a lookup table with an “old folder ID” and a new folder ID too.
  • Update the messages table to have the correct flags for each message.
    Again looked up by the GUID of the message, and remembering that the flags are stored differently in the latest version, where they used to be stored in individual columns for read, replied etc – they are now binary OR’d, the new values are here.

After that, and a lot of worry on our part, it all works – with one exception:  someone’s iPhone.  For some reason it’s missing some messages out of a users Inbox which are 100% present when you look at it in a “full blown” email client.  I suspect this is something to do with either the way that hMailServer implements the SORT function, or the way the iPhone implements IMAP….  I shall let you know what I find out.

Oh, and I think everything else went OK – so if you see any broken links, images or 404’s on this site (or any of mine) can you let know please?

Paddington Bear: New Heli Expert.

One of the problems with the internet, blogging, forums and social media generally is that people think because they have the “cloak” of anonymity which is provided by the internet they can say what they like without thinking about it – often without even leaving their real name.  This is why I moderate comments on my blog – I want them to be useful to other readers.

This morning I awoke to find an email from the blog software informing me of the following comment needing moderation.

You brits and all your biotching LOL, Tea? in Bed with a cup of Tea and the Blades Magazine? lol… good thing you guys dont build helicopters, they’d be in the shop constantly like your piece of shit Jags and Range Rovers.. please! spare the world your assesments 🙂

This was left on an article where I criticised the accuracy of the running costs given by Robinson about their helicopters.  The commenters name:  Paddington Bear.  Yep, the one and only fictional cartoon character is now a self appointed expert on British Whining, Jags, Land Rovers and Helicopters.  Awesome, that bear has come on!

This comment bugged me, not because it’s inaccurate – but because whoever made it doesn’t have the courage of their convictions and won’t put their name to it.

If you’re going to comment at least engage in conversation, make your point without being rude and have the same manners you’d have in the real world.  Or don’t, but if you don’t then don’t expect people to give you the time of day…

… I trashed the comment!  (Not even the emoticon at the end could save it!)

Government Websites

In an age of austerity?  Really?

If you live in the UK you can’t of helped but notice that of late the new Tory / Liberal Democrat coalition government have been cutting anything that stands still for long enough.  We are in an age of austerity apparently.  This has prompted all sorts of clever questions made under the Freedom of Information Act by journalists about government spending, and tech journalists are no exception.

Business Link Logo
£105m Website

The BBC’s leading technology correspondent is Rory Cellan-Jones – he’s generally a very smart fellow and obviously has a number of good connections.  Today he made a blog post about the Business Link website costing £105,000,000 over a 3 year period.

Yes, £105 million pounds for a website.

Clearly there is righteous indignation all round, and outright amazement that a website can cost £105m.  This then prompted a former civil servant who now runs his own consultancy, Simon Dixon, to comment on what he felt Rory had missed.

How?

In short, Simon says it can cost £35m a year to run a website because:  it can.

And scarily, he’s right.  While I have no experience in the public sector I have seen the same thing happen in the private sector but usually only in large corporations.

I think the reason for it is a little different to that which Simon suggests (that it is because big consultancys get involved and the money is there).  I think its because we get involved in my pet hate:  I.T. for I.T.’s sake.  This is when we, as IT professionals do things because we believe thats how they should be done, or because we want a new tech on our CV, or its the current “favourite”,  forgetting the core purpose of what our client wants.

We should be about helping our clients (be they public or private sector) improve their output, or achieve their goals in the most cost effective way.  One of the comments on Simon’s blog just about sums it up for me:

Factor in the endless box-ticking requirements generated by the ITIL and PRINCE2 job-creation methodologies…

Clearly I dont think any sane person would argue against having “best practices” and “methodologies” which allow us to get our jobs done in the most effective way.  But do the likes of ITIL and PRINCE2 really do that?    In my experience the problem with them is that they are too generic and allow themselves to be bent by persons various to suit whatever aim they currently have… do they result in better IT projects?  Yes, mainly.  But do they result in our clients producing widgets more efficiently, or getting information out better?  Only as a bi-product.

A place for everything and everything in its place.

SEO Experts?

I spend most of my time writing applications for industrial customers, but I also look after some websites for small businesses locally.  Most of them are just static stuff, but some include a small CMS or an online store.  I get a few questions asked of me about SEO – Search Engine Optimisation.

As the name suggests SEO is the process of optimising your website so that the automated software which the likes of Google, Yahoo! and Bing use are able to index your site properly and thus your site is returned toward the top of search results for relevant searches.

SEO is, in my mind, somewhat of a black art.  None of the search engines are going to tell you exactly how they rank a page – to do so would mean that people would exploit the information to ensure that their site was returned at the top of searches even if it wasn’t as relevant to that search as one which wasn’t exploiting the information.  What the likes of Google do instead is publish “best practice” guidelines and make a suite of tools available to help you make sure your site is indexed and ranked appropriately. For instance, the Google Webmasters Blog is available and offers a real insight into how Google works.

Experts.

Obviously if you have taken the trouble to create a site then you want to get it ranked high in search results.  I’m no expert at SEO, but I try and make sure I follow the guidelines from Google and make sure that best practice is followed.  If someone has a comment on a site I’ll check it out and change it if need be.

However, over the last 3 days I have been “bugged” by two so called “SEO experts.”  I’ll add that they call themselves this, and not me.  One even called himself “pre-eminent,”  how modest.  While I won’t get into the technical advice they were giving, what really bugged me is that one of them didn’t realise I work in IT and know about this and he was talking to my brother about his website while I was stood in the room.

Apparently the chap has been working on SEO for 15 years.  Really?  Really? Google was formed in 1998, Yahoo in 1995 and MSN (now Bing) in 2005.  But this guy has been optimising websites for search engines for 15 years.

I bet he hasn’t even heard of Lycos (1994), or AltaVista (1995).

More complicated than you’d think…

Well, I am a few days into my blog now and have  spent a great deal of time setting the site up; certainly more than I imagined I would.  Despite WordPress’s famed “5 minute install,” I have found myself spending my spare time over the last few days adding tweaks, plugins, themes and debugging the same to try and get the blog nearer to where I would like it to be.

I’ve read multiple websites and other blogs which contain advice for new bloggers and (see interesting links) and most of the advice seems pretty sensible.  In the hope it might help other new bloggers, here’s what I have added to the base WordPress 2.9.2 installation so far, and why.

  • Awsom News Announcement.  My theme doesn’t allow for a static message on the home page, and I wanted a little introduction.  Having used this plugin before elsewhere, it was a must.
  • Contact Form 7.  Not wanting to fill my inbox with spam by putting my email address on the internet, I wanted a contact form, and this was highly recommended.  I did have some problems with using it and permalinks on my server (I use an IIS server and a 404 error handler to handle permalinks) – but found a solution on the WordPress forums.
  • Postalicious.  This required me to sign up for a Delicious.com account too, but came highly recommended as a good way of automatically posting the links which interest me on a daily basis.
  • WP-Cumulus.  Rather than a normal tag  cloud which just lists your tags as plain text, this takes them and turns them into a small flash “movie” which is more interactive, and I think better looking.
  • WP to Twitter. Pretty much as the name suggests, this plug in tweets an announcement of new blog posts on my twitter account.

And there is still more to do!  I need to improve the prominence of my RSS feeds, put more useful information in the right hand column, improve the SEO, and do some layout tweaks at the least…  but this could just be my being a perfectionist.  All / any suggestions greatly appreciated – leave a comment!

Hello world!

Well, it seems appropriate to leave the title of the first entry on my blog as “Hello World!” partly because its nearly always the first example in any computing book and partly because this is my first foray into the world of blogging.

I’ve been inspired to start blogging as a result of a couple of blogs which I read regularly, some are Helicopter related exclusively but some have mixed content which reflects the authors life (although I probably started reading them because of the flying content).

I hope my blog proves to be interesting to people, but my motivation is a little selfish too.  I often have thoughts about subjects which I think if I wrote about them it would help me think about the topic in greater depth, rather than being naturally dismissive; a tendency of mine.

I’m now faced with the dilemma’s of coming up with good tags & categories to help with the SEO of the site, tweaking the theme, finding the right widgets and all the other stuff which makes for a good usable blog which (hopefully) generates repeat visitors.  So, be prepared for the site to evolve over time… and thanks for visiting!