Blogging Archives - Page 2 of 5

Reducing website errors with HTTP 301 redirects

Posted on Wednesday 19 September 2012Wednesday 19 September 2012 By Mark Wilson

This content is 13 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

A couple of weeks ago, I wrote about a WordPress plugin called Redirection. I mentioned that I’ve been using this to highlight HTTP 404 errors on my site but I’ve also been using the crawl errors logged by Google’s Webmaster Tools to track down a number of issues resulting from the various changes that have been made to the site over the years, then creating HTTP 301 redirects to patch them.

Redirections as a result of other people’s mistakes

One thing that struck me was how other people’s content can affect my site – for example, many forums seem to abbreviate long URLs with … in the middle. That’s fine until the HTML anchor gets lost (e.g. in a cut/paste operation) and so I was seeing 404 errors from incomplete URLs like http://www.markwilson.co.uk/blog/2008/12/netboo…-file-systems.htm. These were relatively easy for me to track down and create a redirect to the correct target.

Unfortunately, there is still one inbound link that includes an errant apostrophe that I’ve not been able to trap – even using %27 in the redirect rule seems to fail. I guess that one will just have to remain.

Locating Post IDs

Some 404s needed a little more detective work – for example http://www.markwilson.co.uk/blog/2012/05/3899.htm is a post where I forgot to add a title before publishing and, even though I updated the WordPress slug afterwards, someone is linking to the old URL. I used PHPMyAdmin to search for post ID 3899 in the wp_content table of the database, from which I could identify the post and create a redirect.

Pattern matching with regular expressions

Many of the 404s were being generated based on old URL structures from either the Blogger version of this site (which I left behind several years ago) or changes in the WordPress configuration (mostly after last year’s website crash). For these I needed to do some pattern matching, which meant an encounter with regular expressions, which I find immensely powerful, fascinating and intimidating all at once.

Many of my tags were invalid as, at some point I obviously changed the tags from /blog/tags/tagname to /blog/tag/tagname but I also had a hierarchy of tags in the past (possibly when I was still mis-using categories) which was creating some invalid URLs (like http://www.markwilson.co.uk/blog/tag/apple/ipad). The hierachy had to be dealt with on a case by case basis, but the RegEx for dealing with the change in URL for the tags was fairly simple:

Source RegEx: (\/tags\/)
Target RegEx: (\/tag\/)

Using the Rubular Ruby RegEx Editor (thanks to Kristian Brimble for the suggestion – there were other tools suggested but this was one I could actually understand), I was able to test the RegEx on an example URL and, once I was happy with it, that was another redirection created. Similarly, I redirected (\/category\/) to (\/topic\/).

I also created a redirection for legacy .html extensions, rewriting them to .htm:

Source RegEx: (.*).html
Target RegEx: $1.htm

Unfortunately, my use of a “greedy” wildcard meant this also sustituted html in the middle of a URL (e.g. http://www.markwilson.co.uk/blog/2008/09/creating-html-signatures-in-apple-mail.htm became http://www.markwilson.co.uk/blog/2008/09/creating-.htm-signatures-in-apple-mail.htm) , so I edited the source RegEx to (.*).html$.

More complex regular expressions

The trickiest pattern I needed to match was for archive pages using the old Blogger structure. For this, I needed some help, so I reached out to Twitter:

Any RegEx gurus out there who fancy a challenge, please can you help me convert /blog/archive/yyyy_mm_01_archive.htm to /blog/yyyy/mm ?

Tuesday 18 September 2012 17:23 via TweetDeck ReplyRetweetFavorite

@markwilsonit

Mark Wilson

and was very grateful to receive some responses, including one from Dan Delaney that let me to create this rule:

Source RegEx: /blog\/([a-zA-Z\/]+)([\d]+)(\D)(\d+)(\w.+)
Target RegEx: /blog/$2/$4/

Dan’s example helped me to understand a bit more about how match groups are used, taking the second and fourth matches here to use in the target, but I later found a tutorial that might help (most RegEx tuturials are quite difficult to follow but this one is very well illustrated).

A never-ending task

It’s an ongoing task – the presensce of failing inbound links due to incorrect URLs means that I’ll have to keep an eye on Google’s crawl errors but, over time, I should see the number of 404s drop on my site. That in itself won’t improve my search placement but it will help to signpost users who would otherwise have been turned away – and every little bit of traffic helps.

Redirection – an essential plug-in for WordPress users

Posted on Wednesday 5 September 2012Wednesday 5 September 2012 By Mark Wilson

Last year, a combination of a loss of service from my hosting provider and my appalling backups meant that this website was temporarily wiped off the face of the Internet. It’s never recovered – at least not in terms of revenue – and it taught me an important lesson about backups (it’s all too easy to forget the hours of effort that go into a “hobby” site like this one…).

Whilst the blog posts were restored, and I took the opportunity to apply a new theme to the site (it’s probably due another one now…) but some of the images had got AWOL along the way. I’ve been ignoring that (mostly) but decided I really should do something about it when an old post was picked up by a journalist today and I realised it had a missing graphic.

I remembered a WordPress plugin that I used on another site recently, for managing redirects when access to the .htaccess file is not available. The plug-in, written by John Godley, is called Redirection, and one of its modules will report on HTTP 404 errors, like the ones that my missing graphics will create. I know there are other tools that can do this for me (Google’s Webmaster Tools, for example, or trawling through the web logs) but it’s an easy way to see when a 404 has been returned in order to investigate accordingly. So far this afternoon, I’ve tracked down and replaced around 8 missing graphics and one broken permalink using the logs from Redirection. I’m now scanning through the rest of John’s plugins to see what else I’m missing and will certainly be donating later…

Disabling comments for all posts on a WordPress blog

Posted on Tuesday 17 July 2012Tuesday 17 July 2012 By Mark Wilson

Long-time readers of my blog will know that I used to manage the Fujitsu UK and Ireland CTO Blog (which we’ve recently closed, but have left the content in place for posterity) and I’m still getting the comment notifications (mostly spam). Many of the posts have HTTP 301 redirects to either mine or David Smith‘s blogs (I found a great WordPress plugin for that – Redirection) but, for those that remain, I wanted to turn off comments. Doing this individually for each post seemed unnecessarily clunky but there is, apparently, no way to do this from the WordPress user interface (with database access it would have been straightforward but I don’t have that level of access).

There is a plug-in that globally disables all comments – named, rather aptly, Disable Comments – except that the blog is part of a multi-site (network) install and I’m not sure what the broader impact would be…

No bother, I found a workaround – simply set all of the posts to close comments after a certain number of days. The theme that someone has applied to the site (since I stopped working with it) doesn’t seem to respect that, and still leaves a comment button visible, but anyone with a well-developed theme should be OK…

Adding extra social sharing services to WordPress with JetPack (ShareDaddy)

Posted on Wednesday 9 November 2011Wednesday 9 November 2011 By Mark Wilson

Last night, as part of the rebuild of this site, I reinstated the social sharing links for each post. In the old site they had been implemented as bespoke code using each social network’s recommended approach (e.g. Twitter or Facebook‘s official button codes) but presentation becomes problematic, with each button having a slightly different format and needing some CSS trickery to get it right.

I looked into a variety of plugins but they all had issues – either with formatting or functionality – until I stumbled across reference to WordPress.com’s social sharing capabilities. If only I could have that functionality on a self-hosted (WordPress.org) site…

…As it happens, I can – WordPress.com’s social sharing is based on the ShareDaddy plugin, which is part of a collection called JetPack. ShareDaddy is also available as a freestanding plugin but now I have JetPack installed I’m finding some of the other functionality it gives me useful (and it’s not possible to activate ShareDaddy if you have JetPack installed).

I need to make some changes (like working out how to hack the code and turn off the count next to my Tweet/Like/+1 buttons – it’s embarrassing when the number is small!) but I’m happy enough with the result for now. One thing I did need to do though was to add some services that are not yet in the JetPack version of the plugin (one of the major advantages of ShareDaddy is how simple it is to do this).

The first service I added was Google +1, for which Di Turner has produced a plugin to extend ShareDaddy.
The other important one for me is LinkedIn (it’s in ShareDaddy but not yet in the JetPack version). Ryan Markel has created a post to describe the process for adding custom sharing services (Paul Robert Lloyd’s social media icons are useful for this) and he’s listed the settings for some services. Brad Dalton has listed some more, including LinkedIn (the one I needed).
Finally, I found a forum post from airodyssey with details of an improved Print option, using the PrintFriendly service.

Rebuilding my site: please excuse the appearance

Posted on Friday 7 October 2011Tuesday 18 September 2012 By Mark Wilson

Regular readers may have noticed that this site is looking a little… different… right now.

Unfortunately, my hosting provider told me last night that they had a disk failure on the server. Normally that wouldn’t be a problem (that’s why servers have redundant components right? Like RAID on the disks?) but it seems this “server” is just a big PC. I can’t get too mad though… the MySQL database backup scripts have been failing for a month and it was my sloppyness that didn’t chase that up, and it was me who hadn’t made sure I had a recent copy of the file system…

So, as things stand:

I think I have restored all posts ~~from 2004 until almost the end of August 2011~~;
~~I need to restore the later posts and comments (using copies from FeedBlitz, Google Reader, etc.);~~
~~There are no plugins (so things look odd);~~ Some of the plugins have been reinstalled (but things may still look odd);
~~There are no graphics (they were hosted outside WordPress)~~ I’ve restored ~~all~~ most of the graphics and other external media but there are still some I need to track down;
I have not restored the theme (~~so I’m using the WordPress defaults and~~ there is no mobile theme);
~~The theme I’m using does not specify UTF-8 encoding so lots of Â characters;~~ Still some spurious characters appearing on some pages…
There are no fewer ads (which you might be happy about, but I do still need to pay the bills).

Please bear with me whilst I get things back… it may take some time as it needs to fit in between other activities but it might also be a good thing (new theme has been long overdue and I might even get smarter about my backups…).

And, if you spot another problem, please let me know.

[Updated at various points as the site has been restored]

Attempting to track RSS subscribers on a WordPress blog

Posted on Monday 9 May 2011Sunday 8 May 2011 By Mark Wilson

This content is 14 years old. I don't routinely update old blog posts as they are only intended to represent a view at a particular point in time. Please be warned that the information here may be out of date.

As well as my own website (which has precious little content these days due to my current workload), I also manage the Fujitsu UK and Ireland CTO Blog. Part of that role includes keeping an eye on a number of metrics to make sure that people are actually interested in what we have to say (thankfully, they seem to be…). Recently though, I realised that, whilst I’m tracking visitors to the blog, I’m missing hits on the RSS feed (because it’s not actually a page with the tracking script included) -Â and that’s a problem.

There are ways around this (I use Google Feedburner on my own blog, or it’s possible to put a dummy page with a meta refresh in front of the feed to pick up some metrics) but they have their own issues (for example the meta refresh methods breaks autodiscovery for some RSS readers) and will only help withÂ new subscribersÂ going forwards, not with my legacy issue of how many subscribers do I have right now.

There is another approach though: using a popular web-based RSS subscription service like Google ReaderÂ to see how many subscribers it tracks for our feed (the sameÂ metricsÂ are available from Google’s Webmaster Tools).Â The trouble is, that’s not all of the subscribers (for example, a good chunk of people use Outlook to manage their feeds, or other third-party RSS readers). If I use my own blog as an example, Google Reader shows that I have 247 subscribers but Feedburner says I have 855.Â Those subscribers come from all manner of feed readers and aggregators, email subscription services and web browsers (Firefox accounts for almost 20% of them) so it’s clear that I’m not getting the whole picture from Google’s statistics.Â

Google Reader Subscribers

Google Feedburner Subscribers

Does anyone have any better ideas for getting some subscriber stats for RSS feeds on a WordPress blog using Google Analytics? Or maybe from the server logs?

Google Analytics: Honing in on the visits that count

Posted on Wednesday 30 March 2011Thursday 31 March 2011 By Mark Wilson

Every week I create a report that looks at a variety of social media metrics, including visits to the Fujitsu UK and Ireland CTO Blog.Â It’s developing over time – I’m also working on a parallel activity with some of my marketing colleagues to create a social media listening dashboard – but my Excel spreadsheet with metrics cobbled together from a variety of sources and measuring against some defined KPIs seems to be doing the trick for now.

One thing that’s been frustrating me is that I know a percentage of our visits are from employees and, frankly, I don’t care about their visits to our blog.Â Nor for that matter do I want my own visits (mostly administrative) to show in the stats that I take from Google Analytics.

I knew it should be possible to filter internal users and, earlier this week, I had a major breakthrough.

I created an advanced segment that checked the page (to filter out one blog from the rest of the content on the site) and the source (to filter anyone whose referral source contained certain keywords – for example our company name!).Â I then tested the segment and, hey presto – I can see how many results apply to each of the queries and the overall result – now I can concentrate on those visits that really matter.

Google Analytics advanced segment settings to remove internal referrals

Of course, this only relates to referrals, so it doesn’t help me where internal users access the content from an email link (even if I could successfully filter out all the traffic via the company proxy servers, which I haven’t managed so far, some users access the content directly whilst working from home), but it’s a start.

The other change was one I made a few months ago, by defining a number of filters to adjust the reporting:

Firstly, I exclude traffic my home IP addressÂ (this has become easier now as it no longer requires the creation of regular expressions).
I also force request URIs into lower case.
Finally, I make sure that the full referralÂ address is shown.

Unfortunately filters do not apply retrospectively, so it’s worth defining these early in the life of a website.

London Bloggers Meetup (#LBM): January 2011 – 10 lessons and tips for blogging

Posted on Thursday 13 January 2011Tuesday 8 February 2011 By Mark Wilson

A couple of nights ago, I went along to the London Blogger’s Meetup – which is basically a big social event for bloggers! It’s wierd, most of the bloggers I meet normally are in tech but I’m never stopped to think that an event like this doesn’t just attract geeks like me (duh!).Â I’m a bit shy at these things, but I did meet some great people – as well as lusting after the Dell Vostro V130 laptop that was given away.

The highlight of the evening though was Andy Bargery’s short presentation giving 10 lessons and tips for blogging.Â Andy has shared the Prezi and I’ve embedded it below:

10 Lessons & Tips on Blogging on Prezi

Blog Recap: 2010 in review

Posted on Friday 31 December 2010Tuesday 18 September 2012 By Mark Wilson

A couple of days ago, SQL Server MVP, Brent Ozar took a look back at what he’d been posting on his blog in 2010. I thought that was a good idea, so I’m shamelessly stealing his idea to highlight some of the key posts from the last twelve months on this blog. There were many more, technically-focused, ones but these are a good summary of the year’s events:

January

A look forward to SharePoint 2010 – Microsoft’s collaboration platform continues to improve (but so much of the benefit comes down to how it’s implemented).

February

Safer Internet Day: Educating parents on Internet safety for their children.

March

How UK iPhone users can save money – switch to a SIM only deal at the end of the contract.
Desktop virtualisation shake-up at Microsoft
Three phases of Microsoft support (often misunderstood).

April

Office 2010 is released to manufacturing – including some useful resources for those looking at deploying a new Office release.
Introducing Microsoft PowerPivot – a possible answer for collating data from across the organisation?

May

Overview of Windows Phone 7 – some of the details may have changed between this post and launch, but it explains what Microsoft is trying to achieve.
Highlights from the second Dell B2B Social Media Huddle (#dellb2b) – I’m hoping there will be a third one soon!
So you think social media is a fad?

June

Why CEOs don’t blog/tweet (with thanks to Rob Shimmin, who presented this at the Dell B2B Social Media Huddle).
How to be an Internet private eye (based on another session at the Dell B2B Social Media Huddle).
Lies, damn lies, and Apple marketing – how can someone with as much Apple kit as me be called an #ihater?
Installing Ubuntu on Windows Virtual PC – it’s harder than it should be, but it is possible.

July

Move along folks, nothing to see here (well, there were a couple of posts, but nothing really worth shouting about)…

August

Publishing: yet another industry clinging on to an outdated business model (and in danger of falling ito the same traps as the music industry).
Yikes! My computer can tell websites where I live (thanks to Google) – Internet privacy is an oxymoron.
Playing with video on the iPad (aka jumping through hoops because of the lack of Flash support…)
Running Spotify and other apps as background tasks on an iPhone 3G (with iOS 3.x – because iOS 4 is too slow for old iPhones).

September

Jailbreaking does not equal piracy (although, from reading the consumer-focused media, you’d be forgiven for thinking it did).
Hyper-V R2 Dynamic Memory: over-subscription vs. over-commitment – trying to cut through the FUD and explain the differences between good and bad resource allocation.
Keeping Windows alive with curated computing – how the applications store model could potentially increase software quality and breathe new life into an aging operating system.

October

How Steve Ballmer told me what to do with my iPad – it seems that Microsoft still believe Windows is a suitable choice for tablets…
After 3 months with my iPad, was it still a good purchase?
Windows Phone 7 will fail if the channel is not ready – let’s hope I’m wrong about the failure… but the channel was certainly not ready!

November

Getting hands on with Windows Touch (with a monitor on loan from HP…)
How tablets will disrupt desktop managed service delivery – a look at why next generation tablets (such as the iPad) have the potential to shake up end user computing.
easyJet’s journey into the clouds – a look at how one of the UK’s leading low-cost carriers has adopted cloud computing within its IT strategy.
Six months to set up a new blog – what were you doing man? (aka, why there hasn’t been much blogging around here recently – I’ve been setting up a new blog platform at work).

December

Tumbleweed (and some geekery) – although there are plenty of posts in the pipeline for next year.

Even though 2010 was a quiet year on the blog (120 posts this year is a record low – especially when considering I averaged almost one a day in 2008!), I did win a Computer Weekly Blog Award, and I have been busy elsewhere:

I switched roles at Fujitsu, moving out of a technology-focused role and into one which concentrates on thought leadership and innovation. As part of that, I’ve been working to getting the Fujitsu UK and Ireland CTO blog off the ground – including editing a fair amount of the content there.
I’ve also seen the last of the videos I produced for Microsoft go live (running Hyper-V Server from a USB drive)
Mark Parris and I continue to try and run Windows Server User Group events. We’ve experienced some “difficulties” this year but it looks as though things are changing for the better at Microsoft UK and hopefully the remaining blockers will be removed soon…
I’ve been a part of most of the IT TweetUps (#ITTU) that we’ve run this year.
I was re-awarded my Microsoft Most Valuable Professional (MVP) status for the third year running.
I gave a presentation on Internet Safety to parents at my son’s school.
I’ve also been prolific on Twitter (@markwilsonit) and I’ll try and post some Twitter highlights over the next few days.

As for 2011, well, expect this blog to remain one of my main online activities but, as I spend less time working directly with technology and more working on strategic IT issues, the focus is changing. Indeed, some people think blogging is dead (it’s not) – others say it is now more about content marketing! Whatever the semantics, I’ll be here for a while yet. Thanks to everyone who reads my “stuff” and engages with me – whether it’s as a blog comment, an e-mail or a tweet – and have a happy and prosperous 2011.

markwilson.it

get-info -class technology | write-output > /dev/web

Blogging