Tuesday 18 December 2007

You pay for what you get

Civil servants are reeling in the wake of the horrific news that CDs containing the records of Her Majesty's Revenue and Customs (HMRC) database have been lost, and the futher news of DVLA data being lost. The full cost to tax paying members of the public may not be fully realised for years to come.


This debacle is not only an example of incredibly poor information management, but also a sign of a wider problem in the UK, that you get what you pay for. Or in this case you don't get what you pay for. 


Information management is, or rather was, at the heart of British life. Travel to former colonies like India or Australia and they'll gladly inform you of the regimented behaviour towards information that led to government structures that have served the sub-continent and prison colony well to date. Yet, those standards have dropped.


An IWR reporter remarked as we debated the issue, how come information of this value was so easy to simply download and burn to a CD?  Technology preventing such blunders is not new and is a basic function of many information management systems.


Revelations of the missing information came a day after a report on the BBC's Today programme that the Driving Standards Agency and vehicle licensing body the DVLA employees take on average three weeks sick leave a year. Missing information and low staff moral are examples of a civil service that is poorly funded and poorly managed.

It is too easy to wag the finger of blame at civil servants, when in truth a much wider debate needs to take place.  As tax payers and child benefit recipients we are angry and worried, as information professionals we are dumbfounded that such lapses could have occurred.  What of our role as citizens?  Since the 1980s we've wanted a John Lewis service, but only paid Tesco value brand prices.  If you want John Lewis quality, you pay John Lewis prices.  On the high street this modus operandi fits well with the public, as they choose when they want quality and when they want to increase their spending. So why is it that we expect our state services to manage high level information on a low level budget?

This needs to be a debate about our society and its values, literally, as well as an improvement in information management.

How will a slowdown affect ECM decision making?

With 2007 all but put to bed, a lot of people are looking forward to 2008 with not a little trepidation. And the cause of that fear is the prospect of a decline in the macro economy.


I'n no economist and have healthy scepticism for many self-proclaimed experts, but there is a lot of consistency in terms of the impact on IT buying. Down markets can be very healthy for technology buyers and very disuptive on the incumbents. In the early 1990s, for example, companies like Microsoft, Dell and Oracle were able to win chunks of market share as firms looked to take advantage of the move to client/server architectures, and shift away from mainframes.


In the wake of the dot-com collapse, software-as-a-service companies such as Salesforce.com took advantage of the squeeze on capex spending by offering subscription-based pricing that let firms get projects up and running very quickly.


In ECM, it's pretty obvious that firms offering low-cost, fast-deployment options stand to prosper if the economy struggles. That could be good news for SharePoint, Alfresco and companies offering freebie tools such as IBM Yahoo OmniFind and Microsoft's Search Server.

Friday 14 December 2007

pdf forgeries

It's not every day an email starts like this:

PDF files have essentially become the standard within the business community because of the need for a protected file. However, the software is useless if users need to edit the document in any way.

Oh ho ho (he says, seasonally), does this mean the sender has a way to edit pdfs undetectably? This I must see. The software is called deskUNPDF Professional and it comes from Docudesk. It promises to:

convert PDFs into Word documents, view data in XML-format, or convert the files into HTML for a web presentation

It actually throws the result out in a huge variety of standard formats including images, csv and Sony Reader's lrf.


The danger appears to be that you could 'round trip' a pdf by exporting to word or an image file, fiddle around a bit and then print to pdf using one of the many pdf writers around.


Fortunately for people trying to protect their pdfs, the exercise proved less than satisfactory. Forgeries are evident. So the real value of the software is that you can move a pdf into another format.


Below, I've round-tripped an image page and a text page from the World Wildlife Fund's "Sustainability at the speed of light". It is utterly evident that I've been up to no good. And that, I believe, should be proclaimed as a valuable feature of the software.


Here's the original and the pdf output going via Word:


Compare1


It's shrunk and there's a bit of textual overlap. The Word export has strange column breaks and the different text blocks appear to be in the wrong sequence when editing.


In honour of the Kit Kat tv commercial where the pandas roller-skate when the cameraman's not looking, I thought I'd do a bit of panda substitution using the GIMP.


Compare2


I didn't add the speed streaks (and I cannot repeat the phenomenon) but they do look rather nice. However, even had my panda been better executed, this is clearly no way to forge a pdf.


Bear in mind that a lot of pdfs are protected by copyright and you need to be sure you're not going to land yourself in hot water by republishing. (Hopefully my snippets aren't going to get me into trouble.)

Tuesday 11 December 2007

Information professionals guiding you to the best bits of the blogosphere

Ben Toth reveals how he keeps his information intake healthy and why blogging can be more valuable than social networks such as Facebook.

Q Who are you?
A Ben Toth, 48, domiciled on a farm in Herefordshire. I trained as a librarian at University College
London about 15 years ago. I used to be the director of the NHS National Knowledge Service when it was part of Connecting for Health. The best known service it runs is the National Library for Health (www.library.nhs.uk). Currently, I’m designing the enterprise architecture for the National Institute for Health Research (www.nihr.ac.uk). I’m also writing a book on Health 2.0, which will be published in parts later this year.

Q Where is your blog?
A You’ll find it at http://nelh.blogspot.com

Q Describe your blog and the categories on it
A It’s just a public notebook really. Its content tends to reflect what I’m working on, but it’s mostly about libraries, health and the web. I could use Microsoft Word to keep my notes. I could use del.icio.us. But a blog is more visible and more in the flow of the things I’m reading, which
are almost invariably on the web. A lot of the entries I make are just notings – highlight, right-click and
send to Blogger. I use tags but I’m not very strict about categorising things.

Q How long have you been blogging?
A Since about 2001. Eighteen months ago I lost all my entries and had to start again.

Q What started you blogging?
A I was helping my daughter set up a website as part of a Brownie project she was doing. I couldn’t use the National Electronic Library for Health servers and I didn’t want to manage Apache or pay someone to, so we used Tripod. Which worked, but it was difficult to use. And then I read about Evan Williams’ little project, which became Blogger, had a go with it, and haven’t looked back. It’s become a
habit, and I haven’t got tired of it yet.

Q What bloggers do you watch and link to, and why?
A These days I follow things through RSS if I can, so my blog-watching is mostly via a feed reader. The only blog I regularly visit is Dave Winer’s (www.scripting.com) because he’s taken blog writing to a level where the argument is developed through the day and so needs to be read on the page. I look at Techmeme (www.techmeme.com), but that’s not really a blog. I used to maintain a list of blogs
that I linked to through blogrolling, but I can’t see the point of doing that any more. The social web takes care of that sort of affiliation-showing much better.

Q Do you comment on other blogs?
A I don’t comment much. Sometimes I carp from the sidelines on e-healthinsider (www.e-health-insider.com), but I don’t think there’s much value in commenting or reading comments. That’s not to say that discussion isn’t valuable, but I’d rather read views as blog entries rather than comments on
someone else’s blog.

Q How does your organisation benefit from your blog presence?
A It’s the best way of keeping in touch with what’s going on, and keeping a blog maintains some
visibility to people.

Q How does blogging benefit your career?
A Blogging and RSS are really important for me professionally. They keep me up to date in a way
that nothing else can.

Q What good things have happened to you solely because you blog?
A Making professional contacts that I otherwise wouldn’t have and maintaining ones that might
otherwise have fallen off. In some ways blogging is more useful than LinkedIn and Facebook
as a social networking tool. But it’s really only a matter of time until traditional blogging gets divided
up between Facebook, Vlogging and Twitter.

Q Setting work aside, which blogs do you read just for fun?
A The Fake Steve Jobs blog was great (http://fakesteve.blogspot.com). And when I need a chuckle, I check out the Dilbert RSS feed (http://dwlt.net/tapestry/dilbert.rdf ).

What are the blogs in your sector that you trust?
A The reliably interesting starting points on library matters for me are:
www.earlham.edu/~peters/fos/fosblog.html
http://orweblog.oclc.org
www.philbradley.typepad.com
http://tomroper.typepad.com
And Jon Udell is a first-class technologist who happens to like libraries (http://blog.jonudell.net)

Time for ECM to stop selling fear

Over at CMS Watch, Tony Byrne mentions that he has heard the term “risk of incarceration” being used by reps as a spin on the older and somewhat more traditional meaning of ROI, “return on investment”. That shouldn’t surprise you too much on at least two counts.


First, enterprise content management (ECM) companies have become accustomed to pitching content management as a tool to “keep the CEO out of jail” by providing an audit trail of programs, files, messages and their associated creation, viewing, editing and other interactions. The selling spiel says: “Remember Enron? You don’t want to end up like that so you’d better have a good document management and retention strategy. Oh and if you were scared by Sarbanes-Oxley, there’s a ton more of that stuff coming down the line.”


Second, the company Byrne says he heard using the term was IBM, the company that was the originator, of course, of selling “fear, uncertainty and doubt”.


As I said at the top, I’m not at all surprised but if sarcasm is the lowest form of wit, then this is certainly among the lower echelons of effective sales and marketing. It gets an instant reaction but you might struggle to sell it twice –- as many firms are finding out the hard way.


For the last five years, ECM vendors have been touting regulatory compliance and reputation risk as threats to business. This was a crude but effective weapon in the down market after the dot-com collapse but, having made fortunes from scaring the bejeezus out of firms, the sales guys could really do with a hose down and a fresh approach.

Friday 7 December 2007

Icelandic data refuge

We're used to offshoring work to other countries to achieve cost reductions or follow-the-sun working, or both. We're also used to having some of our computing activities and services hosted remotely through web hosts, Salesforce.com, Google, Facebook et al.


Some of these companies run huge data centres and they are concerned about continuity of energy supplies. Some are siting themselves near renewable energy sources, others moving to where they can get the stuff cheap or to locations where they can avoid declaring their energy consumption. (True, but you can do your own research on that one.)


Anyone who's a serious consumer of power is trying to find ways to get the consumption down. Hardware and software suppliers are having a great time selling virtualisation software, efficient new kit and clever new cooling systems. And, when customers have all done this and got themselves sorted out, they'll still find that the need for computing resources will grow and they'll have to find new ways to cope.


Well, with a hat tip to an announcement by Data Íslandia and Hitachi Data Systems, another possibility has surfaced. Why not move all the hosts to safe countries where natural energy abounds? The 'safe' is probably the main challenge. If you think 'solar' then some of the hottest countries also happen to be the least desirable from this perspective.


Iceland, on the other hand, has a rather unusual combination of plenty of renewable geothermal and hydro-electric energy coupled with a cool climate. It has a technically literate population and it is relatively secure. No-one seems to want to invade it, for example.


Data Íslandia specialises in providing disk- and tape-based long-term data archiving services. Yesterday's Hitachi deal is based on its data management services which will enable multinational organisations to address the management, compliance and environmental burden of exploding data volumes. Data Íslandia director, Sol Squires, says "virtualising six-month old information, which is effectively digital toxic waste, is a very poor use of resources." Customers will be able to offload stale data while still having real-time access to it.


No doubt there are a million political reasons why I'm wrong, but Iceland strikes me as an environmentally agreeable and secure place to house our national digitised libraries. Maybe our tax records too.

Thursday 6 December 2007

IWR Information Professional of the Year Award

The IWR American Psychological Association Information Professional of the Year award has been announced and went, deservedly to Brian Kelly, UK Web Focus for the UKOLN organisation.


The award is judged by a panel of previous winners and the IWR editorial team. As editor of IWR when I judge the award I look for an individual who is pushing the limits of information, technology and making the role of the information professional as far as possible and making it an exciting role.  When looking through the final results I could see that the other judges felt the same way and Brian was an excellent choice.


Brian's role is a national Web co-ordinator, an advisory post funded by the educational body JISC and the Museums, Library and Archives Council (MLA).


In this role Brian is looking at the web as central resource for learning and research in higher education and is looking at ways to make the web a successful resource, which is a challenging role, because the web is still very young and is constantly changing. This can be seen with the recent changes dubbed Web 2.0, therefore Brian is going to be pretty busy for some time to come.


Based at the University of Bath, I know from information professionals I have dealt with in the academic sector that he is very well respected and his thoughts are often the basis for great debate within the industry. Linked to this is his blog, which is one of the most popular blogs in the sector.


I hope all IWR readers will join me in congratulating Brian for an award very much well deserved. 

Tuesday 4 December 2007

Jimmy Wales on the role of Wikipedia in society

Jimmy Wales, chairman of Wikipedia was the keynote speech of Online Information 2007 with a presentation Web 2.0 in action: Free culture & community on the move.


Starts with Britannica editor Charles van Doren 1962, who said the encyclopaedia should be radical, but Wales claims they have been anything but.


Wales280x293 Small showing of hands for those that have edited, although Wales believes it’s a good showing, "but not as many as college kids".


I consider us to be the Red Cross of information, he says as he describes its charitable status. Have 10 full time staff and will spend about $2 to 3 million this year, which is tiny compared to the major publishers. Vast majority of the money is from small donations, which he likes because its grass routes and not dependent on advertisers.


Wales talks about the desire to extend the languages that are in use on Wikipedia, including Hindi and Afrikaans.


Wiki is free in the sense of GNU, its free to copy, modify and distribute.


Shows a video of his travels to India and how he learnt that the local communities want to use the English version, as the English language is a route out of poverty. His organisation has been out to South Africa teaching students how to edit Wikipedia. "One of the things we have learnt is that if you can get five to 10 editors working together, it can make a great difference." These groups make progress and then they look towards outreach and who they can include. Hence the organisation has set up an academy to find the founding editors. It has begun in India, with 10-20,000 articles a month being put together by academy organisations.


Wikia is his next subject, a separate organisation with 66 languages, including a 67th, Klingon. Wales goes on to demonstrate using Google search results for Muppets and how the top result is the official site, but the rest of the results are from web based conversation, ie Wikipedia pages, forums and fan sites. He demonstrates an article on the Ford motor company and how on Muppet Wiki site, there is an article on Muppet Ford ads and how this demonstrates this level of information would never have been available before.


The search engine is a political statement, in a small P sense, Wales says. The proprietary software of the main players is a mystery in that people have no control of the accountability. The Wikia search will publish its algorithm.


Wales believes that the trust of social networks and setting up trusted networks can be utilised in search. .


On the role of collaboration, he asks the audience to imagine that they are designing a restaurants, discussing the idea that we trust the people around us, we don't put people in cages in restaurants because they will be using knives.
The wiki philosophy is to allow people to do good.

ECM needs to get usability - fast

New research from Oracle and IDG suggests that firms are failing to capitalise on unstructured content. Well, with Stellent now added to its acquisitions mountain, Oracle would say that, wouldn’t it? But the data is interesting nonetheless.


According to the report, two-thirds of “senior IT decision makers” in Western Europe think they have the unstructured data issue managed, or are on the right tracks to cracking the problem. The flip side of that is that 60 per cent say they can’t make business decisions based on unstructured data because it is either too hard to find or because it is sitting among other, irrelevant data.


The average organisation surveyed had 4.28 ECMs in place (!) with many, unsurprisingly, seeking to consolidate. Oracle suggests that this “raises the question as to whether European organisations actually understand that unstructured content is an enterprise-wide issue that requires a strategic enterprise-wide solution”.


That’s a dodgy conclusion. The proliferation of ECMs (and ERPs, databases, BI systems etc) might be better accounted for by the crazy growth patterns and the pace of change in modern technology-driven business. When Oracle itself came along with client/server databases, few smart companies said “sorry, we’ve already standardised on DB2 on the mainframe”.


One other data point is worth examination: 63 per cent of European enterprises “consider email as the primary source for managing unstructured content, with 86 per cent admitting that email is used as the primary source for sharing content”.


That’s refreshingly open but it’s not as “surprising” as Oracle suggests that email is often a vehicle for decision making. The fact that many of us use email as out primary means not only of communications but also for knowledge management, contact information and much else is as much an indictment of ECM usability as anything else.


This research is clearly Oracle positioning itself as the company capable of making ECM palatable for mainstream businesses who are dissatisfied by the big incumbents. Fair enough, the more ECM matters get an airing the better.


But it also suggests to me that ECM is still in its infancy. Alfresco’s John Newton is fitting ECM with social networking integrations to reflect his belief that ECM users will move from being 10 per cent of the orgainsation to over half of users. This Oracle data backs up the hunch that ECM might have to change fast to fit in with the way users want to work, rather than asking users to adapt to what software designers say is right.