Information World Review: April 2008

Monday 28 April 2008

Bouncing back to social intelligence

I know it's only been a week or two since I last wrote about enterprise social intelligence, but I just had a sneak preview of the latest release from Trampoline Systems. This is the newest offering in the firm's ongoing quest to provide organisations with the tools to extract and use the latent knowledge of their workforce. And it's called Sonar Dashboard.

Following the firm's earlier releases Sonar Server – the technology which effectively sucks out all the information from different corporate systems – and Flightdeck – the management diagnostics tool, Dashboard is actually what the end users get to play with.

Chief executive Charles Armstrong told me they've gone for the look and feel of a social networking site to minimise user training to virtually zero. And if you've been on Facbook or LinkedIn you'll have absolutely no problems using it. The main page look s a bit like a Facebook profile page, with regular updates on what all your contacts are currently working on.

Another page allows you to view the main topics a particular contact has been working on – the ones in larger and bolder type being the ones they've been involved in more often – and who they've been talking to. There's also an area where you can input information about yourself, upload CV or connect to your LinkedIn profile. As is Trampoline's wont, relationships between contacts can be viewed in an easy-to-digest graphical format.

Privacy is also a key factor here. Users can view an email every two weeks listing which information has been collected about the projects they're working on, and they can then choose which bits they want hidden from their contacts. In these days of uber-sensitivity about privacy and surreptitious data mining, it's an important part of the trampoline jigsaw.

I guess the point with this release is that the vendor is trying to avoid the mistakes of the unwieldy knowledge management systems of the 90s, by enabling the automatic extraction and updating of individuals' key information. In other words it does all the heavy lifting and, as Armstrong says, is why a consumer-type social networking tool is no good for this purpose, because it depends wholly on the individual to update their details themselves.

Friday 25 April 2008

Can the growth/sustainability circle be squared?

What does 'sustainability' mean to you? The more time I spend with IT vendors, the more I wonder if they see it as something for other people.

To quote from the frequently-cited Brundtland Report: "development that meets the needs of the present without compromising the ability of future generations to meet their own needs."

Lots of companies pay lip service to that but, somehow, it always results in them making more things for us to buy. It's as if they've had trouble getting to grips with the 'needs' part of 'meet the needs of the present'. Do we ever ask what we really 'need' in order to stay afloat in the today's complicated world?

Of course, IT does have the potential to address the 98% of carbon emissions (say) that are not attributable to IT operations. Many manufacturers tell a good tale. They speak of reuse of components in future products, of their adherence to this and that regulation and, especially, of how the application of more IT will help cut someone else's environmental footprint.

In the end, though, there's no hiding the fact that they're still hooked on growth. Understandable, by the way. And while they could possibly achieve this through services, the global vendors see hundreds of millions of souls in developing countries as fine prospects for more 'things', even if they are made of fewer and more easily recycled materials.

Perhaps, by getting in early enough with IT equipment as a travel or printing substitute, for example, these vendors can help the developing countries avoid some of the excesses of the West. Frankly, I'm not optimistic, but I'd love to be proved wrong.

Thursday 24 April 2008

Library & Information Show, e-Book update

There was certainly a buzz around the NEC's Library & Information Show (23- 24 April 2008). The topic of technology and e-books was one of the main concerns, writes Peter Williams

This interest was reflected in the presentation by Caren Milloy, e-books project manager, JISC collections on JISC’s national e-books observatory projects. According to Milloy in higher and further education interest in e-books extends beyond reference books to include text books. One of the problems is that the demand for e-books is hidden so publishers and aggregators have been slow to meet it, and as a result the development of coherent and workable business models has been equally slow. One key frustration for information professionals is knowing what e-books are actually out there. There is no central site providing a cohesive list.

JISC’s research shows that librarians in HE and FE are leading the demand for e-books, but that is a reflection of the demand they are hearing from students and to a lesser extent from teachers. The research tested the demand for e-books in business, engineering, medicine and media studies – a deliberately eclectic choice to see if, for instance medical students differs from media students in their use of e-books.

Looking around the LIS it is clear that with e-books the market is responding. For instance, Swets is due to launch the latest addition to its SwetsWise platform, SwetsWise ERM which helps information professionals keeps track of the licences they have – and that includes keeping tags on the right to browse e-books.

Information professionals’ utopia for e-books (according to JISC research) includes element such as concurrent usage, free archive, common standards, great integration with virtual learning environments (VLEs) and great metadata which encompasses not only texts but multimedia with open access. Simple eh? According to JISC e-books are a maturing market. That may be the case but from the discussions at LIS there is still some way to go in widespread knowledge, understand and adoption.

Wednesday 23 April 2008

Literacy Project encourages greater collaboration

According to a Google press release that landed in my inbox this morning, not only is 23rd April St George’s day, but also the date that Miguel de Cervantes and William Shakespeare died on in 1616.

Somewhat stretching this tenuous co-incidence, Google announced that in honour of such formidable writers, they were promoting ‘innovative literacy and reading-related projects’ through the Literacy Project initiative. Partners include the World Book Day organisation, Lit Cam and UNESCO Institute for Lifelong Learning.

For those of you who haven’t heard of it before, the Literacy Project is all about using the internet to connect likeminded literacy-promoting organisations to collaborate and communicate with each other. Today they will include some new tools to assist in exactly those kinds of initiatives. For example the project’s Literacy Map has been updated so organisations can update news on what they are working on as well as talk to others through the Literacy Project forum.

What you might find of interest are the academic papers that explore areas of improving literacy. Within the literacy and technology section there is a whole host of grey literature material in Google Scholar. This mass of potentially valuable (but largely unpublished) information, can originate from anything produced by organisations such as booklets, presentations and reports to name but a few. Information literacy is well served here.

If you will now permit me to insert my own tenuous link, May’s issue of IWR will be examining the possibilities of using grey literature. There will be pointers on where to go and how to get that valuable but obscure information. It’s an underutilised resource and there is an awful lot of information out there ripe for the picking.

In the meantime more of the Literacy Project here.

Monday 21 April 2008

Data, identity and Microsoft

Kim Cameron, Microsoft’s chief identity architect, was over in the UK last week, talking to government, analysts, internal folk … and me. Most of his time is currently spent on the massive CardSpace project which Microsoft hopes will have the same effect as putting chip and pin on the internet – basically it is being touted as the answer to our online identity verification woes.

Up until now, solutions to the problems of online fraud and even enterprise identity management have been less than perfect. One-time passcode generating tokens work OK, but there comes a point when your “fistful of dongles”, as Cameron calls them, becomes too unwieldy. Cameron’s answer revolves around the “Identity Metasystem” - his vision for the underlying architecture on which CardSpace is built which is cross-technology, cross-provider and as such probably stands the best chance of living up to its own hype.

It involves interaction between three different parties: identity providers, such as credit card companies, government, or even the individual consumer/web site visitor; the relying parties, which require said identities, such as a web site; and subjects, which could be any party about which claims are made.

It can get rather complicated from here, but basically the CardSpace software stores references to a user’s digital identity and then presents them as so-called Information Cards. When a user visits a site that supports InfoCards, they will then be presented with the CardSpace UI from which they can select the appropriate card. Once chosen, the CardSpace software will contact that identity’s issuer to obtain a signed token containing all the relevant information. It’s all about trying to borrow concepts of trust and verification from the physical world and make it all as user friendly as possible.

There are obviously serious data protection issues to be faced here too – as Cameron observed, in the past privacy has often had to be compromised to ensure security. It’s an issue they are well aware of: “If it’s a spy machine then [this project] goes nowhere,” observed Cameron. Well, thanks to some clever algorithms – isn’t it always about algorithms these days - they’re able to do just that. Don’t ask me how, but it will be interesting to see if CardSpace proves to succeed wherever all other verification technologies have not.

IT services group Eduserv were represented at the meeting too, for the work the company is doing with CardSpace. It just announced last week that ten local councils are trialing the software – given the amount of data loss incidents in recent times, it’s reassuring that local councils are looking at innovative ways to tighten their security practices and ensure the secure sharing of and access to data. Practical, real world applications like this of the still juvenile technology will be vital in the coming months and years to hone the technology and processes behind it and win over the sceptics.

Thursday 17 April 2008

Can computers really extract knowledge?

Knowledge management is theoretically impossible. Real knowledge sits between your ears, unseen until it is needed. As happened today. Someone mentioned Battenburg cake to me and all sorts of long forgotten knowledge about tea parties at my grandma's surfaced.

Not exactly a momentous bit of knowledge, but I joined a conversation on the subject on Facebook of all places. (The dyes in the cake are, apparently, dangerous.)

Recently, I visited a company that specialises in testing staff knowledge through questionnaires. The idea is to find out what an employee knows about their job and to determine whether there are any gaps that need filling or good results that need exploiting.

Boards of very large companies have rather taken to this system, a sort of asset register of the staff and their expected performance on the job. They can use it to correct weaknesses or develop strengths. And, should a crisis occurs in a particular department, they can quickly pull up staff information to help them figure out what went wrong.

Test results can also be measured against averaged results for other organisations in the same industry - a sort of performance benchmark.

It all sounds terrific in theory. The underpinning technology is fundamentally sound. But, as always, the acid test is in the implementation. And that involves humans.

By the time the strategy and raw information has found its way to the question designers, all intimacy with the subject matter will have been squeezed out. It's like speaking a foreign language. It doesn't matter how perfect your accent, a native will know you are a foreigner within a very short time.

I've just read a blog post by a member of staff at the receiving end of an assessment run by this particular system. Slightly tidied up and anonymised, he said, "The people who designed the questions and answers knew nothing about my line of work. The end result has been questions that don't make sense or which are so ambiguous that one needs to be a professor of English to understand them".

You can see why I've not mentioned the company name. I will return to it when I've tried the system myself and dug a little deeper into the particular circumstances around the above comment. But it seems clear that one important step was forgotten - did they try the questionnaires out on people who understood the subject before letting it out in the wild?

Tuesday 15 April 2008

Understanding e-books and information behaviours

A typically well-attended e-book seminar on day one of the London Book Fair (LBF) raised some poignant questions on e-book growth. Speakers this year were familiar faces such as David Nicholas, Director of the School of Library, Archive and Information Studies at University College London (UCL), Sage Publishing’s Rolf Janke, and Mark Carden from MyiLibrary, more of that later.

When I blogged on last year’s LBF e-book seminar, the talk was of tipping points and a greater increase of e-book activity. Over the last 12 months we have certainly seen that from publishers who continue to march on with a plethora of digitisation initiatives and deals. Then there is the publishing of research from the Centre for Information Behaviour and the Evaluation of Research (CIBER) of which Nicholas played a key part. This came in the form of the joint British Library/JISC report on “Information behaviour of the researcher of the future”.

These changing research behaviours include horizontal, not vertical methods of searching by the ‘Google generation’ or viewing but not reading, onscreen sources of information. Both should be considered when thinking about e-books, learning and the library.

Nicholas opened his presentation by discussing the JISC National e-Books Observatory Survey, one of the biggest studies of its kind in the world. This ongoing research has seen the placement of a range of e-textbooks into 120 UK universities. Once the study has run for two years expect the wealth of e-book user information to raise some interesting findings.

Nicholas made some pretty honest points on what he thought needed to be considered. Users want ‘quick information wins’ they want to ‘bounce from one source to the other’ and ‘power browse’. e-Books, he said appeal to people wanting a bit of a book – not all of it, and everyone is just waking up to this”.

While e-books are supposed to circumvent the traditional logistic problems of supplying each student with their core textbooks, Nicholas asked what happens when students get all their content this way. What does that mean for the library? Will they even need to come to the library anymore?

Before I attempt to answer that, I should point out that the publishers presentations by both Janke and Carden had something to offer on this dilemma, albeit from their point of view.

Janke admitted how end-users; both faculty and students, will go to Google and Wikipedia first for information rather than the library and therefore e-books. His problem as a publisher was to ask, how you get them to your content.

There was talk of various initiatives, business models and marketing plans, which all involved the library and publisher making efforts in the attempt to address this. Both Janke and Carden admit that librarians complain of too many pricing models and collections, although in the experience of both, a one size fits all approach won’t be right for librarians either. As Jenke pointed out, “Librarians say they aren’t there to market publisher’s content”

That’s interesting because as was said more than once during the seminar, users don’t care who the publisher is.

With e-books continuing to grow in popularity among both scholars and publishers the traditional academic library will face the challenges into how it works and what it should be there for. The way learning and the processing of information happens among scholarly circles has changed and will continue to do so.

There may be hard questions to ask about what the physical as well as virtual nature of academic libraries should be and could mean some big changes. But as Nicholas points out we have “seen a frightening dumbing down of information seeking”. There is a significant and serious role for the library still to play in all this. If there was ever a need for information professionals to take a leading role addressing these issues, the time is now.

Monday 14 April 2008

WCM and web 2.0 - believe the hype

Web content management vendor Vignette has just released a new set of tools allowing firms to offer community and social networking functionality on their customer-facing sites and intranets. It seems like the whole wonderful world of Web 2.0 is fast becoming the battleground for differentiating in the super-competitive enterprise content management (ECM) and web content management (WCM) space.

Vignette Community Services includes tools which can help firms offer ratings, reviews, tagging and other functionality on their sites. The vendor is also providing content moderation tools - an essential feature for any firm which wants to allow user-generated content on its site. Community Applications, meanwhile is all about collaborative functionality – think along the lines of blogs, wikis, forums and so on. So why is Vignette doing this? Well, despite all the hype surrounding Web 2.0, and despite the numerous conflicting definitions of what this over-used word actually means, there is actually some sound reasoning behind offering visitors to your web site a means to interact in a more meaningful and rewarding way.

Web 1.0 was all about firms forcing their messages and marketing on customers. Web 2.0 is all about them providing the means for customers to collaborate and participate, listening to what they want, and then delivering it. Vignette's Guy Westlake is right when he says that providing user-generated content functionality alone can do wonders to help strengthen brands and drive revenue, and provides vital feedback to help firms produce new goods and services.

The vendor is well on the way to achieving its vision for its underlying Web Experience platform, having previously released Vignette Recommendations – software which offers firms the means to personalise their customers' web experiences. It crucially delivers content based on the user's intent rather than history, and then dynamically serves up relevant content. The firm also made moves to ease the management of digital media assets by updating its rich media services product.

And Vignette isn't the only WCM firm doing this kind of thing. As the analyst community has noted, ECM and WCM functionality is rapidly becoming commoditised. Look at most of the big players and they all offer similar things, so this is an opportunity for some early movers to steal a march on their rivals, and give forward thinking clients what they want – a means to create stickier sites and give their customers what they want. It won't be the end of the Web 2.0 posturing and eventually this functionality too will become standardised, but in the meantime expect to see all the usual content management suspects trying to differentiate along these lines.

Friday 11 April 2008

Racetrack memory to save us from ourselves?

Today's issue of Science carries a story about a new invention from IBM called Magnetic Domain-Wall Racetrack Memory. It works by storing data in a permalloy nanowire, a thousand times thinner than a human hair. It promises to increase reliability and speed of storage while slashing the amount of energy needed to power it.

The storage density can be hundreds of times that of today's best flash memories. And, because the wires can be made to loop from the surface of the chip, they overcome the well-known chip density restrictions of Moore's Law.

Without wishing to demean the technology, the storage wire acts like a chain of magnetised buckets. The read/write wires switch the polarity or copy the state as the buckets are propelled past using tiny electrical currents. Everything happens at the atomic level.

At the moment disk drives store information cheaply but, because they are mechanical, they are relatively slow at reading and writing. Flash memory is fast at reading but slow at writing. Both have reliability issues over the long term. IBM claims that racetrack memory will be inherently stable and durable.

An MP3 player that can contain half a million songs sounds ridiculous, but it gives an idea of the potential capacity of one of these devices. Because they will use a fraction of the energy of conventional disk drives while providing both speed and reliability, they appear to offer significant environmental benefits in all storage situations.

It's a shame we're going to have to wait seven to ten years for them. But, by going public with this development, IBM has just fired the first salvo in a new storage war. Others will be clambering to get in first with high density, highly reliable and energy efficient devices.

This can only be good news for people like us whose lives are so information-centric.

Wednesday 9 April 2008

Unlocking information with social intelligence

A relative newcomer to the IT industry, Trampoline Systems, has just launched an even newer product called Sonar Flightdeck. So far so what, you might be thinking, but what Trampoline is claiming to offer is ever so slightly innovative. The vendor's aim is to help your organisation achieve that holy grail of harnessing the intelligence trapped within your workforce, boosting productivity and ready access to important information.

Chief executive of Trampoline, Charles Armstrong, believes a generational shift is occurring in the way businesses achieve productivity gains, and in many ways he is right. From around the 60s through to the early 2000s, process automation was the main means by which firms achieved these ends, but once everything has been automated, where do you turn? Well, the answer according to Trampoline is to the collective or social intelligence of your staff. This is enterprise social networking – understanding how individuals and groups within the organisation interact and the expertise and knowledge they possess.

Now this is way outside the comfort zone of your average organisation, but Armstrong says he has been "impressed and surprised" by the extent to which IT executives have warmed to these themes. It could be viewed as the slow creeping of social networking ideas into the enterprise, just as other primarily consumer-based technologies have crept quietly into business over the years.

So how does the technology work? Trampoline's offerings at least are all based on the Sonar platform, which mines all the important information from users' data sources – emails, document stores, databases and so on – and runs some clever algorithms on them to deduce various things about the sender. The end result is that you can view in a neat graphical display the information flows, areas of expertise and social networks in the organisation, and then refine it by various criteria. Flightdeck is especially targeted at managers looking to see where the concentrations of expertise are and where the holes are, which Armstrong added is particular popular with firms undertaking change management and mergers and acquisition activities.

But does this flash technology have a future or is it all a bit too far out to get widespread organisational acceptance? Well, it's probably right to liken it to BI a few years back, which was being trialed only by a few very brave and forward-thinking firms – the same could very well be true of enterprise social intelligence. As Armstrong argues, perhaps the advent of a recession will give firms the push they need to investigate new ways of boosting productivity, getting more out of their existing human resources and becoming more agile. Perhaps.

BLOGOSPHERE Information professionals guiding you to the best bits of the blogosphere

Teased about having so much to say about Web 2.0 but not actually having a blog, Brian Kelly got cracking – so effectively that he won IWR’s Info Pro of the Year Award

Q. Who are you?
A. My job title is UK web focus. I’m based at UKOLN and located at the University of Bath. UKOLN advises and supports the higher and further education communities and cultural heritage organisations. I support those communities in making best use of the web, and for the past two years, as everybody will know, that has focused on Web 2.0.

Q. Where is your blog?
A. It is hosted on Wordpress.com (http://ukwebfocus.wordpress.com). You can also follow my microblog on Twitter (http://twitter.com/briankelly).

Q. Describe your blog
A. It has several purposes including engagement with my user communities, dissemination about my activities, experimentation and thinking out loud. The blog reflects the work activities I am engaged in, including web accessibility, standards and Web 2.0.

Q. How long have you been blogging?
A. Since 1 November 2006; it was my 10th anniversary of working at UKOLN. A year later and the blog contained 264 posts and 1,045 comments. I post three to four times a week, with the occasional break for holidays.

Q. What started you blogging?
A. I took part in a Web 2.0 panel at the ILI 2006 conference. My fellow panellists – Paul Miller, Phil Bradley and Michael Stephens – teased me for being a prolific speaker on various aspects of Web 2.0 but not actually having a blog. Within a month, I’d launched my blog.

Q. Which bloggers do you watch, link to and why?
A. John Dale of the University of Warwick (http://blogs.warwick.ac.uk/johndale) was a blogging pioneer in the UK university sector, and his blog illustrates how IT service managers can use blogs to engage with users. Another pioneer is the aptly named Michael Webb, whose blog provides “thoughts from IT and media service” at the University of Wales, Newport. His institution was probably the first in the UK HE sector to take a high profile Web 2.0 strategy. Relatively new is the University of London Computer Centre’s Da Blog (http://dablog.ulcc.ac.uk), which helps bring an understanding of digital preservation issues to a Web 2.0 environment.

Q. Do you comment on other blogs?
A. Absolutely. Initially this helped me get a feel for blogging etiquette and how to engage in discussions in the blogosphere. Now that my blog is well established I can spot the referrer links from bloggers commenting on my posts. Such alerting mechanisms enable me to engage in distributed conversations across the blogosphere. And my regular Technorati alerts for new blog posts containing search terms such as UKOLN or JISC help me to identify other discussions I may want to engage in. On a number of occasions I have spotted blog posts that have misinterpreted the contents of reports produced by colleagues in UKOLN. Being able to quickly spot and correct such posts can help prevent misunderstandings being propagated across the digital library community.

Q. How does your organisation benefit from your blog presence?
A. The blog played a significant role in the award I received (IWR’s Information Professional of the Year) at the Online Information 2007 conference. The award helps to highlight my work but also related activities across UKOLN. UKOLN has a high profile across the international digital library community as well as the educational and cultural heritage sector nationally. My blog has significant numbers of readers in the US, Canada and Australia, and helps to maintain
UKOLN’s presence across this international development community.

Q. Does your blog benefit your career?
A. Publishing posts on a regular basis can be very stimulating. I have to keep informed about new developments and have something relevant to contribute to debates on use of the web. And as my blog is open to comments, I also have a valuable feedback mechanism which gives me a better understanding of the topics I raise.

Q. What has happened to you solely because of blogging?
A. It helped me to win the IWR award. And I will shortly be running a blogging workshop at the
Museums and The Web 2008 conference in Canada. But the best thing was the invitation I received to give a plenary presentation at the National Digital Archives Program 2008 conference in Taiwan in March 2008.

Q. Which blogs do you read for fun?
A. I read, watch films, listen to music and go to the pub for fun! But I also use Twitter for informal discussions with my professional colleagues around the world. And I have found that my former colleague Pete Johnston has mastered that art of being witty in a mere 140 characters, as I described in a blog post (http://rurl.org/gie).

Q. What blogs in your sector do you trust?
A. There are a great many blogs that cover my areas of interest, including web standards, web
accessibility and Web 2.0 developments. I follow a number of blogs, but I also discover new blogs by following links to my blog, links included in comments on my blog, and in dynamic searches retrieved by my RSS reader. There’s a danger that relying on a small selection of trusted blog authors would lead to a sterile environment, and a failure to be open to new ideas and criticisms or current orthodoxies.

Friday 4 April 2008

Bite sized online learning from DTV

In our busy busy world we barely have time to think, let alone reflect and put aside enough time for learning.

As a trainer for many years, I have watched how it has become increasingly difficult to gather executives together for even half a day's training.

Deliverers.online is a company which, for the past ten years, been designing and delivering bespoke corporate communications solutions and programmes to thousands of employees of UK and international brands including AstraZeneca, Schering-Plough, Virgin and HMV.

It recognised that an opportunity exists for highly focused, short and sharp professional development packages. It set about incubating a new company now called Digital Training Videos, or DTV for short. It reckons that an eight to ten minute video can be fitted into anyone's schedule. It's not the same as a live course, with the interaction and live Q&A but it gets key point across in an effective way.

The videos cover topics such as personal and professional development, management, leadership, coaching, communications, customer service, sales, teamwork and sustainability. And, according to the blurb, does this in a "fresh, entertaining, accessible and affordable way". (I'm still waiting for an answer on what 'affordable' means, but the signs are promising.) The videos can provide just in time learning or form part of an employee performance support system (EPSS) for large organisations.

DTV will reveal its first 20 products on April 11th at HRD 2008 at London's ExCel centre.

Thursday 3 April 2008

Information should stay out of the skip

In case you’ve forgotten – or indeed if you never knew– may I remind you that 2008 is the national year of reading? Writes Peter Williams.

This is a celebration of the cognitive process of understanding a written linguistic message (source www.wordwebonline.com) as opposed to a celebration of Reading in Berkshire, which according to wikipedia is the largest town in England.

The word of the week on the national year of reading web site (www.yearofreading.org.uk) is shenanigans (suspicious going on or mischievous fun so the site says) which seems apposite given the fuss that has been generated in The Times letters page over books and libraries in the last few days.

Peter Kinsley, a correspondent to the editor of The Times, alerted me to the national year of reading in his letter. However, the main point of the missive was to protest at the alleged dumping of books in a landfill site which is claimed to have happened recently both in Wiltshire and Waltham Forest. The crime was compounded by the allegation that many of the books had plenty of useful life in them.

This letter prompted a response from Roy Clare, CBE, the chief executive of Museums, Libraries & Archives Council. As the body responsible for promoting best practice in the 3,500 public libraries in England, Clare informed the readers of The Times that it did not condone disposal of books in landfill sites. Indeed Clare suggested that ‘the electorate should hold councils accountable for lapses in quality, cost-efficiency and propriety’. However the MLA is available to offer advice on innovation and improvement to library stocks, and presumably does not hold the number of too many skip hire businesses.

The exchange of letters provoked both reactions expressing horror at the vandalism and the waste that is represented by casting away books. Many of us find it difficult to dispose of even the most ill-regarded trashiest airport-purchase novel. Equally there was a lot of information about what could be done to recycle and reuse unwanted books as an alternative to throwing them away.

The question information professionals, in whatever sector they work, must be able to answer is this: how can they dispose of material which the organisation no longer requires in a way which does the least damage to the environment and which may do maximum good to other potential information users? The answer may not be easy to ascertain and disposing of unwanted material in a sensitive way may come at a price in terms of time and money. But surely information professionals don’t think the best disposal option is a skip and landfill.

By the way the Library Information Show (www.lishow.co.uk) run by IWR’s publisher is on at the end of April where I’m sure the issue of environmentally-sound libraries for the 21st century will be firmly on the agenda.

Wednesday 2 April 2008

Comprende computer?

“The myth of fully automated translation is just that – a myth’, ‘Languages are just too complex for us to be able to automate the whole process” said Mark Lancaster Chief executive of SDL a UK commercial translation company in today’s FT.

If as an information professional you are ever tasked with scouring a foreign language source for vital research data, then you may have needed to use professional translation services. But what if the amount of data you need to wade through doesn’t make using team of humans time or cost effective for your limited budget? Is automated or Machine Translation (MT) an option?

MT is certainly not perfected. It has difficulties like Lancaster says and as the article points out a significant human input is still required.

For any information professional (those working in patent research for example) attempting to understand, as well as navigating a foreign language resource will be a difficult challenge. Language barriers that researchers can come up against include correctly translating different sentence structures and meanings of words compared to their native tongue. There are many complex factors that need to be considered when relying on MT to understand the subtleties that Lancaster mentions.

Using the patent example, a researcher could be searching for a vital, if obscure piece of information from a SE Asian country. The structure of Korean, Japanese or Chinese Languages is fundamentally different from that of its western Latin contemporary. The FT article highlights the considerable times it would take with “Japanese double byte type projects” Double bytes, the FT says refer to “the number of bytes required to code Japanese, Chinese or Korean characters – English is a single-byte language”

This all reminds me of a session on MT and patent search I blogged about at the Information Retrieval Facility Symposium (IRFS) in November. There can be a several ways to express the colour red in Korean. The spacing of a Chinese word in a sentence can alter its meaning significantly, so when translating to a differently structured language there are added complications to contend with. The speakers at the event detailed these differences exquisitely, highlighting exactly what is required to get around them and what kind of work still needs to be done. Admittedly in the patent case, how the patent data was filed (consistently) was important in achieving good results.

As European Patent Office (EPO) head of research, Dr Barrou Diallo, the lead speaker on MT at IRFS, had quite a detailed breakdown of the issues involved. His department develops various tools that focus on helping information examiners in their information retrieval searching.

Diallo admits to plenty of hurdles needing to be overcome in the EPO’s five year MT research initiative. He believes that the lessons learned from translating European patent information as well as the technology developed mean the maturity of existing translated material is enough to already improve the efficiency of patent workers. It is also better placed to deal with the complex differences between European and South East Asian information sources.

The nice thing about Diallo’s presentation is that the quite complex difficulties that MT needs to overcome are explained in-depth. They are challenging (especially to the layperson) and highlight the mountain still to climb. But the positive efforts of European, Korean and Chinese contemporaries on such a testing area of research are worth listening to.

Links to Diallo’s twenty(ish) minute presentation as well as others are here also with accompanying slides.