Sunday, September 23, 2012

Sorting the Stuff Out at World Service & Popup Audio

The Next Web did some interesting and useful interviews in Hall 14 at IBC a few weeks back. That included an interview with George Wright from the BBC R&D department. In this interview above he describes the challenges facing the BBC World Service in cataloging their vast audio archive of features going back several decades. Like most radio archives, the metadata (descriptions of what was on the tape before it was digitized) was very variable - most of it doesn't have much detail and some of the data is actually wrong. They are using around 300 volunteers in the Global Minds Audience Research Panel to help them with pilot tests to improve the metadata. Jamillah Knowles wrote the following as a result of doing the interview with George.

The International Broadcasting Convention (IBC) took place recently in Amsterdam. Throughout the event, The Next Web streamed live interviews with industry players with the help of LiveUThe BBC was present at the convention doing some great work with archives. George Wright, Head of Prototyping, BBC Research and Development and his team are working with sixty years worth of World Service archives, that’s around 500 terabytes of audio. The aim of the work is to help users find what they need more easily. The historical importance of the collection is considerable and it will become far more useful once it has been properly tagged with data that is searchable.
The archives have almost no metadata, so the team has created a speech recognition system which goes through the archive and adds tags so that users can navigate. Along with the machine recognition, listeners are volunteering to correct and add tags to ensure that it is all correct. The R&D team at the BBC built its speech recognition system on top of existing open source software. The audio from BBC World Service has its own idiosyncracies that make speech recognition a tricky prospect for accuracy. If you have heard past broadcasts from the global radio network, you’ll spot that people spoke English in a quite different way in the 50s in comparison with the language used today. Add this to the difficulties in recognising proper nouns and foreign place names and you can see what the software is up against. Once the material is properly tagged, it can be used in a number of ways for re-broadcasting, primary source research and to add value to future broadcasts. The archive material is mostly programming and features rather than news reports, but the standard of interviews and the historical value of the archive is unquestionable.
I think it is interesting to connect this to a group which won one of the Knight Foundation grants on Saturday at the Online News Conference in San Francisco. 
The Popuparchive seems to be tackling similar problems, but then with audio content being recorded for production houses working for US National Public Radio. 

"At its core, Pop Up Archive addresses the challenge of enabling any producer to share digital audio content in ways that are meaningful and useful to the public, without the need to employ an archivist. The system also provides a well-documented method for independent producers to store and access content: media files are seamlessly uploaded to the Internet Archive for permanent preservation with the option of social sharing through SoundCloud, all at no cost to the user. The initial phase of Pop Up Archive resulted in software plug-ins scheduled for summer 2012 release through Omeka, an open-source web publishing platform. Phase two of the project focuses on data standards across organizations and web service needs.
Pop Up Archive addresses the growing desire in public media to engage with audiences online in new ways. Our system reinforces existing audience participation habits and encourages new forms of participation, including re-use of content by other independent producers and oral history collections that may lack institutional support and resources. Allowing listeners to help organize content enables producers to understand how audiences interpret their work. At the same time, we recognize the importance of the authority upon which archival descriptions are historically based. To that end, files are catalogued with authoritative metadata in accordance with recognized public media standards.

No comments: