TMForum and Open Innovation

Our CEO, Dean Elwood, will be in Dublin next week at the TMForum “Management World” conference, where he will be participating in a keynote panel. The conference opens with keynotes on Monday 21st.

Dean is also appearing on Wed 23rd in Brussels at the European Open Innovation Summit, speaking about innovation in the enterprise, along with O2.

You can find Dean on Twitter as @deanelwood.

Posted in Events, Industry

Threnody for a Microwave Dish

BT Tower, with dishes, in 2011

The Telecom Tower is a London icon, and personifies the incredible power that state-owned telecoms wielded back in the 1960s. It was originally the GPO tower (General Post Office), long before the privatization boom of the 1980s, when the Post Office was split up and British Telecom was formed. In the BT era, the tower has remained largely closed to the public, although cudos is due ITSPA, who organized a reception there last year. The view from the top was splendid.

In recent months, the tower has received something of a reconfiguration, with the large microwave dishes being removed from the outside of the building. It is somewhat ironic that the building was designed so tall expressly because it would be the line-of-sight long distance link for microwave bearer services including voice and television. In the age of fibre-optics, these services have become obsolete, and in fact the tower is a focal point for a lot of voice traffic that arrives now via fibre, and much of that will be carried as IP traffic.

The dishes were removed a few months ago (see the Daily Telegraph story here), which involved building a platform and cutting the dishes into pieces in situ so that they could be brought down inside the building. Interestingly, English Heritage needed to give permission for this work to be done, since the building (and dishes) have been there for long enough for the building to be “listed”, i.e. officially protected due to its heritage value.

View to the South (Centrepoint and London Eye to the left)

In some ways, the BT tower is a white elephant in this age, not just because of obsolete microwave technology, but also because market deregulation means that BT will often by buying long distance services from other providers (e.g. from wholesale VoIP companies) who will have their own fibre infrastructure. However, it’s is a unique and beautiful part of the London skyline and long may it tower over us.

Posted in Digital Communication, Industry

Magnum Opus

A new draft arrived this month for the Opus specification, or the Internet Audio Codec, whose home page you can find here. This creditable effort will deliver a codec capable of both narrowband and wideband peformance, and it will be scalable to carry not just speech but even music across the internet. Some commonly used codecs (like AMR used in 3G and MP3 used, well, everywhere) are encumbered with patents, so it’s not always possible to build what you want, especially if you have a business model that cannot support high per-station licensing fees at the start.

It’s quite cool the way the scalability works, too. Most codecs today are not pure waveform coders (like the storage of bits on a CD), but nor do they entirely synthesize speech. They are somewhere in the middle, and this is helpfully known as “hybrid”. Using a mathematical model of the human speech system, a hybrid codec can make a very compact description of the sounds, and send this instead of a slavish description of the audio waveform. This part of the Opus algorithm is provided by the Skype SILK algorithm (something we already know well at Voxygen), and this provides wideband performance for speech, up to 8kHz of bandwidth, double that of a “normal” telephone call. In some literature, “wideband” is reserved for a higher range, around 16kHz, but actually 8kHz is pretty good compared to what telephone users are used to.

When even further quaility is needed, Opus engages a parallel channel using something called the M-DCT (Modified Discrete Cosine Transform). The MDCT is used in encoding JPEG images, and MP3 music. It’s a computationally expensive process, but has the useful property that it encodes the most important perceptual information, and discards the rest. When you look at a JPEG image, it is not a faithful representation of the original bits (say from the CCD in the camera), but what has been discarded is information you would notice the least in any case. The same is true for MP3: relative to the music on a CD, MP3 is much more compact (at least 1/3, perhaps 1/10 the size), and this compression is in part from (perceptually unimportant) information that has been discarded. Running the MDCT algorithm in realtime was not always a possibility, but this is testament to the plentiful and cheap CPU that is available in our mobile phones, tablets and computers.

So the MDCT encodes high frequencies (above 8kHz), and the hybrid encoder handles the lower 8kHz. The decoder combines the two streams to give something like the original audio experience at the receiving end. Opus is pretty interesting in embracing everything from low quality voice all the way to maximal music quality, depending on application needs, network capacity and available CPU. Wideband audio research was in a pretty dead period until the last few years, but the convergence of factors like wireless, open source and the internet has brought a lot of brilliant new ideas forward.

Posted in Digital Communication, Industry, Mobile, Multimedia, Voxygen Tech

Emergency and the New Wireless

ZigBee is the most famous Personal Area Network (PAN) technology that non-one has heard of. Like Bluetooth (which people have heard of), it uses a physical interface called 802.15.4. This means that it can sit in the 2.4GHz “license exempt” spectrum, and so (like WiFi) allows communication systems to be built without licensing conditions from spectrum managers like Ofcom.

ZigBee is interesting in that it has a lightweight stack (perhaps only 10% of the complexity of Bluetooth); it is low-power, and prices are falling fast. A ZigBee sensor module can be bought over the counter for around £25 today, and in large numbers it will be much lower. It is one of the technologies under consideration for smart meters, or in other words, how the utility meters in your home will in the future communicate in real-time with the electricity or gas companies.

ZigBee has a self-organising capability: i.e. left unbidden it will identify its ZigBee neighbours and use them to relay messages to where they need to go. So although ZigBee is a short-range system, the range can be extended by creating dynamic, ad-hoc networks.

One area that we’ve researched at Voxygen is disaster communications, for example the possibility to drop-in a GSM wireless base-station at the scene of a disaster zone where explosion or extreme weather has disabled the communication infrastructure. There is a lot of potential in this area to exploit battery-driven, self-organising systems to bring some kind of infrastucture back online in a hurry.

The EU PEACE project has funded a number of research initiatives in emergency communications, including ad-hoc routing techniques in self-organising networks. One interesting scenario considered was emergency workers (e.g. firefighters) working on a site in small groups. Imagine that each one is equipped with a wireless node, and that networking between them is done on an ad-hoc, self-organising manner. In a damaged building there might be obstacles such as walls and floors to disrupt communication, or electromagnetic sources of interference: in such a situation it is an advantage to have a communication system that is constantly identifying reachable neighbours to allow messages to be routed out by the best available path.

ZigBee wasn’t designed for voice specifically and in fact the target bandwidth is only 250kbps, so there isn’t a lot of headroom to spare. However, simulations have been done by some researchers [Wang et al. 2008] which show that at 3 hops over ZigBee it is possible to support a low-rate codec, G.729a, with tolerable quality. Reduce that to a single duplex Push-to-Talk system, and it’s possible to push that range up a bit further.

ZigBee technology hasn’t yet reached the smartphone, but the potential is huge when it does. Then the smartphone itself becomes a gateway between ad-hoc and established infrastructure (UMTS, LTE, GPRS), and the drop-in base station becomes almost a disposable item in terms of cost. With distaster so much on the agenda in recent years (touching extreme weather, nuclear disaster, war and humanitarian response) these are simple technological tools that can have a big impact.

Posted in Industry, Mobile, Voxygen Projects

Big Data and the ‘Call Me’ Gap

GPO PhoneboxVoice is the communication method that just won’t die. No matter how many different messaging and collaboration systems and social networks are presented, we often end up “just talking about it”. It comes to the point; it allows us to add nuances of certainty and emotion to the conversation, and it’s a rapid feedback loop that often allows us to close a long message trail.

From the “big data” perspective, though, voice is an “out-of-band” technology that hasn’t been recorded and can’t be used as part of the processing. It isn’t hard to imagine a situation where your calls into a call centre would be recorded, and the information added to the context that the company already has: profile, purchasing record, email trails and so on. Google (and Twitter and other social companies) spend a lot of time and effort using tools like Hadoop to mine information and create a context; asking questions about the data that go above and beyond what can be expressed in SQL queries. Recorded voice can similarly be processed by speech recognition tools and subjected to analysis and contextualisation. For example, Voxygen has already demonstrated the ability to search recorded conversations by keyword, which is a basic step in associating a conversation with its rightful place in the “big data” record of a particular individual, or in identifying common “memes” that group people together.

Not that we are suggesting a new “Stasi”, where information is voluminously (and often unnecessarily) kept as a tool of control. Some governments will doubtless try this, as the software tools and massive disk capacities are there at their fingertips. Certainly there are privacy concerns, and controls need to be placed in the hands of the users. The advantages could be great, though: in our vision, processing voice as big data is a tool to make the relationship smarter between individuals and a business or agency.

Many smartphone-equipped people now share their location frequently using tools like Foursquare and Facebook, but politician Malte Spitz went a step further by obtaining usage information from Deutsche Telekom, and then making it available for voters to examine. This is a nice example of accountability, but only scratches the surface in terms of harnessing the usefulness of this kind of data. From an individual point-of-view, this would be a much more useful tool for Spitz if it recorded, transcribed and mined voice conversations, and indexed SMS messages. With time and place as context tools, the data becomes an invaluable database of reminders. It’s an external memory. Veteran Microsoft researcher Gordon Bell has for many years been recording everything about his life in a giant database, including audio and video. It’s a cybernetic memory where everything can be indexed and searched, unconcerned by the limitations of human memory.

From a corporate governance point-of-view the lack of coverage of voice (what we refer to as the “Call-me” gap) is a concern. Many tout VoIP PBX recording solutions as the answer to the “Enron” situation, where the “call-me” gaps were often crucial to the misdeed (incidentally the NYT has a nice article on how non-keyword methods are being used for searching big data for legal cases). VoIP recordings are fine, but we need to go beyond simple storage and into analysis and contextualisation. When Google index your email or page views they are doing more than building a boring history of “you did that; then you did this”, they are seeing patterns. Often the patterns are beneficial from the point-of-view of “herd immunity”: Gmail can identify spam by seeing what a statistical population do when they receive a particular email. The offending content can then be blacklisted before others have to deal with it.

The area where voice meets computing is really under-exploited at the moment. Efforts like Apple’s Siri can be rated as trivially entertaining, but the next move is to make speech genuinely useful as an integrated part of enabling commerce and service.

Posted in Voxygen Tech
← Older posts