Cover Story

SKA: the ultimate big data challenge

The SKA will generate ten times the data trafficked across the global internet daily. So how is it going to be processed?

2 May 2013
photo: Sean WilsonSimon Ratcliffe, DOME South Africa, says this project lays the foundation to help the scientifi c community solve other data challenges such as climate change, genetic information and personal medical data.

The human race is on a journey to deep space. The mission: to explore the origins of the universe. Using the Square Kilometre Array radio telescope (SKA), radio signals from deepest space will be collected for scientists the world over to analyse and process. Their challenge? How to transport, store and process the estimated 14 exabytes of data the antennas will gather every day.

The SKA is a radio telescope being built by ten countries, including South Africa. It’s the largest joint project of its kind ever attempted and will result in the construction of the largest and most sensitive radio telescope ever built.

“The SKA will see back to a time before the first stars lit up,” says IBM. “Optical telescopes see the light from stars. Before stars existed there was only gas; a radio telescope with the sensitivity of the SKA can see back in time to the gas that existed before stars were born.”

The antennas that make up the array will be distributed across South Africa and Australia – literally millions of them. “Because the telescope is to be made up of so many individual antennas, and the antennas are so widely scattered, and such a large volume of data is being gathered, a novel computing system must be developed to manage the process of gathering, storing and analysing data from end to end,” says an IBM whitepaper on the system.

Introducing SKA

The project is led by the SKA Organisation, a not-for-profit company headquartered in Jodrell Bank Observatory, near Manchester, UK. It was established in December 2011 to formalise relationships between the international partners and centralise the leadership of the project.
The SKA will be built in Southern Africa and Australia. There will be 3 000 dish antennas, each about 15m in diameter, as well as two other types of radio wave receptor, known as low- and mid-frequency aperture array antennas. The mid-frequency aperture arrays will be built in South Africa and are envisaged to be a major component of the SKA Phase 2. The antennas will be arranged in five spiral arms and the dishes in Southern Africa will extend to distances of at least 3 000km from the centre of the core region. Construction of the SKA is expected to begin in 2017 and conclude in 2024.
SKA SA was established by the Department of Science and Technology of South Africa and is administered as a business unit of the National Research Foundation (NRF).
Source: IBM

To this end, IBM is collaborating with the Netherlands Institute for Radio Astronomy (ASTRON) on a five-year project called DOME, which aims to design a computing system that could manage the data SKA is going to generate. South Africa’s National Research Foundation has joined the initiative, in a four-year collaboration. DOME will see research conducted into extremely fast but low-power exascale computing systems. The collaboration includes a user platform where organisations from around the world can jointly investigate emerging technologies in high-performance, energy-efficient computing, nanophotonics, and data streaming, says IBM. SA has now joined as a user platform partner.


The DOME collaboration brings together a team of scientists and engineers in a partnership of public and private institutions. “This project lays the foundation to help the scientific community solve other data challenges such as climate change, genetic information and personal medical data,” says Simon Ratcliffe, technical coordinator, DOME South Africa.

Scientists from all three organisations will collaborate remotely and at the newly established ASTRON & IBM Center for Exascale Technology in Drenthe, the Netherlands, IBM said in a statement announcing the collaboration.

According to IBM, scientists from SKA South Africa will focus on:
• Visualising the challenge – fundamental research will be conducted into signal processing and advanced computing algorithms for the capture, processing, and analysis of the SKA data so clear images can be produced for astronomers to study;
• Desert-proof technology – the DOME team is researching and prototyping microserver architectures based on liquid-cooled 3D stacked chips. The team in South Africa will extend this research to make the microservers rugged or ‘desert-proof’ to handle the extreme environmental conditions where the SKA will be located;
• Software analytics – the 64 dishes of the MeerKat telescope in South Africa will be used for the testing and development of a sophisticated software program that will aid in the design of the entire computing system holistically and optimally, taking into account all of the cost and performance trade-offs for the eventual 3 000 SKA dishes.

Breaking it down

ASTRON and IBM have mapped out seven technology projects aimed at dealing with the extreme data-handling requirements of the SKA, IBM says. All have fundamental implications for computing in the future.

“The most strategic of the DOME projects is called Algorithms & Machines. The SKA challenge is so extreme and nobody has designed a data management system to handle anything like this before. So the goal here is to create an ultra-sophisticated software program that will help the team design the system holistically and optimally,” IBM says.

Also of critical importance is the Access Patterns team, responsible for developing a big data repository capable of affordably handling the huge volume of data SKA will generate every day.

Transporting the data hundreds or thousands of kilometres from the antenna that collects it to the datacentre where it will be stored or processed will require a high-speed fibre-optic network that can move data at 100x the rate of today’s internet traffic. The RT Communications project is working on reducing the overhead that data traditionally generates as it travels through a network. The Compressive Sampling project will work towards reducing the data that SKA creates by compressing it as it streams in. Processing the data will be handled by hybrid machines combing supercomputers and a processor called an accelerator. Complementing these machines are DOME Microservers, which are very small, inexpensive, energy-efficient microprocessors that can handle data filtering and analysis close to the antennas. “The real bottleneck comes when the data reaches the computers where it’s processed and analysed,” says IBM. “The computers transport data internally via electronic bits moving on copper wires. So the SKA will be like attaching a fire hose to a garden sprinkler. DOME’s Nanophotonics team, led by IBM researcher Bert Offrein, is taking photonics technology that IBM was already developing for general computing and applying it to the SKA challenge.”

“The DOME research has implications far beyond astronomy. These scientific advances will help build the foundation for a new era of computing, providing technologies that learn and reason. Ultimately, these cognitive technologies will help to transform entire industries, including healthcare and finance,” says Dr Ton Engbersen, DOME project leader, IBM Research. It is these other implications that make the SKA project so significant for the sector, for science, and for humanity.