Copenhagen hospital Rigshospitalet encrypts enormous amounts of DNA data in real time.

Encryption: Security is paramount when human genome data is transmitted.
Cancer patient data is sent quickly and securely from Rigshospitalet to DTU Risø’s supercomputer with hardware encryption.

For the Danish originale version, please follow this link

By Laurids Hovgaard lah@ing.dk (Translated and reproduced with permission from Ingeniøren)

While there is a strong focus on security when new machines and software are purchased, large amounts of data are often transmitted unprotected through fibre connections. This is something that worries Danish start-up company Zybersafe, which develops hardware encryption used, among other things, to encrypt large amounts of sensitive DNA data from Rigshospitalet.

“Encrypting a fibre connection probably seems a bit exotic to most people. And when a company is to buy a fibre infrastructure, the focus is on speed, uptime and stability while security isn’t really prioritised,” says Erik Bidstrup, Zybesafe’s CTO.

He explains how only a small fraction of the fibre connections that exist today is protected:

“There’s often tight control of the units that are used in, for instance, production, but, as soon as you send data through the big fibre highway with all its distributor points in the open landscape, data is unprotected.”

Erik Bidstrup points out that obtaining data from a physical fibre connection is not exactly rocket science:
“It’s light that transports the data and, if you bend the fibre, you can track the exchange of information without anyone noticing it. You can buy equipment to bend a fibre for around DKK 8,000 and on Youtube you’ll find guides that tell you how to do it,” he says.

Fibre connections are listened into every day – for instance when fibre suppliers perform troubleshooting.

“In principle, when a technician performs troubleshooting to configure a system, he gains access to record and transform sensitive data into information,” Erik Bidstrup adds.

Time is extremely critical

In the department for genome medicine at Rigshospitalet they handle some of the most sensitive data that anyone can come across: complete mappings of patients’ genes. This is called sequencing of complete genomes and is used especially when treatment plans are to be prepared in relation to cancer or a range of hereditary diseases.

These illnesses are often so serious that doctors are forced to race against time.

Frederik Otzen Bagger, who is an expert in bioinformatics and a researcher at Rigshospitalet, explains:
“When we diagnose a child with cancer, for instance, we must have an answer quickly, which is why time is extremely critical. By means of sequencing of complete genomes you can look at the entire genome in one comprehensive analysis. Earlier we only mapped the genome defects that we knew were related to known illnesses.”

This means that, today, the hospital produces amounts of data that are completely different from and much larger than previous amounts of data.

“For instance, our newest sequencing machine uses two days  to fill 6 TB of space with sequencing material relating to 48 persons’ genome material,” Frederik Otzen Bagger adds.

The DNA material takes up so much space that it requires massive amounts of computer power to analyse the many millions of text strings that make up a human being’s total genome.

“This means  we need a very strong computer that can put the genome back together after it has been analysed,” says the researcher.

This is possible at DTU Risø, where the supercomputer called Computerome is located. When extremely sensitive data such as DNA material is to be transmitted, it must be encrypted prior to being sent through the high-speed fibre connection between Rigshospitalet and DTU Risø.

Continuous encryption and transmission

Rigshospitalet has looked closely at how things are done in industry, where requirements of response times, stability and reliability are also very tough.

Frederik Otzen Bagger explains:
“We must look out for patients’ data from the moment it’s received at the hospital. We prefer to transmit the large amounts of data in a clever and quick way without prolonging response times by having to pre-process or software-encrypt it here at Rigshospitalet, and this is possible by the help of a hardware  encryption solution.”

The solution that has been chosen was developed by Zybersafe and entails two physical boxes being installed – one in each end of the two junctions – at Rigshospitalet and 40 km to the west at DTU Risø north of Roskilde.

“We can begin encrypting and transmitting data continuously from the second we start the sequencing machine. If we were forced to wait for the machine to finish processing until we could start encrypting the software, two days would be lost before we’d be able to pack data and transmit it to Computerome at DTU Risø. And data transfer from an intermediate server is also time-consuming,” adds Frederik Otzen Bagger.

Inspired by industry

When data is to be encrypted, two methods exist: it is done either by means of software or by means of hardware. Zybersafe prefers hardware in the shape of the above-mentioned two physical boxes installed at each side of a connection. This solution means there is only one encrypted line and data is encrypted by means of the encryption standard AES 256.

Erik Bidstrup explains:
“The advantage of hardware encryption is that we can encrypt things at the speed with which you transmit data trough a fibre connection, which means you know how long it takes. You know how many micro-seconds it takes from when you receive a package until encryption is finished and you can send a file.”

Zybersafe’s hardware encryption is, for instance, used on offshore platforms where there is a desire to protect the data that moves between wind turbines and control systems onshore and where response times need to be so fast that technicians can turn off systems immediately if a sensor reacts.

FACTS

The supercomputer at Risø

DTU’s high performance computer, Computerome, is one of the few computers that can be spotted from the air – for instance in Google Maps – because the computer cluster takes up the space of several containers. The system has 16.048 CPU cores and 92 terabytes connected to three petabytes high performance storage and delivers a performance of more than 483 teraflops.

Encryption: Advanced Encryption Standard (AES), also known as Rijndael – is a block encryption algorithm that is the standard of the American administration. The algorithm was developed by the two Belgian cryptographers  Joan Daemen and Vincent Rijmen.

Sequencing of complete genomes means that a genome’s coding andnon-coding regions are mapped. Put simply, the genome is divided into little pieces that are then analysed individually (the relationship between them is not known).

Carina JørgensenGenome sequencing