International Industry-Academia Workshop on Cloud Reliability and Resilience

International Industry-Academia Workshop on Cloud Reliability and Resilience

November 7-8, 2016, Berlin, Germany


With the increasing adoption and reliance on cloud platforms and services, it is undeniable that cloud computing is becoming a utility such as water, energy, transportation, or telecommunications. This status brings the responsibility for public providers to ensure the development of highly reliable platform and services.

Nonetheless, a study from Gartner found that 47% of all documented cloud problems were caused by service outages. Their duration ranged between 40 minutes and five days. Another study from Ponemon Institute found that outages on average cost US$ 690,204. To aggravate these results, the increasing use of commodity hardware to build data centers will negatively contribute and will lower the reliability of existing cloud computing platforms.

Thus, the development of new strategies, techniques, and methods to evaluate and increase the reliability and resilience of cloud platforms from a software perspective is fundamental.


This workshop intends to bring together industry, academia, and regulators to identify the most relevant requirements in the field of cloud reliability and resilience, on one hand, and existing state-of-the-art solutions, on the other. We invite engineers, scientists, and experts to discuss and contribute to the creation of a new generation of highly reliable cloud platforms.

Topics of Interest

The workshop places focus on the following topics:

  • Challenges of data center reliability
  • Methods and algorithms for failure prediction
  • Damage detection and problem diagnosis
  • Automated repair and recovery of cloud systems
  • Disaster recovery in cloud computing
  • Fault-injection as an approach for reliability
  • Evaluation of cloud platforms reliability
  • Cloud reliability metrics and benchmarks
  • Service Level Agreement (SLA) and reliability
  • Quality of Service in the cloud
  • Standards, regulations, and legislation

General Chairs

  • Henrik Abramowicz, EIT Digital, Germany.
  • Jorge Cardoso, Huawei ERC, Munich, Germany and University of Coimbra, Portugal

Steering Committee

  • Dr. Götz Reinhäckel, Head of Cloud Engineering, T-Systems International, Germany.
  • Dr. Jeff Voas, US National Institute of Standards and Technology (NIST), US.
  • Prof. Paulo Esteves Veríssimo, University of Luxembourg, Luxembourg.
  • Michel Drescher, Cloud Computing Standards Specialist, University of Oxford, UK.
  • Valentina Salapura, Chief Architect, Resiliency and Business Continuity, IBM, US.

Invited Speakers

  • Building Blocks for Site Reliability At Google, Sebastian Kirsch, Google, Switzerland.
  • Breaking Azure for Fun and Profit, Pavel Michailov, Microsoft, US.
  • Using Event-driven Automation and Workflows for Auto-remediation, Dmitri Zimine, Brocade, US.
  • High Availability and Disaster Recovery in OpenStack: From humble beginnings to enterprise reliability, Florian Haas, Hastexo, Austria.
  • A Tale of Ice and Fire, or: The Cloud and The Standards, Michel Drescher, University of Oxford, UK.
  • I’m No Hero: Full Stack Reliability at LinkedIn, Todd Palino, LinkedIn, US.
  • Resilient Cloud Storage – The Consistency View, Neeraj Suri, TU Darmstadt, Germany.
  • A Cloud is Not Enough, Reliable Delivery Matters More, Ajay Gulati, ZeroStack, US.
  • Dependable Storage and Computing using Multiple Cloud Providers, Alysson Neves Bessani, University of Lisbon, Portugal.
  • Cloud Based Fault Injection for Anomaly Detection, Craig Sheridan, flexiOPS, UK.


EIT Digital Berlin Co-Location Centre