terena trusted cloud drive facility

27
TERENA Trusted Cloud Drive facility Pilot project report Project coordinator: Péter Szegedi (TERENA) May 2013

Upload: dinhngoc

Post on 27-Dec-2016

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: TERENA Trusted Cloud Drive Facility

TERENA Trusted Cloud Drive facilityPilot project report

Project coordinator:Péter Szegedi (TERENA)

May 2013

Page 2: TERENA Trusted Cloud Drive Facility

Contributors Maciej Brzeźniak – PSNC (TF-Storage co-chair) Andres Steijaert – SURFnet (GN3plus SA7 activity leader) Jan Meijer – UNINETT (TTC member) Maarten Koopmans – Vrijheid.net (lead developer) Christos Loverdos and Panos Louridas – GRNET Jakub Peisar – CESNET Péter Stefáan and Szabolcs Székelyi – NIIF/HUNGARNET Mario Vandaele – Belnet João Pagaime – FCCN Lorenzo J. Cubero – CESCA (left) Guido Aben – AARNet, Australia Guilherme Maluf – RNP, Brazil Christian Sprajc – PowerFolder Christian Schmitz – OwnCloud

Contact Details Péter Szegedi, Project Development Officer, TERENA TERENA Secretariat, Singel 468D, 1017AW Amsterdam, The Netherlands Phone: +31 20 530 4488 Fax: +31 20 530 4499 Email: [email protected]

© TERENA 2013

All rights reserved.

Parts of this document may be freely copied, unaltered, provided that the original source is acknowledged and the copyright preserved.

Page 3: TERENA Trusted Cloud Drive Facility

3

Executive Summary Triggered by the request of the European national research and education networking organisation (NREN) community as well as the recommendation of the TERENA Advisory Council (TAC) and Technical Committee (TTC), the TERENA Secretariat proposed, planned, and executed an 11-month long pilot project where NRENs experimented with data storage capabilities exposing the cloud service delivery model. In the first phase of the project, a storage cloud brokering platform was selected and deployed at the TERENA Secretariat offices. In the second phase, the facility was opened up to all those in the TERENA community willing to experiment and engage with the Trusted Cloud Drive (TCD) concept. Nineteen NRENs, eight universities, three research institutes and five commercial companies around the globe expressed interest in the pilot; 48 user accounts were provisioned to the TERENA platform; eight other test instances of the open source software were deployed and evaluated in different countries. The pilot initially took an end-user service approach where federated access to the centralised TCD platform was provided via a simple web interface as well as a standard Web Distributed Authoring and Versioning (WebDAV) protocol to store data. This approach turned out to be inadequate for the majority of the users due to its limited features and difficulty to manage in large scale. Therefore, based on recommendations made by the pilot participants, the TCD changed direction and followed a service provider approach where platform functions were not exposed to end-users but kept under control of the domain administrator. The technical and administrative functions of the facility could, as a result, be separated, thereby allowing TERENA and NRENs to perform central service administration and distributed platform management simultaneously. The final recommendations of the pilot can be summarised as follows: TCD should focus on its main asset, which is to maintain trust and privacy of the end-user domain by separating metadata and encryption keys from the storage data at the domains’ boundary. TCD should not compete with the feature-rich front-end sync&share type applications available on the market; TCD should instead broker them to various public and/or NREN-provided storage back-ends in a trusted way. TCD should be a lightweight, thin layer (preferably controlled and operated by NRENs in a distributed way) separating/interfacing end-user application domains and cloud service provider domains; it should be considered as a storage middleware. TCD should not address interoperability at the cloud infrastructure level but facilitate a multi-vendor approach in the application space (through strategic partnerships with application providers of the users’ choice) as well as aggregate data storage demands and relay them to public and private cloud back-ends (available under TERENA’s certified framework agreements).

Page 4: TERENA Trusted Cloud Drive Facility

4

Contents

Executive Summary ………………………………………………………………………………...........3

1. Motivations and objectives .......................................................................................................... 5 Aim of the pilot ......................................................................................................................................................................... 5

Measures of success ................................................................................................................................................................ 6

Delivering the pilot ................................................................................................................................................................. 7

Terms and definitions ............................................................................................................................................................ 7

2. The cloud broker platform ............................................................................................................ 9 Platform architecture and technical characteristics .................................................................................................... 9

3. Pilot participation and dissemination ....................................................................................... 11 Dissemination ......................................................................................................................................................................... 11

Open Phase II procedures ................................................................................................................................................... 11

General participation ............................................................................................................................................................ 12

Core team and major results ............................................................................................................................................. 13

4. Deployment scenarios and use cases ........................................................................................ 15 Preliminary service delivery scenarios ........................................................................................................................... 16

Trust relationship models ................................................................................................................................................... 17

Use cases identified .............................................................................................................................................................. 18

5. Business and legal considerations ............................................................................................. 23 Legal aspects ........................................................................................................................................................................... 23

Pricing model .......................................................................................................................................................................... 25

6. Next steps and future directions ................................................................................................ 26 Acknowledgement ............................................................................................................................. 27 References ........................................................................................................................................... 27

Page 5: TERENA Trusted Cloud Drive Facility

5

1. Motivations and objectives Undoubtedly, massive data storage is vital for academic research. Individual researchers and students on campus increasingly use commercial cloud storage offerings (e.g. Google Drive, iCloud, Dropbox) available on the market. However, these public services are not primarily designed for the needs of sensitive research data sets. Therefore, universities and research institutes are seeking for partnership with private storage solution integrators and application developers (e.g. PowerFolder, SpiderOak, OwnCloud) to build and operate their own storage infrastructure on campus that needs not only capital investment, but also operational knowledge and experience. Although these private storage clusters are able to provide the desired performance and data privacy, , due to a lack of standards and sometimes proprietary vendor solutions, they cannot always interface with each other or with the public services. National research and education networking organisations (NRENs) are in a good position to deliver high-performance data storage infrastructure as a service specially tailored to the research and education community over their advanced networks at national scale. Moreover, thanks to the European and global NREN collaboration, they can also aggregate demands and facilitate community-provided storage to be shared across TERENA members [1]. Trust is the main asset of NRENs, as the universities that are also the major clients of the NRENs govern them. The Trusted Cloud Drive (TCD) service pilot builds on this trust relationship and provides the necessary software tool and know-how to NRENs. Several NRENs in the TERENA community are increasingly interested in offering cloud services for their constituency and many national pilots have already been established. Cloud service is clearly the new paradigm that is changing the traditional way of providing services and how these are accessed by costumers. For instance, cloud storage supports the demand for outsourcing scientific data preservation services to third-party; for using distributed (i.e. geographical diversity) resources whenever needed; and for offering remote space for users to store their data and access it using different devices. However, when outsourcing to public clouds (i.e. Amazon, Google, etc.), privacy and ownership remain a matter of great concern. TERENA has been asked by several members to take action to support the community experimenting with cloud services. As a results of this community call, a small meeting was organised at the TERENA Secretariat in October 2011. One of the outcomes of this meeting was that TERENA would investigate the feasibility of initiating a pilot project that would provide storage capabilities that follow the cloud delivery model to the participating NRENs [2]. Aim of the pilot The primary aim of this pilot was to explore possible deployment scenarios for a trusted personal storage service tailored for academia. The pilot was built upon a federated software platform (i.e. the Cloud Broker Platform) which offers the ability to easily connect different storage back-ends (both private and public cloud storage back-ends are supported) and store users’ data in a secure and privacy preserving way - thanks to the separation of storage data and metadata as well as the built-in encryption functionality - in the cloud. Trust is the main factor that is very hard to gain in the global (often multi-national) cloud environment. The TCD facility offers a privacy enforcement tool that preserves privacy data in the client domain by keeping the metadata “at home” whilst allowing the encrypted contents to be stored in such a way that it cannot be accessed by governments without notifying the data owner. TCD addresses legal concerns highlighted by the U.S.-EU Safe Harbor Framework, the USA PATRIOT Act, and the European Commission Directive on Fighting cyber crime and protecting privacy in the cloud (see Section 5).

Page 6: TERENA Trusted Cloud Drive Facility

6

The following features were also explored as part of the pilot: • longer term sustainability for a potential service (i.e. the community); • legal aspects and perceived trust issues related to the storage and management of the encryption keys

and metadata (i.e. the service model); • software scalability and performance (i.e. the code).

Measures of success Objectives Measures Targets Opportunities Threats The community Longer term sustainability for a potential service

Number of alpha/beta testers (individuals) Number of organisations installing the service platform at their location (other than the TERENA Secretariat) Number of software developers (code co-owners other than the lead developer) who can work on the code (proof needed)

At least 30 test accounts At least 5 test instances At least 3 software developers

Knowledgeable and reliable software development community around the open-source code. Significant number of user communities, specific use cases.

Platform developer as single point of failure. Lack of development and support efforts. No significant take up of the service platform.

The service model Legal aspects and perceived trust issues related to the storage and management of the encryption keys and metadata

Common understanding of the legal implications of storing encrypted data blob in the cloud Description of the potential service models and service delivery scenarios illustrated by use cases

Post information on the pilot Wiki as well as in the final report Post information on the pilot Wiki as well as in the final report

Cloud platform is widely used to clearly separate the personal data controller role from the storage data manager role. Organisations can pick the service model and delivery scenario that is better suited to their environment and use cases.

(Legal) benefits of using the platform are not understood. Perceived as yet another personal cloud storage service. One single service model does not fit all organisations.

Page 7: TERENA Trusted Cloud Drive Facility

7

The code Software scalability and performance

Security test results (system intrusion, hacking, data privacy) Compatibility test results (operating system, web browser, APIs) Performance test results (load, volume, stress) Acceptance test results (Alpha, Beta)

All the necessary tests to be completed.

Platform code is robust, secure, and scalable.

Platform code is weak, insecure, and rigid.

Table 1: Measures of success

The pilot project has been evaluated against these measures of success and pre-defined targets. For evaluation results see Section 3.

Delivering the pilot The pilot was carried out in two phases:

• Phase I - A test instance of the platform was deployed at the TERENA Secretariat’s offices. During this phase the cloud broker (the elements in the green box depicted in Fig. 1) were installed and connected to a limited local storage back-end offered by TERENA. A simple web portal and the necessary support for the federated access were also implemented. For this phase TERENA sub-contracted the lead software developer, Maarten Koopmans, who provided the necessary support for the installation. The platform was evaluated and tested by a limited number of NRENs’ experts coordinated by TERENA.

• Phase II - NRENs were invited to participate in the pilot by either adding their own cloud storage back-

end and/or developing new front-end applications to the cloud broker. The Amazon S3 public storage back-end option was also available for testing. It was envisaged that NRENs would offer a limited number of end-users to provide feedback on the usability of the system. Although most of the user requirements were not implemented during the pilot phase, they did help to shape and understand the type of services that users are looking for.

The pilot Phase II was operated for a nine-month period after which an evaluation was made to assess the success of the pilot and to agree on the following steps. Terms and definitions Some definitions used in the document can be found in Table 2.

Cloud Broker Platform

The Cloud Broker Platform is a flexible open-source software tool developed by UNINETT Sigma in 2010 as part of the NEON project. The platform is the basis of the TERENA Trusted Cloud Drive pilot service.

Page 8: TERENA Trusted Cloud Drive Facility

8

Trusted Cloud Drive

Trusted Cloud Drive is a pilot service made available by TERENA for evolutionary prototyping, testing and service development purposes.

Pilot A pilot experiment, also called a pilot study, is a small scale preliminary study conducted in order to evaluate feasibility, time, cost, adverse events, and effect size (statistical variability) in an attempt to predict an appropriate sample size and improve the study design prior to implementing it as a full-scale research project. Pilot studies, therefore, may not be appropriate for case studies.

Prototype A prototype is an early sample or model built to test a concept or process, or to act as a thing to be replicated or learned from. A prototype is designed to test and trial a new design to enhance precision by system analysts and users. Prototyping serves to provide specifications for a real, working system rather than a theoretical one. Prototype software is often referred to as alpha grade, meaning it is the first version to run. Often only a few functions are implemented; the primary focus of the alpha is to have a functional base code onto which features may be added. Once the alpha grade software has most of the required features integrated into it, it becomes beta software for testing the entire software and to adjust the program to respond correctly during unforeseen situations in the development process.

Evolutionary Prototyping

The main goal when using evolutionary prototyping is to build a very robust prototype in a structured manner and constantly refine it. The reason for this is that the evolutionary prototype, when built, forms the heart of the new system, and the improvements and further requirements will be built. Evolutionary prototyping acknowledges that we do not understand all the requirements and builds only those that are well understood.

Grey-box Testing

In black-box testing there is no information about the internal structure at all. In grey-box testing there is information when the tests are designed, but when the tests are executed that information is neglected. The TCD pilot testing procedure includes:

• security testing (system intrusion, hacking, data privacy) • compatibility testing (operating system, web browser, API) • performance testing (load, volume, stress) • acceptance testing (Alpha, Beta)

Service Delivery Framework

A service delivery framework (SDF) is a set of principles, standards, policies and constraints used to guide the design, development, deployment, operation and retirement of services delivered by a service provider with a view to offering a consistent service experience to a specific user community in a specific business context. An SDF is the context in which a service provider's capabilities are arranged into services.

Table 2: Terms and definitions

Page 9: TERENA Trusted Cloud Drive Facility

9

2. The cloud broker platform The TERENA TCD is a pilot experiment to determine the feasibility of a targeted personal data storage service that builds on a flexible Cloud Broker Platform. The unique features of the selected software platform are:

• federated access to the service; • metadata and storage data are kept separate; • storage data is encrypted and stored in the cloud; • metadata is stored in a trusted place; • various cloud storage back-ends can be brokered; • standard-based Web Distributed Authoring and Versioning (WebDAV) front-end and the proprietary web

application are both included; • different user platforms (Windows, MacOS, iOS, Android) are supported; • code and documentation are fully open-source and available on TERENA Github1 under Apache Licences,

Version 2.0. The software architecture is modular (Fig. 1); each of the platform functionalities can be enabled or disabled according to the administrator’s choice.

Fig. 1: The modular Cloud Broker Platform architecture and its components Platform architecture and technical characteristics The pilot Phase I focused on prototyping and operating the Cloud Broker Platform, the open software developed by UNINETT Sigma in 2010 as part of the NEON project2, at the TERENA Secretariat. This prototype software was built with the basic idea of separating the storage data (i.e. encrypted content) from the metadata (i.e. encryption keys, filenames, size, date, etc.). This particular feature makes, unique to the Cloud Broker Platform, the usage of

1 https://github.com/terena 2Åke Edlund and Maarten Koopmans, “NEON – Northern Europe Cloud Computing”, Final Report, December 17, 2010, 2Åke Edlund and Maarten Koopmans, “NEON – Northern Europe Cloud Computing”, Final Report, December 17, 2010, https://notendur.hi.is/~helmut/publications/NEON-Final-Report.pdf

Page 10: TERENA Trusted Cloud Drive Facility

10

public clouds particular appealing. A set of metadata is linked to the user’s data for search purposes; the metadata (together with the encryption keys) are meant to be stored and operated by a trusted party, which in practical terms means that the storage data and the metadata can be handled by different parties. By keeping the metadata store “on premises” data confidentiality is guaranteed under the assumption that the premises are inside a “trusted domain” – e.g. TERENA. The metadata is stored in a metadata store called Voldemort that was developed and open sourced by LinkedIn; this store scales elastically and across data centres. The data itself is encrypted using 128 bit Advanced Encryption Standards (AES) (though any cipher could be used); the file names are replaced by a universally unique identifier (UUID) for the cloud provider. The mapping between the UUID and the filename takes place in the metadata store. In this way, a stored blob does not reveal any information that could be exploited by a malicious attack (i.e. “which blob do we need to attack?”). The data stored in the cloud is accessed using WebDAV, a stable and widely supported protocol (MacOS, Linux, Windows); the WebDAV server connects to a web front-end, which makes the WebDAV transparent to the end-users. Though many built-in clients differ slightly in their implementation, all their differences are handled by the custom designed WebDAV daemon – including iOS and Android’s most popular applications such as Goodreader and WebDAVNav.

Fig. 2: Cloud broker platform architecture implemented in Phase I

The software has features that are not an essential part of the pilot, such as a public folder where users can share data with the world, a web interface to the file storage, the ability to tag files and folders and search these tags via the web interface, and the ability to store these searches as search folders that are automatically updated when new files are tagged or when tags are removed from existing files. There is no limitation on the type of storage that can be used, as the software is able to support different cloud storages - such as Amazon S3 and RackSpace Cloud Files, - which makes it particularly flexible. When logging in via their federations, users will not be informed of the type of cloud-storage back-end used but will be able to store and retrieve their data as desired. The frontends are elastically scalable as they are stateless – all data resides in the metadata store. Stable operation of the metadata store and fast connections (low latency, low round trip time) will improve the end user experience when keeping the NoSQL metadata store on premises.

Page 11: TERENA Trusted Cloud Drive Facility

11

3. Pilot participation and dissemination The pilot Phase I was completed by the end of May 2012. The platform was then opened to the participants of Phase II for the following 9 months. To access the pilot service:

• the WebApp interface was available at https://tc2.terena.org; • the WebDAV interface was available at https://tc1.terena.org.

The following practical limitations applied to the TERENA test service instance (all configurable otherwise):

• 100 GB local data storage (on a 1TB volume); • 20 GB metadata store (inside the virtual machine (VM)); • 5 GB single file-size limit; • no limitation on the number of concurrent users.

Dissemination The features of the platform were presented and demonstrated at the TERENA Networking Conference 2012 in Reykjavik, Iceland. A YouTube video3 was also prepared by Péter Szegedi (Project coordinator, TERENA Secretariat) to show how to access and use the platform. The full list of dissemination activities (eight presentations) can be found in the References section of this document. It includes the presentations provided at meetings of the TERENA Storage task force (TF-Storage) [3] [9] and the TERENA task force on Management of Service Portfolio (TF-MSP) [4]; larger events such as the European Commission Information Day on Call 8 [5]; the TERENA Networking Conference 2012 [6]; the Cisco Symposium 2012 [7],; the RNP Forum 2012 [8]; and the SUCRE Project Workshop 2013 [10]. TERENA also issued two news items on its website:

1. 23 April 2012 - New pilot project to extend TERENA's cloud activities http://www.terena.org/news/fullstory.php?news_id=3140

2. 15 October 2012 - TERENA Trusted Cloud Drive pilot invites phase II participation http://www.terena.org/news/fullstory.php?news_id=3269

The third news item will be issued after the publication of this final report. Open Phase II procedures Phase II was the public phase of the pilot which ran until the end of March 2013. A discussion mailing list ([email protected]) was set up for Phase II participants. To ensure widest possible participation from within the TERENA community, a gradual procedure was adopted, where participants could first become involved by experimenting as a user of the service, then act as an administrator of the service, and finally, if they wanted, study the code and become an adaption or developer of the service:

1. Bring your test users and try out the TERENA installation of the service: All the national federations, as well as the guest federations (social media) such us Google and facebook, were connected to federated platform. User with e.g. a Google account could therefore test this service. Due to service restrictions, all test users needed to be white-listed. A Wiki tutorial explaining the procedure to follow to white-list a federated account was made available at:

https://confluence.terena.org/display/CloudStorage/Get+access+to+the+TERENA+installation

3 Demonstartion video: http://www.youtube.com/watch?v=DfGUX2Utypw

Page 12: TERENA Trusted Cloud Drive Facility

12

An additional Identify Provider (IdP) configuration guide (only for experts) was published to address cases where a user account was white-listed but still faced problems accessing the service: https://confluence.terena.org/display/CloudStorage/IdP+configuration

2. Attach your own storage back-end to the platform installed at the TERENA Secretariat offices:

By default, the service used a local file system at the TERENA offices and had the possibility to store encrypted data in the Amazon S3 cloud. If a user wanted to connect a local storage back-end, he needed to develop a bridge between the Cloud Broker Platform and the local data storage facility. The interface/protocol description, and the detailed technical information, consultation, and support were made available on the mailing list.

3. Get familiar with the code:

The open source code was written in Scala and ran on top of the Java VM. The pilot participants could get familiar with the code step by step. A single downloadable development VM image, a clone of the TERENA system installation (three VMware images on Ubuntu 12.04 LTS Server platform with openJDK 6, MySQL, cadaver (downloadable ZIP file4) and Readme5), and the source code and full documentation were all made available by the pilot through Github, under Apache Licenses, Version 2.0. There was also limited free support via the Google group. Pilot participants (Staszek Jankowski and Maciej Brzeźniak – PSNC, Dirk Dupont – BELNET, and Jakub Peisar - CESNET) also contributed to an Installation Guide available at https://confluence.terena.org/display/CloudStorage/Installation+Guide.

TERENA owns a Github space https://github.com/terena where the Trusted Cloud Drive project is forked (temporarily) and Github users (developers) can be added to the team. The aim of the aforementioned gradual approach was to build a community of developers who would eventually become "co-owners" of the code. It was expected that these potential co-owners organisations would contribute, through man-power and/or financial support, to the maintenance and development of the code. General participation A total of 66 subscribers signed up to the pilot mailing list. Forty-eight user accounts were provisioned (white-listed) on the TERENA service instance. Table 3 includes the list of organisations that expressed an interest in the pilot. More details about their actual interest can be found at https://confluence.terena.org/display/CloudStorage/Phase+II+-+Completed

European NRENs (16) ACOnet, ARNES, Belnet, CARNet, CESNET, CSC, DFN, FCCN, GRNET, HEAnet, NIIF, PSNC, RedIRIS, RENATER, SURFnet, SWITCH

Other NRENs (3) AARNet (Australia) ERNET (India), RNP (Brazil)

Universities (8) École Polytechnique Fédérale de Lausanne, Newcastle University, University of Melbourne, University of Malta, University of Porto, Università Roma TRE, Aristotle University of Thessaloniki, University of Vienna

4 Development VMwaer images http://d195twbwndmf02.cloudfront.net/Clouddrive.zip 5 Readme http://d195twbwndmf02.cloudfront.net/README.txt

Page 13: TERENA Trusted Cloud Drive Facility

13

Regional networks and Research institutes (3)

CERN, CESCA, Srce

Commercials (7) Amazon, Box, Dell, Joyent, OwnCloud, PowerFolder

Table 3: Phase II interest

TERENA TF-Storage participants discussing the Trusted Cloud Drive pilot in March 2013 in Berlin, Germany Core team and major results In addition to the TERENA test service instance, eight other national deployments were implemented primarily for testing and software development purposes. The list of platform instances, the name of contacts and voluntary developers, and their results are summarised in Table 4.

Organisation Contact / Developer Results

GRNET Panos Louridas Christos Loverdos

The Pithos+ cloud storage back-end of GRNET has been integrated with the TCD platform. The code extensions were added to the Github repository. The work was demonstrated and the experiences shared at the 12th TF-Storage meeting in Berlin, Germany. http://www.terena.org/activities/tf-storage/ws14/slides/20130306-Pithos-CloudDrive.pdf

Page 14: TERENA Trusted Cloud Drive Facility

14

RNP Roberto Araujo Guilherme Maluf

OpenStack Java SDK was used for connecting the Swift storage back-end of PRN and interfacing with TCD. The results were added to the Github repository. A TCD service integration with RNP's OpenStack Swift-based cloud storage infrastructure is planned.

CESCA Jordi Guijarro Lorenzo J. Cubero

The implementation of the Jclouds API at the TCD back-end has was planned and begun however, as Lorenzo has left CESCA, this was not completed

CESNET Jakub Peisar

A test platform instance was deployed using a single VM as well as multiple VMs setups. The performance test resulted in good scalability and performance, taking into account the actual test conditions. The TCD platform scales horizontally per vCPU and across multiple WebDAV servers. The ~60Mbit/s upload speed achieved with application level encryption and compression on one vCPU was good; it could even be tripled on a quad core. Note that both compression and encryption can be turned off which increases the seed by a factor of 4-8. The detailed results can be found at https://confluence.terena.org/display/CloudStorage/Software+tests

Belnet Mario Vandaele Jean-Philippe Evrard

The test platform instance was deployed using a single VM setup. Priority was taken by the national storage infrastructure tender at Belnet.

PSNC Maciej Brzeźniak Stanisław Jankowski

The test platform instance was deployed using a single VM setup. Future code security audits and performance tests have been planned. Code security audit has been postponed.

Srce/CARNet Damir Žagar Nikola Garafolic

Srce/CARNet was looking for new storage platforms to migrate from the current GSS installation due to end of product life. TCD was deployed and evaluated but the end-user features of the platform were not sufficient.

FCCN João Pagaime

The test platform instance was deployed using a single VM setup. The end-user features of the TCD platform were found “rough around the edges”.

Table 4: Test/Development platform instances (in addition to the Phase I TERENA instance)

The detailed consultation of the aforementioned extensive group of pilot participants has led to the development of potential deployment scenarios and use cases for the TERENA Trusted Cloud Drive facility, summarised in the next section.

Page 15: TERENA Trusted Cloud Drive Facility

15

4. Deployment scenarios and use cases The pilot initially took an end-user service approach where users could have federated access to the centralised TCD platform via a simple web interface as well as a standard WebDAV to store data. This approach turned out to be insufficient for the majority of the users because of its limited features and difficulty to manage in large scale. Therefore, based on recommendations made by the pilot participants, the TCD changed direction and followed a service provider approach where the platform functions were not exposed to end-users but kept under the control of the domain administrator. The technical and administrative functions of the facility could, as a result, be separated, thereby allowing TERENA/NRENs to simultaneously perform central service administration and distributed platform management.

Pilot recommendations and service architecture The final recommendations of the pilot can be summarised as follows:

• TCD should focus on its main asset: to maintain trust and privacy of the end-user domain by separating metadata and encryption keys from the storage data at the domains’ boundary;

• TCD should not compete with the feature-rich front-end sync&share type applications available on the market. It should instead broker them to various public and/or NREN-provided storage back-ends in a trusted way;

• TCD should be a lightweight, thin layer (preferably controlled and operated by NRENs in a distributed way) separating/interfacing end-user application domains and cloud service provider domains; TCD should be considered as a storage middleware;

• TCD should not address interoperability at the cloud infrastructure level but facilitate a multi-vendor approach in the application space (throughout strategic partnerships with application providers of the users’ choice) as well as aggregate data storage demands and relay them to public and private cloud back-ends (available under TERENA’s certified framework agreements).

In line with the strategic directions outlined above, TERENA should further explore:

• the service brokering scenarios made available by the TCD-Pithos+ and the TCD-Swift integrations; different NREN storage infrastructures and/or public cloud storage back-ends can be brokered down to user domains in a trusted, privacy-controlled way;

• the service integration scenarios with feature-rich end-user storage applications; both community-developed and commercial solutions (such as OwnCloud and PowerFolder) can be investigated for potential integration with TCD functionality to preserve trust and privacy.

Fig. 3 depicts the desired TCD service architecture taking into account some potential use cases described in the followings.

Page 16: TERENA Trusted Cloud Drive Facility

16

Fig. 3: TCD service architecture and major use cases In this architecture the TCD acts as a storage middleware (with the functionality of encryption, compression, and metadata preservation) that separates the client trust domain(s) from the storage provider domain(s), and brokers several cloud storage back-ends (both private and public) to the user-preferred storage application front-ends according to the platform administrator’s choice.

Preliminary service delivery scenarios The following preliminary service delivery scenarios (Fig 4.) were envisioned at the beginning of the pilot:

1. Hosted service scenarios A) One cloud broker instance is hosted at a central location (e.g. TERENA offices), metadata is stored

at the same place (i.e. inside the broker), storage data is stored in the public cloud contracted by TERENA.

B) One cloud broker instance is hosted at a central location (e.g. TERENA offices), metadata is stored at the same place (i.e. inside the broker), storage data is either stored in the public cloud or in the data storage facilities provided by NRENs participating in the pilot coordinated by TERENA.

2. Brokered service scenarios A) Several cloud broker instances are hosted by NRENs or end-sites, metadata is stored locally (i.e.

inside the distributed brokers), storage data is stored in the public cloud that is brokered to NRENs by TERENA.

B) Several cloud broker instances are hosted by NRENs or end-sites, metadata is stored locally (i.e. inside the distributed brokers), storage data is stored in both public cloud and the data storage facility of NRENs brokered and coordinated by TERENA.

Page 17: TERENA Trusted Cloud Drive Facility

17

Fig. 4: Preliminary service delivery scenarios

According to the pilot results, the fully centralised service delivery scenarios (1A and 1B), where TERENA hosts and operates a single TCD platform, would not function. A distributed approach, where NRENs or client domain administrators host and operate the TCD instances by accessing one or more storage back-ends provisioned via a centralised TERENA portal, is a more realistic approach. In these cases (2A and 2B), the administrative functions (i.e. the TERENA portal part) are separated from the technical functions (i.e. the distributed TCD platform instances).

Trust relationship models In the distributed service delivery scenarios, the TCD platform can be placed at different levels in the cloud stack. Table 5 shows the possible TCD platform locations against the trust relationship throughout the entire cloud service stack.

Trust model No trust Client domain trust

NREN trust TERENA trust Full trust

Cloud storage provider

No further trust delegation

No further trust delegation

No further trust delegation

No further trust delegation

Back-end encryption

TERENA TCD instance encryption

NREN or Data centre

TCD instance encryption

University or Institute

TCD instance encryption

End-user Client level encryption

Client Client Client Client

Out of scope In scope In scope In scope Out of scope

Table 5: Trust relationships throughout the cloud service stack

Page 18: TERENA Trusted Cloud Drive Facility

18

It is assumed here that trust becomes weaker and weaker as we move up the cloud service stack from end-user to cloud storage provider. It is therefore recommended that the TCD storage middleware platform be installed as close as possible to the end-user domain. Ideally, TCD can sit at the border of the client domain controlling outgoing storage data (i.e. client domain trust model).

Use cases identified Based on the extensive consultation with the pilot participants as well as the broader TERENA community (via TF-Storage and TF-MSP) four major TCD use cases have been identified:

1. Public cloud storage broker 2. Private storage infrastructure capacity aggregator 3. Private storage application back-end interface 4. Trusted data replication middleware

The following set of tables summarises the essence of these use cases.

Page 19: TERENA Trusted Cloud Drive Facility

19

Use case 1. Public cloud storage broker

Scope Pan-European level (TERENA). NREN as storage service provider.

The case NREN wants to burst temporary storage demand peeks into public storage cloud(s) even if that nominally costs more than the in-house data storage facility.

TCD functions API(s) to public cloud storage back-end(s). Transparent account management via federated authentication and authorisation infrastructure (AAI). Trust domain demarcation (i.e. privacy preservation by metadata separation). Strong (additional) encryption. Storage data compression (optional). Uniform search support on local metadata without touching the actual storage data.

Role of TERENA TERENA shall close framework agreement(s) with appropriate public cloud storage provider(s) – optionally provide technical/legal certification of these – and make them available via the distributed TCD platform(s). Aggregate demands from NRENs, provide administration and accounting – full authentication, authorisation and accounting (AAA) if needed – via an additional centralised TERENA web portal.

Key benefits To make sure that no personal data leaves the customer domain (i.e. beyond TCD) when bursting to public clouds. To opt-in to TERENA framework agreement(s) if the total value of purchase is lower than the limit of national public tendering rules or team up for joint procurements with the help of TERENA.

Illustration

Page 20: TERENA Trusted Cloud Drive Facility

20

Use case 2. Private storage infrastructure capacity aggregator

Scope Pan-European level (TERENA). NREN as storage infrastructure provider/customer.

The case NREN makes spare data storage capacity of its own infrastructure dynamically available to others for a short or long period. This can typically happen in the early phase of the funding period/deployment cycle (optionally) to cover the maintenance cost of the new infrastructure.

TCD functions (Sometimes proprietary) APIs to private data storage back-end(s). Transparent account management via federated AAI. Trust domain demarcation (i.e. privacy preservation by metadata separation). Strong (additional) encryption. Storage data compression (optional). Uniform search support on local metadata without touching the actual storage data.

Role of TERENA TERENA shall close agreement(s) and collect Terms of Use documents from the participating NRENs – optionally provide technical/legal certification of those offerings – and make them available via the distributed TCD platform(s). Aggregate small dynamic chunks of NREN private storage capacities to a larger consistent pool and provide them to other NRENs. Provide a centralised TERENA web portal for provisioning, administration and accounting.

Key benefits Ensure that no personal data is exposed to NRENs that offer spare storage capacity to others. No legal obligations arise. Offers private storage cloud interoperability not at the infrastructure but the application level. NREN as storage infrastructure: customer can dynamically select the storage back-end (made available by other NRENs) that fits its customers/applications’ needs the most.

Illustration

Page 21: TERENA Trusted Cloud Drive Facility

21

Use case 3. Private storage application back-end interface

Scope National level (NREN). NREN as storage infrastructure provider to different (commercial/community) private front-end applications.

The case NREN wants to provide storage back-end to various commercial or community private storage front-end applications (such as OwnCloud or PowerFolder) of the end-user’s choice. Assuming that the NREN is small; API development budget and efforts are limited.

TCD functions Single back-end API (needs to be developed once by the NREN). Standard WebDAV front-end to various storage applications. Transparent account management via federated AAI. IdP-based group management and accounting (optional). Trust domain demarcation (optional).

Role of TERENA TERENA shall step into strategic partnerships with some selected (preferred) commercial and/or community private storage application developers (such as OwnCloud or PowerFolder). Commercial/community storage front-end application domains shall comply with TCD at the back-end or at the storage abstraction layer agreed and coordinated by TERENA.

Key benefits Win-win strategic partnership with private storage application providers; NRENs can offer their storage infrastructure to third-party private storage application domains; commercials can gain greater customer trust by offering NREN storage complying with TCD.

Illustration

Page 22: TERENA Trusted Cloud Drive Facility

22

Use case 4. Trusted data replication middleware

Scope Bilateral agreements (NREN to NREN). NREN as storage service/infrastructure provider/customer.

The case NREN wants to improve reliability or offer better Service Level Agreement (SLA) by replicating specific user data to an other NREN’s storage infrastructure (or commercial cloud storage certified by TERENA). NREN wants to keep privacy control on replicated storage data.

TCD functions API(s) to public or private cloud storage back-end(s). Trust domain demarcation (i.e. privacy preservation by metadata separation). Strong (additional) encryption. Uniform search support on local metadata without touching the actual storage data.

Role of TERENA Managing all available back-end storage offerings for replication.

Key benefits Enhanced storage service reliability. Storage service diversification through rich availability, reliability and SLA variations tailored to the customers’ needs (and budget).

Illustration

Table 6: Summary of the four use cases identified

In all of these use cases, TCD can be considered as a distributed storage middleware platform installed at the edge of the client domains, preserving trust within the domain. For central administration, and in some cases accounting reasons, TERENA might want to provide a centralised web portal (not part of TCD).

Page 23: TERENA Trusted Cloud Drive Facility

23

5. Business and legal considerations All of these use cases see TERENA’s role as acting at the pan-European level. The functions that TERENA can perform are as follows:

• negotiate and coordinate framework agreements with public cloud storage providers (in Use Case 1); • act as a single contractor and clearing house for joint public cloud storage procurements, aggregate

demands (in Use Case 1); • administer NREN-offered dynamic storage capacity pool and provision storage to the community (in Use

Case 2); • evaluate, assess and certify cloud storage back-end offerings from a technical, legal and business point of

view (in Use Cases 1, 2 and 4); • step into strategic partnerships with some selected (preferred) commercial and/or community private

storage application providers (in Use Case 3). For both technical and legal cloud storage certification, as well as storage procurement tendering preparations, TERENA will provide detailed technical specifications and some legal recommendations that take into account the special requirements of the TCD platform integration. These specifications can also be used by NRENs as best practice documents for preparing national tenders or evaluating the tender responses. The specifications will be available on the TCD pilot Wiki page by the second quarter of 2013. In regards to storage front-end applications, a large number of commercial and community private cloud storage applications are available on the market. Selecting the market leader is difficult and can vary according to country or even region. The best strategy for TERENA would be to pick one (or more) popular, largely community supported/developed, open-source storage application(s) as well as one (or more) smaller commercially developed product(s) – where TERENA can better influence the software/service development. TERENA should step into agreements with these selected private cloud storage front-end providers to comply with TCD at their back-end. These potential strategic partnership agreements could lead to a win-win situation where:

• TERENA members can benefit from good value-for-money educational licences with pre-configured (i.e. inbuilt TCD support) clients including end-user support;

• Commercial providers can benefit from the single access point to a national research and education community (i.e. demand aggregate) as well as a developed relationship of trust (via TCD) that the community values the most.

Legal aspects Table 7 summarises those legal aspects that the pilot participants found relevant to TCD deployments, the potential use cases, and cloud storage procurements.

Page 24: TERENA Trusted Cloud Drive Facility

24

Reference Relevance Conclusion and recommendation

US-EU Safe Harbor Framework

The US-EU Safe Harbor is a streamlined process for US companies to comply with the EU Directive 95/46/EC on the protection of personal data. Intended for organisations within the EU or the US that store customer data, the Safe Harbor Principles are designed to prevent accidental information disclosure or loss. US companies can opt into the program as long as they adhere to the seven principles outlined in the Directive. The process was developed by the US Department of Commerce in consultation with the EU. http://export.gov/safeharbor/eu/index.asp

Full compliance with the US-EU Safe Harbor Framework principles must be a mandatory requirement in case of public cloud storage tendering.

USA PATRIOT Act of 2001

The USA PATRIOT Act of 2001 is an Act of the US Congress that was signed into law by President George W. Bush on 26 October 2001. It stands for ‘Uniting (and) Strengthening America (by) Providing Appropriate Tools Required (to) Intercept (and) Obstruct Terrorism Act of 2001’. The relevant US legislation offers ample opportunities to request data stored in the cloud. The possibility that foreign governments request information is a risk that cannot be eliminated by contractual guarantees. It is a persistent misconception that US jurisdiction does not apply if the data are not stored on US. territory. The key criterion in this respect is whether the cloud provider conducts systematic business in the United States, for example because it is based there or is a subsidiary of a US- based company that controls the data in question. https://confluence.terena.org/download/ attachments/39846087/Cloud_Computing_ Patriot_Act_2012_EN.pdf

Higher education and research institutions should seek to gain more insight into and keep abreast of the various forms of access to data enjoyed by judicial authorities and intelligence agencies. They should, at the same time, identify the related risks. It is also recommended that the higher education sector produce a risk analysis based on a classification of the various types of data that could be requested. It would then be advisable to develop alternatives for data that might pose an unacceptable risk if they were to come into the possession of a foreign government in a non-transparent manner.

Page 25: TERENA Trusted Cloud Drive Facility

25

EC Directorate- General for Internal Policies: Fighting cyber crime and protecting privacy in the cloud, 2012

This study addresses the challenges raised by the growing reliance on cloud computing. It starts by investigating the issues at stake and explores how the EU is addressing the identified concerns. The study then examines the legal aspects in relation to the right to data protection, the issues of jurisdiction, responsibility and regulation of data transfers to third countries. These questions have been neglected in EU policies and strategies, despite very strong implications on EU data sovereignty and the protection of citizens’ rights. In the field of cybercrime, the study thus strongly underlines that the challenge of privacy in a cloud context is underestimated, if not ignored. http://www.europarl.europa.eu/committees/en /studiesdownload.html?languageDocument=EN &file=79050

The main concern arising for private citizens, companies and public administration using cloud technologies is not so much the possible increase in “cyber” fraud or crime than the loss of control over one’s data. From a risk-assessment perspective, the higher risk is indeed to be found in the management of the data contained in data centres, whether this management is of a criminal nature or not. Where cloud computing is possibly most disruptive is where it breaks away from the forty-year-old legal model for international data transfers, jeopardising the rights of the EU citizens.

Table 7: TCD legal context and recommendations

TCD can be considered as a privacy enforcement facility that preserves privacy data in the client domain. It gives a technical solution to a non-technical problem. TCD complies with the International Safe Harbor Framework principles of which the three major points are:

• Choice - Individuals must have the ability to opt out of the collection and forward transfer of the data to third parties.

• Onward transfer - Transfers of data to third parties may only occur if these follow adequate data protection principles.

• Enforcement - There must be effective means of enforcing these rules. TCD is an effective tool for this. In the context of the USA PATRIOT Act, TCD also offers a solution by providing an alternative storage solution for data that might pose an unacceptable risk if they were to come into the possession of a foreign government in a non-transparent manner. TCD keeps control over the privacy data, which is a basic recommendation of the EC Directorate-General for International Policies. Pricing model The NREN community prefers flat-rate pricing models as they can provide access to the entire research and education community of a given country, aggregating large demands (at times, several tens of thousand users). To market TCD, TERENA can apply a simple flat-rate licensing model per TCD instance. The details of the appropriate licensing scheme will be worked out by the TERENA Secretariat, in consultation with the community as well as commercial partners.

Page 26: TERENA Trusted Cloud Drive Facility

26

6. Next steps and future directions Although the TERENA TCD pilot project officially ended in April 2013, TERENA is keen to support TCD for a longer term, taking into account the results and recommendations of this final report. The TERENA action plan includes:

• continue to enhance TCD (as a storage middleware platform); • procure storage through a framework agreement with several suppliers and offer this to the NRENs

member of TERENA; • develop a system for monetising the service to cover the costs and provide an income stream; • provide legal advice to NRENs on their use of TCD; • facilitate community- (NREN) provided storage to be shared across TERENA members; • embrace GN3plus SA7, Helix-Nebula and other initiatives (as well as commercials initiative) with a view to

integrating TCD.

Page 27: TERENA Trusted Cloud Drive Facility

27

Acknowledgement The TERENA TCD pilot project was made available by the TERENA members. The results could not have been achieved without the unconditional collaboration and voluntary efforts of the NRENs and other organisations actively participating in the pilot. TERENA gives special thanks to the contributors of this final report.

References [1] P. Szegedi: “NRENs’ Strategic Perspective on Storage and Cloud - Build or Buy”, Green paper v.05, TERENA TF-

Storage Task Force Deliverable, April 2011. [2] Licia Florio, Peter Szegedi, Maarten Koopmans, Dick Visser, Christian Gijtenbeek: Proposal for building a “TERENA

Trusted Cloud Drive” facility, 28 March 2011 https://confluence.terena.org/download/attachments/31555720/ProposalforbuildingAVirtualDriver-v5-ps-clean.pdf

[3] P. Szegedi: “NREN's perspective on storage and clouds”, TF-Storage Meeting, 3-4 February, 2011, Budapest, Hungary

[4] P. Szegedi: “NREN's perspective on storage and clouds”, TF-MSP Meeting, 3 March 2011, Brussels, Belgium [5] P. Szegedi: “NRENs' Strategic Perspective on Storage and Cloud” , Information Day on Call 8 of FP7: Cloud

Computing Internet of Services and Advanced Software Engineering, 27 Sep, 2011, Brussels, Belgium Video [6] P. Szegedi: “What's mine is mine, what's yours is...” TNC 2012, 21-24 May, 2012, Reykjavik, Iceland Video [7] P. Szegedi:”TERENA Trusted Cloud Drive pilot service”, Cisco Symposium: Cloud - Opportunity or Threat? 21-24

May, 2012, Reykjavik, Iceland Video [8] P. Szegedi: “TERENA Trusted Cloud Drive Pilot - What's the role of EU NRENs in the area of cloud computing” Forum

RNP 2012, 14-16 September, 2012, Brasilia, Brazil Video [9] P. Szegedi: “TERENA Trusted Cloud Drive pilot update”, TERENA TF-Storage meeting, 5-6 March, 2013, Berlin,

Germany [10] P. Szegedi: “TERENA Trusted Cloud Drive - Unleashing the NREN clouds”, SUCRE Workshop - Open Source Clouds

in the public sector, 16-17 April, 2013, Poznan, Poland