That’s Off the Chain – AWS Today vs. Decentralized Cloud Storage

Can AWS optimize its supply chains and gain reliability by leveraging decentralized file storage technology?

Digitized supply chains require transparent, integrated ecosystems in order to allow companies to fully model their networks and anticipate or react to disruptions in the supply chain [1]. Blockchain technology provides a new digital mechanism for the supply of file storage. Decentralized file storage and retrieval can significantly improve reliability, security, privacy, supply and demand matching, and throughput time on file retrieval. Unlike traditional cloud file storage, decentralized file storage can connect customers directly to their data storage without any intermediaries. By leveraging decentralized file storage in their web storage supply chain, Amazon Web Services (AWS) can optimize both their customer’s supply chains and their own.

Proof of Supply Chain?

From a supply chain perspective, decentralization offers an advanced logistics system, efficiently matching user’s requests for storage to host’s excess hard-drive space, and duplicating files to be stored across different geographies, providing greater resiliency. Whereas a typical consumer has no visibility into the movement or storage of digital files with AWS, the nodes in a decentralized storage system can be verified. In contrast, when you store data on Amazon’s Simple Cloud Storage Service (S3), it may remain unencrypted on an AWS server, vulnerable to attack or misuse [2]. These issues are a source of variability for AWS and for the customer, resulting in server downtime and decreasing average output rates of file storage.

Hashing out Supply Forecasting

Within a decentralized file storage system, tokenization eliminates the need for supply and demand forecasting. Suppliers of storage space are compensated with a cryptographic token (such as SiaCoin) and consumers of file storage pay with a token as well. The token’s value is intrinsically linked to the value of the network: when there is more demand, the token is worth more, and when there is more supply, the token is worth less as supply adjusts to meet demand. By leveraging decentralized technology, file storage services like Dropbox or AWS should be able to automatically match supply and demand. When supply and demand are aligned, a company like AWS can have dynamic supply chains, correcting the bullwhip effect that results from inexact supply chain estimates [3]. For example, AWS won’t have to figure out how much data warehouse capacity to build in advance of when it’s used, and risk under- or over-supplying storage. As the amount of data created increases, more people and organizations will need to store that data, and better storage supply chains will become critical in allowing organizations to leverage their data in order to make decisions [4].

Figure 1

Figure 2

Prevalent cloud file storage & retrieval systems (Figure 1) vs. Decentralized storage & retrieval systems (Figure 2). Currently, companies like Google, Amazon and Dropbox have massive data centers, and files stored on these services go onto one of their central servers [10].

Hardware is Hard (When it’s Centralized)

While Amazon and AWS have not documented any effort to implement decentralized file storage, I believe AWS should be concerned about the growing risks to their supply chain. In early 2017, Amazon’s data center suffered a meltdown, causing numerous companies running their businesses on Amazon Web Services, such as Slack and Trello, to have sustained outages. According to Amazon, the S3 team was debugging an S3 billing issue, and in executing, one team member accidentally removed a set of servers that supported two other S3 subsystems. In short, a small user error was able to take down entire internet-based companies [5]. Decentralization would reduce the impact of human error, and consequently reduce variability in Amazon’s supply chain.

Leveraging the Other Kids on the Block

As a short-term measure to improve its supply chains and remain competitive, AWS could use Sia, Swarm, FileCoin (when launched) or other decentralized storage systems as a partial replacement of its existing cloud storage providers and content distribution networks. In the long term, by using Sia, for example, instead of S3 data warehouses to store user data, AWS could gain superior reliability and throughput time by storing some document copies in geographic proximity to users.

The Sia network’s design for distributed, decentralized file storage has no central point of failure, and can survive extreme events like natural disasters and war while still allowing the user to maintain full control of their data [6]. The Sia network, like other decentralized storage systems, uses blockchain to create an open marketplace for cloud storage, similar to the open market that exists for Bitcoin mining [7]. Files are encrypted before being stored on the network to ensure user privacy, and Sia uses file contracts to ensure the file hosts stay on the network [8]. Even if a storage host leaves the network, Sia can ensure that no group of hosts is a single point of failure by using redundancy features [6].

What’s Left to Figure Out Before AWS Forks it Over to Decentralization?

  • Can blockchain systems preserve automatic supply and demand matching, but also implement user-friendly geographic targeting (so that users can limit the countries in which hosts can store data)?
  • Are blockchains too wasteful in energy usage as they ensure data integrity? [9]
  • If S3 were to use Sia to augment their current infrastructure, would they need their data warehouses? How could S3 be compensated for fail safes and data redundancies already built into their data centers?

(wc: 800)

Sources:

[1] “Industry 4.0: How Digitization Makes the Supply Chain More Efficient, Agile, and Customer Focused.” Price Waterhouse Cooper, 2015, p.2.

[2] “Amazon Simple Storage Service (S3) – Cloud Storage – AWS.” Amazon Web Services, Inc., aws.amazon.com/s3/.

[3] Hau L. Lee, et al. “The Bullwhip Effect in Supply Chains.” MIT Sloan Management Review. April 15, 1997. http://sloanreview.mit.edu/article/the-bullwhip-effect-in-supply-chains/

[4] Richard Harris. “More Data Will Be Created in 2017 than the Previous 5,000 Years of Humanity.” App Developer Magazine, App Developer Magazine, 23 Dec. 2016, appdevelopermagazine.com/4773/2016/12/23/More-data-will-be-created-in-2017-than-the-previous-5,000-years-of-humanity-/.

[5] Rachel King. “Here’s Why Amazon’s Cloud Suffered a Meltdown This Week.” Fortune, 2 Mar. 2017, fortune.com/2017/03/02/amazon-cloud-outage/.

[6] David Vorick. “Sia the Decentralized Storage Network.” October 23, 2017, Cambridge, MA.

[7] Satoshi Nakamoto. “Bitcoin: A peer-to-peer electronic cash system.” 2008.

[8] Giuseppe Ateniese, Randal Burns, Reza Curtmola, Joseph Herring, Lea Kissner, Zachary Peterson, and Dawn Song. “Provable data possession at untrusted stores.” In Proceedings of the 14th ACM conference on Computer and communications security, 2007.

[9] “Bitcoin Energy Consumption Index.” Digiconomist, digiconomist.net/bitcoin-energy-consumption.

[10] Metz, Cade. “Google’s Untrendy Play to Make the Blockchain Actually Useful.” Wired, Conde Nast, 3 June 2017, www.wired.com/2017/03/google-deepminds-untrendy-blockchain-play-make-actually-useful/.

Previous:

SunPower: Dark Clouds Over Sunny Future

Next:

AK Steel: Casualty of a Protectionist War with No Winner in Sight

Student comments on That’s Off the Chain – AWS Today vs. Decentralized Cloud Storage

  1. 1. Would a centralized cloud deployment (read AWS) with real-time encryption and some sort of a distributed hot-standby solve for AWS’s current issues? Could this then be superior to a distributed/decentralized cloud architecture?

    2. Who would own, mediate and evaluate supply/demand levels in a given network and ascertain the dynamic equilibrium price of the service? How would the value of this service, priced on a Sia-coin base system, stay cognizant of other fiat currencies/represent the Sia coin’s market value?

  2. I agree that the idea of decentralized file storage has its merits, but I would push back on the notion of large tech companies such as AWS, Google, or Dropbox being the primary beneficiaries or users of the technology.

    Security and privacy are already the utmost concerns for all major technology companies that house their user data. For example, Dropbox uses AES 256-bit encryption, which is the gold standard for security encryption today. Furthermore, the internal anti-hacking product security teams at major companies are more competent and have more manpower than those at startups like Sia or FileCoin. Regarding privacy, all major companies publish privacy and transparency reports that show their commitment to protecting user data, especially in response to government data requests. The Electronic Frontier Foundation publishes an annual report that shows major tech companies like Dropbox, Facebook, and Google all following industry best practices, and all scoring at last 4/5 on their rubric.

    On the point of matching supply and demand, the costs of oversupply become trivialized with Moore’s Law, the idea that the number of transistors that can be fit on a computer chip will double every 18 months. In practice, unit storage costs at companies that build their own data centers effectively halve every 18 months. Thus, these major players need to forecast supply only directionally correctly to continuously reap these cost savings over time. Companies like Dropbox go as far as producing their own hardware and creating custom programming languages to ultra-optimize the efficiency of their servers to capture unit cost savings.

    Regarding service levels, the companies that build their own data centers already design sufficient redundancies to provide 99.9999%+ uptime (“six nines”), as well as survive natural disasters, which are being constantly simulated. To the extent that distributed networks are still reliant on the service levels of major internet service providers, their service levels cannot practically exceed six nines of uptime, so there’s no guarantee that decentralized file storage would have more robust reliability.

    In summary, it’s more likely that smaller startups who lack the scale and resources to capture the benefits of building their own data centers are the ones who will benefit from decentralized file storage technology, as opposed to major file storage players such as AWS or Dropbox.

    Electonic Frontier Foundation report: https://www.eff.org/who-has-your-back-2017
    Cloud storage costs: https://www.cnet.com/news/google-on-cloud-storage-pricing-follow-moores-law/
    Dropbox’s tech blog post on building its own data center: https://blogs.dropbox.com/tech/2016/05/inside-the-magic-pocket/
    Service reliability levels: http://vinciconsulting.com/blog/-/blogs/%E2%80%9Cthe-table-of-nines%E2%80%9D-and-high-availability

  3. Agree that the major file storage players may be less likely to benefit. A few other thoughts:

    – Appreciate the work EFF is doing, but it’s still hard to believe that a 4/5 rating means that our data is safe. Check out https://haveibeenpwned.com/PwnedWebsites

    – Imagine two companies competing in a world where Moore’s law holds up longer than expected. Both benefit, but the distributed company’s cost savings will consistently be bigger, which is a risk to the other, centralized player.

    – So far, AWS was down for at least 4 hours in 2017. 1-(4/(365*24)) = 99.95433%. Do we trust the 99.9999% figure?

Leave a comment