I attended Cloud Camp Boston yesterday. It was a great meeting with some good discussions. Several hundred people attended. What struck me about the general session (when all attendees were present) was that there was a lot of interest around data in the cloud. For example, during the “unpanel” (where people become panelists in real time), 50%; (5 of the 10 questions) that were up for grabs dealt with data in the cloud. That’s pretty significant.
- How do I integrate large amounts of enterprise data in the cloud? (answers included various approaches, more traditional to new vendor technology were mentioned)
- How do I move my enterprise data into the cloud? (answers included ship it FedEx on a hard drive and make sure there is a proven chain of custody around the transfer)
- How do I ensure the security of my data in the cloud? (no answer – that deserved its own breakout session)
- What is the maximum sustained data transfer rate in the cloud? (answers included when it takes a server down, no one knows, but a year ago someone mentioned that 8 gigabytes a second took down a cloud provider)
- How do applications (and data) interoperate in the cloud? (answers included that standards need to rule)
There were some interesting break out sessions as well. One – the aforementioned security (and audit), another an intro to cloud computing (moderated by Judith Hurwitz), one about channel strategies, and a number of others. I attended a break out session about Analytics and BI in the cloud and again, for obvious reasons, much of the discussion was data centric. Some of the discussion items included:
- What public data sets are available in the cloud?
- What is the data infrastructure needed to support various kinds of data analysis?
- What SaaS vendors offer business analytics in the cloud?
- How do I determine what apps/data make sense to move to the cloud?
The upshot? Data in the cloud – moving it, securing it, accessing it, manipulating it, and analyzing it – is going to be a hot topic in 2010.
8 thoughts on “Top of Mind – Data in the Cloud”
I would actually pose a more basic question. Why do we need to keep our data in the cloud? Wouldn’t it make sense to keep our data in our own premises and only send the necessary data to the cloud in a transient state when it has to be used by the cloud functionality. Would it be possible to have my salesforce.com database behind my firewall, rather than in the cloud, where I am absolutely not sure it is not copied, leaked etc. Worth thinking.
There was zero discussion of public data in the cloud at Cloudcamp St Louis. What did you guys come up with on the topicn in Boston?
We briefly discussed some of the public data that is out there in the cloud: US Census, Open Street Maps, some Amazon public data, DBpedia, and the fact that NSF might make its scientists who receive grants from NSF make all of their data public (in the cloud?)
Being a geogeek, OSM is not cloud data at this time. CloudMade is, but that is a derived view rather than OSM itself. Same with the NSF requirement, public, but not necessarily cloud.
I’m not sure if these are in the cloud exactly but DATA.gov has as its mission to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. There is only a limited amount of data right now but with the emphasis on Open Government, more and more data is expected to be released.
Here’s a newsflash for everyone. OMB is now requiring all federal agencies to use the cloud or justify why not. I believe this is all a part of the administrations Open Government initiative. See the link below for more information
[…] couple days ago I commented a blog entry titled “Top of Mind – Data in the Cloud”. Unfortunately, the discussion moved into an other direction, focusing on the government data […]
For the customers of our cloud storage company Nirvanix, common use cases are: remote archive, long-term near-line storage (think media assets for mastering), “third copy”.