EMC has moved Data Lake from 1.0 to 2.0 and with that announcement the company plans to address the data needs of the mobile workforce on the edge of the data centre. The centrepiece of the Data Lake 2.0 strategy is a new IsilonSD Edge. However, it’s not the only solution released by EMC today. EMC is also providing a next-generation Isilon OneFS operating system specific for optimizing unstructured data and Isilon CloudPools, again for unstructured data, but to provide of an avenue for archival data to the public cloud.
The strategy behind this three-pronged approach for Data Lake 2.0 is an expansion of data storage to the edge with IsilonSD Edge, while OneFS takes on the core and CloudPools pushes less sensitive data out to public clouds such as Amazon Web Services, Virtustream and Microsoft Azure to provide customers with a cheaper alternative than housing data on-premise.
Phil Bullinger, senior vice president of the Isilon product line at EMC’s Emerging Technology Division, said information is exploding to the edge and people are not at the same place all the time. Knowledge workers have a limited amount of data storage at the edge, while the IT department is challenged by governance issues and it makes for a negative impact on productivity.
“What we are trying to solve is no more disconnected edge work and no more unmanaged storage,” he said.
The new software for the edge will be based on software-defined networking and is able to use commodity hardware on a VMware hypervisor no matter if it’s on a Mac, Windows machine or even Hadoop; all protocols will apply.
OneFS’s data services and protocols can scale up to 36TB, while replicating data that can be distributed from the core. The software-defined IsilonSD Edge system will be made available in the channel free for non-production use and licensed per cluster for production use. The OneFS .Next system is the core or the “nerve centre of the Data Lake,” Bulllinger added. This enables data centres to be up always and secure, while providing enterprise grade computing. The OneFS .Next will also provide non-disruptive upgrades and have the ability to rollback the Data Lake reducing risk.
According to Bullinger, EMC has pinpointed three states of data: hot, warm and frozen. Hot data is newly created data. Warm data is data that is being access by knowledge workers. Frozen data is not accessed and kept for either archival reasons or for compliance.
In an example provided by Bullinger of a 1,000 TB data centre you could see Hot data be approximately 10 per cent on average, while 40 per cent is Warm data and 50 per cent is Frozen.
In this scenario, Bullinger said the Frozen data is not adding any value to the organization, while still being costly. The simple plan is to moved the data to the cloud and the organization will get cost savings from moving from a Capex to Opex. “This changes the financial burden. It looks like a great idea, but its not simple. There are too many pitfalls and certainly not a seamless process. The cloud has different file formats for example. Some clouds have poor security and (for the IT department or channel partner) a painful cloud integration situation,” he said.
CloudPools through partnership with AWS, Virtustream and Azure is the solution EMC is proposing in the Data Lake 2.0 plan. CloudPools shifts data to the cloud transparently without any 3rd party appliances and it looks like another storage tier in an organization’s environment.
American film producer Jon Landau, who is working on the sequel of Avatar, is an early adopter of the new IsilonSD Edge system.
With Isilon, Landau said he’s now able to work on a 100 per cent digital strategy. “No one in Hollywood is working on film anymore. The next three Avatar movies will be digital and we will be working with data storage and repurposing data that will blow the ceiling off with high frame rates, laser projection off of laser files with better sound,” he said.
Landau added the Data Lake 2.0 approach opens up the possibility to push the technology further and work with the best of the best from around the world.
The production of Avatar 2 is taking place in two places more than 10,000 kilometres away. Landau is working with a New Zealand-based digital solution provider LightStorm Productions and part of the Avatar 2 is being developed in New Zealand, while the bulk of the movie is done in Los Angeles.
Jeremy Burton, the president of EMC products and marketing, said the data is created locally in the Avatar scenario and the producers need to find a way to get that data to Los Angeles. “A different approach is required,” he said.
“Data Lake 2.0 can empower a global workforce and connect to the core with an integrated system,” Burton said.
EMC IsilonSD Edge, OneFS and the CloudPools will be generally available in early 2016. Pricing information was not released at press time. EMC said pricing will be made available upon general availability.