SlideShare a Scribd company logo
#ibmedge© 2016 IBM Corporation
Analytics for Object Storage Simplified
- Spectrum Scale Object Storage with
File Access for Hadoop
Sandeep Patil, STSM, IBM Spectrum Scale
Tomer Perry, Solution Architect, IBM Spectrum Scale
Smita Raut , Object Development, IBM Spectrum Scale
Acknowledgement : Bill Owen, Ashutosh Mate, Shou Feng, John Gu, Yong
Zeng, Piyush Chaudhary, Wei Gong
#ibmedge
Agenda
• Introduction to Spectrum Scale
• Introduction to Spectrum Scale Analytics
• Introduction to Spectrum Scale Object Store
• Unified File & Object Access (UFO) Feature Details
• Use Cases Enabled By UFO
• Deep Dive of In-Place Analytics Use Case
• Demo
• Q & A
1
#ibmedge
2
#ibmedge
3
#ibmedge
4
#ibmedge
5
#ibmedge
6
#ibmedge
7
#ibmedge
Spectrum Scale Analytics Introduction
8
#ibmedge
9
#ibmedge
10
#ibmedge
11
#ibmedge
12
#ibmedge
13
#ibmedge
14
#ibmedge
GPFS-FPO Advanced Storage for Map Reduce Data
15
Hadoop HDFS IBM GPFS Advantages
HDFS NameNode is a single point of failure
Large block-sizes – poor support
for small files
Non-POSIX file system – obscure
commands
Difficulty to ingest data – special
tools required
Single-purpose, Hadoop
MapReduce only
Not recommended for critical data
No single point of failure,
distributed metadata
Variable block sizes – suited to multiple
types of data and data access patterns
POSIX file system – easy to use and
manage
Policy based data ingest
Versatile, Multi-purpose
Enterprise Class advanced
storage features
#ibmedge
Use Case: Big Data Analytics
• Problem: Separate storage systems for ingest/distribution
and analysis
• Data movement overhead is a significant part of
my time to insight.
• Increased cost from data duplication & overhead
• Inconsistent results
• Solution: Native HDFS support
• Decreased time to results
• Run Map/Reduce directly
• No waiting for data transfer between storage
systems
• Immediately share results
16
Spectrum Scale
File/ ObjectFile/HDFS
Global
Ingest and
Distribution
Business
Analytics
Custom
Applications
Packaged
Applications
#ibmedge
Spectrum Scale Object Storage Introduction
17
#ibmedge
IBM Spectrum Scale
• Avoid vendor lock-in with true Software
Defined Storage and Open Standards
• Seamless performance & capacity scaling
• Automate data management at scale
• Enable global collaboration
Data management at scale OpenStack and Spectrum Scale helps
clients manage data at scale
Business: I need virtually
unlimited storage
Operations: I need a flexible
infrastructure that supports
both object and file based
storage
Operations: I need to
minimize the time it takes to
perform common storage
management tasks
Collaboration: I need to share
data between people,
departments and sites with
low latency.
A single data plane
that supports Cinder,
Glance, Swift, Manila
as well as NFS, et. al.
A fully automated
policy based data
placement and
migration tool
An open & scalable
cloud platform
Sharing with a
variety of WAN
caching modes
Results
• Converge File and Object based storage under one roof
• Employ enterprise features to protect data, e.g.
Snapshots, Backup, and Disaster Recovery
• Support native file, block and object sharing to data.
Spectrum Scale
NFS
SMBPOSIX
SSD Fast
Disk
Slow
Disk
Tape
Swift
HDFS
Cinder
Glance Manila
Cognitive
Services
18
#ibmedge
Spectrum Scale Object Storage
• Basic support added in 4.1.1 release & enhanced in 4.2 and 4.2.1 release
• Based on Openstack Swift (Juno Release)
• REST-based data access
• Growing number of clients due to extremely simple protocol
• Applications can easily save & access data from anywhere using HTTP
• Simple set of atomic operations:
– PUT (upload)
– POST (update metadata)
– GET (download)
– DELETE
• Amazon S3 Protocol support
• High Availability with CES Integration
• Simple and Automated Installation Process
• Integrated authentication (Keystone) support
• Native GPFS Command Line Interface to manage Object service (mmobj command)
19
#ibmedge
Spectrum Scale Object Store – Additional Features
• Unified file and object support with Hadoop connectors
• Support for Encryption
• Support for Compression
• Only Object Store with Tape support for Backup
• Object store with integrated transparent cloud tiering Support
• Multi Region support
• AD/LDAP support for authentication
• ILM support for Object
• Movement of Object across storage tiers based on access heat
• Spectrum Scale Object with IBM DeepFlash becomes object store over all flash array for newer faster
workloads.
• Spectrum Scale Object with WAN caching support (AFM)
20
#ibmedge
Spectrum Scale Object Vs Cleversafe
21
#ibmedge
The right solution for the workload
22
Ideal Workloads
• Big Data Analytics
• High Performance Computing, e.g. Engineering
Applications
• Performance optimized Backup and Restore
• Multi-Site file collaboration
• Multi-tier File Synch and Share
• Cold data archive with lowest cost data storage
tier
Differentiation
• Designed for high performance
• Unified Storage Infrastructure: Native File,
Object & Hadoop
• Robust Tiering with policy based data placement
and data movement
• Multi site collaboration with advanced routing
and caching
• Enterprise Features, e.g. Encryption,
Compression, QoS, & Disaster Recovery
Ideal Workloads
• Active Archive (warm data, mostly static)
• Cost optimized Cloud backup target
• Web app content
• Remote office storage consolidation
• Storage as a service
Differentiation
• Designed for easy deployment and
management at scale
• Always-on architecture
• Geo-dispersed erasure coding for site fault
tolerance and DR
• Simple keyless native encryption and multi-
tenant security
• Reduced cost and complexity
Spectrum Scale
IBM Cloud Object Store
(Cleversafe)
#ibmedge
IBM Spectrum Scale: Unified File and Object Access
Feature Overview
23
#ibmedge
Unified File & Object (UFO) Support
• Challenge
• The world is not converged/file/object/HDFS today!
• and never will be completely…
• Unified Scale-out Content Repository
• File or object in. Object or file out.
• Integrated big data analytics support
• Native protocol support
• High-performance that scales
• Single Management Plane
24
Spectrum Scale
NFS SMBPOSIX
SSD Fast
Disk
Slow
Disk
Tape
Swift/S3HDFS
Spectrum Scale: Redefining Unified Storage
#ibmedge
Spectrum Scale Unified File & Object
• Access same content both as a File & as an Object without making a copy or needing File or Object
Gateways!
• File-In-Object-Out and Object-In-File-Out Support
• Support for File Access Protocols (NFS/SMB/POSIX) and Object Access Protocols (Swift/S3)
• Objects ingested into designated Unified Container available as Files and Files ingested into it available as
Objects.
• Support for File & Object ACLs with Unified Mode ID Mapping
25
#ibmedge
Unified File and Object Access – What is it ?
26
#ibmedge
What is Unified File and Object Access ?
• Accessing object using file interfaces
(SMB/NFS/POSIX) and accessing file using object
interfaces (REST) helps legacy applications
designed for file to seamlessly start integrating into the
object world.
• It allows object data to be accessed using
applications designed to process files. It allows file
data to be published as objects.
• Multi protocol access for file and object in the same
namespace (with common User ID management
capability) allows supporting and hosting data oceans
of different types of data with multiple access options.
• Optimizes various use cases and solution architectures
resulting in better efficiency as well as cost savings.
27
<Clustered file system>
Swift (With Swift on File)
NFS/SMB/POSIXObject(http)
2
1
<Container>
File Exports created
on container level
OR
POSIX access from
container level
Objects accessed
as FilesData ingested
as Objects
3
Data ingested
as Files4
Files accessed as
Objects
#ibmedge
Flexible Identity Management Modes
• Support’s Two Identity Management Modes
• Administrators can choose based on their need and use-case using CLI -------------->
28
#mmobj config change --ccrfile
object-server-sof.conf --
section DEFAULT --property
id_mgmt --value unified_mode |
local_mode
Local_Mode Unified_Mode
Identity Management Modes
Object created by Object interface
will be owned by internal “swift” user
Application processing the object data
from file interface will need the required
file ACL to access the data.
Object authentication setup
is independent of File
Authentication setup
Object created from Object interface should be
owned by the user doing the Object PUT (i.e
FILE will be owned by UID/GID of the user)
Users from Object and File are expected to be
common auth and coming from same directory
service (only AD+RFC 2307 or LDAP)
Owner of the object will own and
have access to the data from file
interface.
Suitable for unified file and object access for
end users. Leverage common ILM policies
for file and object data based on data
ownership
Suitable when auth schemes for file and
object are different and unified access
is for applications
#ibmedge
Use Cases Enabled by Unified File Object
29
#ibmedge
Use case 1 – Enabling “In-Place” analytics for Object
data repository with analytic results available as objects
30
Clustered file system
<SOF_Fileset>/<Device>
Object
(http)
Data ingested
as Objects
Spark or Hadoop
MapReduce
In-Place Analytics
Source:https://aws.amazon.com/elasticmapreduce/
Traditional object store – Data to be copied from
object store to dedicated cluster , do the analysis
and copy the result back to object store for
publishing
Object store with Unified File and Object Access –
Object Data available as File on the same fileset. Analytics systems like
Hadoop MapReduce or Spark allow the data to be directly leveraged for
analytics.
No data movement i.e. In-Place immediate data analytics.
Analytics With Unified File and Object AccessAnalytics on Traditional Object Store
Explicit Data movement
Results Published
as Objects with
no data movement
Results returned
in place
#ibmedge
Use case 2 : Process Object Data with File-Oriented
Applications and Publish Outcomes as Objects
31
Swift on file
Container1
Virtual
Machine
Instances
Virtual
Machine
Instances
Container2
Subsidiary 1 Subsidiary 2
NFS Export
on
Container 1
NFS Export
on
Container 2
Virtual
Machine
Instances
Virtual
Machine
Instances
VM Farm for Subsidiary 1
for video processing
VM Farm for Subsidiary 2
for video processing
…. ….
Ingest
Media Objects
Media House OpenStack Cloud Platform
(Tenant = Media House Subsidiaries)
Manila Shares (NFS) exported only for Subsidiary1
Publishing Channels
Final Video (as objects)
available for streaming
Final processed videos available as
Objects in container which is used for
external publishing
Raw media content sent for media
processing which happens over files
(Object to File access)
NFS Export
on
Container 1’
Container
1’
Manila Shares (NFS) exported only for Subsidiary2
Files converted into objects for publishing
(File to Object access)
#ibmedge
Use case 3 : Users read/write data via File and Object
with Common User Authentication and Identity
32
Clustered file system
Data
N
F
S
S
M
B
O
b
je
c
t
Data
N
F
S
S
M
B
O
b
je
c
t
User: John User: Riya
Access Common Data using the same User Credentials across all protocols
Corporate User
Directory
(Active Directory/LDAP)
Riya’s data Read/Written
from Object should be
owned by Riya when
accessed from File
(SMB/NFS/POSIX)
User: Riya
UID: 1001
GID: 2000
Domain: XYZ
#ibmedge
Deep Dive on In-Place Analytics Use Case
33
#ibmedge
Analytics use case
34
What is In-place Analytics ?
What is unified storage ?
Setup Details
Prerequisites for demo
Demo content
#ibmedge
What is In-place Analytics ?
35
#ibmedge
Setup Details
36
/dev/dm-3
viknode1
Roles – Admin,
quorum, NSD
viknode2
Roles – Quorum,
NSD, CES Node
viknode3
Roles – Quorum,
CES Node
Spectrum Scale Cluster
IBM BigInsight with Spectrum Scale Demo Setup
/dev/dm-2 Disks
Ambari Server
IBM BigInsightsYarn
Spark
HiveOozie
Slider Knox
#ibmedge
Prerequisites For Demo
37
Setup a Spectrum Scale Cluster with NFS, SMB and Object Protocols Enabled
Setup same authentication for File and Object
Enable unified access mode
Enable file access capabilities
Create a swift storage policy with File access enabled
Install BigInsights
Start Ambari server
Configure Cyberduck Client to access object store
#ibmedge
Demo Content
38
Upload a file
using Cyberduck
Run Analytics on
file
Download result
using Cyberduck
#ibmedge
Demo
#ibmedge
Spectrum Scale User Group
• The Spectrum Scale User Group is free
to join and open to all using, interested
in using or integrating Spectrum Scale.
• Join the User Group activities to meet
your peers and get access to experts
from partners and IBM.
• Next meetings:
- APAC: October 14, Melbourne
- Global at SC16 : November 13 1pm to 5pm, Salt Lake City
• Web page: http://www.spectrumscale.org/
• Presentations: http://www.spectrumscale.org/presentations/
• Mailing list: http://www.spectrumscale.org/join/
• Contact: http://www.spectrumscale.org/committee/
• Meet Bob Oesterlin (US Co-Principal) at Edge2016: Robert.Oesterlin@nuance.com
#ibmedge
Session : Futures of IBM Spectrum Scale
NDA & Customers ONLY
• Who: IBM Spectrum Scale Offering Management
• Carl Zetie, Ron Riffe
• When: Tuesday, September 20, 2016
• 1pm to 2pm
• Where: MGM Grand, Signature Tower 3
• Meeting Room D
• Contact (if any questions)
• douglasof@us.ibm.com, cmukhya@us.ibm.com
41
#ibmedge
Session : How to apply Flash benefits to big data
analytics and unstructured data
NDA & Customers ONLY
• Who: IBM Elastic Storage Server Offering Management
• Alex Chen
• When: Thursday, September 22, 2016
• 1:15pm to 2:15pm
• Where: Grand Garden Arena, Lower Level, MGM, Studio 10
• Contact(if any questions)
• • cmukhya@us.ibm.com, douglasof@us.ibm.co
42
#ibmedge
Trial VM
• Download the IBM Spectrum Scale Trial VM from : http://www-
03.ibm.com/systems/storage/spectrum/scale/trial.html
43
#ibmedge
References
Write a File, read as an Object: Openstack Summit, Austin, TX Apr 2016
https://www.youtube.com/watch?v=6ovLb6aktbM&feature=youtu.be&t=2
Amalgamating Manila and Swift for Unified Data Sharing: Openstack Summit, Austin, TX Apr 2016
https://www.youtube.com/watch?v=3MMrMUaA_Mg
Hadoop HDFS Vs Spectrum Scale: https://www.youtube.com/watch?v=kOeEbdO8F4A
From Archive to Insight: Debunking Myths of Analytics on Object Stores – Dean Hildebrand, Bill Owen,
Simon Lorenz, Luis Pabon, Rui Zhang. Vancouver Summit, Spring 2015.
https://www.youtube.com/watch?v=brhEUptD3JQ
Deploying Swift on a File System – Bill Owen, Thiago Da Silva. BrownBag at OpenStack Paris, Fall 2014
https://www.youtube.com/watch?v=vPn2uZF4yWo
Breaking the Mold with OpenStack Swift and GlusterFS – Jon Dickinson, Luis Pabo. Atlanta Summit, Spring 2014
https://www.youtube.com/watch?v=pSWdzjA8WuA
SNIA SDC 2015
http://www.snia.org/sites/default/files/SDC15_presentations/security/DeanHildebrand_Sasi__OpenStack%20SwiftOnFile.pdf
Spectrum Scale Infocenter
http://www.ibm.com/support/knowledgecenter/#!/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adm.doc/bl1adm_manageunifieda
ccess.htm
44
#ibmedge
OpenStack Summit 2016: IBM Spectrum Scale in an
OpenStack Environment Redpaper Published.
45
http://www.redbooks.ibm.com/abstracts/redp5331.html
© 2016 IBM Corporation #ibmedge
Thank You
#ibmedge
IBM Spectrum Scale - Unified File and Object Access
Feature Overview
• Multi protocol access for file and object in the same namespace
• Access object as file from POSIX, NFS and SMB
• Access file as object
– Provision to convert files to object automatically via background service called ‘objectizer’
– Provision to explicitly and immediately convert files to objects using CLI
• Feature is specifically made available as an “object storage policy”
• Allows to coexists with traditional object and other policies
• Create multiple unified file and object access policies
• Since policies are applicable per container , it gives end user the flexibility to create certain containers with Unified File and Object
Access policy and certain without it.
Flexible Identity Management Mode Support
• Local Mode: Suitable when auth schemes for file and object are different and unified access is for applications
• Object created by Object interface will be owned by internal “swift” user
• Unified Mode: Suitable for unified file and object access by end users. Leverage common ILM policies for file and object data based on data
ownership.
• Object created from Object interface should be owned by the user doing the Object PUT (i.e. FILE will be owned by UID/GID of the
user)
• Ability to run in-place analytics of object data using Spectrum Scale Hadoop connectors via POSIX interface.
47
#ibmedge
Filesystem Layout (Traditional Vs Unified File and Object
Access)
• One of the key advantages of unified file and object access is the placement and naming of objects when stored on the file
system. In unified file and object access stores objects following the same path hierarchy as the object's URL.
• In contrast, the default object implementation stores the object following the mapping given by the ring, and its final file path
cannot be determined by the user easily.
48
ibm/gpfs0/
Object ingest
object_fileset/
o/z1device108/objects/7551/125
75fc66179f12dc513580a239e92c3125
a.jpg a.jpg
Object ingest
ibm/gpfs0/
<Sof_policy_fileset>/<device>/
AUTH_acctID/cont/
a.jpg
Traditional SWIFT Unified File and Object Access
Ingest object URL: https://swift.example.com/v1/acct/cont/a.jpg
#ibmedge
Easy Access Of Objects as Files via supported File
Interfaces (NFS/SMB/POSIX)
• Objects ingested are available immediately for File access via the 3 supported file protocols.
• ID management modes (explained later) gives flexibility of assigning/retaining of owners, generally required by file protocols.
• Object authorization semantics are used during object access and file authorization semantics are used during file access of
the same data – thus ensuring compatibility of object and file applications
49
<Spectrum Scale Filesystem>
<SOF_Fileset>/<Device>
NFS/SMB/POSIXObject
(http) 2
1
<AUTH_account_ID>
<Container>
File Exports created on container level
OR
POSIX access from container level
Objects accessed as Files
Data ingested as Objects
#ibmedge
Objectization – Making Files as Objects (Accessing File
via Object interface)
• Spectrum Scale 4.2 features with a system service called ibmobjectizer responsible for objectization.
• Objectization is a process that converts files ingested from the file interface on unified file and object access
enabled container path to be available from the object interface.
• When new files are added from the file interface, they need to be visible to the Swift database to show
correct container listing and container or account statistics.
50
Spectrum Scale Filesystem
Unified File and Object
Fileset
NFS/SMB/POSIXObject
(http)
ibmobjectizer
objectization
1
2
3 Data ingested as Files
Files accessed as Objects
#ibmedge
Unified File and Object Access – Policy Integration for
Flexibility
• This feature is specifically made available as an “object storage policy” as it gives the following
advantages:
• Flexibility for administrator to manage unified file and object access separately
• Allows to coexists with traditional object and other policies
• Create multiple unified file and object access policies which can vary based on underlying storage
• Since policies are applicable per container , it gives end user the flexibility to create certain containers
with Unified File and Object Access policy and certain without it.
• Example: mmobj policy create SwiftOnFileFS --enable-file-access
51
#ibmedge
Notices and Disclaimers
52
Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission
from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of
initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS
DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE
USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY.
IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided.
IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our
warranty terms apply.”
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers
have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in
which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials
and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or
their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and
interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such
laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law
#ibmedge
Notices and Disclaimers Con’t.
53
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not
tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the
ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual
property right.
IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®,
FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG,
Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®,
PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,
StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business
Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.

More Related Content

What's hot (20)

PDF
Maginatics Cloud Storage Platform - MCSP 3.0 Technical Highlights
Maginatics
 
PPTX
Directory Write Leases in MagFS
Maginatics
 
PPTX
Big Data on Cloud Native Platform
Sunil Govindan
 
PPTX
2021 March Pravega Community Meeting
Derek Moore
 
PDF
Ceph c01
Lâm Đào
 
PPTX
Software-Defined Storage (SDS)
HTS Hosting
 
PDF
Genomics Deployments - How to Get Right with Software Defined Storage
Sandeep Patil
 
PPTX
Hdp security overview
Hortonworks
 
PPTX
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
InfluxData
 
PDF
Red Hat Storage Day Boston - OpenStack + Ceph Storage
Red_Hat_Storage
 
PDF
Future of cloud storage
GlusterFS
 
PPTX
Storage Requirements and Options for Running Spark on Kubernetes
DataWorks Summit
 
PDF
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld
 
PDF
Introduction to GlusterFS Webinar - September 2011
GlusterFS
 
PPTX
Nordic infrastructure Conference 2017 - SQL Server on Linux Overview
Travis Wright
 
PPTX
Hadoop ClusterClient Security Using Kerberos
Sarvesh Meena
 
PPTX
Security implementation on hadoop
Wei-Chiu Chuang
 
PDF
2014 sept 4_hadoop_security
Adam Muise
 
PDF
Novinky v Oracle Database 18c
MarketingArrowECS_CZ
 
PDF
Oracle Cloud Infrastructure – Storage
MarketingArrowECS_CZ
 
Maginatics Cloud Storage Platform - MCSP 3.0 Technical Highlights
Maginatics
 
Directory Write Leases in MagFS
Maginatics
 
Big Data on Cloud Native Platform
Sunil Govindan
 
2021 March Pravega Community Meeting
Derek Moore
 
Ceph c01
Lâm Đào
 
Software-Defined Storage (SDS)
HTS Hosting
 
Genomics Deployments - How to Get Right with Software Defined Storage
Sandeep Patil
 
Hdp security overview
Hortonworks
 
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
InfluxData
 
Red Hat Storage Day Boston - OpenStack + Ceph Storage
Red_Hat_Storage
 
Future of cloud storage
GlusterFS
 
Storage Requirements and Options for Running Spark on Kubernetes
DataWorks Summit
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld
 
Introduction to GlusterFS Webinar - September 2011
GlusterFS
 
Nordic infrastructure Conference 2017 - SQL Server on Linux Overview
Travis Wright
 
Hadoop ClusterClient Security Using Kerberos
Sarvesh Meena
 
Security implementation on hadoop
Wei-Chiu Chuang
 
2014 sept 4_hadoop_security
Adam Muise
 
Novinky v Oracle Database 18c
MarketingArrowECS_CZ
 
Oracle Cloud Infrastructure – Storage
MarketingArrowECS_CZ
 

Similar to In Place Analytics For File and Object Data (20)

PPTX
Spectrum scale-external-unified-file object
Sandeep Patil
 
PPTX
Software Defined Analytics with File and Object Access Plus Geographically Di...
Trishali Nayar
 
PPTX
Introducing IBM Spectrum Scale 4.2 and Elastic Storage Server 3.5
Doug O'Flaherty
 
PPTX
Object Storage Overview
Cloudian
 
PPTX
BIOIT14: Deploying very low cost cloud storage technology in a traditional re...
Dirk Petersen
 
PPTX
Elastic storage in the cloud session 5224 final v2
BradDesAulniers2
 
PPTX
SoftLayer Storage Services Overview (for Interop Las Vegas 2015)
Michael Fork
 
PDF
IBM Cloud Object Storage System (powered by Cleversafe) and its Applications
Tony Pearson
 
PPTX
OpenStack Swift In the Enterprise
Hostway|HOSTING
 
PDF
Design - Building a Foundation for Hybrid Cloud Storage
LaurenWendler
 
PPTX
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Cloudian
 
PPTX
Webinar: What Your Object Storage Vendor Isn’t Telling You About NFS Support
Storage Switzerland
 
PDF
#MFSummit2016 Operate: The race for space
Micro Focus
 
PPTX
Se training storage grid webscale technical overview
solarisyougood
 
PPTX
SoftLayer Storage Services Overview
Michael Fork
 
PDF
Big Data Architecture Workshop - Vahid Amiri
datastack
 
PDF
S016825 ibm-cos-nola-v1710d
Tony Pearson
 
PDF
HDFCloud Workshop: HDF5 in the Cloud
The HDF-EOS Tools and Information Center
 
PPTX
Cloud computing UNIT 2.1 presentation in
RahulBhole12
 
PPTX
NetApp Se training storage grid webscale technical overview
solarisyougood
 
Spectrum scale-external-unified-file object
Sandeep Patil
 
Software Defined Analytics with File and Object Access Plus Geographically Di...
Trishali Nayar
 
Introducing IBM Spectrum Scale 4.2 and Elastic Storage Server 3.5
Doug O'Flaherty
 
Object Storage Overview
Cloudian
 
BIOIT14: Deploying very low cost cloud storage technology in a traditional re...
Dirk Petersen
 
Elastic storage in the cloud session 5224 final v2
BradDesAulniers2
 
SoftLayer Storage Services Overview (for Interop Las Vegas 2015)
Michael Fork
 
IBM Cloud Object Storage System (powered by Cleversafe) and its Applications
Tony Pearson
 
OpenStack Swift In the Enterprise
Hostway|HOSTING
 
Design - Building a Foundation for Hybrid Cloud Storage
LaurenWendler
 
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...
Cloudian
 
Webinar: What Your Object Storage Vendor Isn’t Telling You About NFS Support
Storage Switzerland
 
#MFSummit2016 Operate: The race for space
Micro Focus
 
Se training storage grid webscale technical overview
solarisyougood
 
SoftLayer Storage Services Overview
Michael Fork
 
Big Data Architecture Workshop - Vahid Amiri
datastack
 
S016825 ibm-cos-nola-v1710d
Tony Pearson
 
HDFCloud Workshop: HDF5 in the Cloud
The HDF-EOS Tools and Information Center
 
Cloud computing UNIT 2.1 presentation in
RahulBhole12
 
NetApp Se training storage grid webscale technical overview
solarisyougood
 
Ad

More from Sandeep Patil (7)

PPTX
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
Sandeep Patil
 
PPTX
IBM Spectrum Scale Secure- Secure Data in Motion and Rest
Sandeep Patil
 
PDF
Spectrum Scale Best Practices by Olaf Weiser
Sandeep Patil
 
PDF
IBM Spectrum Scale Networking Flow
Sandeep Patil
 
PPTX
IBM Spectrum Scale Authentication for Protocols
Sandeep Patil
 
PPTX
IBM Spectrum Scale Security
Sandeep Patil
 
PPTX
IBM Spectrum Scale and Its Use for Content Management
Sandeep Patil
 
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
Sandeep Patil
 
IBM Spectrum Scale Secure- Secure Data in Motion and Rest
Sandeep Patil
 
Spectrum Scale Best Practices by Olaf Weiser
Sandeep Patil
 
IBM Spectrum Scale Networking Flow
Sandeep Patil
 
IBM Spectrum Scale Authentication for Protocols
Sandeep Patil
 
IBM Spectrum Scale Security
Sandeep Patil
 
IBM Spectrum Scale and Its Use for Content Management
Sandeep Patil
 
Ad

Recently uploaded (20)

PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PPTX
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPT
deep dive data management sharepoint apps.ppt
novaprofk
 
PPTX
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
PDF
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PDF
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
PPTX
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
PDF
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
PPTX
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PDF
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
apidays Helsinki & North 2025 - API access control strategies beyond JWT bear...
apidays
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
deep dive data management sharepoint apps.ppt
novaprofk
 
Aict presentation on dpplppp sjdhfh.pptx
vabaso5932
 
apidays Helsinki & North 2025 - Monetizing AI APIs: The New API Economy, Alla...
apidays
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
apidays Helsinki & North 2025 - REST in Peace? Hunting the Dominant Design fo...
apidays
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
The European Business Wallet: Why It Matters and How It Powers the EUDI Ecosy...
Lal Chandran
 
AI Presentation Tool Pitch Deck Presentation.pptx
ShyamPanthavoor1
 
Data Chunking Strategies for RAG in 2025.pdf
Tamanna
 
ER_Model_with_Diagrams_Presentation.pptx
dharaadhvaryu1992
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
AUDITABILITY & COMPLIANCE OF AI SYSTEMS IN HEALTHCARE
GAHI Youssef
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 

In Place Analytics For File and Object Data

  • 1. #ibmedge© 2016 IBM Corporation Analytics for Object Storage Simplified - Spectrum Scale Object Storage with File Access for Hadoop Sandeep Patil, STSM, IBM Spectrum Scale Tomer Perry, Solution Architect, IBM Spectrum Scale Smita Raut , Object Development, IBM Spectrum Scale Acknowledgement : Bill Owen, Ashutosh Mate, Shou Feng, John Gu, Yong Zeng, Piyush Chaudhary, Wei Gong
  • 2. #ibmedge Agenda • Introduction to Spectrum Scale • Introduction to Spectrum Scale Analytics • Introduction to Spectrum Scale Object Store • Unified File & Object Access (UFO) Feature Details • Use Cases Enabled By UFO • Deep Dive of In-Place Analytics Use Case • Demo • Q & A 1
  • 16. #ibmedge GPFS-FPO Advanced Storage for Map Reduce Data 15 Hadoop HDFS IBM GPFS Advantages HDFS NameNode is a single point of failure Large block-sizes – poor support for small files Non-POSIX file system – obscure commands Difficulty to ingest data – special tools required Single-purpose, Hadoop MapReduce only Not recommended for critical data No single point of failure, distributed metadata Variable block sizes – suited to multiple types of data and data access patterns POSIX file system – easy to use and manage Policy based data ingest Versatile, Multi-purpose Enterprise Class advanced storage features
  • 17. #ibmedge Use Case: Big Data Analytics • Problem: Separate storage systems for ingest/distribution and analysis • Data movement overhead is a significant part of my time to insight. • Increased cost from data duplication & overhead • Inconsistent results • Solution: Native HDFS support • Decreased time to results • Run Map/Reduce directly • No waiting for data transfer between storage systems • Immediately share results 16 Spectrum Scale File/ ObjectFile/HDFS Global Ingest and Distribution Business Analytics Custom Applications Packaged Applications
  • 18. #ibmedge Spectrum Scale Object Storage Introduction 17
  • 19. #ibmedge IBM Spectrum Scale • Avoid vendor lock-in with true Software Defined Storage and Open Standards • Seamless performance & capacity scaling • Automate data management at scale • Enable global collaboration Data management at scale OpenStack and Spectrum Scale helps clients manage data at scale Business: I need virtually unlimited storage Operations: I need a flexible infrastructure that supports both object and file based storage Operations: I need to minimize the time it takes to perform common storage management tasks Collaboration: I need to share data between people, departments and sites with low latency. A single data plane that supports Cinder, Glance, Swift, Manila as well as NFS, et. al. A fully automated policy based data placement and migration tool An open & scalable cloud platform Sharing with a variety of WAN caching modes Results • Converge File and Object based storage under one roof • Employ enterprise features to protect data, e.g. Snapshots, Backup, and Disaster Recovery • Support native file, block and object sharing to data. Spectrum Scale NFS SMBPOSIX SSD Fast Disk Slow Disk Tape Swift HDFS Cinder Glance Manila Cognitive Services 18
  • 20. #ibmedge Spectrum Scale Object Storage • Basic support added in 4.1.1 release & enhanced in 4.2 and 4.2.1 release • Based on Openstack Swift (Juno Release) • REST-based data access • Growing number of clients due to extremely simple protocol • Applications can easily save & access data from anywhere using HTTP • Simple set of atomic operations: – PUT (upload) – POST (update metadata) – GET (download) – DELETE • Amazon S3 Protocol support • High Availability with CES Integration • Simple and Automated Installation Process • Integrated authentication (Keystone) support • Native GPFS Command Line Interface to manage Object service (mmobj command) 19
  • 21. #ibmedge Spectrum Scale Object Store – Additional Features • Unified file and object support with Hadoop connectors • Support for Encryption • Support for Compression • Only Object Store with Tape support for Backup • Object store with integrated transparent cloud tiering Support • Multi Region support • AD/LDAP support for authentication • ILM support for Object • Movement of Object across storage tiers based on access heat • Spectrum Scale Object with IBM DeepFlash becomes object store over all flash array for newer faster workloads. • Spectrum Scale Object with WAN caching support (AFM) 20
  • 22. #ibmedge Spectrum Scale Object Vs Cleversafe 21
  • 23. #ibmedge The right solution for the workload 22 Ideal Workloads • Big Data Analytics • High Performance Computing, e.g. Engineering Applications • Performance optimized Backup and Restore • Multi-Site file collaboration • Multi-tier File Synch and Share • Cold data archive with lowest cost data storage tier Differentiation • Designed for high performance • Unified Storage Infrastructure: Native File, Object & Hadoop • Robust Tiering with policy based data placement and data movement • Multi site collaboration with advanced routing and caching • Enterprise Features, e.g. Encryption, Compression, QoS, & Disaster Recovery Ideal Workloads • Active Archive (warm data, mostly static) • Cost optimized Cloud backup target • Web app content • Remote office storage consolidation • Storage as a service Differentiation • Designed for easy deployment and management at scale • Always-on architecture • Geo-dispersed erasure coding for site fault tolerance and DR • Simple keyless native encryption and multi- tenant security • Reduced cost and complexity Spectrum Scale IBM Cloud Object Store (Cleversafe)
  • 24. #ibmedge IBM Spectrum Scale: Unified File and Object Access Feature Overview 23
  • 25. #ibmedge Unified File & Object (UFO) Support • Challenge • The world is not converged/file/object/HDFS today! • and never will be completely… • Unified Scale-out Content Repository • File or object in. Object or file out. • Integrated big data analytics support • Native protocol support • High-performance that scales • Single Management Plane 24 Spectrum Scale NFS SMBPOSIX SSD Fast Disk Slow Disk Tape Swift/S3HDFS Spectrum Scale: Redefining Unified Storage
  • 26. #ibmedge Spectrum Scale Unified File & Object • Access same content both as a File & as an Object without making a copy or needing File or Object Gateways! • File-In-Object-Out and Object-In-File-Out Support • Support for File Access Protocols (NFS/SMB/POSIX) and Object Access Protocols (Swift/S3) • Objects ingested into designated Unified Container available as Files and Files ingested into it available as Objects. • Support for File & Object ACLs with Unified Mode ID Mapping 25
  • 27. #ibmedge Unified File and Object Access – What is it ? 26
  • 28. #ibmedge What is Unified File and Object Access ? • Accessing object using file interfaces (SMB/NFS/POSIX) and accessing file using object interfaces (REST) helps legacy applications designed for file to seamlessly start integrating into the object world. • It allows object data to be accessed using applications designed to process files. It allows file data to be published as objects. • Multi protocol access for file and object in the same namespace (with common User ID management capability) allows supporting and hosting data oceans of different types of data with multiple access options. • Optimizes various use cases and solution architectures resulting in better efficiency as well as cost savings. 27 <Clustered file system> Swift (With Swift on File) NFS/SMB/POSIXObject(http) 2 1 <Container> File Exports created on container level OR POSIX access from container level Objects accessed as FilesData ingested as Objects 3 Data ingested as Files4 Files accessed as Objects
  • 29. #ibmedge Flexible Identity Management Modes • Support’s Two Identity Management Modes • Administrators can choose based on their need and use-case using CLI --------------> 28 #mmobj config change --ccrfile object-server-sof.conf -- section DEFAULT --property id_mgmt --value unified_mode | local_mode Local_Mode Unified_Mode Identity Management Modes Object created by Object interface will be owned by internal “swift” user Application processing the object data from file interface will need the required file ACL to access the data. Object authentication setup is independent of File Authentication setup Object created from Object interface should be owned by the user doing the Object PUT (i.e FILE will be owned by UID/GID of the user) Users from Object and File are expected to be common auth and coming from same directory service (only AD+RFC 2307 or LDAP) Owner of the object will own and have access to the data from file interface. Suitable for unified file and object access for end users. Leverage common ILM policies for file and object data based on data ownership Suitable when auth schemes for file and object are different and unified access is for applications
  • 30. #ibmedge Use Cases Enabled by Unified File Object 29
  • 31. #ibmedge Use case 1 – Enabling “In-Place” analytics for Object data repository with analytic results available as objects 30 Clustered file system <SOF_Fileset>/<Device> Object (http) Data ingested as Objects Spark or Hadoop MapReduce In-Place Analytics Source:https://aws.amazon.com/elasticmapreduce/ Traditional object store – Data to be copied from object store to dedicated cluster , do the analysis and copy the result back to object store for publishing Object store with Unified File and Object Access – Object Data available as File on the same fileset. Analytics systems like Hadoop MapReduce or Spark allow the data to be directly leveraged for analytics. No data movement i.e. In-Place immediate data analytics. Analytics With Unified File and Object AccessAnalytics on Traditional Object Store Explicit Data movement Results Published as Objects with no data movement Results returned in place
  • 32. #ibmedge Use case 2 : Process Object Data with File-Oriented Applications and Publish Outcomes as Objects 31 Swift on file Container1 Virtual Machine Instances Virtual Machine Instances Container2 Subsidiary 1 Subsidiary 2 NFS Export on Container 1 NFS Export on Container 2 Virtual Machine Instances Virtual Machine Instances VM Farm for Subsidiary 1 for video processing VM Farm for Subsidiary 2 for video processing …. …. Ingest Media Objects Media House OpenStack Cloud Platform (Tenant = Media House Subsidiaries) Manila Shares (NFS) exported only for Subsidiary1 Publishing Channels Final Video (as objects) available for streaming Final processed videos available as Objects in container which is used for external publishing Raw media content sent for media processing which happens over files (Object to File access) NFS Export on Container 1’ Container 1’ Manila Shares (NFS) exported only for Subsidiary2 Files converted into objects for publishing (File to Object access)
  • 33. #ibmedge Use case 3 : Users read/write data via File and Object with Common User Authentication and Identity 32 Clustered file system Data N F S S M B O b je c t Data N F S S M B O b je c t User: John User: Riya Access Common Data using the same User Credentials across all protocols Corporate User Directory (Active Directory/LDAP) Riya’s data Read/Written from Object should be owned by Riya when accessed from File (SMB/NFS/POSIX) User: Riya UID: 1001 GID: 2000 Domain: XYZ
  • 34. #ibmedge Deep Dive on In-Place Analytics Use Case 33
  • 35. #ibmedge Analytics use case 34 What is In-place Analytics ? What is unified storage ? Setup Details Prerequisites for demo Demo content
  • 36. #ibmedge What is In-place Analytics ? 35
  • 37. #ibmedge Setup Details 36 /dev/dm-3 viknode1 Roles – Admin, quorum, NSD viknode2 Roles – Quorum, NSD, CES Node viknode3 Roles – Quorum, CES Node Spectrum Scale Cluster IBM BigInsight with Spectrum Scale Demo Setup /dev/dm-2 Disks Ambari Server IBM BigInsightsYarn Spark HiveOozie Slider Knox
  • 38. #ibmedge Prerequisites For Demo 37 Setup a Spectrum Scale Cluster with NFS, SMB and Object Protocols Enabled Setup same authentication for File and Object Enable unified access mode Enable file access capabilities Create a swift storage policy with File access enabled Install BigInsights Start Ambari server Configure Cyberduck Client to access object store
  • 39. #ibmedge Demo Content 38 Upload a file using Cyberduck Run Analytics on file Download result using Cyberduck
  • 41. #ibmedge Spectrum Scale User Group • The Spectrum Scale User Group is free to join and open to all using, interested in using or integrating Spectrum Scale. • Join the User Group activities to meet your peers and get access to experts from partners and IBM. • Next meetings: - APAC: October 14, Melbourne - Global at SC16 : November 13 1pm to 5pm, Salt Lake City • Web page: http://www.spectrumscale.org/ • Presentations: http://www.spectrumscale.org/presentations/ • Mailing list: http://www.spectrumscale.org/join/ • Contact: http://www.spectrumscale.org/committee/ • Meet Bob Oesterlin (US Co-Principal) at Edge2016: [email protected]
  • 42. #ibmedge Session : Futures of IBM Spectrum Scale NDA & Customers ONLY • Who: IBM Spectrum Scale Offering Management • Carl Zetie, Ron Riffe • When: Tuesday, September 20, 2016 • 1pm to 2pm • Where: MGM Grand, Signature Tower 3 • Meeting Room D • Contact (if any questions) • [email protected], [email protected] 41
  • 43. #ibmedge Session : How to apply Flash benefits to big data analytics and unstructured data NDA & Customers ONLY • Who: IBM Elastic Storage Server Offering Management • Alex Chen • When: Thursday, September 22, 2016 • 1:15pm to 2:15pm • Where: Grand Garden Arena, Lower Level, MGM, Studio 10 • Contact(if any questions) • • [email protected], [email protected] 42
  • 44. #ibmedge Trial VM • Download the IBM Spectrum Scale Trial VM from : http://www- 03.ibm.com/systems/storage/spectrum/scale/trial.html 43
  • 45. #ibmedge References Write a File, read as an Object: Openstack Summit, Austin, TX Apr 2016 https://www.youtube.com/watch?v=6ovLb6aktbM&feature=youtu.be&t=2 Amalgamating Manila and Swift for Unified Data Sharing: Openstack Summit, Austin, TX Apr 2016 https://www.youtube.com/watch?v=3MMrMUaA_Mg Hadoop HDFS Vs Spectrum Scale: https://www.youtube.com/watch?v=kOeEbdO8F4A From Archive to Insight: Debunking Myths of Analytics on Object Stores – Dean Hildebrand, Bill Owen, Simon Lorenz, Luis Pabon, Rui Zhang. Vancouver Summit, Spring 2015. https://www.youtube.com/watch?v=brhEUptD3JQ Deploying Swift on a File System – Bill Owen, Thiago Da Silva. BrownBag at OpenStack Paris, Fall 2014 https://www.youtube.com/watch?v=vPn2uZF4yWo Breaking the Mold with OpenStack Swift and GlusterFS – Jon Dickinson, Luis Pabo. Atlanta Summit, Spring 2014 https://www.youtube.com/watch?v=pSWdzjA8WuA SNIA SDC 2015 http://www.snia.org/sites/default/files/SDC15_presentations/security/DeanHildebrand_Sasi__OpenStack%20SwiftOnFile.pdf Spectrum Scale Infocenter http://www.ibm.com/support/knowledgecenter/#!/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adm.doc/bl1adm_manageunifieda ccess.htm 44
  • 46. #ibmedge OpenStack Summit 2016: IBM Spectrum Scale in an OpenStack Environment Redpaper Published. 45 http://www.redbooks.ibm.com/abstracts/redp5331.html
  • 47. © 2016 IBM Corporation #ibmedge Thank You
  • 48. #ibmedge IBM Spectrum Scale - Unified File and Object Access Feature Overview • Multi protocol access for file and object in the same namespace • Access object as file from POSIX, NFS and SMB • Access file as object – Provision to convert files to object automatically via background service called ‘objectizer’ – Provision to explicitly and immediately convert files to objects using CLI • Feature is specifically made available as an “object storage policy” • Allows to coexists with traditional object and other policies • Create multiple unified file and object access policies • Since policies are applicable per container , it gives end user the flexibility to create certain containers with Unified File and Object Access policy and certain without it. Flexible Identity Management Mode Support • Local Mode: Suitable when auth schemes for file and object are different and unified access is for applications • Object created by Object interface will be owned by internal “swift” user • Unified Mode: Suitable for unified file and object access by end users. Leverage common ILM policies for file and object data based on data ownership. • Object created from Object interface should be owned by the user doing the Object PUT (i.e. FILE will be owned by UID/GID of the user) • Ability to run in-place analytics of object data using Spectrum Scale Hadoop connectors via POSIX interface. 47
  • 49. #ibmedge Filesystem Layout (Traditional Vs Unified File and Object Access) • One of the key advantages of unified file and object access is the placement and naming of objects when stored on the file system. In unified file and object access stores objects following the same path hierarchy as the object's URL. • In contrast, the default object implementation stores the object following the mapping given by the ring, and its final file path cannot be determined by the user easily. 48 ibm/gpfs0/ Object ingest object_fileset/ o/z1device108/objects/7551/125 75fc66179f12dc513580a239e92c3125 a.jpg a.jpg Object ingest ibm/gpfs0/ <Sof_policy_fileset>/<device>/ AUTH_acctID/cont/ a.jpg Traditional SWIFT Unified File and Object Access Ingest object URL: https://swift.example.com/v1/acct/cont/a.jpg
  • 50. #ibmedge Easy Access Of Objects as Files via supported File Interfaces (NFS/SMB/POSIX) • Objects ingested are available immediately for File access via the 3 supported file protocols. • ID management modes (explained later) gives flexibility of assigning/retaining of owners, generally required by file protocols. • Object authorization semantics are used during object access and file authorization semantics are used during file access of the same data – thus ensuring compatibility of object and file applications 49 <Spectrum Scale Filesystem> <SOF_Fileset>/<Device> NFS/SMB/POSIXObject (http) 2 1 <AUTH_account_ID> <Container> File Exports created on container level OR POSIX access from container level Objects accessed as Files Data ingested as Objects
  • 51. #ibmedge Objectization – Making Files as Objects (Accessing File via Object interface) • Spectrum Scale 4.2 features with a system service called ibmobjectizer responsible for objectization. • Objectization is a process that converts files ingested from the file interface on unified file and object access enabled container path to be available from the object interface. • When new files are added from the file interface, they need to be visible to the Swift database to show correct container listing and container or account statistics. 50 Spectrum Scale Filesystem Unified File and Object Fileset NFS/SMB/POSIXObject (http) ibmobjectizer objectization 1 2 3 Data ingested as Files Files accessed as Objects
  • 52. #ibmedge Unified File and Object Access – Policy Integration for Flexibility • This feature is specifically made available as an “object storage policy” as it gives the following advantages: • Flexibility for administrator to manage unified file and object access separately • Allows to coexists with traditional object and other policies • Create multiple unified file and object access policies which can vary based on underlying storage • Since policies are applicable per container , it gives end user the flexibility to create certain containers with Unified File and Object Access policy and certain without it. • Example: mmobj policy create SwiftOnFileFS --enable-file-access 51
  • 53. #ibmedge Notices and Disclaimers 52 Copyright © 2016 by International Business Machines Corporation (IBM). No part of this document may be reproduced or transmitted in any form without written permission from IBM. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM. Information in these presentations (including information relating to products that have not yet been announced by IBM) has been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to the terms and conditions of the agreements under which they are provided. IBM products are manufactured from new parts or new and used parts. In some cases, a product may not be new and may have been previously installed. Regardless, our warranty terms apply.” Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without notice. Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual performance, cost, savings or other results in other operating environments may vary. References in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation. It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law
  • 54. #ibmedge Notices and Disclaimers Con’t. 53 Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual property right. IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.