SlideShare a Scribd company logo
Why and What is Graph Mining?
• Graphs allows us to model complicated structures
• Chemical compounds ( Cheminformatics )
• Protein structures, biological pathways/networks ( Bioinformatics )
• Program control flow, traffic flow and social network analysis
• XML documents, the Web and social network analysis
• Graph search algorithms have been developed in chemical
informatics, computer vision, video indexing and text retrieval.
• Graph Mining has become an active and important theme in data
mining.
Graph Patterns
• Frequent subgraphs
• A subgraph is frequent if its support ( occurrence frequency ) in
a given dataset is no less than a minimum support threshold
• What is support?
• Intuitively the number of transactions containing a single
occurrence
• Useful for
• Characterizing graph sets
• Discriminating different groups of graphs
• Similarity search in graph databases.
Methods for Mining Frequent Subgraphs
• An Apriori based approach and a pattern-growth approach
• Apriori-based algorithms for frequent substructure mining include AGM,
FSG, and a path-join method
• AGM shares similar characteristics with Apriori-based itemset mining.
• FSG and the path-join method explore edges and connections
• Pattern Growth Graph Approach is simplistic pattern growth-based
frequent substructure mining.
Social Network
• A social network is a heterogeneous and multirelational data set
represented by a graph.
• Nodes corresponding to objects and edges corresponding to links
representing relationships or interactions between objects.
• Both nodes and links have attributes.
• Social networks need not be social in context.
• There are many real-world instances of technological, business, economic,
and biologic social networks.
• Examples:
• Electrical power grids
• Telephone call graphs
• the spread of computer viruses
• the World Wide Web
Set Valued Attribute
• A set-valued attribute may be of homogeneous or heterogeneous type.
• A set-valued data can be generalized by:
1. Generalization of each value in the set to its corresponding higher-
level concept.
2. Derivation of the general behavior of the set.
• Generalization can be performed by applying different generalization
operators to explore alternative generalization paths.
• The result of generalization is a heterogeneous set.
Set and Listed Valued Attribute Example
• that the hobby of a person is a set-valued attribute containing the set of
values Tennis, Hockey, Soccer, Violin.
• This set can be generalized to a set of high-level concepts, such as {sports,
music, computer games} or into the number 5.
• a count can be associated with a generalized value to indicate how many
elements are generalized to that value, as in {sports(3),music(1), computer
games(1)}, where sports(3) indicates three kinds of sports. and so on.
Listed Valued Attribute
• List-valued attributes can be generalized in a manner similar to that for
set-valued attributes except that the order of the elements in the list
should be preserved in the generalization.
• A list can be generalized according to its general behavior, such as the
length of the list, the type of list elements, the value range, the weighted
average value for numerical data, or by dropping unimportant elements in
the list.
Listed Valued Attribute Example
• List of data for a person’s education record:
• “((B.Sc. in Electrical Engineering, U.B.C., Dec., 1998),
• (M.Sc. in Computer Engineering, U. Maryland, May, 2001),
• (Ph.D. in Computer Science, UCLA, Aug., 2005))”.
• We can generalize by dropping less important descriptions (attributes) of
each tuple in the list, such as by dropping the month attribute to obtain
“((B.Sc., U.B.C., 1998), : : :)”, and/or by retaining only the most important
tuple(s) in the list,
• Example: “(Ph.D. in Computer Science, UCLA, 2005)”.
Dimensions in Spatial Data Cube
• Three types of dimensions in a spatial data cube:
• A nonspatial dimension
• A spatial-to-nonspatial dimension
• A spatial-to-spatial dimension
• A nonspatial dimension contains only nonspatial data. Nonspatial
dimensions temperature and precipitation can be constructed for the
warehouse.
• Example: Contains nonspatial data whose generalizations are nonspatial
(such as “hot” for temperature and “wet” for precipitation).
• A spatial-to-nonspatial dimension is a dimension whose primitive-level
data are spatial but whose generalization, starting at a certain high level,
becomes nonspatial.
Dimensions in Spatial Data Cube
• Example: Suppose that the dimension’s spatial representation of Mumbai is
generalized to the string “south mumbai.” Although “south mumbai” is a
spatial concept, its representation is not spatial.
• A spatial-to-spatial dimension is a dimension whose primitive level and all
of its high level generalized data are spatial.
• Example: the dimension equi temperature region contains spatial data, as
do all of its generalizations, such as with regions covering 0-5 degrees
(Celsius), 5-10 degree.
Plan, Plan Database and Plan Mining
• A plan consists of a variable sequence of actions.
• A plan database or planbase, is a large collection of plans.
• Plan mining is the task of mining significant patterns or knowledge from a
planbase.
• Plan mining can be used to discover travel patterns of business passengers
in an air flight database or to find significant patterns from the sequences
of actions in the repair of automobiles.
• Plan mining is the extraction of important or significant generalized
(sequential) patterns from a planbase.
Spatial Aggregation and Approximation
• Aggregation and approximation are especially useful for generalizing attributes
with large sets of values, complex structures, and spatial or multimedia data.
• We would like to generalize detailed geographic points into clustered regions,
such as business, residential, industrial, or agricultural areas, according to land
usage.
• A multimedia database may contain complex texts, graphics, images, video
fragments, maps, voice, music, and other forms of audio/video information.
• Generalization on multimedia data can be performed by recognition and
extraction of the essential features and/or general patterns of such data.
• For an image, the size, color, shape, texture, orientation, and relative positions and
structures of the contained objects or regions in the image can be extracted by
aggregation and/or approximation.
• For a segment of music, its melody can be summarized based on the approximate
patterns that repeatedly occur in the segment, while its style can be summarized
based on its tone, tempo, or the major musical instruments played.
• Technologies developed in spatial databases and multimedia databases such as
• spatial data accessing and analysis techniques
• pattern recognition
• image analysis
• text analysis
• content-based image/text retrieval
• multidimensional indexing methods
• Example:
• Suppose that we have different pieces of land for various purposes of
agricultural usage, such as the planting of vegetables, grains, and fruits. These
pieces can be merged or aggregated into one large piece of agricultural land
by a spatial merge.
• However, such a piece of agricultural land may contain highways, houses, and
small stores.
• If the majority of the land is used for agriculture, the scattered regions for
other purposes can be ignored, and the whole region can be claimed as an
agricultural area by approximation.

More Related Content

Viewers also liked (17)

PDF
Darryl Schultz_Resume
Darryl Schultz
 
DOCX
Soal latihan tik materi excel
wensi wen
 
PDF
Sni 2836-2008-tata cara perhitungan harga satuan pekerjaan pondasi untuk kons...
Ellan Syahnoorizal Siregar
 
PPT
Object Relational Database Management System
Amar Myana
 
PDF
Sni 7395-2008-tata cara perhitungan harga satuan pekerjaan penutup lantai dan...
Ellan Syahnoorizal Siregar
 
PPTX
Octave
Amar Myana
 
PDF
MINING HEALTH EXAMINATION RECORDS A GRAPH-BASED APPROACH
Nexgen Technology
 
PPTX
Mining Electronic Health Records for Insights
Ontotext
 
PPTX
Data Mining Seminar - Graph Mining and Social Network Analysis
vwchu
 
PPT
Vat+dyes
saima latif
 
PDF
Social Data Mining
Mahesh Meniya
 
PPTX
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
PPTX
Social media mining PPT
Chhavi Mathur
 
PPT
Google Presentation
guesta599e2
 
DOCX
Computer science seminar topics
123seminarsonly
 
PPT
Buffer Overflow Countermeasures, DEP, Security Assessment
Amar Myana
 
PPT
Google Ppt
prabalaggarwal
 
Darryl Schultz_Resume
Darryl Schultz
 
Soal latihan tik materi excel
wensi wen
 
Sni 2836-2008-tata cara perhitungan harga satuan pekerjaan pondasi untuk kons...
Ellan Syahnoorizal Siregar
 
Object Relational Database Management System
Amar Myana
 
Sni 7395-2008-tata cara perhitungan harga satuan pekerjaan penutup lantai dan...
Ellan Syahnoorizal Siregar
 
Octave
Amar Myana
 
MINING HEALTH EXAMINATION RECORDS A GRAPH-BASED APPROACH
Nexgen Technology
 
Mining Electronic Health Records for Insights
Ontotext
 
Data Mining Seminar - Graph Mining and Social Network Analysis
vwchu
 
Vat+dyes
saima latif
 
Social Data Mining
Mahesh Meniya
 
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Social media mining PPT
Chhavi Mathur
 
Google Presentation
guesta599e2
 
Computer science seminar topics
123seminarsonly
 
Buffer Overflow Countermeasures, DEP, Security Assessment
Amar Myana
 
Google Ppt
prabalaggarwal
 

Similar to Graph Mining, Graph Patterns, Social Network, Set & List Valued Attribute, Spatial Data (20)

PPTX
Data Mining: Applying data mining
Datamining Tools
 
PPTX
Data Mining: Applying data mining
DataminingTools Inc
 
PDF
17 manjula aakunuri final_paper--185-190
Alexander Decker
 
PPT
Spatial data mining
MITS Gwalior
 
PDF
Big Data Mining - Classification, Techniques and Issues
Karan Deep Singh
 
PPT
Data warehousing and data mining Chapter 9
sankariashok09
 
PDF
Intro dm
Sushma Shetty
 
PDF
Intro dm
udit singh
 
PDF
Efficient Association Rule Mining in Heterogeneous Data Base
IJTET Journal
 
PPT
Overview of Data Mining
Bowo Prasetyo
 
PDF
Dunham - Data Mining.pdf
ssuserf71896
 
PDF
Dunham - Data Mining.pdf
PRAJITBHADURI
 
PDF
Ontology Based PMSE with Manifold Preference
IJCERT
 
PPT
Sanjeev Kumar Dash D ata Mining-2023.ppt
gobeli2850
 
PDF
Ijariie1184
IJARIIE JOURNAL
 
PDF
Ijariie1184
IJARIIE JOURNAL
 
PPT
Data Mining-2023 (2).ppt
SATYAJITJENABTECH
 
PPTX
Skillwise Big data
Skillwise Group
 
PPTX
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
Lauri Eloranta
 
Data Mining: Applying data mining
Datamining Tools
 
Data Mining: Applying data mining
DataminingTools Inc
 
17 manjula aakunuri final_paper--185-190
Alexander Decker
 
Spatial data mining
MITS Gwalior
 
Big Data Mining - Classification, Techniques and Issues
Karan Deep Singh
 
Data warehousing and data mining Chapter 9
sankariashok09
 
Intro dm
Sushma Shetty
 
Intro dm
udit singh
 
Efficient Association Rule Mining in Heterogeneous Data Base
IJTET Journal
 
Overview of Data Mining
Bowo Prasetyo
 
Dunham - Data Mining.pdf
ssuserf71896
 
Dunham - Data Mining.pdf
PRAJITBHADURI
 
Ontology Based PMSE with Manifold Preference
IJCERT
 
Sanjeev Kumar Dash D ata Mining-2023.ppt
gobeli2850
 
Ijariie1184
IJARIIE JOURNAL
 
Ijariie1184
IJARIIE JOURNAL
 
Data Mining-2023 (2).ppt
SATYAJITJENABTECH
 
Skillwise Big data
Skillwise Group
 
Big Data and Data Mining - Lecture 3 in Introduction to Computational Social ...
Lauri Eloranta
 
Ad

Recently uploaded (20)

PPTX
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
PDF
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
PPTX
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
PDF
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
PDF
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
PDF
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
PPTX
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PDF
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
PDF
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PPTX
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PDF
NSF Converter Simplified: From Complexity to Clarity
Johnsena Crook
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PPTX
Build a Custom Agent for Agentic Testing.pptx
klpathrudu
 
PDF
Everything you need to know about pricing & licensing Microsoft 365 Copilot f...
Q-Advise
 
PDF
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
PPTX
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Empowering Asian Contributions: The Rise of Regional User Groups in Open Sour...
Shane Coughlan
 
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
Wondershare PDFelement Pro Crack for MacOS New Version Latest 2025
bashirkhan333g
 
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
Home Care Tools: Benefits, features and more
Third Rock Techkno
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
The 5 Reasons for IT Maintenance - Arna Softech
Arna Softech
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
Finding Your License Details in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
NSF Converter Simplified: From Complexity to Clarity
Johnsena Crook
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
Build a Custom Agent for Agentic Testing.pptx
klpathrudu
 
Everything you need to know about pricing & licensing Microsoft 365 Copilot f...
Q-Advise
 
Build It, Buy It, or Already Got It? Make Smarter Martech Decisions
bbedford2
 
Agentic Automation Journey Session 1/5: Context Grounding and Autopilot for E...
klpathrudu
 
Ad

Graph Mining, Graph Patterns, Social Network, Set & List Valued Attribute, Spatial Data

  • 1. Why and What is Graph Mining? • Graphs allows us to model complicated structures • Chemical compounds ( Cheminformatics ) • Protein structures, biological pathways/networks ( Bioinformatics ) • Program control flow, traffic flow and social network analysis • XML documents, the Web and social network analysis • Graph search algorithms have been developed in chemical informatics, computer vision, video indexing and text retrieval. • Graph Mining has become an active and important theme in data mining.
  • 2. Graph Patterns • Frequent subgraphs • A subgraph is frequent if its support ( occurrence frequency ) in a given dataset is no less than a minimum support threshold • What is support? • Intuitively the number of transactions containing a single occurrence • Useful for • Characterizing graph sets • Discriminating different groups of graphs • Similarity search in graph databases.
  • 3. Methods for Mining Frequent Subgraphs • An Apriori based approach and a pattern-growth approach • Apriori-based algorithms for frequent substructure mining include AGM, FSG, and a path-join method • AGM shares similar characteristics with Apriori-based itemset mining. • FSG and the path-join method explore edges and connections • Pattern Growth Graph Approach is simplistic pattern growth-based frequent substructure mining.
  • 4. Social Network • A social network is a heterogeneous and multirelational data set represented by a graph. • Nodes corresponding to objects and edges corresponding to links representing relationships or interactions between objects. • Both nodes and links have attributes. • Social networks need not be social in context. • There are many real-world instances of technological, business, economic, and biologic social networks. • Examples: • Electrical power grids • Telephone call graphs • the spread of computer viruses • the World Wide Web
  • 5. Set Valued Attribute • A set-valued attribute may be of homogeneous or heterogeneous type. • A set-valued data can be generalized by: 1. Generalization of each value in the set to its corresponding higher- level concept. 2. Derivation of the general behavior of the set. • Generalization can be performed by applying different generalization operators to explore alternative generalization paths. • The result of generalization is a heterogeneous set.
  • 6. Set and Listed Valued Attribute Example • that the hobby of a person is a set-valued attribute containing the set of values Tennis, Hockey, Soccer, Violin. • This set can be generalized to a set of high-level concepts, such as {sports, music, computer games} or into the number 5. • a count can be associated with a generalized value to indicate how many elements are generalized to that value, as in {sports(3),music(1), computer games(1)}, where sports(3) indicates three kinds of sports. and so on.
  • 7. Listed Valued Attribute • List-valued attributes can be generalized in a manner similar to that for set-valued attributes except that the order of the elements in the list should be preserved in the generalization. • A list can be generalized according to its general behavior, such as the length of the list, the type of list elements, the value range, the weighted average value for numerical data, or by dropping unimportant elements in the list.
  • 8. Listed Valued Attribute Example • List of data for a person’s education record: • “((B.Sc. in Electrical Engineering, U.B.C., Dec., 1998), • (M.Sc. in Computer Engineering, U. Maryland, May, 2001), • (Ph.D. in Computer Science, UCLA, Aug., 2005))”. • We can generalize by dropping less important descriptions (attributes) of each tuple in the list, such as by dropping the month attribute to obtain “((B.Sc., U.B.C., 1998), : : :)”, and/or by retaining only the most important tuple(s) in the list, • Example: “(Ph.D. in Computer Science, UCLA, 2005)”.
  • 9. Dimensions in Spatial Data Cube • Three types of dimensions in a spatial data cube: • A nonspatial dimension • A spatial-to-nonspatial dimension • A spatial-to-spatial dimension • A nonspatial dimension contains only nonspatial data. Nonspatial dimensions temperature and precipitation can be constructed for the warehouse. • Example: Contains nonspatial data whose generalizations are nonspatial (such as “hot” for temperature and “wet” for precipitation). • A spatial-to-nonspatial dimension is a dimension whose primitive-level data are spatial but whose generalization, starting at a certain high level, becomes nonspatial.
  • 10. Dimensions in Spatial Data Cube • Example: Suppose that the dimension’s spatial representation of Mumbai is generalized to the string “south mumbai.” Although “south mumbai” is a spatial concept, its representation is not spatial. • A spatial-to-spatial dimension is a dimension whose primitive level and all of its high level generalized data are spatial. • Example: the dimension equi temperature region contains spatial data, as do all of its generalizations, such as with regions covering 0-5 degrees (Celsius), 5-10 degree.
  • 11. Plan, Plan Database and Plan Mining • A plan consists of a variable sequence of actions. • A plan database or planbase, is a large collection of plans. • Plan mining is the task of mining significant patterns or knowledge from a planbase. • Plan mining can be used to discover travel patterns of business passengers in an air flight database or to find significant patterns from the sequences of actions in the repair of automobiles. • Plan mining is the extraction of important or significant generalized (sequential) patterns from a planbase.
  • 12. Spatial Aggregation and Approximation • Aggregation and approximation are especially useful for generalizing attributes with large sets of values, complex structures, and spatial or multimedia data. • We would like to generalize detailed geographic points into clustered regions, such as business, residential, industrial, or agricultural areas, according to land usage. • A multimedia database may contain complex texts, graphics, images, video fragments, maps, voice, music, and other forms of audio/video information. • Generalization on multimedia data can be performed by recognition and extraction of the essential features and/or general patterns of such data. • For an image, the size, color, shape, texture, orientation, and relative positions and structures of the contained objects or regions in the image can be extracted by aggregation and/or approximation. • For a segment of music, its melody can be summarized based on the approximate patterns that repeatedly occur in the segment, while its style can be summarized based on its tone, tempo, or the major musical instruments played.
  • 13. • Technologies developed in spatial databases and multimedia databases such as • spatial data accessing and analysis techniques • pattern recognition • image analysis • text analysis • content-based image/text retrieval • multidimensional indexing methods • Example: • Suppose that we have different pieces of land for various purposes of agricultural usage, such as the planting of vegetables, grains, and fruits. These pieces can be merged or aggregated into one large piece of agricultural land by a spatial merge. • However, such a piece of agricultural land may contain highways, houses, and small stores. • If the majority of the land is used for agriculture, the scattered regions for other purposes can be ignored, and the whole region can be claimed as an agricultural area by approximation.