Extracting and Visualising Tree-like Structures from Concept Lattices
Cassio Melo1
, Bénédicte Le-Grand2
, Marie-Aude Aufaure1
and Anastasia Bezerianos1
1
École Centrale Paris – MAS Laboratoire, 2
Laboratoire d’Informatique 6 – LIP6
{cassio.melo, marie-aude.aufaure, anastasia.bezerianos}@ecp.fr, benedicte.le-grand@lip6.fr
Abstract
Traditional software in Formal Concept Analysis
makes little use of visualization techniques, producing
poorly readable concept lattice representations when the
number of concepts exceeds a few dozens. This is
problematic as the number of concepts in such lattices
grows significantly with the size of the data and the
number of its dimensions. In this work we propose
several methods to enhance the readability of concept
lattices firstly though colouring and distortion
techniques, and secondly by extracting and visualizing
trees derived from concept lattice structures. These
contributions represent an important step in the visual
analysis of conceptual structures, as domain experts may
visually explore larger datasets that traditional
visualizations of concept lattice cannot represent
effectively.
Keywords--- Concept Lattices, Formal Concept
Analysis, Tree Extraction.
1. Introduction
The vast amount of data generated over the last
decades has brought new challenges to the analytics
science. Visual data analysis and knowledge
representation employ methods such as Formal Concept
Analysis (FCA) in order to identify groupings of patterns
from the analysis process [26]. FCA provides an intuitive
understanding of generalization and specialization
relationships among objects and their attributes in a
structure known as a concept lattice. A concept lattice is
traditionally represented by a Hasse diagram illustrating
the groupings of objects described by common attributes.
A Hasse diagram is a graph where concepts appear as
vertices on the plane connected by line segments or
curves. The layout of the partially ordered set may be
seen as a layered diagram [2]. Lattices visualization
becomes a problem as the number of clusters grows
significantly with the number of objects and attributes.
Interpreting the lattice through a direct visualization of
the line diagram rapidly becomes impossible and more
synthetic representations are needed.
In this work we propose alternatives to the
traditional lattice representation, firstly by enhancing the
readability of concept lattices though colouring and
distortion techniques; secondly by extracting and
visualizing trees derived from the lattices structure. The
tree extraction from the original lattice has some unique
advantages: it eliminates all edges crossing and the
resulting hierarchy is also easier to interpret and to
represent. Moreover, this representation still provides an
overview of the dataset, highlighting significant
properties of the lattice. In order to extract trees from
lattices, we define a set of parent concept selection
criteria, including the stability and support indexes [1,4]
provided by FCA literature, confidence index as well as
topological features of the lattice.
The paper is organized as follows. Section 2
provides background on lattice representations; Section 3
proposes a set of criteria for transforming concept
lattices into trees; Section 4 discusses colouring and
distortion techniques for enhancing interpretations of
lattices. Section 5 presents instantiations of the suggested
criteria and visualizations in the biology domain,
followed by a discussion in section 6. Section 7 finally
concludes and presents perspectives for future work.
2. Visual Representation of Concept Lattices
As mentioned above, FCA analysis produces
lattices, usually represented as layered directed acyclic
graph graphs, named Hasse diagrams, that illustrate the
groupings of objects described by common attributes.
Hasse diagrams display the partially ordered sets
(posets) between concepts in a hierarchical fashion,
where each concept may have several parent concepts. In
the following example about animal’s features, the
formal context in table 1 generated the concept lattice
illustrated in figure 1. The partial order among concepts
of the lattice is materialized through the generalization
and specialization relationships: for instance Concept 4
(representing the set of flying birds, containing Finch and
Eagle objects), is more specific than Concept 1 (which
contains all birds –flying or not-), and thus contains a
smaller number of objects (Concept 1 has an extra one,
the ostrich). This partial order provides different levels
of abstraction and native navigation links from a given
concept.
2011 15th International Conference on Information Visualisation
1550-6037/11 $26.00 © 2011 IEEE
DOI 10.1109/IV.2011.46
261
As mentioned earlier, such diagrams are usually
layered graphs, where concept vertices are assigned to
horizontal layers according of the number of common
attributes, and are ordered within each layer to reduce
edge crossings. FCA lattices in particular suffer from
considerable edge crossings, especially if the number of
concepts exceeds a few dozen as is the case in more real
word applications [13], which leads to reduced graph
readability and aesthetics [3].
To reduce the complexity of lattices, simplified
diagrams can be produced by condensing or clustering
concepts according to similarity [4]. Visualisations can
also be restricted to portions of the data [5], and concept
size reduction is possible by incorporating conditions
into the data mining process [6]. Finally, conceptual
measures can be applied to identify the most relevant
concepts and filter outliers [7].
To deal specifically with the visual complexity of
Hasse diagrams, several approaches allow users to
dynamically explore and reveal specific parts of the
diagram, using visual query languages [8-10]. However
these techniques do not provide a clear view of the entire
lattice.
Other FCA visualization approaches map the
distances between concepts to visual variables, in order
to highlight patterns. For example in [11] similar
concepts are represented as similarly coloured pixels
placed in the 2D space along a Peano-Hilbert curve, so
that similar concepts are placed close from one another.
Nevertheless in these representations detailed
relationships between concepts are lost. Finally, systems
often provide users with hybrid/combined lattice
visualization, e.g. showing both a general Hasse diagram
and a tag cloud for representing the neighbours of a
specific concept (for a review see [12]).
Our approach consists in representing lattices not as
Hasse diagrams, but as trees. We use different criteria to
extract trees from lattices, and visualize the resulting
trees. Trees are inherently simpler hierarchical structures
than Hasse diagrams and due to their applicability in
many domains, there is a plethora of tree representations.
These include: indented outline trees, sometimes called a
“tree list” (common in file browsers such as windows
Explorer), traditional layered node-link diagrams in 2D
or 3D (e.g. ConeTrees [14]), spatially transformed tree
diagrams (e.g. Radial [15]) as well as several space
optimization (Space Optimized trees [16]) and space-
filling tree visualization techniques (e.g. TreeMaps [17]).
Table 1. A formal context of animals.
Animal Preying Mammal Flying Bird
Lion X X
Finch X X
Eagle X X X
Hare X
Ostrich X
Figure 1. Generated concept lattice for the
animal’s context in table 1.
3. Tree Extraction from Concept Lattices
Trees are a common and easily understandable
visual representation. We consider them as a
visualization alternative to large cluttered concept
lattices, which preserves all lattice entities and some of
its structure. In order for a tree visualization to be an
effective alternative to a lattice, the extraction of the tree
from the lattice needs to preserve the most essential
features of the original structure.
The present approach consists in extracting a tree
from a concept lattice by choosing one single parent
concept for each concept of the lattice. We start from the
most specific concepts i.e. the parent concepts of the
lower bound of the lattice, at the bottom of the Hasse
diagram (concepts 5 and 6 in Figure 1) and select a
single parent concept for each of them, and reproduce
this recursively. Choosing a single parent concept at each
step leads to an information loss. Our goal is to minimize
this loss by selecting parents using the most relevant
criteria according to the kind of analysis performed by
the analyst. Before proceeding, we briefly recall the FCA
terminology [18]. Given a (formal) context
K = (G,M, I), where G is called a set of objects or extent,
M is called a set of attributes or intent, and the binary
relation I ‫ك‬ G ×M specifies which objects have which
attributes, the derivation operators (·)’ are defined for A
‫ك‬ G and B ‫ك‬ M:
A’ = {m ‫א‬ M | ‫׊‬g ‫א‬ A : gIm};
B’ = {g ‫א‬ G | ‫׊‬m ‫א‬ B : gIm}.
In the following sections we consider various
strategies for selecting parent concepts, including the
stability and support indexes from FCA literature,
confidence, as well as topological features of the lattice.
262
3.1. Parent Selection based on the highest
Stability or Support
The stability index measures the proportion of subsets
of objects of a given concept whose derivation is equal to
the intent of this concept [1]. In other words, the stability
indicates the probability of preserving a concept intent
while removing some objects of its extent. We recall the
definition of stability:
Definition 1. Let K = (G,M,I) be a formal context and
(A,B) be a formal concept of K. Card is a cardinality
function. The stability index of (A,B) is defined as:
ߪሺ‫ܣ‬ǡ ‫ܤ‬ሻ ൌ
‫݀ݎܽܥ‬ሺሼ‫ܥ‬ ‫ك‬ ‫ܣ‬ ȁ ‫ܥ‬Ԣ ൌ ‫ܤ‬ሽሻ
ʹ஼௔௥ௗሺ஺ሻ
ƒ
Using the context in table 1 as an example, we
calculate the stability for concepts 2 and 4 in order to
select a parent for concept 5 (0.25 and 0.5 respectively);
we keep the one with highest stability, in this case we
therefore remove the edge between concepts 2 and 5.
The idea behind the choice of the parent concept with the
highest stability is that we expect to keep parent
concept’s meaning even if some of the objects or
attributes are removed.
On the other hand, the support measure is the relation
between the intent closure and the number of total of
objects [4]:
Definition 2. Let B ⊂ M. The support count of the
attribute set B in K is:
ɔሺ‫ܤ‬ሻ ൌ
‫݀ݎܽܥ‬ሺ‫ܤ‬Ԣሻ
‫݀ݎܽܥ‬ሺ‫ܩ‬ሻ
The use of support as parent selection criteria may lead
to trees containing concepts that have fewer
specialization levels since in general, generic concepts
have higher support values than their most specific
counterparts [4]. Concept stability and support measures
have been widely used in FCA and their combination has
been promising [1] in reducting the lattice.
3.2. Parent Selection Based on Shared Attributes
and Objects
This approach relies on clustering parent and child
concepts which share most of their attributes or objects.
Parent and child having a great number of attributes in
common are supposed to be grouped together following
the principle of similarity clustering and local
predictability [19]. Its definition is:
Definition 3. Let ConceptParent (A,B) be such that A
⊂ G and B ⊂ M. Let ConceptChild (C,D) be C ⊂ G
and D ⊂ M. The shared attribute index of an edge E
(C,D)•(A,B):
߶ሺ‫ܧ‬ሻ ൌ
‫݀ݎܽܥ‬ሺ‫ܤ‬ ‫ת‬ ‫ܦ‬ሻ
‫݀ݎܽܥ‬ሺ‫ܯ‬ሻ
In the same animal’s context in table 1, we have
potential parent concepts 2 and 4 sharing the same
number of objects with concept 5, but concept 4 has
more attributes in common with 5, so it should be chosen
as the unique parent of concept 5.
3.3. Parent Selection Based on Confidence
The confidence value of a concept estimates how
likely an object which has an attribute set A, also has an
attribute set C [18]. In other words, it tries to measure
how strong the implication of the parent attributes in the
child objects is. For instance, considering the formal
context in table 1, what is the probability of a given
object that is {Bird, Flying} to be also {Bird, Flying,
Preying}? The following paragraph formalizes its
definition.
Definition 4. Let ConceptParent (A,B) be such that A
⊂ G and B ⊂ M. Let ConceptChild (C,D) be C ⊂ G
and D ⊂ M. The confidence of an edge E (C,D)•(A,B):
ߜሺ‫ܧ‬ሻ ൌ
‫݀ݎܽܥ‬ሺ‫ܥ‬ሻ
‫݀ݎܽܥ‬ሺ‫ܣ‬ሻ
An advantage of this method is its consistency with
the interpretation of concept lattices. Taking our animals
context as example, there is a 50% probability that an
animal that is a flying bird is also a flying and preying
bird. By contrast, an animal that is preying has only 33%
of chance to be also a flying bird.
3.4. Tree Transformation Based on the
Minimum Spanning Tree
This topological-driven approach seeks to choose
the tree configuration on the graph which has the shortest
paths among all concepts. A detailed definition of the
algorithm can be found in [20]. In this case, we choose
the parent concept which minimizes the number of steps
needed to reach the top of the lattice, and therefore most
generic parents (closer to the top) will be preferred rather
than the specialized ones. This is the only proposed
strategy that takes into account the topology of a lattice,
and it yields concepts with similar depth (specialization)
levels.
4. Using extraction criteria to enhance
Lattice and Tree Interpretation through
Drawing, Sizing and Shaping
Common graph drawing techniques include the
assignment of different colours, shapes and sizes to
nodes and edges, according to different dimensions or
properties. This approach is underused in traditional
lattice visualizations, where the main visual variable
used is node/link colour to reflect user selections or node
263
size to indicate the immediate presence of an extent or
intent as displayed in ConExp1
.
In our work we use these as well as other visual
variables in a Hasse diagram to represent possible tree
extraction criteria. This provides several benefits to
lattice and extracted tree understanding. First, it enables
users to rapidly associate the dimension/criteria in
question (e.g. stability, support in Figure 2 with
concepts, thus justifying the choices made during the tree
extraction process. Second, visualizing different
extraction criteria using various visual variables, allows
users to compare these criteria in order to choose the one
that better fits their needs. Third, irrespective of the tree
extraction process, matching visual attributes to concept
attributes establishes a benchmark/comparison among
concepts, making it possible to compare at a glance
different concepts, even if they do not have a link in
common, as well as gain insights on the whole lattice
itself. Finally, prominent features of the lattice like
specialization and generalization can be better
understood: for instance the power of implications of
different concepts can be rendered by edge thickness.
The concept node itself can be a visual metaphor for the
intent and extent. In the example of figure 2, a pie chart
replaces the traditional box representation to depict the
proportion of objects (blue) and attributes (yellow). In
this way users can be guided in understanding and
choosing criteria for extracting trees to simplify the
lattice representation.
Figure 2. Animal lattice with nodes as pie charts
sized by stability, and edge thickness by
confidence. Pie charts indicate the ratio
intent/extent of the concept.
1
ConceptExplorer. http://conexp.sourceforge.net/
5. A Qualitative Analysis of the Proposed
Parent Selection Criteria
In this section we discuss a case study of a concept
lattice to qualitatively examine the nature of the trees
resulting from different criteria. The techniques for
lattice transformation and drawing were implemented in
a visual analytics tool called CUBIST Analytics and
applied to a dataset2
containing 8 animals and 9
attributes which produced a lattice with 19 concepts
(figure 3). Each of the measures proposed revealed
particular aspects on the analysis of a lattice, illustrated
in table 2.
Table 2 a) shows the tree generated with stability as
parent selection criterion. In practice, it resulted in a tree
with very stable concepts more likely to retain their
subsequent children. For instance, the concept {lives in
land} was the preferred parent of the concept that holds
our notion for amphibians: {lives on land, lives in water}
because it is more stable than its counterparts.
The measure of shared objects was the criterion that
generated the tree in table 2 b). Parent concepts sharing
most objects with child concept were the preferred
candidates. As an example, the concept {lives on land}
shares more objects with {lives on land, needs
chlorophyll} than concept {needs chlorophyll} does,
therefore it was the chosen parent in this case.
Table 2 c) the tree was generated from confidence
criterion, therefore children nodes are associated with the
parent with which the relationship of confidence is the
highest among the candidates. As a result, the relation
{can move, has limbs} has a stronger implication in
{lives on land} than {lives on land} has for {can move,
has limbs}, for example.
Table 2 d) depicts the tree generated by the
minimum spanning tree criterion. In this configuration,
concepts are arranged in a way that they are closer
semantically from the rest of lattice and hence it features
a more symmetric structure in comparison with previous
approaches.
Figure 3. Concept lattice of the biology domain.
2
The “Needs water to live” dataset is available at
http://www.upriss.org.uk/fca/examples.html
264
6. Discussion
Some may argue that due to the tree construction,
the present approach breaks the original lattice meaning,
and therefore subsequent mathematical models based on
this structure. It is noteworthy to observe however, that
only the links in the lattice graph structure are removed
and the lattice structure remains semantically valid, since
there is no need to take out the attributes or objects that
concepts have in common with their parents.
Table 2. Trees generated from the lattice in
figure 4 for each one of the proposed measure.
Example
a)Stabilityb)Sharedobjectsc)Confidenced)MST
The choice of parent selection criteria for tree
transformation corresponds to a classification problem to
some extent. Deciding if a Lion is more “mammal” than
it is “preying” it’s not always straightforward, hence we
rely on the measures that attempt to keep the context
semantics when looking at the entire concept lattice. For
instance, if we have more objects described by mammal
which are “closer” to Lion than other concepts, then it
may reasonable to be chosen as its parent. As general
recommendations, one should use the criteria that best
fits to their analysis task (table 3).
In addition to the tree-extraction strategies, the use
of colours, size, shaping and thickness for both nodes
and edges in the original lattice to represent the criteria
metrics (such as stability, support, specialization or
implication) can enhance the interpretation of a concept
lattice, and aid users in their choice and interpretation of
the created trees.
The labelling strategy for identifying concepts
should be taken into account as well. Merely placing
attributes and objects names on concepts may be
cumbersome for large lattice analysis (used in most FCA
visualizations). In this case, it is recommended to
represent the concept’s intent and extent with visual
metaphors like the pie chart shown in figure 2.
Conclusions and Future Work
Traditional software in FCA makes little use of
visualization techniques, producing poorly readable
lattice graphs when the number of concepts exceeds a
few dozens. In this work we have presented a
transformation approach to extract trees from concept
lattices, attempting to minimize both semantic and
conceptual loss in favour of readability and
interpretation. We have also presented ways to visually
show the extraction criteria in the original lattice. This is
an important step in the visual analysis of conceptual
structures, as the resulting tree structures are visually
easier to understand than cluttered lattice graphs.
Domain experts can thus visually explore larger datasets
that traditional visualizations of concept lattice cannot
represent effectively. Each of the tree construction
measures proposed in our work provides particular
insights valuable to different analysis tasks, identified in
our paper as recommendations.
In the future we plan to combine two or more criteria
for parent selection with other lattice reduction
techniques (e.g. icebergs lattices [4]). We also plan to
conduct user experiments to understand when users want
to have full lattice views vs. tree views, which metrics
for creating trees are of most interest to them and under
which circumstances, and assess if our visual indications
allow users to understand the extraction tree process.
Acknowledgements
This work is partly funded by the CUBIST project
(“Combining and Uniting Business Intelligence with
Semantic Technologies”), funded by the European
Commission’s 7th Framework Programme of ICT, under
topic 4.3: Intelligent Information Management.'
265
Table 3. General guidelines on the usage of the proposed metrics.
Criteria
Description Rationale Suitable for
Stability It measures how likely a concept is to
change if some of their attributes or
objects are removed.
Stable concepts are less impacted by
noise and usually represent strong
correlation with real world entities (e.g.:
a concept that encapsulates our notion of
“mammal”).
Observing real world
analogies
Support It measures the frequency of the
concept itemset.
Frequent concepts are usually generic
concepts since they aggregate a larger
number of objects than the specialized
ones.
Frequent pattern
analysis
Shared
objects /
attributes
It represents the degree of similarity
between parent and child nodes.
Concepts that share most attributes or
objects should be linked together because
they are similar.
Similarity analysis
Confidence It measures how strong the
implication is between a parent
concept in a child concept.
Implication is one of the desired
interpretation of a concept lattice.
Confidence analysis
MST It extracts a minimum spanning tree
from the concept lattice.
It looks at concepts that have some
degree of similarity with all other
concepts.
Topological analysis
References
[1] Kuznetsov, S.O.: Stability as an estimate of the degree of
substantiation of hypotheses derived on the basis of
operational similarity. Nauchn. Tekh. Inf., Ser.2
(Automat. Document. Math. Linguist.) 12 (1990) 21–29
[2] Di Battista, G.; Tamassia, R. (1988), "Algorithms for
plane representation of acyclic digraphs", Theoretical
Computer Science 61: 175–178.
[3] C. Ware, H. Purchase, L. Colpoys, and M. McGill.
Cognitive measurements of graph aesthetics. Information
Visualization, 1(2):103–110, 2002.
[4] Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., and
Lakhal, L. Computing iceberg concept lattices with
Titanic. In Data & Knowledge Engineering, Volume 42,
Issue 2, pp. 189-222, 2002.
[5] Ducrou, J., Eklund, P., and Wilson, T. An Intelligent
User Interface for Browsing and Searching MPEG-7
Images Using Concept Lattices. In S. Ben Yahiaet al.
(Eds.): CLA 2006, LNAI 4923, pp. 1–21, Springer-
Verlag Berlin Heidelberg 2008.
[6] Zaki, M.J., Hsiao, C-J. Efficient Algorithms for Mining
Closed Itemsets and Their Lattice Structure. In IEEE
Transactions on Knowledge and Data Mining, Vol. 17,
No. 4, IEE Computer Soc., 2005.
[7] Le Grand, B., Soto, M., Aufaure, M.-A. (2009)
“Conceptual and Spatial Footprints for Complex systems
Analysis: Application to the Semantic Web”, in 20th
International Conference on Database and Expert
Systems Applications 2009, pp.114-127.
[8] Blau, H., Immerman, N.,and Jensen, D.. A Visual
Language for Querying and Updating Graphs. University
of Massachusetts Amherst, Computer Science
Department Tech: Report 2002-037. 2002.
[9] Cruz, I. F., Mendelzon, A. O., and Wood, P. T.. A
Graphical Query Language Supporting Recursion. In
Proc. of the Association for Computing Machinery
Special Interest Group on Management of Data, pages
323–330. ACM Press, May 1987.
[10] Consens, M., and Mendelzon, A. Hy+: a Hygraph-based
query and visualization system. SIGMOD Record,
22(2):511–516, 1993.
[11] Michel Soto, Benedicte Le Grand, Marie-Aude Aufaure,
"Spatial Visualisation of Conceptual Data," International
Conference Information Visualisation, pp. 57-61, 2009.
[12] Eklund, Peter, Villerd, Jean. A Survey of Hybrid
Representations of Concept Lattices in Conceptual
Knowledge Processing Formal Concept Analysis.
Lecture Notes in Computer Science 2010, Springer
Berlin/Heidelberg, pp. 296- 311
[13] C. Roth, S. Obiedkov, D. G. Kourie. "Towards Concise
Representation for Taxonomies of Epistemic
Communities", CLA 4th Intl Conf on Concept Lattices
and their Applications. 2006.
[14] George Robertson, Jock D. Mackinlay, Stuart Card.
Cone Trees: Animated 3D Visualizations of Hierarchical
Information. In Proceedings of the ACM CHI 91 Human
Factors in Computing Systems Conference, pages 189--
194, Association for Computing Machinery. 1991.
[15] P. D. Eades. Drawing free trees. Bulletin of the Institute
for Combinatorics and its Applications, 5:10–36, 1992.
[16] Quang Vinh Nguyen; Mao Lin Huang. A space-
optimized tree visualization, Information Visualization,
2002. INFOVIS 2002, pp. 85- 92, 2002
[17] B. Johnson, B. Shneiderman: Tree maps: A Space-Filling
Approach to Visualization of Hierarchical Information
Structures. IEEE Visualization 1991: 284-291
[18] Ganter, B., Wille, R.: Formal Concept Analysis:
Mathematical Foundations. Springer, Berlin (1999)
[19] Hannan, T., Pogel, A.: Spring-based lattice drawing
highlighting conceptual similarity. In: Proceedings of the
International Conference on Formal Concept Analysis,
ICFCA 2006, Berlin. LNCS, vol. 3974, pp. 264–279.
Springer, Heidelberg (2006)
[20] Gallager, R.G., Humblet, P.A., and Spira, P.M. 1983. A
Distributed Algorithm for Minimum-Weight Spanning
Trees. ACM Trans. Program. Lang. Syst. 5, 1 (Jan.
1983), 66-77.
266

More Related Content

PDF
LATTICE-CELL : HYBRID APPROACH FOR TEXT CATEGORIZATION
PDF
A New Approach for Ranking Shadowed Fuzzy Numbers and its Application
PDF
Reduct generation for the incremental data using rough set theory
PDF
Interval Pattern Structures: An introdution
PDF
Singh gordon-unified-factorization-ecml
PDF
Dimensionality reduction by matrix factorization using concept lattice in dat...
PDF
PPTX
Handling missing data with expectation maximization algorithm
LATTICE-CELL : HYBRID APPROACH FOR TEXT CATEGORIZATION
A New Approach for Ranking Shadowed Fuzzy Numbers and its Application
Reduct generation for the incremental data using rough set theory
Interval Pattern Structures: An introdution
Singh gordon-unified-factorization-ecml
Dimensionality reduction by matrix factorization using concept lattice in dat...
Handling missing data with expectation maximization algorithm

What's hot (18)

PDF
A SYSTEM FOR VISUALIZATION OF BIG ATTRIBUTED HIERARCHICAL GRAPHS
PDF
PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...
PDF
A NEW PERSPECTIVE OF PARAMODULATION COMPLEXITY BY SOLVING 100 SLIDING BLOCK P...
PDF
Bq25399403
PDF
Perimetric Complexity of Binary Digital Images
PDF
Efficiency of TreeMatch Algorithm in XML Tree Pattern Matching
PDF
Reflections on understanding in mathematics
PDF
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
PDF
G0354451
PDF
Quantum Deep Learning
PDF
Extending the knowledge level of cognitive architectures with Conceptual Spac...
PDF
E018212935
PDF
Dd31720725
PPTX
Bsc cs ii dfs u-1 introduction to data structure
PDF
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
PDF
An Optimal Approach For Knowledge Protection In Structured Frequent Patterns
PDF
Decision tree clustering a columnstores tuple reconstruction
ODP
Minimizing cost in distributed multiquery processing applications
A SYSTEM FOR VISUALIZATION OF BIG ATTRIBUTED HIERARCHICAL GRAPHS
PREDICTIVE EVALUATION OF THE STOCK PORTFOLIO PERFORMANCE USING FUZZY CMEANS A...
A NEW PERSPECTIVE OF PARAMODULATION COMPLEXITY BY SOLVING 100 SLIDING BLOCK P...
Bq25399403
Perimetric Complexity of Binary Digital Images
Efficiency of TreeMatch Algorithm in XML Tree Pattern Matching
Reflections on understanding in mathematics
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
G0354451
Quantum Deep Learning
Extending the knowledge level of cognitive architectures with Conceptual Spac...
E018212935
Dd31720725
Bsc cs ii dfs u-1 introduction to data structure
An_Accelerated_Nearest_Neighbor_Search_Method_for_the_K-Means_Clustering_Algo...
An Optimal Approach For Knowledge Protection In Structured Frequent Patterns
Decision tree clustering a columnstores tuple reconstruction
Minimizing cost in distributed multiquery processing applications
Ad

Viewers also liked (17)

PDF
Proposal to establish_center_for_excellence_in_curricular_
PDF
1 s2.0-s2093791114000079-main
PDF
1 s2.0-s0022437513001552-main
PDF
Job shop-scheduling uain heurstic bottle neck shift
PDF
1 s2.0-s2093791114000262-main
PDF
1 s2.0-s2093791114000201-main
PDF
1 s2.0-s2093791114000080-main
PDF
1 s2.0-s2093791114000237-main
PDF
Journal of mathematical analysis and applications
PDF
1 s2.0-s2093791114000250-main
PDF
1 s2.0-s2093791113000619-main
PDF
Improving safety culture..
PDF
Innovative knowledge sharing
PDF
Proposal for center for information assurance
PDF
Occupational health and safety management systems
PDF
Proposal establishment
Proposal to establish_center_for_excellence_in_curricular_
1 s2.0-s2093791114000079-main
1 s2.0-s0022437513001552-main
Job shop-scheduling uain heurstic bottle neck shift
1 s2.0-s2093791114000262-main
1 s2.0-s2093791114000201-main
1 s2.0-s2093791114000080-main
1 s2.0-s2093791114000237-main
Journal of mathematical analysis and applications
1 s2.0-s2093791114000250-main
1 s2.0-s2093791113000619-main
Improving safety culture..
Innovative knowledge sharing
Proposal for center for information assurance
Occupational health and safety management systems
Proposal establishment
Ad

Similar to Lattice2 tree (20)

PDF
Fuzzy formal concept analysis: Approaches, applications and issues
PDF
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
PDF
LATTICE-CELL : HYBRID APPROACH FOR TEXT CATEGORIZATION
DOC
Discovering Novel Information with sentence Level clustering From Multi-docu...
PDF
call for papers, research paper publishing, where to publish research paper, ...
PDF
Is Abstraction the Key to Artificial Intelligence? - Lorenza Saitta
PDF
DOMAIN SPECIFIC CBIR FOR HIGHLY TEXTURED IMAGES
PDF
Semi-Supervised Discriminant Analysis Based On Data Structure
PDF
E017373946
PDF
FUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATION
DOCX
Summary2 (1)
PDF
F5233444
PDF
Towards An Enhanced Semantic Approach Based On Formal Concept Analysis And Li...
DOCX
Data Mining Exploring DataLecture Notes for Chapter 3
PDF
A Novel Algorithm for Design Tree Classification with PCA
PDF
1376846406 14447221
PPT
20070702 Text Categorization
PPT
PPT
The Fuzzy Logical Databases
PDF
Image Super-Resolution Reconstruction Based On Multi-Dictionary Learning
Fuzzy formal concept analysis: Approaches, applications and issues
word2vec, node2vec, graph2vec, X2vec: Towards a Theory of Vector Embeddings o...
LATTICE-CELL : HYBRID APPROACH FOR TEXT CATEGORIZATION
Discovering Novel Information with sentence Level clustering From Multi-docu...
call for papers, research paper publishing, where to publish research paper, ...
Is Abstraction the Key to Artificial Intelligence? - Lorenza Saitta
DOMAIN SPECIFIC CBIR FOR HIGHLY TEXTURED IMAGES
Semi-Supervised Discriminant Analysis Based On Data Structure
E017373946
FUZZY STATISTICAL DATABASE AND ITS PHYSICAL ORGANIZATION
Summary2 (1)
F5233444
Towards An Enhanced Semantic Approach Based On Formal Concept Analysis And Li...
Data Mining Exploring DataLecture Notes for Chapter 3
A Novel Algorithm for Design Tree Classification with PCA
1376846406 14447221
20070702 Text Categorization
The Fuzzy Logical Databases
Image Super-Resolution Reconstruction Based On Multi-Dictionary Learning

Recently uploaded (20)

PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Zenith AI: Advanced Artificial Intelligence
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
Architecture types and enterprise applications.pdf
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PPTX
Chapter 5: Probability Theory and Statistics
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPTX
Microsoft Excel 365/2024 Beginner's training
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
Configure Apache Mutual Authentication
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
STKI Israel Market Study 2025 version august
Convolutional neural network based encoder-decoder for efficient real-time ob...
Hindi spoken digit analysis for native and non-native speakers
Zenith AI: Advanced Artificial Intelligence
2018-HIPAA-Renewal-Training for executives
UiPath Agentic Automation session 1: RPA to Agents
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
A proposed approach for plagiarism detection in Myanmar Unicode text
NewMind AI Weekly Chronicles – August ’25 Week III
sbt 2.0: go big (Scala Days 2025 edition)
Architecture types and enterprise applications.pdf
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
Chapter 5: Probability Theory and Statistics
Taming the Chaos: How to Turn Unstructured Data into Decisions
Microsoft Excel 365/2024 Beginner's training
Custom Battery Pack Design Considerations for Performance and Safety
A contest of sentiment analysis: k-nearest neighbor versus neural network
Configure Apache Mutual Authentication
1 - Historical Antecedents, Social Consideration.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
STKI Israel Market Study 2025 version august

Lattice2 tree

  • 1. Extracting and Visualising Tree-like Structures from Concept Lattices Cassio Melo1 , Bénédicte Le-Grand2 , Marie-Aude Aufaure1 and Anastasia Bezerianos1 1 École Centrale Paris – MAS Laboratoire, 2 Laboratoire d’Informatique 6 – LIP6 {cassio.melo, marie-aude.aufaure, anastasia.bezerianos}@ecp.fr, [email protected] Abstract Traditional software in Formal Concept Analysis makes little use of visualization techniques, producing poorly readable concept lattice representations when the number of concepts exceeds a few dozens. This is problematic as the number of concepts in such lattices grows significantly with the size of the data and the number of its dimensions. In this work we propose several methods to enhance the readability of concept lattices firstly though colouring and distortion techniques, and secondly by extracting and visualizing trees derived from concept lattice structures. These contributions represent an important step in the visual analysis of conceptual structures, as domain experts may visually explore larger datasets that traditional visualizations of concept lattice cannot represent effectively. Keywords--- Concept Lattices, Formal Concept Analysis, Tree Extraction. 1. Introduction The vast amount of data generated over the last decades has brought new challenges to the analytics science. Visual data analysis and knowledge representation employ methods such as Formal Concept Analysis (FCA) in order to identify groupings of patterns from the analysis process [26]. FCA provides an intuitive understanding of generalization and specialization relationships among objects and their attributes in a structure known as a concept lattice. A concept lattice is traditionally represented by a Hasse diagram illustrating the groupings of objects described by common attributes. A Hasse diagram is a graph where concepts appear as vertices on the plane connected by line segments or curves. The layout of the partially ordered set may be seen as a layered diagram [2]. Lattices visualization becomes a problem as the number of clusters grows significantly with the number of objects and attributes. Interpreting the lattice through a direct visualization of the line diagram rapidly becomes impossible and more synthetic representations are needed. In this work we propose alternatives to the traditional lattice representation, firstly by enhancing the readability of concept lattices though colouring and distortion techniques; secondly by extracting and visualizing trees derived from the lattices structure. The tree extraction from the original lattice has some unique advantages: it eliminates all edges crossing and the resulting hierarchy is also easier to interpret and to represent. Moreover, this representation still provides an overview of the dataset, highlighting significant properties of the lattice. In order to extract trees from lattices, we define a set of parent concept selection criteria, including the stability and support indexes [1,4] provided by FCA literature, confidence index as well as topological features of the lattice. The paper is organized as follows. Section 2 provides background on lattice representations; Section 3 proposes a set of criteria for transforming concept lattices into trees; Section 4 discusses colouring and distortion techniques for enhancing interpretations of lattices. Section 5 presents instantiations of the suggested criteria and visualizations in the biology domain, followed by a discussion in section 6. Section 7 finally concludes and presents perspectives for future work. 2. Visual Representation of Concept Lattices As mentioned above, FCA analysis produces lattices, usually represented as layered directed acyclic graph graphs, named Hasse diagrams, that illustrate the groupings of objects described by common attributes. Hasse diagrams display the partially ordered sets (posets) between concepts in a hierarchical fashion, where each concept may have several parent concepts. In the following example about animal’s features, the formal context in table 1 generated the concept lattice illustrated in figure 1. The partial order among concepts of the lattice is materialized through the generalization and specialization relationships: for instance Concept 4 (representing the set of flying birds, containing Finch and Eagle objects), is more specific than Concept 1 (which contains all birds –flying or not-), and thus contains a smaller number of objects (Concept 1 has an extra one, the ostrich). This partial order provides different levels of abstraction and native navigation links from a given concept. 2011 15th International Conference on Information Visualisation 1550-6037/11 $26.00 © 2011 IEEE DOI 10.1109/IV.2011.46 261
  • 2. As mentioned earlier, such diagrams are usually layered graphs, where concept vertices are assigned to horizontal layers according of the number of common attributes, and are ordered within each layer to reduce edge crossings. FCA lattices in particular suffer from considerable edge crossings, especially if the number of concepts exceeds a few dozen as is the case in more real word applications [13], which leads to reduced graph readability and aesthetics [3]. To reduce the complexity of lattices, simplified diagrams can be produced by condensing or clustering concepts according to similarity [4]. Visualisations can also be restricted to portions of the data [5], and concept size reduction is possible by incorporating conditions into the data mining process [6]. Finally, conceptual measures can be applied to identify the most relevant concepts and filter outliers [7]. To deal specifically with the visual complexity of Hasse diagrams, several approaches allow users to dynamically explore and reveal specific parts of the diagram, using visual query languages [8-10]. However these techniques do not provide a clear view of the entire lattice. Other FCA visualization approaches map the distances between concepts to visual variables, in order to highlight patterns. For example in [11] similar concepts are represented as similarly coloured pixels placed in the 2D space along a Peano-Hilbert curve, so that similar concepts are placed close from one another. Nevertheless in these representations detailed relationships between concepts are lost. Finally, systems often provide users with hybrid/combined lattice visualization, e.g. showing both a general Hasse diagram and a tag cloud for representing the neighbours of a specific concept (for a review see [12]). Our approach consists in representing lattices not as Hasse diagrams, but as trees. We use different criteria to extract trees from lattices, and visualize the resulting trees. Trees are inherently simpler hierarchical structures than Hasse diagrams and due to their applicability in many domains, there is a plethora of tree representations. These include: indented outline trees, sometimes called a “tree list” (common in file browsers such as windows Explorer), traditional layered node-link diagrams in 2D or 3D (e.g. ConeTrees [14]), spatially transformed tree diagrams (e.g. Radial [15]) as well as several space optimization (Space Optimized trees [16]) and space- filling tree visualization techniques (e.g. TreeMaps [17]). Table 1. A formal context of animals. Animal Preying Mammal Flying Bird Lion X X Finch X X Eagle X X X Hare X Ostrich X Figure 1. Generated concept lattice for the animal’s context in table 1. 3. Tree Extraction from Concept Lattices Trees are a common and easily understandable visual representation. We consider them as a visualization alternative to large cluttered concept lattices, which preserves all lattice entities and some of its structure. In order for a tree visualization to be an effective alternative to a lattice, the extraction of the tree from the lattice needs to preserve the most essential features of the original structure. The present approach consists in extracting a tree from a concept lattice by choosing one single parent concept for each concept of the lattice. We start from the most specific concepts i.e. the parent concepts of the lower bound of the lattice, at the bottom of the Hasse diagram (concepts 5 and 6 in Figure 1) and select a single parent concept for each of them, and reproduce this recursively. Choosing a single parent concept at each step leads to an information loss. Our goal is to minimize this loss by selecting parents using the most relevant criteria according to the kind of analysis performed by the analyst. Before proceeding, we briefly recall the FCA terminology [18]. Given a (formal) context K = (G,M, I), where G is called a set of objects or extent, M is called a set of attributes or intent, and the binary relation I ‫ك‬ G ×M specifies which objects have which attributes, the derivation operators (·)’ are defined for A ‫ك‬ G and B ‫ك‬ M: A’ = {m ‫א‬ M | ‫׊‬g ‫א‬ A : gIm}; B’ = {g ‫א‬ G | ‫׊‬m ‫א‬ B : gIm}. In the following sections we consider various strategies for selecting parent concepts, including the stability and support indexes from FCA literature, confidence, as well as topological features of the lattice. 262
  • 3. 3.1. Parent Selection based on the highest Stability or Support The stability index measures the proportion of subsets of objects of a given concept whose derivation is equal to the intent of this concept [1]. In other words, the stability indicates the probability of preserving a concept intent while removing some objects of its extent. We recall the definition of stability: Definition 1. Let K = (G,M,I) be a formal context and (A,B) be a formal concept of K. Card is a cardinality function. The stability index of (A,B) is defined as: ߪሺ‫ܣ‬ǡ ‫ܤ‬ሻ ൌ ‫݀ݎܽܥ‬ሺሼ‫ܥ‬ ‫ك‬ ‫ܣ‬ ȁ ‫ܥ‬Ԣ ൌ ‫ܤ‬ሽሻ ʹ஼௔௥ௗሺ஺ሻ ƒ Using the context in table 1 as an example, we calculate the stability for concepts 2 and 4 in order to select a parent for concept 5 (0.25 and 0.5 respectively); we keep the one with highest stability, in this case we therefore remove the edge between concepts 2 and 5. The idea behind the choice of the parent concept with the highest stability is that we expect to keep parent concept’s meaning even if some of the objects or attributes are removed. On the other hand, the support measure is the relation between the intent closure and the number of total of objects [4]: Definition 2. Let B ⊂ M. The support count of the attribute set B in K is: ɔሺ‫ܤ‬ሻ ൌ ‫݀ݎܽܥ‬ሺ‫ܤ‬Ԣሻ ‫݀ݎܽܥ‬ሺ‫ܩ‬ሻ The use of support as parent selection criteria may lead to trees containing concepts that have fewer specialization levels since in general, generic concepts have higher support values than their most specific counterparts [4]. Concept stability and support measures have been widely used in FCA and their combination has been promising [1] in reducting the lattice. 3.2. Parent Selection Based on Shared Attributes and Objects This approach relies on clustering parent and child concepts which share most of their attributes or objects. Parent and child having a great number of attributes in common are supposed to be grouped together following the principle of similarity clustering and local predictability [19]. Its definition is: Definition 3. Let ConceptParent (A,B) be such that A ⊂ G and B ⊂ M. Let ConceptChild (C,D) be C ⊂ G and D ⊂ M. The shared attribute index of an edge E (C,D)•(A,B): ߶ሺ‫ܧ‬ሻ ൌ ‫݀ݎܽܥ‬ሺ‫ܤ‬ ‫ת‬ ‫ܦ‬ሻ ‫݀ݎܽܥ‬ሺ‫ܯ‬ሻ In the same animal’s context in table 1, we have potential parent concepts 2 and 4 sharing the same number of objects with concept 5, but concept 4 has more attributes in common with 5, so it should be chosen as the unique parent of concept 5. 3.3. Parent Selection Based on Confidence The confidence value of a concept estimates how likely an object which has an attribute set A, also has an attribute set C [18]. In other words, it tries to measure how strong the implication of the parent attributes in the child objects is. For instance, considering the formal context in table 1, what is the probability of a given object that is {Bird, Flying} to be also {Bird, Flying, Preying}? The following paragraph formalizes its definition. Definition 4. Let ConceptParent (A,B) be such that A ⊂ G and B ⊂ M. Let ConceptChild (C,D) be C ⊂ G and D ⊂ M. The confidence of an edge E (C,D)•(A,B): ߜሺ‫ܧ‬ሻ ൌ ‫݀ݎܽܥ‬ሺ‫ܥ‬ሻ ‫݀ݎܽܥ‬ሺ‫ܣ‬ሻ An advantage of this method is its consistency with the interpretation of concept lattices. Taking our animals context as example, there is a 50% probability that an animal that is a flying bird is also a flying and preying bird. By contrast, an animal that is preying has only 33% of chance to be also a flying bird. 3.4. Tree Transformation Based on the Minimum Spanning Tree This topological-driven approach seeks to choose the tree configuration on the graph which has the shortest paths among all concepts. A detailed definition of the algorithm can be found in [20]. In this case, we choose the parent concept which minimizes the number of steps needed to reach the top of the lattice, and therefore most generic parents (closer to the top) will be preferred rather than the specialized ones. This is the only proposed strategy that takes into account the topology of a lattice, and it yields concepts with similar depth (specialization) levels. 4. Using extraction criteria to enhance Lattice and Tree Interpretation through Drawing, Sizing and Shaping Common graph drawing techniques include the assignment of different colours, shapes and sizes to nodes and edges, according to different dimensions or properties. This approach is underused in traditional lattice visualizations, where the main visual variable used is node/link colour to reflect user selections or node 263
  • 4. size to indicate the immediate presence of an extent or intent as displayed in ConExp1 . In our work we use these as well as other visual variables in a Hasse diagram to represent possible tree extraction criteria. This provides several benefits to lattice and extracted tree understanding. First, it enables users to rapidly associate the dimension/criteria in question (e.g. stability, support in Figure 2 with concepts, thus justifying the choices made during the tree extraction process. Second, visualizing different extraction criteria using various visual variables, allows users to compare these criteria in order to choose the one that better fits their needs. Third, irrespective of the tree extraction process, matching visual attributes to concept attributes establishes a benchmark/comparison among concepts, making it possible to compare at a glance different concepts, even if they do not have a link in common, as well as gain insights on the whole lattice itself. Finally, prominent features of the lattice like specialization and generalization can be better understood: for instance the power of implications of different concepts can be rendered by edge thickness. The concept node itself can be a visual metaphor for the intent and extent. In the example of figure 2, a pie chart replaces the traditional box representation to depict the proportion of objects (blue) and attributes (yellow). In this way users can be guided in understanding and choosing criteria for extracting trees to simplify the lattice representation. Figure 2. Animal lattice with nodes as pie charts sized by stability, and edge thickness by confidence. Pie charts indicate the ratio intent/extent of the concept. 1 ConceptExplorer. http://conexp.sourceforge.net/ 5. A Qualitative Analysis of the Proposed Parent Selection Criteria In this section we discuss a case study of a concept lattice to qualitatively examine the nature of the trees resulting from different criteria. The techniques for lattice transformation and drawing were implemented in a visual analytics tool called CUBIST Analytics and applied to a dataset2 containing 8 animals and 9 attributes which produced a lattice with 19 concepts (figure 3). Each of the measures proposed revealed particular aspects on the analysis of a lattice, illustrated in table 2. Table 2 a) shows the tree generated with stability as parent selection criterion. In practice, it resulted in a tree with very stable concepts more likely to retain their subsequent children. For instance, the concept {lives in land} was the preferred parent of the concept that holds our notion for amphibians: {lives on land, lives in water} because it is more stable than its counterparts. The measure of shared objects was the criterion that generated the tree in table 2 b). Parent concepts sharing most objects with child concept were the preferred candidates. As an example, the concept {lives on land} shares more objects with {lives on land, needs chlorophyll} than concept {needs chlorophyll} does, therefore it was the chosen parent in this case. Table 2 c) the tree was generated from confidence criterion, therefore children nodes are associated with the parent with which the relationship of confidence is the highest among the candidates. As a result, the relation {can move, has limbs} has a stronger implication in {lives on land} than {lives on land} has for {can move, has limbs}, for example. Table 2 d) depicts the tree generated by the minimum spanning tree criterion. In this configuration, concepts are arranged in a way that they are closer semantically from the rest of lattice and hence it features a more symmetric structure in comparison with previous approaches. Figure 3. Concept lattice of the biology domain. 2 The “Needs water to live” dataset is available at http://www.upriss.org.uk/fca/examples.html 264
  • 5. 6. Discussion Some may argue that due to the tree construction, the present approach breaks the original lattice meaning, and therefore subsequent mathematical models based on this structure. It is noteworthy to observe however, that only the links in the lattice graph structure are removed and the lattice structure remains semantically valid, since there is no need to take out the attributes or objects that concepts have in common with their parents. Table 2. Trees generated from the lattice in figure 4 for each one of the proposed measure. Example a)Stabilityb)Sharedobjectsc)Confidenced)MST The choice of parent selection criteria for tree transformation corresponds to a classification problem to some extent. Deciding if a Lion is more “mammal” than it is “preying” it’s not always straightforward, hence we rely on the measures that attempt to keep the context semantics when looking at the entire concept lattice. For instance, if we have more objects described by mammal which are “closer” to Lion than other concepts, then it may reasonable to be chosen as its parent. As general recommendations, one should use the criteria that best fits to their analysis task (table 3). In addition to the tree-extraction strategies, the use of colours, size, shaping and thickness for both nodes and edges in the original lattice to represent the criteria metrics (such as stability, support, specialization or implication) can enhance the interpretation of a concept lattice, and aid users in their choice and interpretation of the created trees. The labelling strategy for identifying concepts should be taken into account as well. Merely placing attributes and objects names on concepts may be cumbersome for large lattice analysis (used in most FCA visualizations). In this case, it is recommended to represent the concept’s intent and extent with visual metaphors like the pie chart shown in figure 2. Conclusions and Future Work Traditional software in FCA makes little use of visualization techniques, producing poorly readable lattice graphs when the number of concepts exceeds a few dozens. In this work we have presented a transformation approach to extract trees from concept lattices, attempting to minimize both semantic and conceptual loss in favour of readability and interpretation. We have also presented ways to visually show the extraction criteria in the original lattice. This is an important step in the visual analysis of conceptual structures, as the resulting tree structures are visually easier to understand than cluttered lattice graphs. Domain experts can thus visually explore larger datasets that traditional visualizations of concept lattice cannot represent effectively. Each of the tree construction measures proposed in our work provides particular insights valuable to different analysis tasks, identified in our paper as recommendations. In the future we plan to combine two or more criteria for parent selection with other lattice reduction techniques (e.g. icebergs lattices [4]). We also plan to conduct user experiments to understand when users want to have full lattice views vs. tree views, which metrics for creating trees are of most interest to them and under which circumstances, and assess if our visual indications allow users to understand the extraction tree process. Acknowledgements This work is partly funded by the CUBIST project (“Combining and Uniting Business Intelligence with Semantic Technologies”), funded by the European Commission’s 7th Framework Programme of ICT, under topic 4.3: Intelligent Information Management.' 265
  • 6. Table 3. General guidelines on the usage of the proposed metrics. Criteria Description Rationale Suitable for Stability It measures how likely a concept is to change if some of their attributes or objects are removed. Stable concepts are less impacted by noise and usually represent strong correlation with real world entities (e.g.: a concept that encapsulates our notion of “mammal”). Observing real world analogies Support It measures the frequency of the concept itemset. Frequent concepts are usually generic concepts since they aggregate a larger number of objects than the specialized ones. Frequent pattern analysis Shared objects / attributes It represents the degree of similarity between parent and child nodes. Concepts that share most attributes or objects should be linked together because they are similar. Similarity analysis Confidence It measures how strong the implication is between a parent concept in a child concept. Implication is one of the desired interpretation of a concept lattice. Confidence analysis MST It extracts a minimum spanning tree from the concept lattice. It looks at concepts that have some degree of similarity with all other concepts. Topological analysis References [1] Kuznetsov, S.O.: Stability as an estimate of the degree of substantiation of hypotheses derived on the basis of operational similarity. Nauchn. Tekh. Inf., Ser.2 (Automat. Document. Math. Linguist.) 12 (1990) 21–29 [2] Di Battista, G.; Tamassia, R. (1988), "Algorithms for plane representation of acyclic digraphs", Theoretical Computer Science 61: 175–178. [3] C. Ware, H. Purchase, L. Colpoys, and M. McGill. Cognitive measurements of graph aesthetics. Information Visualization, 1(2):103–110, 2002. [4] Stumme, G., Taouil, R., Bastide, Y., Pasquier, N., and Lakhal, L. Computing iceberg concept lattices with Titanic. In Data & Knowledge Engineering, Volume 42, Issue 2, pp. 189-222, 2002. [5] Ducrou, J., Eklund, P., and Wilson, T. An Intelligent User Interface for Browsing and Searching MPEG-7 Images Using Concept Lattices. In S. Ben Yahiaet al. (Eds.): CLA 2006, LNAI 4923, pp. 1–21, Springer- Verlag Berlin Heidelberg 2008. [6] Zaki, M.J., Hsiao, C-J. Efficient Algorithms for Mining Closed Itemsets and Their Lattice Structure. In IEEE Transactions on Knowledge and Data Mining, Vol. 17, No. 4, IEE Computer Soc., 2005. [7] Le Grand, B., Soto, M., Aufaure, M.-A. (2009) “Conceptual and Spatial Footprints for Complex systems Analysis: Application to the Semantic Web”, in 20th International Conference on Database and Expert Systems Applications 2009, pp.114-127. [8] Blau, H., Immerman, N.,and Jensen, D.. A Visual Language for Querying and Updating Graphs. University of Massachusetts Amherst, Computer Science Department Tech: Report 2002-037. 2002. [9] Cruz, I. F., Mendelzon, A. O., and Wood, P. T.. A Graphical Query Language Supporting Recursion. In Proc. of the Association for Computing Machinery Special Interest Group on Management of Data, pages 323–330. ACM Press, May 1987. [10] Consens, M., and Mendelzon, A. Hy+: a Hygraph-based query and visualization system. SIGMOD Record, 22(2):511–516, 1993. [11] Michel Soto, Benedicte Le Grand, Marie-Aude Aufaure, "Spatial Visualisation of Conceptual Data," International Conference Information Visualisation, pp. 57-61, 2009. [12] Eklund, Peter, Villerd, Jean. A Survey of Hybrid Representations of Concept Lattices in Conceptual Knowledge Processing Formal Concept Analysis. Lecture Notes in Computer Science 2010, Springer Berlin/Heidelberg, pp. 296- 311 [13] C. Roth, S. Obiedkov, D. G. Kourie. "Towards Concise Representation for Taxonomies of Epistemic Communities", CLA 4th Intl Conf on Concept Lattices and their Applications. 2006. [14] George Robertson, Jock D. Mackinlay, Stuart Card. Cone Trees: Animated 3D Visualizations of Hierarchical Information. In Proceedings of the ACM CHI 91 Human Factors in Computing Systems Conference, pages 189-- 194, Association for Computing Machinery. 1991. [15] P. D. Eades. Drawing free trees. Bulletin of the Institute for Combinatorics and its Applications, 5:10–36, 1992. [16] Quang Vinh Nguyen; Mao Lin Huang. A space- optimized tree visualization, Information Visualization, 2002. INFOVIS 2002, pp. 85- 92, 2002 [17] B. Johnson, B. Shneiderman: Tree maps: A Space-Filling Approach to Visualization of Hierarchical Information Structures. IEEE Visualization 1991: 284-291 [18] Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Berlin (1999) [19] Hannan, T., Pogel, A.: Spring-based lattice drawing highlighting conceptual similarity. In: Proceedings of the International Conference on Formal Concept Analysis, ICFCA 2006, Berlin. LNCS, vol. 3974, pp. 264–279. Springer, Heidelberg (2006) [20] Gallager, R.G., Humblet, P.A., and Spira, P.M. 1983. A Distributed Algorithm for Minimum-Weight Spanning Trees. ACM Trans. Program. Lang. Syst. 5, 1 (Jan. 1983), 66-77. 266