SlideShare a Scribd company logo
Similar Characteristics of Internal Software Quality
Attributes for Object-Oriented Open-Source
Software Projects
PqES- Grupo de pesquisa em Engenharia de Software
http://pesquisa.dcc.ufla.br/pqes/index.php/home-english
Universidade Federal de Lavras – UFLA – Brasil
Mariana de Azevedo Santos Rodrigo Amador
Heitor CostaPaulo Henrique S. Bermejo
mariana@bsi.ufla.br toluenotnt@gmail.com
bermejo@dcc.ufla.br heitor@dcc.ufla.br
AGENDA
 Introduction
 Objective
 Background
 Software domains
 Cluster analysis
 Methodology
 Results
 Related work
 Conclusion
 Threats of validity
 Future work
2
INTRODUCTION
 Software quality assurance: vital component for software
development
 Organization concerns: low cost and high quality products
 Quality assurance are necessary, but costly:
 expends more 50% of the budget of projects
 What to do? Find efficient and cheap methods to get
information about quality on software projects
3
INTRODUCTION (CONT.)
 How to predict internal software quality?
 Measuring their internal attributes (source code quality or its
complexity [ISO/IEC 25010 2011])
 Why OO measures (metrics)?
 Expectation: OO code should have high quality and
maintainability (organized in classes)
 Why measure OO projects?
 Only empirical studies on the structure of real systems can
provide tangible answers about the project’s quality
4
OBJECTIVE
Identify similar characteristics among project structures,
considering their different domains through software metrics
 RQ: Do the software domains have structural similarities
with each other in aspects such as modularity,
abstraction, stability, complexity, and specialization?
 Assumption: software components with similar attributes
will have similar quality characteristics
5
SOFTWARE DOMAINS
 Software projects can be classified into different categories,
related to different application domains
 Content: input/output, determining the nature of an
application or domain
 Generic domains:
 as software complexity grows, specific domain characteristics
become unclear
 Unclear: means that domains can have similar or dissimilar
characteristics!
6
SOFTWARE DOMAINS (CONT.)
7
SOFTWARE DOMAINS (CONT.)
8
Science
&
Engineering
Development
System
Administration
Audio & Video
Home & Education
Security & Utilities
GraphicsCommunication
Business
&
EnterpriseAll Sourceforge
domains that have
web-based software
System software
Business software
Eng. and scientific software
Web-based software
AI software
Personal Computer Software
Games
Pressman’s
approach
CLUSTER ANALYSIS
Find similar groups of objects among themselves and different
from other groups of elements
 Phases:
1. Selection of entities: 150 software
2. Selection of grouping attributes: metrics
3. Selection of clustering algorithm: K-means, Expectation-
Maximization, hierarchical clustering
4. Data interpretation: final classification
9
CLUSTER ANALYSIS
Find similar groups of objects among themselves and different
from other groups of elements
 Phases:
1. Selection of entities: 150 software
2. Selection of grouping attributes: metrics
3. Selection of clustering algorithm: K-means, Expectation-
Maximization, hierarchical clustering
4. Data interpretation: final classification
10
Macro: KM-Euclidian
distance
Micro: KM-Manhattan
METHODOLOGY
 Sample characterization:
 LOC: 12,178,587
 Number of classes: 69,334
 Repositories: Github and Sourceforge
11
METHODOLOGY (CONT.)
 Tools for metrics extraction: Eclipse plugins
 Tool for data analysis:
 Weka
 Parameters: metrics
 Data analysis: Hyphoteses (H0 and H1) about software quality
measures relationships
12
TOOLS MEASURES
Metrics VG, WMC, NOVM, NOC, DIT, SIX,
LCOM, CA, CE, RMI, RMA, NC, NOM,
NOA
Vizz Maintenance CBO, RFC, MPC, DAC, TCC, LOC
METHODOLOGY (CONT.)
 What is expected? Example:
For H-Depth and descendants:
H0: Software domains that have classes located deeper in the inheritance
hierarchy (less abstract) != software domains that have less deep classes in
the inheritance hierarchy (more abstract)
H1:
 Metrics selected are not capable of identifying characteristics among
domains in relation to inheritance and abstraction. This means that the
behavior on inheritance and abstraction are similar for all domains.
 The hypothesis is partially validated if a pair of metrics presents the
expected behavior.
13
RESULTS
 For micro-categories and macro-categories:
 H1-Coupling and cohesion, H2-Complexity and inheritance, H5-
Complexity and overriding were not validated
 H3-Depth and descendants (DIT and RMI), H4-Complexity and size
(WMC and LOC) were partially validated
 H6-Abstraction and stability (RMA and RMI) was fully validated
 Macro-categories: despite having a smaller error for the same
analysis for micro categories, these solutions are less
heterogeneous
14
RESULTS: MICRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
15
RESULTS: MICRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
16
Cluster 0
Seem to have a more
balanced relationship
between abstraction and
inheritance
Major Domains: Science
& Engineering (SE),
Business & Enterprise
(BE)
RESULTS: MICRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
17
Cluster 1
Tend to have a higher
average number of
descendants
Major Domains: Security
& Utilities (SU), Graphics
(GPH)
RESULTS: MICRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
18
Cluster 2
Tend to have few abstract
classes (more stability) and few
descendants
Major Domains: Development
(D), Communication (C)
RESULTS: MICRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
19
Cluster 3
Have more stable classes than
the other clusters do
Major Domains: Games (G) and
System Administration (SA)
RESULTS: MACRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
20
RESULTS: MACRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
21
Cluster 0
Tend to have few abstract classes (more
stability) and few descendants
Major Domains: PCS, SS, AIS or Audio &
Video, Communication, Graphics,
Development, Games...
RESULTS: MACRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
22
Cluster 1
Tend to have a higher
average number of
descendants
Major Domains: PCS, SS, BS
or Audio & Video,
Communication, Graphics,
Development, Business &
Enterprise...
RESULTS: MACRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
23
Cluster 2
Have more stable classes than the other clusters
do
Major Domains: PCS, SS, AIS, ESS or Audio &
Video, Communication, Graphics, Development,
Games, Science & Engineering...
RESULTS: MACRO-CATEGORIES
 DIT (axis X) x RMI (axis Y):
24
Cluster 3
Seem to have a more
balanced relationship
between abstraction and
inheritance
Major Domains: PCS, BS, AIS,
ESS or Audio & Video,
Communication, Graphics,
Business & Enterprise, Games,
Science & Engineering...
RELATED WORK
 [Romano; Pinzger 2011]: specific metrics for Java interfaces has strong
correlation with changes in abstract and concrete classes of OO projects
 [Malviya; Yadav 2012]: clustering to identify OO sustainable systems
 [Jehad Al Dallal 2013]: internal qualities attributes (size, cohesion and
coupling) x external quality attribute (maintainability in classes)
 [Souza; Maia 2013]: reference values for a set of coupling metrics,
considering software domains
 Our study:
 Proposes a model that explains the similarity among domains in OO
internal software quality
 It provides a more immediate view of the trends and characteristics
of internal Java software quality
25
THREATS TO VALIDITY
 Construct validity: other measures could be relevant and it is
possible that the technique of clustering is not sufficient to
completely validate and detect characteristics inherent in
software projects developed in object-oriented Java
 Internal validity: do not provide in-depth technical details of
the projects. Exemple: code inspections
 External validity: the study analyzes only OO software
developed in Java
26
CONCLUSIONS
 Some specific domains tend to have similarities relating to four
properties (abstraction, stability, complexity, and specialization).
 In general:
 Systems in the SU and GPH domains tend to have few descendants and
few abstract classes
 Software in the D and C domains can have similar characteristics in
inheritance and abstraction, with a higher average number of descendants
 Software in the SE and BE domains tend to make good use of inheritance
 Software in the G and SA domains can have more stable classes and are
harder to maintain due to the lack of flexibility in changes
27
CONCLUSIONS (CONT.)
 Contribution to SE:
 Provide observations of structural aspects of OO development, such as
specialization, stability, abstraction, and complexity
 Metrics such as WMC, DIT, LOC, RMI, and RMA are relevant to the
characterization of Java internal software quality (similarity between
domains)
 For software developers, the study shows that some domains (for
exemple, G and SA) tend to have the same characteristics and that
more efforts in these aspects are necessary so that systems can
continue to be maintainable
28
FUTURE WORK
 We suggest:
 Repeating the analyses on a larger sample of software
 The use of other repositories of available projects
 The use of other metrics to obtain new results on
characteristics that have not yet been explored
29
Thanks!
PqES- Grupo de pesquisa em Engenharia de Software
http://pesquisa.dcc.ufla.br/pqes/index.php/home-english
Universidade Federal de Lavras – UFLA – Brasil
Mariana de Azevedo Santos Rodrigo Amador
Heitor CostaPaulo Henrique S. Bermejo
mariana@bsi.ufla.br toluenotnt@gmail.com
bermejo@dcc.ufla.br heitor@dcc.ufla.br
REFERENCES
 ISO/IEC 25010 (2011) Systems and Software Engineering - Systems and
Software Quality Requirements and Evaluation - System and Software
Quality Models.
 Dallal, J. A. (2013) Object-Oriented Class Maintainability Prediction Using
Internal Quality Attributes. In: Inf. Software Technology 55, 11. pp. 2028-
2048.
 Malviya, A. K.; Yadav, V. K. (2012) Maintenance Activities in Object
Oriented Software Systems Using K-Means Clustering Technique: A
Review. In: Sixth International Conference on Software Engineering, pp. 1-
5.
 Romano, D.; Pinzger, M. (2011) Using Source Code Metrics to Predict
Change-Prone Java Interfaces. In: International Conference on Software
Maintenance, pp. 303-312.
 Souza, L. B. L. de; Maia, M. de A. (2013) Do Software Categories Impact
Coupling Metrics? In: Working Conference on Mining Software
Repositories. pp. 217-220.
31

More Related Content

What's hot (19)

PDF
IRJET - Scrutinizing Attributes Influencing Role of Information Communication...
IRJET Journal
 
PDF
Cm24585587
IJERA Editor
 
PDF
Abb case study 1
apn18
 
PDF
JELINSKI-MORANDA SOFTWARE RELIABILITY GROWTH MODEL: A BRIEF LITERATURE AND MO...
ijseajournal
 
PDF
Re2018 Semios for Requirements
Clément Portet
 
PDF
Cascon06 tooldemo.ppt
Ptidej Team
 
PDF
The adoption of machine learning techniques for software defect prediction: A...
RAKESH RANA
 
PDF
Csqe sample exam 2 solutions 05.00.04
binodrit98
 
PDF
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
Ali Ouni
 
PDF
Web Service Antipatterns Detection Using Genetic Programming
Ali Ouni
 
PDF
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
Ali Ouni
 
PDF
REALIZING A LOOSELY-COUPLED STUDENTS PORTAL FRAMEWORK
ijseajournal
 
PDF
Privacy Requirements Engineering in Agile Software Development
RequirementsEngineeringLaboratory
 
PPT
06 styles and_greenfield_design
Majong DevJfu
 
PDF
Open source evolution analysis
Izzat Alsmadi
 
PDF
Systematic software development using vdm by jones 2nd edition
Yasir Raza Khan
 
PDF
The Impact of Software Complexity on Cost and Quality - A Comparative Analysi...
ijseajournal
 
PDF
TOWARDS PREDICTING SOFTWARE DEFECTS WITH CLUSTERING TECHNIQUES
ijaia
 
PDF
2017 BE CSE Syllabus Anna University Affiliated Colleges
Fatima Michael College of Engineering and Technology
 
IRJET - Scrutinizing Attributes Influencing Role of Information Communication...
IRJET Journal
 
Cm24585587
IJERA Editor
 
Abb case study 1
apn18
 
JELINSKI-MORANDA SOFTWARE RELIABILITY GROWTH MODEL: A BRIEF LITERATURE AND MO...
ijseajournal
 
Re2018 Semios for Requirements
Clément Portet
 
Cascon06 tooldemo.ppt
Ptidej Team
 
The adoption of machine learning techniques for software defect prediction: A...
RAKESH RANA
 
Csqe sample exam 2 solutions 05.00.04
binodrit98
 
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
Ali Ouni
 
Web Service Antipatterns Detection Using Genetic Programming
Ali Ouni
 
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
Ali Ouni
 
REALIZING A LOOSELY-COUPLED STUDENTS PORTAL FRAMEWORK
ijseajournal
 
Privacy Requirements Engineering in Agile Software Development
RequirementsEngineeringLaboratory
 
06 styles and_greenfield_design
Majong DevJfu
 
Open source evolution analysis
Izzat Alsmadi
 
Systematic software development using vdm by jones 2nd edition
Yasir Raza Khan
 
The Impact of Software Complexity on Cost and Quality - A Comparative Analysi...
ijseajournal
 
TOWARDS PREDICTING SOFTWARE DEFECTS WITH CLUSTERING TECHNIQUES
ijaia
 
2017 BE CSE Syllabus Anna University Affiliated Colleges
Fatima Michael College of Engineering and Technology
 

Viewers also liked (10)

PPT
Ch10
蕭美蓮
 
PDF
Minicurso Aplicando Scrum em projetos de software (2010)
Mariana de Azevedo Santos
 
PDF
Template method pattern
Eriko Morais
 
PDF
Scrum solo
Eriko Morais
 
PPTX
Como Preparar Artefatos para um Projeto em Scrum (Exemplo prático para Projec...
Luanna Eroles
 
PPTX
Quality characteristics
Sigma Software
 
PPT
Audit sampling for tests of controls and substantive tests of transactions
bagarza
 
PDF
Software engineering lecture notes
Siva Ayyakutti
 
PDF
Kanban para Desenvolvimento de Software
Rildo (@rildosan) Santos
 
Ch10
蕭美蓮
 
Minicurso Aplicando Scrum em projetos de software (2010)
Mariana de Azevedo Santos
 
Template method pattern
Eriko Morais
 
Scrum solo
Eriko Morais
 
Como Preparar Artefatos para um Projeto em Scrum (Exemplo prático para Projec...
Luanna Eroles
 
Quality characteristics
Sigma Software
 
Audit sampling for tests of controls and substantive tests of transactions
bagarza
 
Software engineering lecture notes
Siva Ayyakutti
 
Kanban para Desenvolvimento de Software
Rildo (@rildosan) Santos
 
Ad

Similar to Similar Characteristics of Internal Software Quality Attributes for Object-Oriented Open-Source Software Projects (20)

PPTX
Software Architecture: introduction to the abstraction
Henry Muccini
 
PDF
ProDebt's Lessons Learned from Planning Technical Debt Strategically
QAware GmbH
 
DOCX
Dharani
Dharani R
 
PDF
Architectural Design Spaces for Feedback Control in Self-Adaptive Systems Con...
Sandro Andrade
 
PPTX
software engineering module i & ii.pptx
rani marri
 
PDF
Class quality evaluation using class quality
IAEME Publication
 
PDF
Class quality evaluation using class quality scorecards
IAEME Publication
 
PPTX
OOAD - Ch.09 - Software Project Estimation.pptx
SohagSrz
 
PDF
IMPLEMENTATION OF DYNAMIC COUPLING MEASUREMENT OF DISTRIBUTED OBJECT ORIENTED...
IJCSEA Journal
 
PDF
How to bring Sustainability in your Organization – Green IT
Patricia Lago
 
PPT
A Validation of Object-Oriented Design Metrics as Quality Indicators
vie_dels
 
PDF
Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios
PET Computação
 
PDF
Software Analytics = Sharing Information
Thomas Zimmermann
 
PDF
Hardware Design Practices For Modern Hardware
Winstina Kennedy
 
PPTX
Trends in Embedded Software Engineering
Aditya Kamble
 
PDF
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
SBGC
 
PDF
Automating Software Development Using Artificial Intelligence (AI)
Jeremy Bradbury
 
PDF
ITERATIVE AND INCREMENTAL DEVELOPMENT ANALYSIS STUDY OF VOCATIONAL CAREER INF...
ijseajournal
 
PDF
The Increasing Value and Complexity of Software Call for the Reevaluation of ...
PRQA
 
PDF
IRJET- Software Architecture and Software Design
IRJET Journal
 
Software Architecture: introduction to the abstraction
Henry Muccini
 
ProDebt's Lessons Learned from Planning Technical Debt Strategically
QAware GmbH
 
Dharani
Dharani R
 
Architectural Design Spaces for Feedback Control in Self-Adaptive Systems Con...
Sandro Andrade
 
software engineering module i & ii.pptx
rani marri
 
Class quality evaluation using class quality
IAEME Publication
 
Class quality evaluation using class quality scorecards
IAEME Publication
 
OOAD - Ch.09 - Software Project Estimation.pptx
SohagSrz
 
IMPLEMENTATION OF DYNAMIC COUPLING MEASUREMENT OF DISTRIBUTED OBJECT ORIENTED...
IJCSEA Journal
 
How to bring Sustainability in your Organization – Green IT
Patricia Lago
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
vie_dels
 
Redes de sensores sem fio autonômicas: abordagens, aplicações e desafios
PET Computação
 
Software Analytics = Sharing Information
Thomas Zimmermann
 
Hardware Design Practices For Modern Hardware
Winstina Kennedy
 
Trends in Embedded Software Engineering
Aditya Kamble
 
2012 ieee projects software engineering @ Seabirds ( Trichy, Chennai, Pondich...
SBGC
 
Automating Software Development Using Artificial Intelligence (AI)
Jeremy Bradbury
 
ITERATIVE AND INCREMENTAL DEVELOPMENT ANALYSIS STUDY OF VOCATIONAL CAREER INF...
ijseajournal
 
The Increasing Value and Complexity of Software Call for the Reevaluation of ...
PRQA
 
IRJET- Software Architecture and Software Design
IRJET Journal
 
Ad

Recently uploaded (20)

PDF
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
PDF
Is Framer the Future of AI Powered No-Code Development?
Isla Pandora
 
PPTX
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
PPTX
From spreadsheets and delays to real-time control
SatishKumar2651
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PDF
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
PPTX
iaas vs paas vs saas :choosing your cloud strategy
CloudlayaTechnology
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PPTX
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PPTX
Build a Custom Agent for Agentic Testing.pptx
klpathrudu
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
PDF
Salesforce Experience Cloud Consultant.pdf
VALiNTRY360
 
PPTX
Get Started with Maestro: Agent, Robot, and Human in Action – Session 5 of 5
klpathrudu
 
PDF
AI Prompts Cheat Code prompt engineering
Avijit Kumar Roy
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PDF
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
PPTX
BB FlashBack Pro 5.61.0.4843 With Crack Free Download
cracked shares
 
PDF
NPD Software -Omnex systems
omnex systems
 
Top Agile Project Management Tools for Teams in 2025
Orangescrum
 
Is Framer the Future of AI Powered No-Code Development?
Isla Pandora
 
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
From spreadsheets and delays to real-time control
SatishKumar2651
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
iaas vs paas vs saas :choosing your cloud strategy
CloudlayaTechnology
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Foundations of Marketo Engage - Powering Campaigns with Marketo Personalization
bbedford2
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Build a Custom Agent for Agentic Testing.pptx
klpathrudu
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
IObit Driver Booster Pro 12.4.0.585 Crack Free Download
henryc1122g
 
Salesforce Experience Cloud Consultant.pdf
VALiNTRY360
 
Get Started with Maestro: Agent, Robot, and Human in Action – Session 5 of 5
klpathrudu
 
AI Prompts Cheat Code prompt engineering
Avijit Kumar Roy
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
BB FlashBack Pro 5.61.0.4843 With Crack Free Download
cracked shares
 
NPD Software -Omnex systems
omnex systems
 

Similar Characteristics of Internal Software Quality Attributes for Object-Oriented Open-Source Software Projects

  • 1. Similar Characteristics of Internal Software Quality Attributes for Object-Oriented Open-Source Software Projects PqES- Grupo de pesquisa em Engenharia de Software http://pesquisa.dcc.ufla.br/pqes/index.php/home-english Universidade Federal de Lavras – UFLA – Brasil Mariana de Azevedo Santos Rodrigo Amador Heitor CostaPaulo Henrique S. Bermejo [email protected] [email protected] [email protected] [email protected]
  • 2. AGENDA  Introduction  Objective  Background  Software domains  Cluster analysis  Methodology  Results  Related work  Conclusion  Threats of validity  Future work 2
  • 3. INTRODUCTION  Software quality assurance: vital component for software development  Organization concerns: low cost and high quality products  Quality assurance are necessary, but costly:  expends more 50% of the budget of projects  What to do? Find efficient and cheap methods to get information about quality on software projects 3
  • 4. INTRODUCTION (CONT.)  How to predict internal software quality?  Measuring their internal attributes (source code quality or its complexity [ISO/IEC 25010 2011])  Why OO measures (metrics)?  Expectation: OO code should have high quality and maintainability (organized in classes)  Why measure OO projects?  Only empirical studies on the structure of real systems can provide tangible answers about the project’s quality 4
  • 5. OBJECTIVE Identify similar characteristics among project structures, considering their different domains through software metrics  RQ: Do the software domains have structural similarities with each other in aspects such as modularity, abstraction, stability, complexity, and specialization?  Assumption: software components with similar attributes will have similar quality characteristics 5
  • 6. SOFTWARE DOMAINS  Software projects can be classified into different categories, related to different application domains  Content: input/output, determining the nature of an application or domain  Generic domains:  as software complexity grows, specific domain characteristics become unclear  Unclear: means that domains can have similar or dissimilar characteristics! 6
  • 8. SOFTWARE DOMAINS (CONT.) 8 Science & Engineering Development System Administration Audio & Video Home & Education Security & Utilities GraphicsCommunication Business & EnterpriseAll Sourceforge domains that have web-based software System software Business software Eng. and scientific software Web-based software AI software Personal Computer Software Games Pressman’s approach
  • 9. CLUSTER ANALYSIS Find similar groups of objects among themselves and different from other groups of elements  Phases: 1. Selection of entities: 150 software 2. Selection of grouping attributes: metrics 3. Selection of clustering algorithm: K-means, Expectation- Maximization, hierarchical clustering 4. Data interpretation: final classification 9
  • 10. CLUSTER ANALYSIS Find similar groups of objects among themselves and different from other groups of elements  Phases: 1. Selection of entities: 150 software 2. Selection of grouping attributes: metrics 3. Selection of clustering algorithm: K-means, Expectation- Maximization, hierarchical clustering 4. Data interpretation: final classification 10 Macro: KM-Euclidian distance Micro: KM-Manhattan
  • 11. METHODOLOGY  Sample characterization:  LOC: 12,178,587  Number of classes: 69,334  Repositories: Github and Sourceforge 11
  • 12. METHODOLOGY (CONT.)  Tools for metrics extraction: Eclipse plugins  Tool for data analysis:  Weka  Parameters: metrics  Data analysis: Hyphoteses (H0 and H1) about software quality measures relationships 12 TOOLS MEASURES Metrics VG, WMC, NOVM, NOC, DIT, SIX, LCOM, CA, CE, RMI, RMA, NC, NOM, NOA Vizz Maintenance CBO, RFC, MPC, DAC, TCC, LOC
  • 13. METHODOLOGY (CONT.)  What is expected? Example: For H-Depth and descendants: H0: Software domains that have classes located deeper in the inheritance hierarchy (less abstract) != software domains that have less deep classes in the inheritance hierarchy (more abstract) H1:  Metrics selected are not capable of identifying characteristics among domains in relation to inheritance and abstraction. This means that the behavior on inheritance and abstraction are similar for all domains.  The hypothesis is partially validated if a pair of metrics presents the expected behavior. 13
  • 14. RESULTS  For micro-categories and macro-categories:  H1-Coupling and cohesion, H2-Complexity and inheritance, H5- Complexity and overriding were not validated  H3-Depth and descendants (DIT and RMI), H4-Complexity and size (WMC and LOC) were partially validated  H6-Abstraction and stability (RMA and RMI) was fully validated  Macro-categories: despite having a smaller error for the same analysis for micro categories, these solutions are less heterogeneous 14
  • 15. RESULTS: MICRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 15
  • 16. RESULTS: MICRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 16 Cluster 0 Seem to have a more balanced relationship between abstraction and inheritance Major Domains: Science & Engineering (SE), Business & Enterprise (BE)
  • 17. RESULTS: MICRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 17 Cluster 1 Tend to have a higher average number of descendants Major Domains: Security & Utilities (SU), Graphics (GPH)
  • 18. RESULTS: MICRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 18 Cluster 2 Tend to have few abstract classes (more stability) and few descendants Major Domains: Development (D), Communication (C)
  • 19. RESULTS: MICRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 19 Cluster 3 Have more stable classes than the other clusters do Major Domains: Games (G) and System Administration (SA)
  • 20. RESULTS: MACRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 20
  • 21. RESULTS: MACRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 21 Cluster 0 Tend to have few abstract classes (more stability) and few descendants Major Domains: PCS, SS, AIS or Audio & Video, Communication, Graphics, Development, Games...
  • 22. RESULTS: MACRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 22 Cluster 1 Tend to have a higher average number of descendants Major Domains: PCS, SS, BS or Audio & Video, Communication, Graphics, Development, Business & Enterprise...
  • 23. RESULTS: MACRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 23 Cluster 2 Have more stable classes than the other clusters do Major Domains: PCS, SS, AIS, ESS or Audio & Video, Communication, Graphics, Development, Games, Science & Engineering...
  • 24. RESULTS: MACRO-CATEGORIES  DIT (axis X) x RMI (axis Y): 24 Cluster 3 Seem to have a more balanced relationship between abstraction and inheritance Major Domains: PCS, BS, AIS, ESS or Audio & Video, Communication, Graphics, Business & Enterprise, Games, Science & Engineering...
  • 25. RELATED WORK  [Romano; Pinzger 2011]: specific metrics for Java interfaces has strong correlation with changes in abstract and concrete classes of OO projects  [Malviya; Yadav 2012]: clustering to identify OO sustainable systems  [Jehad Al Dallal 2013]: internal qualities attributes (size, cohesion and coupling) x external quality attribute (maintainability in classes)  [Souza; Maia 2013]: reference values for a set of coupling metrics, considering software domains  Our study:  Proposes a model that explains the similarity among domains in OO internal software quality  It provides a more immediate view of the trends and characteristics of internal Java software quality 25
  • 26. THREATS TO VALIDITY  Construct validity: other measures could be relevant and it is possible that the technique of clustering is not sufficient to completely validate and detect characteristics inherent in software projects developed in object-oriented Java  Internal validity: do not provide in-depth technical details of the projects. Exemple: code inspections  External validity: the study analyzes only OO software developed in Java 26
  • 27. CONCLUSIONS  Some specific domains tend to have similarities relating to four properties (abstraction, stability, complexity, and specialization).  In general:  Systems in the SU and GPH domains tend to have few descendants and few abstract classes  Software in the D and C domains can have similar characteristics in inheritance and abstraction, with a higher average number of descendants  Software in the SE and BE domains tend to make good use of inheritance  Software in the G and SA domains can have more stable classes and are harder to maintain due to the lack of flexibility in changes 27
  • 28. CONCLUSIONS (CONT.)  Contribution to SE:  Provide observations of structural aspects of OO development, such as specialization, stability, abstraction, and complexity  Metrics such as WMC, DIT, LOC, RMI, and RMA are relevant to the characterization of Java internal software quality (similarity between domains)  For software developers, the study shows that some domains (for exemple, G and SA) tend to have the same characteristics and that more efforts in these aspects are necessary so that systems can continue to be maintainable 28
  • 29. FUTURE WORK  We suggest:  Repeating the analyses on a larger sample of software  The use of other repositories of available projects  The use of other metrics to obtain new results on characteristics that have not yet been explored 29
  • 30. Thanks! PqES- Grupo de pesquisa em Engenharia de Software http://pesquisa.dcc.ufla.br/pqes/index.php/home-english Universidade Federal de Lavras – UFLA – Brasil Mariana de Azevedo Santos Rodrigo Amador Heitor CostaPaulo Henrique S. Bermejo [email protected] [email protected] [email protected] [email protected]
  • 31. REFERENCES  ISO/IEC 25010 (2011) Systems and Software Engineering - Systems and Software Quality Requirements and Evaluation - System and Software Quality Models.  Dallal, J. A. (2013) Object-Oriented Class Maintainability Prediction Using Internal Quality Attributes. In: Inf. Software Technology 55, 11. pp. 2028- 2048.  Malviya, A. K.; Yadav, V. K. (2012) Maintenance Activities in Object Oriented Software Systems Using K-Means Clustering Technique: A Review. In: Sixth International Conference on Software Engineering, pp. 1- 5.  Romano, D.; Pinzger, M. (2011) Using Source Code Metrics to Predict Change-Prone Java Interfaces. In: International Conference on Software Maintenance, pp. 303-312.  Souza, L. B. L. de; Maia, M. de A. (2013) Do Software Categories Impact Coupling Metrics? In: Working Conference on Mining Software Repositories. pp. 217-220. 31