Testing Plug-in Architectures

Testing Plug-in
Architectures
Arie van Deursen
@avandeursen

Joint work with Michaela Greiler
Margaret-Anne Storey

The TU Delft
Software Engineering Research Group
Education Research
• Programming, • Software architecture
software engineering • Software testing
• MSc, BSc projects • Repository mining
• Collaboration
• Services
• Model-driven engineering
• End-user programming

2

Crawljax: Automated Testing
of Ajax Aplications

Ali Mesbah, Arie van Deursen, Danny Roest: Invariant-Based Automatic Testing of Modern
Web Applications. IEEE Trans. Software Eng. 38(1): 35-53 (2012) 3

Plug-in Architectures

Create series of tailored products
by combining, configuring, & extending
plug-ins

4

WordPress

“The number one reason people give us
for not upgrading to the latest version of
WordPress is fear that their plugins won’t
be compatible.”

http://wordpress.org/news/2009/10/plugin-compatibility-beta/
6

http://www.eclipsesource.com/blogs/author/irbull/ 7

Eclipse-based Software

Eclipse Plug-in Architecture

8

Underneath: OSGi
• Routers, Modems, Gateways, Control Panels, Phones,
Cars, Trains, Trucks, Healthcare devices…

9

One Product = Many Plug-ins
Set of Plug-ins = Many Products

What are the test implications?

How should we test
plug-in architectures?

10

[ Plug-in Testing Issues to Consider ]

Fault model? Test Approaches?
• Interacting plug-ins, • Combinatorial
• Plug-in configurations • Multi-version
• Plug-in versions • As part of product line
• Plug-in extensions engineering
• Resource usage • Search-based through
• … interaction space
• …
See related work (slide 5) of #sast2012
Teste de Linha de Produto de Software Baseado em Mutação de Variabilidades.
Marcos Quinaia (UNICENTRO), Johnny Ferreira (UFPR), Silvia Vergilio (UFPR).
11

What do Eclipsers
Think about Testing?

12

Research Questions
1. What testing practices are prevalent in the Eclipse
community?

2. Does the plug-in nature of Eclipse have an impact on
software testing?

3. Why are certain practices adopted, and why are
others not adopted?

4. Are there additional compensation strategies used to
support testing of plug-ins?

13

Grounded Theory
• Systematic procedure to • Coding
discover theory from • Code, concept, category
(qualitative) data • Memoing
• Theoretical sensitivity
• Theoretical sampling
• Constant comparison
• Saturation

S. Adolph, W. Hall, Ph. Kruchten. Using Grounded Theory to study the experience of software
development. Emp. Sw. Eng., Jan. 2011.
B. Glaser and J. Holton. Remodeling grounded theory. Forum Qualitative Res., 2004. 15

What’s a Theory?

“A set of well-developed categories
(e.g. themes, concepts)
that are systematically inter-related
through statements of relationships
to form a framework that explains
some relevant social, psychological, educational,
or other phenomenon.”

Corbin & Strauss, Basicis of Qualitative Research, 1998
16

P Code Comment
P1, P4 The best Unit test are the best
P1 Legacy code legacy code can be problematic to tests
P1 Fast, easy to execute They are fast, and easy to execute
P1 Misused as Unit tests grows into PDE test
Integration test
P1 Refactoring easy refactoring is easier when you have a well design unit test
P4 More important than Are more important
Integration test
P4 Limited applicability Leave parts out that cannot be tested by unit tests, Remote calls
P4 High amounts You can have many
P4 Frequent execution Can be executed often
P8 Only complex stuff Only for complex stuff like state machines
P13 Comfort for It gives you a certain level of comfort to know that when you make a change and you break
refactoring something that that would be apparent in your test case
P16 Limited applicability Concurrency
P20 Limited applicability For code within browser

17

Resulting Theory
Theory comprises four main categories:

1. Testing practices used by Eclipsers
2. Impact of the plug-in characteristic
3. Factors affecting test practice adoption
4. The role of the community

Also 12 concepts, 100 codes, 100 memos

Full codes & quotes: Technical Report TUD-SERG-2011-011 18

Triangulation

1. Resonance @ EclipseCon

2. Survey among 151 developers

19

Practices: Unit testing is popular

“Unit testing is where “Ultimately, unit test are our
you find the most bugs” best friends”

P3
P18

P14
“At least 70% of our test effort is
spent on unit testing.”

20

Other forms of testing are less popular

“The Capture and Replay nature
“We think that with a high
of QF-tests was too rigid when
test coverage through
the system was evolving”
unit tests, integration
tests are not necessary.”

P18

P14
P20 “We haven’t been 100% satisifed with
capture-replay: too much is captured.”

21

Findings 1: Practices
• Common practice to have
no separate test teams

• Eclipsers are proud of their unit testing

• Eclipsers tend to dislike system, integration,
UI, and acceptance testing
– Substantially less automation

22

Automated or Manual?

23

Cross plug-in testing is optional

“We do bug-driven cross plug-in testing”

P18
P19

“We have no automated tests for cross plug-in testing,
but we do manual testing.”

24

Version testing is minimal
“A lot of people put version ranges in their bundle dependencies,
and they say they can run with 3.3 up to version 4.0 of the
platform.”

P13

“But I’m willing to bet that 99% of the people
do not test that their stuff works.”

25

Findings 2: Plug-ins
• Testing deferred to `application engineering’
– No special effort during `product line engineering’

• Integration testing on demand:
– Bug occurring in the field

• No test effort aimed at integration faults per se
– Versions, configurations, interactions, …

26

Testing combinations or versions?

43% don’t test integration of different products
only 3% test this thoroughly

55% don’t test for platform versions


63% don’t test for dependency versions

27

Barriers
“And you never know, once you
“It’s complicated to
write a good test, then it will
integrate Junit with the
become obsolete with the next
build. Another
version of Eclipse”
framework? I didn’t want
to take the trouble”

P4

P7
P21 “Especially for plug-ins, we would need
some best practices.”

28

Findings 3: Barriers
• Responsibility for integration unclear
• Requirements for composite unclear
• Lack of ownership
• Insufficient plug-in knowledge
• Set-up of test infrastructure too complicated
• Test execution too long
• Poor testability of the platform

29

Community Testing (I)
Testing is done by the user community. […]
We have more than 10,000 installations per month.
If there should be a bug it gets reported immediately.”

P12 P17

“The community helps to test the system for
different operating systems, and versions.
They are very active with that.”
30

Community Testing (II)
“I would say the majority of the bug reports
come from the community. […]
We have accepted more than 800 patches.”

P20
P19

““We make all infrastructure available, […], so that somebody who
writes a patch has the opportunity to run the same tests […]”

31

Downstream Testing
“We’re a framework. If the user downloads a
new version and lets his application run with it,
then this is already like a test.”

P20
P13

“They have extensive unit tests, and so I am quite sure that when I
break something, somebody downstream very rapidly notices and
reports the problem.”
32

Findings 4: “Compensation Strategies”

• Community plays key role in
finding and reporting issues

• Downstream testing (manual and automatic)
provides additional tests of upstream framework.

• Open test infrastructure facilitates patching

33

Summary: Findings
1. (Automated) unit testing is widely adopted;
Integration, system, UI and acceptance
testing are much less automated

2. The plug-in nature has little direct impact on
test practices

3. Barriers to adopt techniques include unclear
ownership, accountability, and test effort &
execution time

4. Limited integration testing is compensated
by community
34

Scope
• Beyond the participants:
– Challenged results in survey among 150 Eclipsers

• Beyond Eclipse:
– Open source, developer centric, plug-in
architecture, services, …

• Beyond the people:
– Repository mining, code analysis

35

(Eclipse) Implications
1. Community tolerance for failures determines
(integration) test effort

2. Need to strengthen community

3. Need to strengthen plug-in
architecture with “self testing” capabilities

4. Test innovations must address adoption barriers

36

Issue of concern Positivist View Interpretive View
Representative- Objectivity: free from Confirmability: conclusions
ness of findings researcher bias depend on subjects, not on
researcher
Reproducibility Reliability: findings Auditability: process is
can be replicated consistent & stable over
time
Rigor of method Internal validity: Credibility: finding relevant
statistically and credible to people we
significant study
Generalizability External validity: Transferability: how far can
of findings domain of findings be transferred to
generalizability other contexts?

"Ameaças à validade". 38

Impact for Your (SAST 2012) Paper?
1. Empirical Studies in Software Testing
2. Experience in Organizing Test Teams
3. Teste de Linha de Produto de Software Baseado em Mutação de
Variabilidades
4. Execução Determinística de Programas Concorrentes Durante o Teste de
Mutação
5. Uso de análise de mutantes e testes baseados em modelos: um estudo
exploratório
6. Framework para Teste de Software Automático nas Nuvens
7. Usando o SilkTest para automatizar testes: um Relato de Experiência
8. Automating Test Case Creation and Execution for Embedded Real-time
Systems
9. Controlando a Diversidade e a Quantidade de Casos de Teste na Geração
Automática a partir de Modelos com Loop
10. Geração Aleatória de Dados para Programas Orientados a Objetos

39

Conclusions (1)
Increasing Dynamism
• We must accept that many deployed
compositions are in fact untested.

• As the level of dynamism grows, we need to
move from a priori testing to “in vivo” testing

Rethink what your test approach
might mean in a run time setting.
40

Conclusions (2)
In Vivo / Run Time / On line Testing

• Continuous assertion & health checking
• Active stimuli upon configuration change
• Learn relevant stimuli from past behavior

First steps in research, little adoption so far

[ DevOps is just the beginning ]

41

Conclusions (3):
Involving the Tester in Your Research

• To understand the problem to begin with
– Qualitative study to start research

• To evaluate your results

• More painful, … and more rewarding!
42

Summary
• A Grounded Theory study is a great way to
understand what people are struggling
with.

• Integration testing in plug-in architectures
is hard and possibly costly

• A cooperative community is a invaluable

• We must and can strengthen post-
deployment testing and reporting
43

Further Reading

Michaela Greiler, Arie van Deursen &
Margaret-Anne Storey.

Test Confessions: A Study of Testing
Practices for Plug-in Systems

Proceedings International Conference
Engineering (ICSE2012), IEEE, 2012.

Full report: TUD-SERG-2011-010

44

Testing Plug-in Architectures

More Related Content

What's hot (20)

Similar to Testing Plug-in Architectures (20)

Recently uploaded (20)

Testing Plug-in Architectures