www.software.ac.uk



Doing Science
Properly in the
Digital Age
10 September 2012, Digital Research 2012, Oxford
Neil Chue Hong (@npch)
N.ChueHong@software.ac.uk

                      Software Sustainability Institute
Four Paradigms of Research
                                              www.software.ac.uk




          Software Sustainability Institute
Software is pervasive
     in research                            www.software.ac.uk




        Software Sustainability Institute
Just the Nature of the problem?
                                                                             www.software.ac.uk


Statistics courtesy of Jo Hannay et al, “How Do Scientists Develop and Use Scientific Software?




                                                          Maintenance is not fun
  Published online 13 October 2010 | Nature 467, 775-777 Hacking new stuff is fun
                                                         (2010)
  doi:10.1038/467775a

                                Software Sustainability Institute
The Software Sustainability
            Institute                                     www.software.ac.uk



A national facility for cultivating world-
class research through software
• Better software enables better research
• Software reaches boundaries in its
  development cycle that prevent
  improvement, growth and adoption
• Providing the expertise and services
  needed to negotiate to the next stage
• Developing the policy and tools to
  support the community developing and
  using research software
                                                       Supported by EPSRC
                   Software Sustainability Institute   Grant EP/H043160/1
SSI Drivers and Themes
                                                       www.software.ac.uk



• Two key drivers which cause people to seek the
  SSI’s advice:
   They want to be more productive in their research
   They don’t want to be embarrassed by appearing
    worse than their peers

• Broadly, our work falls into a few key themes:
   The role and reward of software in research
   Recognition of software career paths
   Developing the scientific computing / software
    development skill base

                   Software Sustainability Institute
The Foundations of
 Digital Research                          www.software.ac.uk




                  Re-
                search
               Careers
         Recognition /
           Reward
     Skills and Capability

       Software Sustainability Institute
www.software.ac.uk




Software Skills for
Free-Range
Researchers

        Software Sustainability Institute
Seven recommendations to
       improve learning                                  www.software.ac.uk


1. Space learning over time
2. Interleave worked example solutions with problem-solving
   exercises
3. Combine graphics with verbal descriptions
4. Connect and integrate abstract and concrete
   representations of concepts
5. Use quizzing to promote learning
6. Help students allocate study time efficiently
7. Ask deep explanatory questions

Organizing Instruction and Study to Improve Student Learning
http://ies.ed.gov/ncee/wwc/practiceguide.aspx?sid=1

                     Software Sustainability Institute
Traditional teaching
                                                                  www.software.ac.uk



                                        • Traditional software
                                          teaching is aimed at
                                          “caged” students
                                               Full-time
                                               Able to allocate study-
                                                time
                                               Able to space learning
                                                over time
                                               Teach in groups
                                        • But still focussed on how
                                          and not why

Picture by Farm Sanctuary   Software Sustainability Institute
The free-range researcher
                                                              www.software.ac.uk



• The free range
  researcher has different
  requirements
   Already working at 200%
   No common “baseline”
    knowledge
   Often just one or two
    researchers
• Why do they care about
  software engineering?

                  Software Sustainability Institute   Picture by Brookford Farm
No common baseline
                                                                              www.software.ac.uk




BhargenBasepair           Fan Fullerene            Helen Helmet            Mehrdad Mapping

Bioinformatics RA         Chemistry PhD            Mech Eng intern         Forestry student
Novice coding in Java     Running analyses         Exploratory coding      Field data collection
and Perl to test groups   on behalf of his         based on tinkering      currently correlating
pattern algorithms        supervisor               with legacy code        using Matlab
                              Software Sustainability Institute         Courtesy of Greg Wilson
Case Study: Ligand Binding
                                                                   www.software.ac.uk


• Centre for Computational Chemistry, Bristol
      New methods for rapid MC sampling of
       biomolecular systems modelled using QM/MM
      Developed two codes ProtoMS (F77) + Sire (C++)
      Water-Swap Reaction Coordinate method to
       calculate absolute protein-ligand binding free
       energies
• SSI’s work is helping to scale development
      ProtoMS and Sire both single developer codes
      ASPIRE/ACQUIRE framework has multiple devs
          • Split architecture between ASPIRE (adaptive
            multiresolution hybrid MD simulation) and ACQUIRE
            (WorkPacket scheduling system with optimisation
            for time to result vs “green-ness”

•   http://www.siremol.org/adaptive_dynamics

                               Software Sustainability Institute
Case Study:
       Climate Policy Modelling                                       www.software.ac.uk


• CIAS team at Tyndall Centre for Climate Change
  Research, University of East Anglia
     Develop linked climate and economic models for
      detailed analysis
     Their software was not ready to be used by other
      groups
        • One researcher/developer at UEA, several users
• SSI’s work means the software is robust enough that
  it can be installed and used by others
     Enabled use of the software by the
      WWFN’sClimascope project and James Cook University
        •   Documented software to allow extensions by contributors
        •   Made it easier to maintain and backup
        •   Added job scheduling to improve modeling throughput
        •   New modelling framework enables new models i.e. new
            science
• http://www.tyndall.ac.uk/research/cias


                                Software Sustainability Institute
Case Study: textual studies
                                                                     www.software.ac.uk


• TextVRE team at CeRCH, Kings College London
     Developed an environment which is used to integrate
      various tools used in the e-Humanities textual studies
      lifecycle
     Builds on the German TextGrid project, and many
      other existing tools
• SSI’s work means the software is can be run “out of
  the box” – an important requirement for the
  researchers
     Developed a VM image containing the TextVRE
      installation
        • Improve installation instructions
        • Develop tests to check each installed component
        • Improve modularisation to allow others to contribute and
          maintain
     Feeding back work to TextGrid
• http://textvre.cerch.kcl.ac.uk


                               Software Sustainability Institute
The modern researcher…
                                                                          www.software.ac.uk



                                                        • … worries about:
                                                               Data management
                                                                and analysis
                                                               Reproducible
                                                                research
                                                               Scalable simulations
                                                               Integration of
                                                                models and
                                                                workflows
Picture of Otto Stern          of                              Collaboration
Emilio Segre Visual Archives

                               Software Sustainability Institute
Software Skills Training
                                                                             www.software.ac.uk




   Research
   Focussed            Software                Summer
                       Carpentry               Schools


                                            Who fills this gap?
                                                            HPC Short
                                                             Courses
                                             MSc in HPC /
                                               scientific
                                              computing
                                                                        Advanced HPC
                                                                           Training


Programming       Programming                Programming
                      101                        201
    Focussed

               Basic                                                      Advanced
                               Software Sustainability Institute
A part of the process
                                                       www.software.ac.uk



• Foundations of scientific computing in
  undergraduate courses
   Like presentation skills
• Methods of scientific computing in
  postgraduate courses
   Like statistics and ethics
• Show the benefits from the knowledge and
  methods of digital research
   Not just programming 101
                   Software Sustainability Institute
Supporting the system
                                                      www.software.ac.uk



• Centres for Doctoral Training
• Institutional support
• Professional bodies
• Career paths for software
  specialists
• Recognition for software reuse
• Case studies to change practice


                Software Sustainability Institute   Picture by mira66
Methods and mechanisms
                                                            www.software.ac.uk


                                •    Train and support the trainers
                                •    Open source materials
                                •    Common structure
                                •    Enable online participation
                                •    Encourage local delivery
                                •    Support local follow-up
                                •    Shared spaces for discussion

                                • Spacing learning over time,
                                  asking deep questions
                                • Teaching software skills for the
                                  free-range researcher
Picture by fallsroad
                       Software Sustainability Institute
Skills allow us to show off
                                                www.software.ac.uk



       • Software is a key part of the life of a
         researcher
       • A broad base of software skills is
         required and must be supported by
         the institutions
       • What skills do we teach now to
         provide the computational scientists of
         the future?
       • Reward the application of these skills
            Software Sustainability Institute
More Information / Collaborate
                                                                   www.software.ac.uk


• Today 2pm – 4.30pm: Drop-In Software Surgery

• SSI Resources
     http://www.software.ac.uk/resources
• SSI Fellowship Programme:
     £3000 to support your work, deadline 20th Sep
     http://www.software.ac.uk/fellowship-programme
• SSI Open Call for Projects:
     Collaborate with us, next deadline 30th Sep
     http://www.software.ac.uk/open-call

• Journal of Open Research Software
     A metajournal of record for your software and its metadata
     http://openresearchsoftware.metajnl.com

                          Software Sustainability Institute

Doing Science Properly in the Digital Age: Software Skills for Free-Range Researchers

  • 1.
    www.software.ac.uk Doing Science Properly inthe Digital Age 10 September 2012, Digital Research 2012, Oxford Neil Chue Hong (@npch) [email protected] Software Sustainability Institute
  • 2.
    Four Paradigms ofResearch www.software.ac.uk Software Sustainability Institute
  • 3.
    Software is pervasive in research www.software.ac.uk Software Sustainability Institute
  • 4.
    Just the Natureof the problem? www.software.ac.uk Statistics courtesy of Jo Hannay et al, “How Do Scientists Develop and Use Scientific Software? Maintenance is not fun Published online 13 October 2010 | Nature 467, 775-777 Hacking new stuff is fun (2010) doi:10.1038/467775a Software Sustainability Institute
  • 5.
    The Software Sustainability Institute www.software.ac.uk A national facility for cultivating world- class research through software • Better software enables better research • Software reaches boundaries in its development cycle that prevent improvement, growth and adoption • Providing the expertise and services needed to negotiate to the next stage • Developing the policy and tools to support the community developing and using research software Supported by EPSRC Software Sustainability Institute Grant EP/H043160/1
  • 6.
    SSI Drivers andThemes www.software.ac.uk • Two key drivers which cause people to seek the SSI’s advice:  They want to be more productive in their research  They don’t want to be embarrassed by appearing worse than their peers • Broadly, our work falls into a few key themes:  The role and reward of software in research  Recognition of software career paths  Developing the scientific computing / software development skill base Software Sustainability Institute
  • 7.
    The Foundations of Digital Research www.software.ac.uk Re- search Careers Recognition / Reward Skills and Capability Software Sustainability Institute
  • 8.
  • 9.
    Seven recommendations to improve learning www.software.ac.uk 1. Space learning over time 2. Interleave worked example solutions with problem-solving exercises 3. Combine graphics with verbal descriptions 4. Connect and integrate abstract and concrete representations of concepts 5. Use quizzing to promote learning 6. Help students allocate study time efficiently 7. Ask deep explanatory questions Organizing Instruction and Study to Improve Student Learning http://ies.ed.gov/ncee/wwc/practiceguide.aspx?sid=1 Software Sustainability Institute
  • 10.
    Traditional teaching www.software.ac.uk • Traditional software teaching is aimed at “caged” students  Full-time  Able to allocate study- time  Able to space learning over time  Teach in groups • But still focussed on how and not why Picture by Farm Sanctuary Software Sustainability Institute
  • 11.
    The free-range researcher www.software.ac.uk • The free range researcher has different requirements  Already working at 200%  No common “baseline” knowledge  Often just one or two researchers • Why do they care about software engineering? Software Sustainability Institute Picture by Brookford Farm
  • 12.
    No common baseline www.software.ac.uk BhargenBasepair Fan Fullerene Helen Helmet Mehrdad Mapping Bioinformatics RA Chemistry PhD Mech Eng intern Forestry student Novice coding in Java Running analyses Exploratory coding Field data collection and Perl to test groups on behalf of his based on tinkering currently correlating pattern algorithms supervisor with legacy code using Matlab Software Sustainability Institute Courtesy of Greg Wilson
  • 13.
    Case Study: LigandBinding www.software.ac.uk • Centre for Computational Chemistry, Bristol  New methods for rapid MC sampling of biomolecular systems modelled using QM/MM  Developed two codes ProtoMS (F77) + Sire (C++)  Water-Swap Reaction Coordinate method to calculate absolute protein-ligand binding free energies • SSI’s work is helping to scale development  ProtoMS and Sire both single developer codes  ASPIRE/ACQUIRE framework has multiple devs • Split architecture between ASPIRE (adaptive multiresolution hybrid MD simulation) and ACQUIRE (WorkPacket scheduling system with optimisation for time to result vs “green-ness” • http://www.siremol.org/adaptive_dynamics Software Sustainability Institute
  • 14.
    Case Study: Climate Policy Modelling www.software.ac.uk • CIAS team at Tyndall Centre for Climate Change Research, University of East Anglia  Develop linked climate and economic models for detailed analysis  Their software was not ready to be used by other groups • One researcher/developer at UEA, several users • SSI’s work means the software is robust enough that it can be installed and used by others  Enabled use of the software by the WWFN’sClimascope project and James Cook University • Documented software to allow extensions by contributors • Made it easier to maintain and backup • Added job scheduling to improve modeling throughput • New modelling framework enables new models i.e. new science • http://www.tyndall.ac.uk/research/cias Software Sustainability Institute
  • 15.
    Case Study: textualstudies www.software.ac.uk • TextVRE team at CeRCH, Kings College London  Developed an environment which is used to integrate various tools used in the e-Humanities textual studies lifecycle  Builds on the German TextGrid project, and many other existing tools • SSI’s work means the software is can be run “out of the box” – an important requirement for the researchers  Developed a VM image containing the TextVRE installation • Improve installation instructions • Develop tests to check each installed component • Improve modularisation to allow others to contribute and maintain  Feeding back work to TextGrid • http://textvre.cerch.kcl.ac.uk Software Sustainability Institute
  • 16.
    The modern researcher… www.software.ac.uk • … worries about:  Data management and analysis  Reproducible research  Scalable simulations  Integration of models and workflows Picture of Otto Stern of  Collaboration Emilio Segre Visual Archives Software Sustainability Institute
  • 17.
    Software Skills Training www.software.ac.uk Research Focussed Software Summer Carpentry Schools Who fills this gap? HPC Short Courses MSc in HPC / scientific computing Advanced HPC Training Programming Programming Programming 101 201 Focussed Basic Advanced Software Sustainability Institute
  • 18.
    A part ofthe process www.software.ac.uk • Foundations of scientific computing in undergraduate courses  Like presentation skills • Methods of scientific computing in postgraduate courses  Like statistics and ethics • Show the benefits from the knowledge and methods of digital research  Not just programming 101 Software Sustainability Institute
  • 19.
    Supporting the system www.software.ac.uk • Centres for Doctoral Training • Institutional support • Professional bodies • Career paths for software specialists • Recognition for software reuse • Case studies to change practice Software Sustainability Institute Picture by mira66
  • 20.
    Methods and mechanisms www.software.ac.uk • Train and support the trainers • Open source materials • Common structure • Enable online participation • Encourage local delivery • Support local follow-up • Shared spaces for discussion • Spacing learning over time, asking deep questions • Teaching software skills for the free-range researcher Picture by fallsroad Software Sustainability Institute
  • 21.
    Skills allow usto show off www.software.ac.uk • Software is a key part of the life of a researcher • A broad base of software skills is required and must be supported by the institutions • What skills do we teach now to provide the computational scientists of the future? • Reward the application of these skills Software Sustainability Institute
  • 22.
    More Information /Collaborate www.software.ac.uk • Today 2pm – 4.30pm: Drop-In Software Surgery • SSI Resources  http://www.software.ac.uk/resources • SSI Fellowship Programme:  £3000 to support your work, deadline 20th Sep  http://www.software.ac.uk/fellowship-programme • SSI Open Call for Projects:  Collaborate with us, next deadline 30th Sep  http://www.software.ac.uk/open-call • Journal of Open Research Software  A metajournal of record for your software and its metadata  http://openresearchsoftware.metajnl.com Software Sustainability Institute

Editor's Notes

  • #3 For thousands of years, research was empirical, using observation and experiment to describe natural phenomenaIn the last few hundred years, theory developed using models and generalisationsIn the last decades, computational simulation has made it possible to model complex phenomenaIn the last few years, data exploration – digital research – has unified experimental data, theory, and computational simulation to analyse the vast amounts of collected and generated information
  • #4 Images courtesy of projects from the ENGAGE programme http://www.engage.ac.uk/
  • #5 Statistics from Greg WilsonAre academics software developers?Can research consortia manage production?Are timing constraints different?What is the role of the PI in software development management?Are the skills for software and research the same?- more and more researchers use computer software and hardware intheir day to day research, not just those researchers who could beclassed as being computational scientists, yet they find itincreasingly difficult to exploit due to a lack of coordination([Gob10], also observed in [Han09])- there is a wide variance in the levels of experience in scientificcomputing and software development, and hence their use of computing,which is present across all domains and levels of seniority ([Har09],also ongoing as our result with the DIRAC consortium)- software is often treated as if it was disposable, rather than thesubject of a £9m per year investment by EPSRC [SaaI12]
  • #6 Software reviews and refactoring, collaborations to develop your project, guidance and best practice on software development, project management, community building, publicity and more…Drawing on pool of specialists to drive the continued improvement and impact of research software developed by and for researchersProviding services for research software users and developersDeveloping research community interactions and capacityPromoting research software best practice and capability
  • #11 http://ies.ed.gov/ncee/wwc/practiceguide.aspx?sid=11. Space learning over time. Arrange to review key elements of course content after a delay of several weeks to several months after initial presentation. 2. Interleave worked example solutions with problem-solving exercises. Have students alternate between reading already worked solutions and trying to solve problems on their own. 3.Combine graphics with verbal descriptions. Combine graphical presentations (e.g., graphs, figures) that illustrate key processes and procedures with verbal descriptions. 4.Connect and integrate abstract and concrete representations of concepts. Connect and integrate abstract representations of a concept with concrete representations of the same concept.  5.Use quizzing to promote learning. Use quizzing with active retrieval of information at all phases of the learning process to exploit the ability of retrieval directly to facilitate long-lasting memory traces.  6.Help students allocate study time efficiently. Assist students in identifying what material they know well, and what needs further study, by teaching children how to judge what they have learned. 7.Ask deep explanatory questions. Use instructional prompts that encourage students to pose and answer “deep-level” questions on course material. These questions enable students to respond with explanations and supports deep understanding of taught material.
  • #12 http://www.flickr.com/photos/farmsanctuary1/2162602003/
  • #13 http://www.flickr.com/photos/brookfordfarm/4856788448/
  • #17 Transferring software knowledge is not easyhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2882045/Compare fused pairs of different MR sequences modulated in red-green colour space which enhances tissue discrimination
  • #18 Transferring software knowledge is not easyhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2882045/Compare fused pairs of different MR sequences modulated in red-green colour space which enhances tissue discrimination
  • #19 Collaboration helps sustainability
  • #20 Collaboration helps sustainability
  • #22 Update slide for surveymapper?
  • #23 Update slide for surveymapper?
  • #24 http://www.flickr.com/photos/esva/2364906768
  • #25 CPD?
  • #27 http://www.flickr.com/photos/suanie/4699242750/
  • #28 Ultimately the Software Sustainability Institute would like to seebasic scientific computing to be taught in the same way thatstatistics are a fundamental part of any researchers toolbox. Likewisean understanding of software programming should be seen as equivalentto the understanding of presenting and disseminating your work whichis expected of graduates.A basic syllabus and list of recognised teaching providers ensuresthere is a way of providing excellent foundation training inscientific computing via the CDTs. Specialist interdisciplinaryscientific computing CDTs which concentrate on instilling the bestcomputational, data analysis and software development techniques intheir doctoral students will provide the UK with the next generationof world-class scientists.
  • #29 http://www.flickr.com/photos/21804434@N02/4643070344/
  • #30 http://www.flickr.com/photos/fallsroad/6758523
  • #35 c.f work of James Howison
  • #36 Based on study done for Cameron Neylon’s Beyond Impact workshop
  • #37 Is it more important to sustain the software that this workflow references, or the workflow itself?
  • #38 At what level do you reference, at what level do you deposit?
  • #39 Made more difficult than data because of the fluidly changing collaborative nature of software development – not just adding to the contributor pool