SlideShare a Scribd company logo
Programming by a Sample: Rapidly Creating Web Applications with d.mixBjörn Hartmann, Leslie WuKevin Collins, Scott R. KlemmerUIST· 10 October 2007
How would you share UIST highlights with colleagues back home?
How would you retrieve the same data programmatically?It’s easy to understand the sites, but not the services.
d.mix: Programming by a Sample
Web sites and their APIs are correlated…≈…let’s leverage that fact![flickr.com]
Give me the code for this!
To retrieve this image, use:flickr.photos.getInfo(user_id = '73866493@N00', 	photo_id= ‘3208312’)
d.mix active wikiSource code generated by d.mixRendered Pageis executed in the active wiki
Scenario
d.mix: Programming by a Sample
d.mix: Programming by a Sample
d.mix: Programming by a Sample
d.mix usersSite owners or lead users define mappings between sites and services (once).Lead UsersWeb DevelopersEnd Users
Web developers create d.mix applicationsLead UsersWeb DevelopersEnd Users
End-users run (and tailor) applicationsin the d.mix wiki.Lead UsersWeb DevelopersEnd Users
d.mix Proxy ArchitectureRewritten page with API annotationsProxy ServerOriginal PageSite-to-Service Map(hosted on d.mix wiki)
Authoring the Site-to-Service Map…Without Help From the Site Owner1	Map URL to Page Type2	Identify visual elements in page to annotate (using XPath/CSS selectors)3	Extract arguments for service calls from page sourceBind arguments to web service code snippet
Why Not Just Scrape?Scraping at design-time rather than at run-time minimizes brittlenessWeb service calls can be parameterizedScraping at run-time can lead to lock-out
d.mix: Programming by a Sample
Page URL:flickr.com/photos/<username>/<photoid>/…Regular Expression:%r{flickr.com//?photos/    [^/]+/\d+/?&script}
Photo TitleImage URLTag Search
Photo Title(doc/"#title_div")Image URL(doc/"div.photoImgDiv")Tag Search(doc/"div.TagList")
flickr.photos.getInfo( photo_id = “298655528”).titleinfo = flickr.photos.getInfo( photo_id = “298655528”)URL = “http://farm”+ info.farm-id    + “.static.flickr.com/”    + info.server-id    + “/” + info.attributes[”id”]     + “_”    + info.secret    + “.jpg”Extracted from  page source:Within <div> for all tags:tag=div.at("a.Plain").inner_htmlflickr.photos.search(tags = “yosemite ...”)
Putting it all together…defself.annotate_photopage_tags(doc)	(doc/"div").reject{…}.each do |div|tag = div.at("a.Plain").inner_htmlsrc = generate_tag_search_source (...)doc.at("body").inner_html += make_context_menu(div.at("a.Plain"),				["Images matching tag #{tag}"],[src])end
Putting it all together…Extract tag namedefself.annotate_photopage_tags(doc)	(doc/"div").reject{…}.each do |div|tag = div.at("a.Plain").inner_htmlsrc = generate_tag_search_source (...)doc.at("body").inner_html += make_context_menu(div.at("a.Plain"),				["Images matching tag #{tag}"],[src])end
defself.annotate_photopage_tags(doc)	(doc/"div").reject{…}.each do |div|tag = div.at("a.Plain").inner_htmlsrc = generate_tag_search_source (...)doc.at("body").inner_html += make_context_menu(div.at("a.Plain"),				["Images matching tag #{tag}"],[src])endPutting it all together…Instantiate Source Example
Putting it all together…defself.annotate_photopage_tags(doc)	(doc/"div").reject{…}.each do |div|tag = div.at("a.Plain").inner_htmlsrc = generate_tag_search_source (...)doc.at("body").inner_html += make_context_menu(div.at("a.Plain"),				["Images matching tag #{tag}"],[src])endAdd annotation to original page
Dataflow Summarysendscode tod.mix run timed.mix design timeaddsannotationinvokesWeb serviceWeb site
Active WikiWiki editor provides syntax-highlighting for Ruby scriptSandboxed execution runs script with limited capabilitiesLibraries facilitate invoking web services and manipulating results
Beyond the Desktop Browser
What we borrowed and what we wrote
Prototype Site-to-Service Library
First-use Lab Study (n=8)All participants had some programming experience, knew HTMLFour had no experience with web APIs75 minute sessions:Demonstration, warm-up, two design tasks
d.mix: Programming by a Sample
d.mix: Programming by a Sample
Lessons LearnedHow do I know what I can sample??
Lessons LearnedHow do I know what I can sample?
Lessons LearnedOffer multiple ways to sample information.Sample from link to contentSample content directly
LimitationsProxying the logged-in web is challenging
LimitationsProxying the logged-in web is challenging
LimitationsHow can one sample APIs that provide interactive widgets intead of data?
Related WorkEnd-user Page Modification &AutomationEnd-user PAGECreationDeep Copy & pasteGreasemonkeyChickenfoot[Bolin, UIST2005]Koala [Little, CHI2007]Yahoo! PipesOpen KapowMarmite [Wong, CHI2007]IBM QEDWikiIntel MashMaker[Ennals, SIGMOD2007]Relations, Cards, and Search Templates [Dontcheva, UIST2007]Citrine [Stylos, UIST2004]WinCuts[Tan, CHI2004]Clip, connect, clone[Fujima, UIST2004]Hunter Gatherer[schraefel, WWW2002]Facades[Stuerzlinger, UIST2006]Finding APIExamplesMica[Stylos, VL/HCC2006]Assieme[Hoffmann, UIST2007]
End-user Page Modification &AutomationGreasemonkeyChickenfoot[Bolin, UIST2005]Koala [Little, CHI2007]
End-user AuTHORING TOOLSYahoo! PipesOpen KapowMarmite [Wong, CHI2007]IBM QEDWikiIntel MashMaker[Ennals, SIGMOD2007]Relations, Cards, and Search Templates [Dontcheva, UIST2007]
API SEARCHTOOLSAssieme[Hoffmann, UIST2007]Mica[Stylos, VL/HCC2006]
ContributionsSearch for programming examples in the solution domain, not the code domain.d.mix instantiates this idea for web service APIs through a site-to-service map.Integration of page annotation and script hosting enables rapid experimentation.
Current WorkRe:MixReformatting existing web applications for	mobile device useJuxtaposeExploring designalternatives inparallel
AcknowledgmentsFunding	NSF grant IIS-0534662	SAP Stanford Graduate Fellowship	Microsoft New Faculty Fellowship	Intel (equipment donation)Help 	Wendy Ju, Leith Abdulla, Michel Krieger,whytheluckystiffImages	morguefile.com
hci.stanford.edu/dmix

More Related Content

What's hot (20)

PDF
Polymer & the web components revolution 6:25:14
mattsmcnulty
 
PPTX
Web Components
FITC
 
PDF
Multi screen HTML5
Ron Reiter
 
PDF
Introduction to Web Components
Fu Cheng
 
KEY
Google App Engine with Gaelyk
Choong Ping Teo
 
PPTX
An Overview of Models in Django
Michael Auritt
 
PDF
Александр Кашеверов - Polymer
DataArt
 
PPTX
Polymer presentation in Google HQ
Harshit Pandey
 
PPTX
2011 - SharePoint + jQuery
Chris O'Connor
 
PDF
jQuery Mobile Workshop
Ron Reiter
 
PPTX
You Either Surf Or You Fight
MrDys
 
PPTX
Introduction to jQuery
Alek Davis
 
PDF
YQL & Yahoo! APIs - Open Hack India 2011
Saurabh Sahni
 
PDF
Web Components
Nikolaus Graf
 
PPSX
JQuery Comprehensive Overview
Mohamed Loey
 
PPTX
Harness jQuery Templates and Data Link
BorisMoore
 
PPTX
Jquery mobile
Eric Turcotte
 
PDF
Web Components with Polymer (extra Polymer 2.0)
Dhyego Fernando
 
PDF
jQueryMobile Jump Start
Haim Michael
 
Polymer & the web components revolution 6:25:14
mattsmcnulty
 
Web Components
FITC
 
Multi screen HTML5
Ron Reiter
 
Introduction to Web Components
Fu Cheng
 
Google App Engine with Gaelyk
Choong Ping Teo
 
An Overview of Models in Django
Michael Auritt
 
Александр Кашеверов - Polymer
DataArt
 
Polymer presentation in Google HQ
Harshit Pandey
 
2011 - SharePoint + jQuery
Chris O'Connor
 
jQuery Mobile Workshop
Ron Reiter
 
You Either Surf Or You Fight
MrDys
 
Introduction to jQuery
Alek Davis
 
YQL & Yahoo! APIs - Open Hack India 2011
Saurabh Sahni
 
Web Components
Nikolaus Graf
 
JQuery Comprehensive Overview
Mohamed Loey
 
Harness jQuery Templates and Data Link
BorisMoore
 
Jquery mobile
Eric Turcotte
 
Web Components with Polymer (extra Polymer 2.0)
Dhyego Fernando
 
jQueryMobile Jump Start
Haim Michael
 

Viewers also liked (8)

KEY
Weather or-not [share it!]
Leslie W
 
PDF
GardenBnB pitch
Leslie W
 
PPT
Opening the window -- Ultimate in-cuts
Leslie W
 
PPT
Cutting 101: Can Has Ultimate Domination?
Leslie W
 
PPT
LWu ConceptualDesignFogg
Leslie W
 
PPT
The Lobster cut: going Downtown
Leslie W
 
KEY
A Google Chrome Extension to Encourage Healthy Sun-care: Social Media Agents ...
Leslie W
 
KEY
the Healthy Habits Coach: a Chrome-Facebook Extension
Leslie W
 
Weather or-not [share it!]
Leslie W
 
GardenBnB pitch
Leslie W
 
Opening the window -- Ultimate in-cuts
Leslie W
 
Cutting 101: Can Has Ultimate Domination?
Leslie W
 
LWu ConceptualDesignFogg
Leslie W
 
The Lobster cut: going Downtown
Leslie W
 
A Google Chrome Extension to Encourage Healthy Sun-care: Social Media Agents ...
Leslie W
 
the Healthy Habits Coach: a Chrome-Facebook Extension
Leslie W
 
Ad

Similar to d.mix: Programming by a Sample (20)

PPS
Web Services Mash-Up
Cal Henderson
 
PPS
Etech2005
royans
 
PDF
Services web RESTful
goldoraf
 
PDF
Avoiding API Library Antipatterns
Paul Mison
 
PDF
Buildingplatforms
codebits
 
PPS
I can has API? A Love Story
Cal Henderson
 
PDF
Using ArcGIS Server with Ruby on Rails
Dave Bouwman
 
ZIP
POIDH: The Flickr API
Matthew Rothenberg
 
ODP
My Story With Flickr
Jose Martinez
 
PPT
Tools for A Preservation Ready Web
Michael Nelson
 
KEY
Library Mashups & APIs
librarywebchic
 
KEY
Seti 09
bzanchet
 
PDF
PHP And Web Services: Perfect Partners
Lorna Mitchell
 
PDF
Yahoo is open to developers
Christian Heilmann
 
PDF
Flickr Open Api Mashup
Jinho Jung
 
PPT
Developing Mash up applications with Adobe AIR
marcocasario
 
PDF
REST Introduction (PHP London)
Paul James
 
KEY
QueryPath, Mash-ups, and Web Services
Matt Butcher
 
PDF
Single API for library services (poster)
Milan Janíček
 
PPTX
Harnessing Free Content with Web Service APIs
ALATechSource
 
Web Services Mash-Up
Cal Henderson
 
Etech2005
royans
 
Services web RESTful
goldoraf
 
Avoiding API Library Antipatterns
Paul Mison
 
Buildingplatforms
codebits
 
I can has API? A Love Story
Cal Henderson
 
Using ArcGIS Server with Ruby on Rails
Dave Bouwman
 
POIDH: The Flickr API
Matthew Rothenberg
 
My Story With Flickr
Jose Martinez
 
Tools for A Preservation Ready Web
Michael Nelson
 
Library Mashups & APIs
librarywebchic
 
Seti 09
bzanchet
 
PHP And Web Services: Perfect Partners
Lorna Mitchell
 
Yahoo is open to developers
Christian Heilmann
 
Flickr Open Api Mashup
Jinho Jung
 
Developing Mash up applications with Adobe AIR
marcocasario
 
REST Introduction (PHP London)
Paul James
 
QueryPath, Mash-ups, and Web Services
Matt Butcher
 
Single API for library services (poster)
Milan Janíček
 
Harnessing Free Content with Web Service APIs
ALATechSource
 
Ad

d.mix: Programming by a Sample

Editor's Notes

  • #3: Let me start with a question: how would you share this year’s UIST highlights with colleagues back home who could not attend?At your next meeting you may first open the conference web page to project the conference program. Then maybe search for authors’ home pages to get more information. You may then go to a photo site to find other attendee’s images they tagged with uist.And then show some videos people took at the demo reception two days ago.In short, you may do it by browsing the web.
  • #4: What would you do if you wanted to write a web application that queries the same data? This is possible, all of the sites offer web service APIs.You would probably have to get some information on the web search api first.And the photo site’s API, and hmm, those look nothing alike.You’ll probably go down some blind alleys before finding the right function; but what’s the right set of arguments?And then you’d repeat those steps for the video search site.One frustrating problem of programming the web today is that programming models and documentation conventions vary widely – there is no global type hierarchy to rely on, and there is no javadoc for web services. So while it’s easy to understand the web sites, it is much harder to understand their services.
  • #5: PAUSEUnderstanding API structure is only half the task. Most existing tools implicitly assume that developers start with a clean slate, and then type their program.However, this is not how casual programming happens in practice.“Programming by Example modification&quot; is a successful alternative because examples situate code snippets in a functioning system. However, since search for examples happens in the domain of source code – api packages, function names, etc. users are faced with a large gulf of execution to translate their goals into system steps.
  • #7: d.mix models the correspondence between html pages and api calls to let users create program code by sampling content from a web page. (space)It co-locates two different kinds of information on one page: examples of what functionality a website offers,
  • #8: together with information how one would obtain this information programmatically. With d.mix. users can generate working code examples by selecting the information on the web site itself. This selection technique allows users to remain in the domain of their task goal, in this case retrieving images.
  • #9: The source code generated through sampling with d.mixCan then be copied to an active wiki, a server-side code environment, where it can be edited and safely executed. (click) Theactive wiki provides a configuration-free environment forrapid experimentation.
  • #10: This scenario will clarify the main interaction techniques of d.mix:A group of rock climbers would like to create a photo page for the group,
  • #11: Which should always show the latest iamges from the groups’ trips without requiring manual updates.
  • #14: There are several classes of users that work with d.mix.First, lead users manually establish the correlation between sites and their services.Alternatively, webmasters at sites such as Flickr, Yahoo, or Amazon.com could easily provide such mappings.
  • #15: After a small number of lead users author these mappings, many more web developers can create applications in d.mix and quickly prototype web-based mashups.
  • #16: Finally, end-users can run applications written in d.mix, by visiting them from any browser.End-users can also tailor applications created in d.mix, by copy and pasting code from one d.mix wiki page to another.
  • #17: This diagram describes the d.mix architecture for sampling content from web pages.d.mix uses a programmable HTTP proxy that rewrites the original web site, adding API annotations. Each API annotation contains an associated snippet of code that generates a working web service call in the Ruby programming language.The programmable proxy knows how to annotate web pages by using the site-to-service map. (space) This map defines the correspondence between web sites and their associated web services.
  • #18: How does a 3rd party developer author a site-to-service map? (space)First, a developer maps web page URLs to web “page types”. (space)For any given “page type”, a developer defines a combination of XPath and CSS selectors that identify the visual elements to be annotated (space)They then define selectors that infer web service arguments given the markup on a page. (space)Finally, these arguments are bound to a web service code snippetNote that it is easier for site owners to provide this mapping, as they do not need to infer these relationships programmatically through scraping. We hope that in the future, site owners will provide such site-to-service mappings for the larger web developer community.
  • #19: So why not rely solely on scraping, without trying to generate web service code?(space) Web site content and markup change quickly. Thus, programs based on scraped content can be brittle.In contrast, web service APIs are often frozen. Since applications are authored once, but executed many more times, scraping once at “application design time” is preferable to scraping at execution time. (space)Furthermore, web service calls can be meaningfully parameterized by changing the arguments used to call them. This is not possible with scraped content.(space)Finally, scraping may not scale. Many sites monitor their traffic and actively lock out clients that request a large number of pages in a short amount of time.
  • #20: Let me illustrate the process of creating a site-to-service map with a concrete example.Here is a single photo page from flickr.com.
  • #21: First, we note that photo pages have a consistent URL structure. Thus, we can define a regular expression that detects similar photo pages,And a set of expressions that furthermore classify pages into distinct page “types”.
  • #22: Here are three items of interest: a photo’s title, the URL for an image, and a way to run a tag search (SPACE, SPACE)
  • #23: Each of these items can be accessed by searching theDOM, the document object model.d.mix uses a combination of CSS and XPath selectors to identify elements, highlight them, and add a context-menu handler to each element. For any given item on the page, there may be multiple matching web service calls that return that element. For example (SPACE), one can either sample “the first image in a photo set” or “this particular image”. d.mix generates multiple entries in the context menu to let the user guide the generalization process.
  • #24: The site-to-service map then specifies how to connect each of these items to a source code snippet.For example (SPACE) to perform a tag search, one needs to pass in the appropriate tag to a specific method. These arguments are also extracted using CSS and XPath selectors.
  • #25: Continuing this example, here is an abbreviated form of the Ruby code that defines part of the site-to-service map for Flickr.com.To summarize, it…
  • #26: …extracts tag names
  • #27: …instantiates working examples for each different tag
  • #28: …and adds code examples as annotations on the original page
  • #29: After adding annotations, working code examples can be sent to the d.mix active wiki. (SPACE)The active wiki enables quick testing and modification of generated code.(SPACE)The code then queries the original web site programmatically.
  • #30: After code is copied or appended to wiki pages, programs can be modified directly in the active wiki(SPACE)Programs are then run in a sandboxed execution model. For example, a program may make a web service call, but it does not have read/write access to the wiki server’s file system.(SPACE)The active wiki also provides libraries that faciliate calling web services and transforming the results.
  • #31: A proxy that can be programmed with code hosted on wiki pages is a flexible, general tool for prototyping web applications. For example, we have used the d.mix architecture to create an application that reformats web content to better fit mobiledevices.It extracts onlyessential information — here, movie names and show times — froma cluttered web page. This application took half an hour to build using the d.mix system.
  • #32: (PAUSE) d.Mix builds on a number of freely available open source toolkits which facilitate front-end and back-end development on the web. In particular, d.mix takes an existing programmable HTTP proxy, the mouseHole, among other software, and adds logic to annotate pages and author site-to-service mappings.
  • #33: (Switch to Bjoern here)To evaluate the authoring approach embodied in d.mix, we wrote a prototype site-to-service library that supports three web sites and their services: Flickr, Yahoo!web search, and the YouTubevideo site. For each site, our library supports a subset of the API.For example, the library provides methods to perform full-text and tag searchesand copy images from Flickr, but it does not support extraction of user profile information.
  • #34: With this library, we then conducted a first-use evaluation study with eight participants, aged 25 to 46. (SPACE) We recruited participants with some programming and web development experience, but only half had programmed in Ruby before. (SPACE)Also, only half worked with web APIs before.(SPACE) We first demonstrated d.mix’s interface to participants. Next, we gave them three tasks to perform.The first task tested the overall usability of our approach:participants were asked to find media files, sendcontent to the wiki, and change simple parameters ofthere, e.g., how many images to show from aphoto stream.
  • #35: The second design task was similar to ourscenario — it asked participants to create an informationdashboard for a magazine’s photography editor to track image submissions by photographers. This requiredcombining data from multiple users on the Flickr siteand formatting the results. This image is a solution one of our participants created. The exciting result here is that everyone successfully built web applications in a very short amount of time. (SPACE)
  • #36: The third task asked participantsto create a meta-search engine by querying at least two different webservices and combining search results from both on a singlepage. This image also shows a page created in the study. While participants were generally not familiar with the particular APIs involved in this task, they did successfully modify the code generated by d.mix.In addition, many leveraged their existing web design experience to add formatting commands such as CSS styles to the generated code.
  • #37: The user study also uncovered some important design suggestions for the future. d.mix users would first navigate to a page, then turn on annotation thorugh the sample this button.
  • #38: However, because only some page items were supported by our limited library, participants would then discover that the desired data were not retrievable by d.mix.A straightforward solution would be to proxy every single page request. Since annotating pages currently introduces a delay of a second or more, engineering for better performance is required to make this tractable.
  • #39: Some participants were also confused by two different approaches to sampling employed inconsistently:For example, to specify aquery for a set of Flickr images matching a specific tag, in d.mix, the user currentlymust sample from the link to the image set, not the results. However, to select images from a photo stream, the user must sample from the set of images itself, not the link. (SPACE)This suggests that example-based programming tools for the web should consistentlyenable sampling from both the source and from the targetpage to support both interaction styles.
  • #40: There are also some architectural limitations of d.mix that we would like to acknowledge.The logged-in web –, web-based email, online banking, -- relies on state maintained in the browser. the current d.mix HTTP proxy does not handle cookies of remote sites as a client browser would.So d.mix cannot currently sample content from the pages that require authentication beyond basic API keys.
  • #41: The most promising solution to this problem is to co-locate a server-side “headless” browser (space) that can handle cookies with our proxy. We have some preliminary results that suggest this is a possible way forward.
  • #42: Third, the examples we have shown focused on sampling content: photos, text, videos. Other web services offer rich interaction widgets, for example for building user interface components and for map navigation. We have not yet addressed how the principal action of selecting annotated items would extend to such services.
  • #44: A good example of research concerned with browser automation is Koala.In Koala, users compose scripts that automate browser sessions by demonstration, e.g. following links or entering form information. d.mix shares with Koala the use of programming-by-demonstration techniques to generate working examples.Both systems also use the social-software mechanism of sharing scripts server-side on awiki page. d.mix distinguishes itself in two importantways. First, Koala shields users from the underlying representation.d.mix uses visual representationswhen they are expedient, yet also providesaccess to source code. Second, Koala and similar systems focus on automating webbrowsing and changing web pages solely through the DOM — they do not interact with web service APIsdirectly.
  • #45: There are several end-user autoring tools that lower the expertise threshold required to create applications thatsynthesize data from multiple pre-existing sources.Yahoo pipes for instance introduces a visual node-and-link editor formanipulating web data sources.Pipes and similar systems like Kapow are used to themselves create webservices meant for programmatic consumption. The output is an RSS feed, not a user-facing application.The main advantage of d.mix over such work is the integration of site and service in a single page that d.mix offers.
  • #46: Programmers often create newfunctionality by finding examplesonline. Assieme is one example of an augmented search interface to support search for example-basedprogramming. In particular, Assieme integrates information from Web-accessible (JAR) files, API documentation, and pages that includeexplanatory text and sample code.d.Mix differs form this work by searching in solution the solution domain, rather than the code domain:Users find examples of the kind of output they would like to create and generate matching code from those examples.d.Mix additionally offers a lightweight execution environment to test the code it generated.
  • #47: In summary, the underlying research idea of our work is to search for programming examples in solution space, instead of code space, and then map back from application output to code that can generate that output.(SPACE) d.Mix is an instance of this idea for programming with data-centric web service APIs.Conceptually, d.mix is enabled by a mapping from HTML pages to the API calls in a site-to-service map.(SPACE)On a technical level, we introduced a prototyping environment for web service code through the integration of a programmable proxy to annotate pages and a wiki to safely execute scripts.
  • #48: We have plans to extend the research ideas investigated by d.mix in two areas.First, we have done preliminary work in using server-side programmable proxies for retargeting existing web applications to mobile devices.Second, design of interactive applications is often about generating different alternative designs and then choosing among them. Beyond enabling rapid experimentation, we are interested in explicitly supporting work with multiple alternative interface designs through better authoring environments.
  • #50: And I invite you to visit our web page.Thank you.