Rants, raves (and occasionally considered opinions) on phyloinformatics, taxonomy, and biodiversity informatics. For more ranty and less considered opinions, see my Twitter feed. ISSN 2051-8188. Written content on this site is licensed under a Creative Commons Attribution 4.0 International license.
Showing posts with label Creative Commons. Show all posts
Showing posts with label Creative Commons. Show all posts
I chose PLoS Currents Tree of Life because it is (supposedly) quick and cheap. Unfortunately a perfect storm of delays in reviewing together with licensing issues resulted in the paper taking nearly three months to appear. The licensing issues were a headache. PLoS uses the Creative Commons CC-BY license for all its content. Unfortunately, the original submission included maps from Google Maps and Open Street Map (OSM), to show that the GeoJSON produced by my tool could work with either. Google Maps tile imagery is not freely available, so I had to replace that in order for PLoS to be able to publish my figures. At first I used simply replaced the tiles Google Maps displays with ones from OSM, but those tiles are CC-BY-SA, which is incompatible with PLoS's use of CC-BY. Argh! I got stroppy about this on Twitter:
Eventually I discovered maps from CartoDB that have CC-BY licenses, and so could be used in the PLoS Currents article. After replacing Google's and OSM tiles with these maps (and trimming off the "Google" logo) the figures were acceptable to PLoS. Increasingly I think Creative Commons has resulted in a mess of mutually incompatible licenses that make mashing up things hard. The idea was great ("skip the intermediaries" by declaring that your content can be used), but the outcome is messy and frustrating.
But, enough grumbling. The article is out, the code is in GitHib. Now to think about how to use it.
GBIF is asking for views on how it should license of data in the GBIF network. The full consultation document is available from Google Drive and DropBox. GBIF is:
...seeking input from all GBIF Participants and stakeholders on the following questions:
Do you have any comments on the plan to associate all GBIF-mediated data with a machine readable licence?
Do you have an opinion on the relative merits of Creative Commons, Open Data Commons or other licence types in the context of the GBIF network?
Which of the two options described in section 8 of this document should GBIF pursue? If you support “Option 2”, would your position be modified if it resulted in a significant decrease in data published to the GBIF network?
The two options referred to above are:
Option 1 – Support restrictions on commercial use
Option 2 – Only support fully free-and-open data
If you have opinions on licensing biodiversity data, please read the consultation document and send your thoughts send to licensing@gbif.org by 5 September 2013.
In my last post I discussed why I thought the decision of The Plant List to use a restrictive license (CC-BY-NC-ND) was such a poor choice. CC-BY-NC-ND states that
You may not alter, transform, or build upon this work.
To make this point more concrete, I've created this site:
to show the kinds of things that The Plant List's choice of license prevents the taxonomic community from doing. As a first step I'm exploring linking the names in the list to the primary scientific literature, as this video demonstrates:
For example, we can take a name like Begonia zhengyiana Y.M.Shui, parse the bibliographic citation provided by The Plant List (via IPNI), and locate the actual paper online, in this case it's freely available as a PDF:
Now we can see a drawing of the plant, and instead of simply trusting that the compilers of The Plant List have correctly interpreted this paper, we can see for ourselves. Down the track, we could imagine mining this paper for details about the plant, such as its morphology and geographic distribution. This requires the link to the original literature, which The Plant List lacks.
A good chunk of the recent plant taxonomic literature has DOIs, for example journals such as the Kew Bulletin and Novon. Playing with some scripts I've managed to associate nearly 9000 accepted names with a DOI, and that's by looking at only a few journals. There are lots more DOIs to be found, but because of the way botanical nomenclators record references (see my post Nomenclators + digitised literature = fail) it can be something of a challenge to find them. This task isn't helped by the fairly lax way some publishers enter data in CrossRef (Cambridge University Press I'm looking at you). The other obvious source of digitised literature is, of course, BHL, and that's next on the list of resources to play with.
Experiments with The Plant List is very crude, and I've barely scratched the surface of linking names to primary literature. That said, given that there are exactly zero links between names and digital literature in The Plant List, I'd argue that my site adds value to the data in that The Plant List. And that's my point — by making data available for others to play with, you enable others to add value to that data. By choosing a CC-BY-NC-ND license, The Plant List has killed that possibility.
So, my question for The Plant List is "why did you do that?"
OK, so that makes getting the complete data set a little tedious (there are 620 plant families in the data set), but we can still do it without too much hassle (in fact, I've grabbed the complete data set while writing this blog post). Then I see that the data is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs (CC BY-NC-ND) license. Creative Commons is good, right? In this case, not so much. The CC BY-NC-ND license includes the clause:
You may not alter, transform, or build upon this work.
So, you can look but not touch. You can't take this data (properly attributed, or course) and build your own list, for example with references linked to DOIs, or to the Biodiversity Heritage Library (which is, of course, exactly what I plan to do). That's a derivative work, and the creators of the Plant List don't want you to do that. Despite this, the Plant List want us to use the data:
Use of the content (such as the classification, synonymised species checklist, and scientific names) for publications and databases by individuals and organizations for not-for-profit usage is encouraged, on condition that full and precise credit is given to The Plant List and the conditions of the Creative Commons Licence are observed.
Great, but you've pretty much killed that by using BY-NC-ND. Then there's this:
If you wish to use the content on a public portal or webpage you are required to contact The Plant List editors at editors@theplantlist.org to request written permission and to ensure that credits are properly made.
Really? The whole point of Creative Commons is that the permissions are explicit in the license. So, actually I don't need your permission to use the data on a public portal, CC BY-NC-ND gives me permission (but with the crippling limitation that I can't make a derivative work).
So, instead of writing a post congratulating the Royal Botanic Gardens, Kew and Missouri Botanical Garden (MOBOT) for releasing this data, I'm left spluttering in disbelief that they would hamstring its use through such a poor choice of license. Kew and MOBOT could have made the Plant List available as open data using one of the licenses listed on the Open Definition web site, such as putting the data in the public domain (for example, or using a Creative Commons CC0 license). Instead, they've chosen a restrictive license which makes the data closed, effectively killing the possibility for people to build upon the effort they've put into creating the list. Why do biodiversity data providers seem determined to cling to data for dear life, rather than open it up and let people realise its potential?
In conjunction with the TV show, the Wellcome Trust has launched the Interactive Tree of Life, a Flash-based view of the tree of life. There's also a blog about the project. Here's a demo of the tree:
The tree looks very nice, and a lot of work has gone into it, but I am somewhat underwhelmed. The tree itself is tiny, and does a poor job of conveying the relative diversity of life (e.g., no plants, bacteria, few arthropods, etc.). It displays the tree on a 2D plane, and the user can move relative to that plane. I'm not convinced this is the best way to display large trees. Something modelled on Perceptive Pixel's demo might be more useful. I blogged about this last year, but the video host service has disappeared. You can see the tree display 50 seconds in to the video below:
Out of curiosity I grabbed the code from the web site (a 1.5Gb file) and had a quick look. The bulk of the files are media, such as images, movies, and 3D Maya models. There's some nice stuff here. The actual tree itself is there in New Hampshire eXtended format. Here it is displayed in TreeView X: