From PMID to BibTeX via BioRuby

View the article’s original source
Author: nsaunders

Chris writes:

Nothing like searching for an answer (PMIDs->Bibtex) and finding someone else pointing back to your own solution! http://t.co/ZOm0cK6o0d

— Chris Miller (@chrisamiller) March 17, 2015

The blog post in question concerns conversion of PubMed PMIDs to BibTeX citations. However, a few things have changed since 2010.

Here’s what currently works.

Filed under: bioinformatics, programming, ruby Tagged: bioruby, eutils, pubmed

Note to journals: “methodologically sound” applies to figures too

View the article’s original source
Author: nsaunders

PeerJ, like PLoS ONE, aims to publish work on the basis of “soundness” (scientific and methodological) as opposed to subjective notions of impact, interest or significance. I’d argue that effective, appropriate data visualisation is a good measure of methodology. I’d also argue that on that basis, Evolution of a research field – a micro (RNA) example fails the soundness test.

Figure 1: miRNA publications per year

Figure 1: miRNA publications per year

Let’s start with Figure 1. Equally spaced divisions on the x-axis, but the years are not equally spaced – 1993, 1996, 1997 for example. Even worse is the attempt to illustrate a rapid increase after 2004 using broken bars and a second y-axis. It’s confusing and messy.

@neilfws @thePeerJ someone needs to learn about log scales…

— Chris Cole (@drchriscole) March 17, 2015

Figure 2: language of publication

Figure 2: language of publication

Some of these crimes are repeated in Figure 2, which also introduces an ugly shading scheme to distinguish languages. When you look at it, do you think “black…aha, black = English” ? No, you do not. There’s no need for different shading or colour here (it’s not even visible for two bars); the bars are readily distinguishable from the x-axis labels.

Figure 3 repeats the shading crime and Table 1 is somewhat superfluous, as it contains much of the same data. Several more tables follow, containing data which might be better presented as charts.

Figure 4: all the previous horrors

Figure 4: all the previous horrors

Figure 4 combines all the previous horrors into 3 panels. We could go on, but let’s not. You can see the rest for yourself, it’s open access.

Publication on the basis of “soundness” need not mean sacrifices in quality. Ideally, someone at some stage in the process – a mentor before submission, a reviewer, an editor – should notice when figures are not produced to an appropriate standard and suggest improvements. I see a lot of failures like this one in the literature and the causes run right through the science career timeline. It starts with poor student training and ends with reviewers and editors who don’t know how to assess the quality of data analysis/visualisation.

It’s easy to blame “peer review lite”, but there are deeper, systemic issues of grave concern here.

Filed under: publications, statistics Tagged: peer review, peerj, quality, visualisation

Some brief thoughts on the end of FriendFeed

View the article’s original source
Author: nsaunders

There was a time, around 2009 or so, when almost every post at this blog was tagged “friendfeed”. So with the announcement (which frankly I expected 5 years ago) that it is to be shut down, I guess a few words are in order.

I’m thankful to FriendFeed for facilitating many of my current online friendships. It was uniquely successful in creating communities composed of people with an interest in how to do science online, not just talk about (i.e. communicate) science online. It was justly famous for bringing together research scientists with other communities: librarians in particular, people from the “tech world”, patient advocates, educators – all under the umbrella of a common interest in “open science”. We even got a publication or two out of it.

To this day I am not sure why it worked so well. One key feature was that it allowed people to coalesce around pieces of information. In contrast to other networks it was the information, presented via a sparse, functional interface, that initially brought people together, as opposed to the user profile. There was probably also a strong element of “right people in the right place at the right time.”

It’s touching that people are name-checking me on Twitter regarding the news of the shutdown, given that no trace of my FriendFeed activity remains online. Realising that my activity was getting more and more difficult to retrieve for archiving and that bugs were never going to be fixed, I opted several years ago to delete my account. The loss of my content pains me to this day, but inaccurate public representation of my activities due to poor technical implementation pains me more.

I’ve seen a few reactions along the lines of “what is all the fuss about.” How short is our collective memory. To those people: look at Facebook, Yammer or even Twitter and ask yourself where the idea of a stream of items with associated discussion came from.

Farewell then FriendFeed, pioneer tool of the online open science community. We never did find a tool quite as good as you.

Filed under: networking, open science Tagged: friendfeed

Make prettier documents by reusing chunks in RMarkdown

View the article’s original source
Author: nsaunders

No revelations here, just a little R tip for generating more readable documents.

Screenshot-RStudio.png

Original with lots of code at the top

There are times when I want to show code in a document, but I don’t want it to be the first thing that people see. What I want to see first is the output from that code. In this silly example, I want the reader to focus their attention on the result of myFunction(), which is 49.

---
title: "Testing chunk reuse"
author: "Neil Saunders"
date: "24/02/2015"
output: html_document
---

## Introduction
Here is my very interesting document. But first, let me show you my long and ugly R function.

```{r chunk 1}
# it's not really long and ugly
# it just squares the input
# but imagine that it is long and ugly

myFunction <- function(x) {
  print(x ^ 2)
}

myFunction(7)
```

Screenshot-RStudio-1.png

Function use before definition = error

I could define myFunction() later in the document but of course that leads to an error when the function is called before it has been defined.

---
title: "Testing chunk reuse"
author: "Neil Saunders"
date: "24/02/2015"
output: html_document
---

## Introduction
Here is my very interesting document.

```{r chunk1}
myFunction(7)
```

## This is chunk 2
My long and ugly R function is now down here.

```{r chunk2}
# it's not really long and ugly
# it just squares the input
# but imagine that it is long and ugly

myFunction <- function(x) {
  print(x ^ 2)
}
```

Solution: use the chunk option ref.label to call chunk 2 from chunk 1. You can also use echo=FALSE to hide chunk1 in the final document, but still see the code (in chunk 2) and its output.

---
title: "Testing chunk reuse"
author: "Neil Saunders"
date: "24/02/2015"
output: html_document
---

## Introduction
Here is my very interesting document.

Chunk 1 is calling chunk 2 here, but you can't see it.
```{r chunk1, ref.label="chunk2", echo=FALSE}
```

## This chunk is unnamed but can now use code from chunk 2
```{r}
myFunction(7)
```

## This is chunk 2
My long and ugly R function is now down here.

```{r chunk2}
# it's not really long and ugly
# it just squares the input
# but imagine that it is long and ugly

myFunction <- function(x) {
  print(x ^ 2)
}
```

Screenshot-RStudio-2.png

The result of calling chunk2 from chunk1

And here’s the result.
Filed under: programming, R, statistics Tagged: how to, knitr, rmarkdown, rstats

Academic Karma: a case study in how not to use open data

View the article’s original source
Author: nsaunders

Update: in response to my feedback, auto-generated profiles without accounts are no longer displayed at Academic Karma. Well done and thanks to them for the rapid response.

A news story in Nature last year caused considerable mirth and consternation in my social networks by claiming that ResearchGate, a “Facebook for scientists”, is widely-used and visited by scientists. Since this is true of nobody that we know, we can only assume that there is a whole “other” sub-network of scientists defined by both usage of ResearchGate and willingness to take Nature surveys seriously.

You might be forgiven, however, for assuming that I have a profile at ResearchGate because here it is. Except: it is not. That page was generated automatically by ResearchGate, using what they could glean about me from bits of public data on the Web. Since they have only discovered about one-third of my professional publications, it’s a gross misrepresentation of my achievements and activity. I could claim the profile, log in and improve the data, but I don’t want to expose myself and everyone I know to marketing spam until the end of time.

One issue with providing open data about yourself online is that you can’t predict how it might be used. Which brings me to Academic Karma.

Academic Karma came to my attention on Twitter via Chris Gunter.

Tipped off to an AcademicKarma profile I did not set up for myself. Looks like years of reviewing/editing mean zilch. http://t.co/45DrnoDvlZ

— Chris Gunter (@girlscientist) February 11, 2015

To which they replied:

@girlscientist Everyone with an @ORCID has an Academic Karma profile, think of it as an Academic directory.

— Academic Karma (@AcademicKarma) February 12, 2015

Everyone with an ORCID? I have one of those. Sure enough, appending my ORCID ID to their URL reveals that I have a profile.

You’ll note that my profile states “no review information shared” and that the data are sourced from ORCID. These are recent changes, brought about by one of my less polite tweets.

and if I don’t want one? More ResearchGate-style bullshit. MT @AcademicKarma Everyone with an ORCID has an Academic Karma profile

— Neil Saunders (@neilfws) February 18, 2015

Karma, apparently, according to someone

Karma, apparently, according to someone

Previously, profiles looked like the one shown in the image, right. In my case, as I have not included any reviewing or editorial activity in my ORCID profile, this resulted in a large, prominent “NA” for so-called “karma earnt”. This gave the misleading impression that I am a bad “corporate citizen”.

To their credit, the people behind Academic Karma made changes to profile views very quickly, based on my feedback. That said, they seemed genuinely bemused by my criticism at times.

@neilfws @AcademicKarma Wow! Genuinely trying to improve peer review here Neil. Value your feedback though on what we could do differently.

— Lachlan Coin (@lachlancoin) February 18, 2015

@neilfws @AcademicKarma @ORCID_Org designed for data re-use, what are we misrepresenting?

— Lachlan Coin (@lachlancoin) February 18, 2015

So let me try to spell it out as best I can.

  1. I object to the automated generation of public profiles, without my knowledge or consent, which could be construed as having been generated by me
  2. I especially object when those profiles convey an impression of my character, such as “someone who publishes but does not review”, based on incomplete and misleading data

I’m sure that the Academic Karma team mean well and believe that what they’re doing can improve the research process. However, it seems to me that this is a classic case of enthusiasm for technological solutions without due consideration of the human and social aspects.

Filed under: networking Tagged: academic karma, orcid, researchgate, social networking

Presentations online for Bioinformatics FOAM 2015

View the article’s original source
Author: nsaunders

Off to Melbourne tomorrow for perhaps my favourite annual work event: the Bioinformatics FOAM (Focus on Analytical Methods) meeting, organised by CSIRO.

Unfortunately, but for good reasons, it’s an internal event this year, but I’m putting my presentations online. I’ll be speaking twice; the first for Thursday is called “Online bioinformatics forums: why do we keep asking the same questions?” It’s an informal, subjective survey of the questions that come up again and again at bioinformatics Q&A forums such as Biostars and my attempt to understand why this is the case. Of course one simple answer might be selection bias – we don’t observe the users who came, found that their question already had an answer and so did not ask it again. I’ll also try to articulate my concern that many people view bioinformatics as a collection of recipe-style solutions to specific tasks, rather than a philosophy of how to do biological data analysis.

My second talk on Friday is called “Should I be dead? a very personal genomics.” It’s a more practical talk, outlining how I converted my own 23andMe raw data to VCF format, for use with the Ensembl Variant Effect Predictor. The question for the end – which I’ve left open – is this: as personal genomics becomes commonplace, we’re going to need simple but effective reporting tools that patients and their clinicians can use. What are those tools going to look like?

Looking forward to spending some time in Melbourne and hopefully catching up with this awesome lady.

Filed under: australia, bioinformatics, meetings Tagged: csiro, foam, presentations

SIGCSE2015 Comings & Goings in New Directions

View the article’s original source
Author: Lisa C. Kaczmarczyk

It’s that time at SIGCSE, when the sheer volume of coffee and unhealthy food is starting to catch up with me. This morning’s egg and cheese on bagel put me over the top. Thank goodness I am not having a cholesterol test any time soon. However, all that protein is good brain food for processing the deluge of activity of the past few days.

There have been some real patterns. I started my teaching career in the community college system and I have never lost a feeling of affinity with them. So I really noticed this year that the Community College contingent is out in force at the conference. The palindromic ACM Committee for Computing Education in Community Colleges (CCECC) has been swooping in to help community college teachers network with one another, work on curriculum development in cyber security, talk about articulation with CS2013 curricular guidelines, host a large networking lunch, and give presentations and workshops in a way that I haven’t seen in all the many long years I’ve been coming to SIGCSE. If you are a community college faculty or know some community college faculty in the computing field, and want to find like minds, I’d definitely recommend dropping them a line. They are active all year, and from what I’ve been told are planning on ramping up opportunities to stay connected.

There has also been a lot of talk about the enrollment surge at many if not most CS departments in the US. I don’t have the full picture yet on what the situation is like outside the US, but all but one person I have spoken to from the US has told me they are experiencing record demand for CS courses. Great on one hand, highly problematic on the other hand because personnel and resources are so strained.

I attended a panel on the subject of the enrollment surge and capacity problems yesterday. The dominant theme was stress and worry. There are many ways to respond to the jump in numbers and many of them are not healthy for students or faculty. The way in which different institutions respond is tied to many things including institutional historical context, what part of the school the department is in (Arts & Sciences, Engineering, Business etc), attitude of administration, budgets, public or private. It is clear however that already there are some short sighted and non sustainable responses such as requiring overloads, increasing class sizes dramatically, enrollment caps and GPA minimums, eliminating non-major classes and electives.

What I didn’t hear, and this worries me significantly, were creative think outside the box ideas for strategically tackling the capacity problem. It’s hard to think creatively and strategically when you are being pressured from all sides on a daily basis to take on more and more. I also noticed that there was a divide in the audience about whether or not this boom is simply the third Bubble, or permanent as a result of economic changes and the ubiquity of computing. When asked, 1/3 of the audience said they thought this a Bubble, 1/3 said long term/permanent, 1/3 had no idea.

Whether or not you think the boom is a short term phenomenon or not is important because it affects how you react to it. We also have to look at history. We’ve been here before; if you were around in the 80s you remember the enrollment surges then and similar responses. As a result of historical memory and contemporary experience and research on the subject, we know that diversity is negatively impacted when we blindly fall back on a “best and the brightest” set of class and programmatic filters. Yet another reason to find a way to get the mental space to come up with creative responses. And to be proactive about sharing those ideas.

I’m on an active search to find people willing to speak out about healthy and creative ways to address the capacity surge. If you have ideas, whether or not they’ve been implemented, especially if your idea is different from the run of the mill ideas, contact me. 

On another note, I’m noticing a generation gap, so to speak, among those at the conference who are plugged in to social media as a mode of communication and those who are not. If you can hold onto your seats until June, you can read in my next ACM Inroads column about why you should pay attention to how communication about our science is taking place on social media. But meanwhile, I’ll point out that there is an active Twitter feed going about this conference #SIGCSE2015 and there you can read a somewhat random but interesting and often informative stream of info about things going on. More importantly, you can get a sense of what people consider important, what they choose to share with others. This matters. Taking the pulse of the community is important to understanding what people care about, what their perspective is, where they are headed.

But when I meet the twitter folk, they are almost always the younger contingent of the SIGCSE conference crowd. Sure, perhaps predictable, and I can only say “YES KEEP COMMUNICATING!”

For the rest of you, those for whom social media is not your best friend and constant companion, consider coming up to speed with some aspect of it. If you want to be plugged in to current and future thought leaders and decision makers and rabble rousers alike, this is a place to go. There is a whole aspect of SIGCSE going on virtually. I’ve met several new and interesting people via SIGCSE twitter exchanges the past few days. We’ve then met in person. People I’d never have met and perspectives I would never have heard. I value all these perspectives. I can, and do, plan on bringing what I have heard into the in person meetings and committees I attend.

Meanwhile, I’m going to get out of this chair and go to…lunch. I hope there is lots of leafy green salad.

Mind Stretching at ACM SIGCAS (Computers & Society) Meeting

View the article’s original source
Author: Lisa C. Kaczmarczyk

Hidden Near a Freeway, Old Meets New

As always, my SIGCSE (ACM Special Interest Group on Computer Science Education) week hit the ground running. Barely had I gotten to my hotel room and hooked up with my roomie than she and I were plotting and planning. So far this year we have not blown up anything or needed to call hotel mechanics. Such a shame.

My day today started off with a bone chilling walk to explore the Kansas City area in search of…whatever. Bone chilling mostly because it was below 20F and I didn’t have winter clothes. My eyebrow had just about frozen together along with the freezing of my knuckles as I kept whipping out my camera to capture something just too good to pass up (see archaeology picture above) when I ran into three guys with no jackets at all (they must be natives because they didn’t look half as brittle as I did in my fleece jacket) who directed me to a local independent coffee shop where I recuperated while supporting free trade coffee.

Some time later I found myself (by design) in the SIGCAS meeting (Special Interest Group on Computers and Society) which traditionally takes place the day before the start of the SIGCSE Technical Symposium. One interesting presentation after another about incorporating socially beneficial projects and activities into the computer science curriculum. Some projects were very local and some were global. From Latina community concerns to water scarcity allocation modeling to Bangladesh. By the end of the afternoon all sorts of ideas were flying through my head.

For example…

What makes a “Good” computing professional? We’re talking “good” in the sense of socially beneficial, rather than technically good. More to the point, how do we know? How do we evaluate this term? (Do we want to evaluate it? Define it?) It’s interesting to think about this because if we want to encourage the integration of socially / environmentally beneficial considerations into the very heart of the curriculum, how do we know we are doing it well? If we want computing professionals to integrate a social consciousness into their work how do we determine what that looks like? Or, do we even want to do this? It is worth pondering from a first principles perspective.

How are Codes of Conduct interpreted across cultures? Several global organizations such as ACM and IEEE have codes of conduct that professional members are asked to adhere to. It hadn’t occured to me until today that this could be tricky due to differences in cultural interpretations of what is ethical. The idea was planted in my head because one of the presenters today said a segment of their students (economically disadvantaged, from some developing nations) said the hardest part of these Codes to adhere to would be the prohibition on taking bribes. Really? The hardest. Well, when you think about cultures where taking bribes is endemic, and business is done that way, … sure it might be really hard to imagine bucking the system. I ask myself…how might one determine how “bad” this activity really is? Might one for example need to follow the chain reaction effect of individual bribes? How bad is it if it gets things done? Whoa….

We know that story telling, making content personal, is an effective way of making material (academic in this case) engaging and accessible. Some programming languages, (many?) don’t, by their very nature, lend themselves to story telling. Java comes to mind. Python. Scratch? As opposed to a language like Alice. So, how might we talk about incorporating story telling into teaching introductory programming? This sounds like a really interesting challenge. Can it be done? I’d love to see ideas kicked around about this.

I gotta say that this was one dynamic meeting. The group made some decisions about action items to take, which, darn it, I missed due to having to boogie off to a meeting of the ACM Education Council. However, I’ll find out and report back on this at a later date with a followup.

Tomorrow, the SIGCSE conference starts. Turbo charged. Stay tuned.

Blog a Month

View the article’s original source
Author: Judy

So it is Blog a Month time. Glad to participate, and above is the prompt. Because I have been teaching Educational Research this year, and not my regular educational technology courses, I have not been posting to this blog with my frequent regularity. Thus, it is good to see a challenge presented and to set aside time to do another post.

Yes, this quotation does resonate with me, especially the first two sentences. Each day, I do focus on purpose, especially when I’m teaching. Even when not in the classroom, am thinking of ideas for the classroom, which is one reason why I turn to Twitter, with a steady stream of tweets for gathering and prompting new ideas. I like to be creative, and sometimes just any tweet will spark an idea. So many wonderful educators share ideas and links to blogs, websites, and tech tools on Twitter. Twitter chats keep me even more focused. Just last night, I happened upon the #Read4Fun chat, a new chat, which garnered over 1200 tweets. How do I know that number? Well, after the chat, I went to Storify to gather them. I collected all of them, but the free version of Storify only allows for 1,000 entries in creating a Storify. Later, I joined the California Educators chat, #CAedchat. The topic was on “pimping your lesson plans,” with links to Google Docs to submit a plan for feedback and suggestions.

Tonight, I will be on #teacheredchat, one of the Twitter chats for which I am an organizer. Being an organizer means I am always in search of good guest moderators. All of this reminds me how critical it is to be a connected educator. Why am I a connected educator? I guess it is because I stay focused every day on my purpose as an educator.

Would love to know how the above quotation resonates with others. Hoping you’ll leave a comment, and if you are one of the participants in the Blog a Month challenge, leave a link to one of your posts.

I have several blogs and will cross post this one on one of those.

Looking forward to making connections and learning about how others stay focused.

The Fact is, Sticking Intently to Facts is Not Enough

View the article’s original source
Author: Lisa C. Kaczmarczyk

Maybe there was a time in my life when I believed that science, that logic, that being rationale was what would lead people to make the “right” decision. Especially about how the world worked and how it could work. Because once you had all the facts lined up, the answers about what to do would be clear. I believed the real problem was that people just needed to be shown the facts. Facts were neutral and told the truth of the matter. I learned that in great part from my dad, for whom logic backed up by facts ruled when it came to decision making.

Dad loved a good debate, especially when it came to science – he was an academic to the core and he loved to tell you what he thought was the logical thing to do based on science and logic. He wanted to hear your side as well. If you had greater solid evidence than he did, he’d graciously concede. But it was very difficult to win an argument with my dad because he never took on a topic until he had more facts and data behind him than most people could ever hope to marshal. When dad advocated for a position he usually came out on top. If he didn’t have the facts, he’d defer the conversation until he could find them. 
I thought of dad this morning after listening to Noah Diffenbaugh at the live-streamed panel “Scientists Communicating Challenging Issues”  at the AAAS meeting. Diffenbaugh, a Stanford climate science researcher, called himself a “fundamentalist” about sticking to the facts, much like my dad. Also like my dad, Diffenbaugh said he was pleased to tell you if he didn’t know anything about a topic; I took this to mean he too would not engage in speculation about it. Finally, both Noah and my dad felt that scientists were best positioned to rigorously discover the way the world worked and explain it to whoever wanted to know. 
But here’s where Diffenbaugh and my dad would have diverged:  Diffenbaugh believes that scientists lose credibility when they suggest policy, or positions, or actions based upon their scientific findings. He lumps the entire spectrum of expressing  an opinion or making a suggestion into advocacy and advocacy as a bad thing. I’ll speculate that my dad wouldn’t have liked the word advocacy either, but he certainly believed that he, as a scientist, had a responsibility to recommend actions to solve humanity’s problems when the science provided facts to support those recommendations.
Diffenbaugh appears to be operating (and here I’m speculating) under the historically debunked idea that if people stay out of the way and don’t get involved in politics, good things will happen. Or at least nothing bad will happen. In his somewhat strange example of talking to a US Senator and climate change skeptic, Diffenbaugh explained how after an hour of answering questions with facts, the Senator told his aid to have him “taken off the list”. He didn’t know what “the list” was and didn’t seem particularly interested in knowing, rather, pointing out he believed he had “balanced the conversation”, and by implication swayed the Congressman’s skepticism of climate change. And, by extension, effected how that Senator would act on the matter in the future.
Lots of assumptions without sufficient facts to back them up. Lots of hopeful thinking that facts would change strongly held views, which, as other panelists pointed out, the research shows is simply not the case. Lots of wishful thinking that staying neutral will earn deep and lasting respect from others. Need we trot out history to drive home this point?
My dad perhaps knew better than anyone that people are irrational beings. As a young boy my father was caught up in Stalin’s ambitions for what the world should look like, spending part of his childhood in Siberia and the rest of it in refugee camps across the Middle East. I suspect the trauma of this experience had a lot to do with why he became an academic and wanted to understand how the world worked. I suspect it also explains why he chose the sciences, where logic and reason provided a well defined and stable framework for understanding the world. 
But most importantly, I believe that early confrontation with reality led my father to believe he had a responsibility to society to take a stand on societal and environmental issues and to defend them as strongly as he could. Credibly. With Science. With Facts. 
You can study history for endless examples of how refusing to express opinions (political or otherwise) had little effect on other people’s attitudes or behavior. Check the research for the peer reviewed studies on human behavior to back it up. 
It’s not merely an academic conversation. It’s about understanding how the world works and acting on that knowledge.