Tuesday, June 29, 2010

Error bars

Error bars are a necessary part of science reporting, but (as so eloquently described here) they are often misunderstood and frequently misused (including of course by myself). Whilst there has been much written about it (not to mention the descriptions in actual statistics books...), I came across a short paper on the use of error bars in the field of experimental biology. Eight rules are proposed to help with the use and interpretation of error bars, but they aren't specific to biology, so I thought I'd summarise them:

Rule 1:
When showing error bars, always describe in the figure legends what they are.

Rule 2:
The sample size/number of independently performed experiments (i.e. n) must be stated in the figure legend.

Rule 3:
Error bars and statistics should only be shown for independently repeated experiments, and never for replicates.

Rule 4:
For very small values of n (e.g. 3), it is better to simply plot the individual data points rather than showing error bars and statistics.

Rule 5:
95% confidence intervals capture the true mean on 95% of occasions, so you can be 95% confident that your interval includes the true mean.

Rule 6:
How standard error bars relate to 95% confidence intervals; when n=3, and doublethe SE bars don't overlap, P <>

Rule 7:
With 95% CIs and n = 3, overlap of one full arm indicates P approx 0.05, and overlap of half an arm indicates P approx 0.01.

Rule 8:
In the case of repeated measurements on the same group (animals, individuals, cultures, or reactions, for example) CIs or SE bars are irrelevant to comparisons within the same group.

They really are quite basic, but it's useful to be reminded of rules 1-5 occasionally. Another part of the paper I quite liked was the single sentence summary of P values (as the result of t-tests for example), but more particularly how to interpret them.

If you carry out a statistical significance test, the result is a P value, where P is the probability that, if there really is no difference, you would get, by chance, a difference as large as the one you observed, or even larger.

Cumming G, Fidler F, & Vaux DL (2007). Error bars in experimental biology. The Journal of cell biology, 177 (1), 7-11 PMID: 17420288

Wednesday, June 23, 2010

A slow shift to cloud computing...

This is my first post in a while, I realise. I can't promise that this is the start of more frequent posting, but can say that I still like the idea of blogging (in preference to all these twitter-like update methods, including facebook updates, which I see as ultimately pointless and a little annoying, except for in very limited contexts), and don't want this little blog to die completely...

...and now moving on.

Cloud computing approaches and applications have been around for a while, and are becoming ever more prevalent. It's quite involved, but they are essentially those programmes that run on a remote networked server, and which are accessed on client machines connected to the network. This often includes the storage of content generated by these applications too - for example, word processing using googledocs. In the past, I've been very resistant and a bit mistrusting of these applications. Partly I think because I like to have everything to hand on my local machine, without having to be dependant on an internet connection - an obvious prerequisite for cloud computing applications - partly because of space limitations on online storage, and partly because of security of data. However, in recent months, I've been using more and more applications that have a cloud-like aspect to them, particularly regarding the storage of data. So for example, I've become particularly addicted to using Tomboy Notes as a note-taking program, especially the cross-platform support, and integration with UbuntuOne (where you can access, view, edit and create notes). Also, I was recently introduced to Mendeley, described as a research management tool (incorporating academic networking tools, and literature statistics, etc), but for me most usefully can be used as a reference management system (using a desktop component). The reference library in this case is held in the cloud, linked to a personal account, and synced with local machines where desired. I'm not completely familiar with it yet, but I'm getting to like it. Finally, I'm now an avid user and admirer of Dropbox, which is a great document sync tool. Anybody with further experiences of either of these tools, or indeed with similar tools I haven't mentioned, then please let me know.

Back to my point. It is that the ability to access personal data, often in a proprietory format, from any computer regardless of operating system and installed programs is a very useful tool. However, the additional functionality afforded by locally running programs (such as the integration between a word processor and a reference manager) means that I'm not going to go completely into the cloud just yet.