Sunday, 4 June 2017
Saturday, 3 June 2017
Friday, 2 June 2017
Hacking the principles of #openscience #workshops
In a previous post, I discussed the key elements that really stood out for me in recent workshops associated with open science, data science, and ecology. Summer workshop season is upon us, and here are some principles to consider that can be used to hack a workshop. These hacks can be applied a priori as an instructor or in situ as a participant or instructor by engaging with the context from a pragmatic, problem-solving perspective.
Principles
1. Embrace open pedagogy.
2. Use and current best practices from traditional teaching contexts.
3. Be learner centered.
4. Speak less, do more.
5. Solve authentic challenges.
Hacks (for each principle)
1. Prepare learning outcomes for every lesson.
2. Identify solve-a-problem opportunities in advance and be open to ones that emerge organically during the workshop.
3. Use no slide decks. This challenges the instructor to more directly engage with the students and participants in the workshop and leaves space for students to shape content and narrative to some extent. Decks lock all of us in. This is appropriate for some contexts such as conference presentations, but workshops can be more fluid and open.
4. Plan pauses. Prepare your lessons with gaps for contributions. Prepare a list of questions to offer up for every lesson and provide time for discussion of solutions.
5. Use real evidence/data to answer a compelling question (scale can be limited, approach beta as long as an answer is provided, and the challenge can emerge if teaching is open and space provided for the workshop participants to ideate).
Final hack that is a more general teaching principle, consider keeping all teaching materials within a single ecosystem that then references outwards only as needed. For me, this has become all content prepared in RStudio, knitted to html, then pushed to GitHub gh-pages for sharing as a webpage (or site). Then participants can engage in all ideas and content including code, data, ideas in one place.
var vglnk = { key: '949efb41171ac6ec1bf7f206d57e90b8' }; (function(d, t) { var s = d.createElement(t); s.type = 'text/javascript'; s.async = true; s.src = '//cdn.viglink.com/api/vglnk.js'; var r = d.getElementsByTagName(t)[0]; r.parentNode.insertBefore(s, r); }(document, 'script'));
June Monthly Fantasy Baseball Writer’s Poll
Editor’s Note: The following poll features MLB picks from the Fantrax writers for the month of June. Want to play fantasy baseball using stats for only the month? Now you can with our Fantrax monthly fantasy baseball leagues. Put together a team of 25 players under a salary cap with payouts up to 90%. Deadline is Monday, June 5!
Click here to join a league and build your team now!
| C Sleeper | Tom Murphy | Tom Murphy | Andrew Knapp | Tom Murphy | Christian Vazquez | Martin Maldonado | Francisco Cervelli | Manny Pina |
| 1B Sleeper | Justin Bour | Matt Adams | Josh Bell | Rhys Hoskins | Jesus Aguilar | Matt Adams | Matt Adams | Yonder Alonso |
| 2B Sleeper | Whit Merrifield | Neil Walker | Chase Utley | Chris Taylor | Whit Merrifield | Yolmer Sanchez | Yolmer Sanchez | Whit Merrifield |
| 3B Sleeper | Alex Bregman | Adrian Beltre | Matt Davidson | Alex Bregman | Jose Reyes | Yangervis Solarte | Adrian Beltre | Wilmer Flores |
| SS Sleeper | Tim Anderson | Tim Anderson | JT Riddle | Jose Peraza | Chad Pinder | Chris Owings | Jose Peraza | Tim Anderson |
| OF Sleeper | Bradley Zimmer | Ben Revere | Cameron Maybin | David Dahl | Franchy Cordero | Kevin Kiermaier | Aaron Hicks | Matt Holliday |
| SP Sleeper | Sean Manaea | Trevor Bauer | Taijuan Walker | Jameson Taillon | Tyson Ross | Sean Manaea | Jameson Taillon | Robert Gsellman |
| RP Sleeper | Ryan Madson | Felipe Rivero | Koda Glover | Corey Knebel | Cam Bedrosian | Cam Bedrosian | Matt Bush | Corey Knebel |
| MiLB Call-Up | Amed Rosario | Yoan Moncada | Lewis Brinson | Austin Meadows | Lucas Giolito | Reynaldo Lopez | Amed Rosario | Lucas Gioloto |
| Most HR | Khris Davis | Nelson Cruz | Giancarlo Stanton | Kris Bryant | Aaron Judge | Jose Bautista | Aaron Judge | Aaron Judge |
| Most SB | Billy Hamilton | Billy Hamilton | Billy Hamilton | Billy Hamilton | Billy Hamilton | Jose Peraza | Billy Hamilton | Billy Hamilton |
| SP Lowest ERA | Carlos Martinez | Clayton Kershaw | Max Scherzer | Clayton Kershaw | Clayton Kershaw | Jacob deGrom | Trevor Bauer | Dallas Keuchel |
| SP Most K's | Max Scherzer | Chris Sale | Chris Sale | Chris Sale | Robbie Ray | Stephen Strasburg | Trevor Bauer | Chris Sale |
| RP Most Saves | Craig Kimbrel | Wade Davis | Craig Kimbrel | Ken Giles | Cody Allen | Dellin Betances | Ken Giles | Craig Kimbrel |
| Team Most Wins | Dodgers | Cubs | Red Sox | Astros | Astros | Indians | Yankees | Astros |
| Team Most Losses | Angels | Padres | Royals | Angels | Angels | Giants | Brewers | Phillies |
The post June Monthly Fantasy Baseball Writer’s Poll appeared first on Fantrax.
The code (and other stuff…)
I’ve received a couple of emails or comments on one of the General Election posts to ask me to share the code I’ve used.
In general, I think this is a bit dirty and lots could be done in a more efficient way $-$ effectively, I’m doing this out of my own curiosity and while I think the model is sensible, it’s probably not “publication-standard” (in terms of annotation etc).
Anyway, I’ve created a (rather plain) GitHub repository, which contains the basic files (including R script, R functions, basic data and JAGS model). Given time (which I’m not given…), I’d like to put a lot more description and perhaps also write a Stan version of the model code. I could also write a more precise model description $-$ I’ll try to update the material on the GitHub.
On another note, the previous posts have been syndicated in a couple of places (here and here), which was nice. And finally, here’s a little update with the latest data. As of today, the model predicts the following seats distribution.
mean sd 2.5% median 97.5%
Conservative 352.124 3.8760350 345 352 359
Labour 216.615 3.8041091 211 217 224
UKIP 0.000 0.0000000 0 0 0
Lib Dem 12.084 1.8752228 8 12 16
SNP 49.844 1.8240041 45 51 52
Green 0.000 0.0000000 0 0 0
PCY 1.333 0.9513233 0 2 3
Other 0.000 0.0000000 0 0 0
Labour are still slowly but surely gaining some ground $-$ I’m not sure the effect of the debate earlier this week (which was deserted by the PM) are visible yet as only a couple of the polls included were conducted after that.
Another interesting thing (following up on this post) is the analysis of the marginal seats that the model predicts to swing from the 2015 Winners. I’ve updated the plot, which now looks as below.
Now there are 30 constituencies that are predicted to change hand, many still towards the Tories. I am not a political scientists, so I don’t really know all the ins and outs of these, but I think a couple of examples are quite interesting and I would venture some comment…
So, the model doesn’t know about the recent by-elections of Copeland and Stoke-on-Trent South and so still label these seats as “Labour” (as they were in 2015), although the Tories have actually now control of Copeland.
In the prediction given the polls and the impact of the EU referendum (both were strong Leave areas with with 60% and 70% of the preference, respectively) and the Tories did well in 2015 (36% vs Labour’s 42% in Copeland and 33% to Labour’s 39% in 2015). So, the model is suggesting that both are likely to switch to the Tories this time around.
In fact, we know that at the time of the by-election, while Copeland (where the contest was mostly Labour v Tories) did go blue, Stoke didn’t. But there, the main battle was between the Labour’s and the UKIP’s candidate (UKIP had got 21% in 2015). And the by-election was fought last February, when the Tories lead was much more robust that it probably is now.
Another interesting area is Twickenham $-$ historically a constituency leaning to the Lib Dems, which was captured by the Conservatives in 2015. But since then, in another by-election the Tories have lost another similar area (Richmond Park,with a massive swing) and the model is suggesting that Twickenham could follow suit, come next Thursday.
Finally, Clapton was the only seat won by UKIP in 2015, but since then, the elected MP (a former Tory-turned-UKIP) has defected the party and is not contesting the seat. This, combined with the poor standing of UKIP in the polls produces the not surprisingly outcome that Clapton is predicted to go blue with basically no uncertainty…
These results look reasonable to me $-$ not sure how life will turn out of course. As many commentators have noted much may depend on the turn out among the younger. Or other factors. And probably there’ll be another instance of the “Shy-Tory effect” (I’ll think about this if I get some time before the final prediction). But the model does seem to make some sense…
Thursday, 1 June 2017
Python and R top 2017 KDnuggets rankings
The results of KDnuggets' 18th annual poll of data science software usage are in, and for the first time in three years Python has edged out R as the most popular software. While R increased its share of usage from 45.7% in last year's poll to 52.1% this year, Python's usage among data scientists increased even more, from 36.6% of users in 2016 to 52.6% of users this year.
There were some interesting moves in the long tail, as well. Several tools entered the KDNuggets chart for the first time, including Keras (9.5% of users), PyCharm (9.0%) and Microsoft R Server (4.3%). And several returning tools saw big jumps in usage, including Microsoft Cognitive Toolkit (3.4% of users), Tensorflow (20.2%) and Power BI (10.2%). Microsoft SQL Server increased its share to 11.6% (up from 10.8%), whereas SAS (7.1%) and Matlab (7.4%) saw declines. Julia, somewhat surprisingly, remained flat at 1.1%.
For the complete results and analysis of the 2017 KDnuggets data science software poll, follow the link below.
KDnuggets: New Leader, Trends, and Surprises in Analytics, Data Science, Machine Learning Software Poll
A Primer in functional Programming in R (part -2)
In the last exercise, We have seen how powerful functional programming principles can be and how it can drammatically increase the readablity of the code and how easily you can work with them .In this set of exercises we will look at functional programming principles with purrr.Purrr comes with a number of interesting features and is really useful in writing clean and concise code . Please check the documentation and load the purrr library in your R session before starting these exercise set .
Answers to the exercises are available here
If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.
Exercise 1
From the airquality dataset( available in base R ) , Find the mean ,median ,standard deviation of all columns using map functions .
Exercise 2
In the same dataset,find 95th percentile of each column excluding the NA values
Exercise 3
Load the iris dataset ,with help of pipe and map functions find out the mean of the relavant columns.Keep in mind mean is meant for numeric columns ,so you may need multiple map like functions.I expect the output as a dataframe .
Exercise 4
I have a vector x
Correcting bias in meta-analyses: What not to do (meta-showdown Part 1)
Previous investigations typically looked only at publication bias or questionable research practices QRPs (but not both), used non-representative study-level sample sizes, or only compared few bias-correcting techniques, but not all of them. Our goal was to simulate a research literature that is as realistic as possible for psychology. In order to simulate several research environments, we fully crossed five experimental factors: (1) the true underlying effect, δ (0, 0.2, 0.5, 0.8); (2) between-study heterogeneity, τ (0, 0.2, 0.4); (3) the number of studies in the meta-analytic sample, k (10, 30, 60, 100); (4) the percentage of studies in the meta-analytic sample produced under publication bias (0%, 60%, 90%); and (5) the use of QRPs in the literature that produced the meta-analytic sample (none, medium, high).
This blog post summarizes some insights from our study, internally called “meta-showdown”. Check out the preprint; and the interactive app metaExplorer. The fully reproducible and reusable simulation code is on Github, and more information is on OSF.
In this blog post, I will highlight some lessons that we learned during the project, primarily focusing on what not do to when performing a meta-analysis.
Limitation of generalizability disclaimer: These recommendations apply to typical sample sizes, effect sizes, and heterogeneities in psychology; other research literatures might have different settings and therefore a different performance of the methods. Furthermore, the recommendations rely on the modeling assumptions of our simulation. We went a long way to make them as realistic as possible, but other assumptions could lead to other results.
Never trust a naive random effects meta-analysis or trim-and-fill (unless you meta-analyze a set of registered reports)If studies have no publication bias, nothing can beat plain old random effects meta-analysis: it has the highest power, the least bias, and the highest efficiency compared to all other methods. Even in the presence of some (though not extreme) QRPs, naive RE performs better than all other methods. When can we expect no publication bias? If (and, in my opinion only if) we meta-analyze a set of registered reports.
But.
In any other setting except registered reports, a consequential amount of publication bias must be expected. In the field of psychology/psychiatry, more than 90% of all published hypothesis tests are significant (Fanelli, 2011) despite the average power being estimated as around 35% (Bakker, van Dijk, & Wicherts, 2012) – the gap points towards a huge publication bias. In the presence of publication bias, naive random effects meta-analysis and trim-and-fill have false positive rates approaching 100%:
More thoughts about trim-and-fill’s inability to recover δ=0 are in Joe Hilgard’s blog post. (Note: this insight is not really new and has been shown multiple times before, for example by Moreno et al., 2009, and Simonsohn, Nelson, and Simmons, 2014).
Our recommendation: Never trust meta-analyses based on naive random effects and trim-and-fill, unless you can rule out publication bias. Results from previously published meta-analyses based on these methods should be treated with a lot of skepticism.
Do not use p-curve and p-uniform for effect size estimation (under heterogeneity)
As a default, heterogeneity should always be expected – even under the most controlled conditions, where many labs perform the same computer-administered experiment, a large proportion showed significant and substantial amounts of between-study heterogeneity (cf. ManyLabs 1 and 3; see also our supplementary document for more details). p-curve and p-uniform assume homogeneous effect sizes, and their performance is impacted to a large extent by heterogeneity:
As you can see, all other methods retain the nominal false positive rate, but p-curve and p-uniform go through the roof as soon as heterogeneity comes into play (see also McShane, Böckenholt, & Hansen, 2016; van Aert et al., 2016).
Under H1, heterogeneity leads to overestimation of the true effect:
(additional settings for these plots: no QRPs, no publication bias, k = 100 studies, true effect size = 0.5)
Note that in their presentation of p-curve, Simonsohn et al. (2014) emphasize that, in the presence of heterogeneity, p-curve is intended as an estimate of the average true effect size among the studies submitted to p-curve (see here, Supplement 2). p-curve may indeed yield an accurate estimate of the true effect size among the significant studies, but in our view, the goal of bias-correction in meta-analysis is to estimate the average effect of all conducted studies. Of course this latter estimation hinges on modeling assumptions (e.g., that the effects are normally distributed), which can be disputed, and there might be applications where indeed the underlying true effect of all significant studies is more interesting.
Furthermore, as McShane et al (2016) demonstrate, p-curve and p-uniform are constrained versions of the more general three-parameter selection model (3PSM; Iyengar & Greenhouse, 1988). The 3PSM estimates (a) the mean of the true effect, δ, (b) the heterogeneity, τ, and (c) the probability that a non-significant result enters the literature, p. The constraints of p-curve and p-uniform are: 100% publication bias (i.e., p = 0) and homogeneity (i.e., τ = 0). Hence, for the estimation of effect sizes, 3PSM seems to be a good replacement for p-curve and p-uniform, as it makes these constraints testable.
Our recommendation: Do not use p-curve or p-uniform for effect size estimation when heterogeneity can be expected (which is nearly always the case).
Ignore overadjustments in the opposite directionMany bias-correcting methods are driven by QRPs – the more QRPs, the stronger the downward correction. However, this effect can get so strong, that methods overadjust into the opposite direction, even if all studies in the meta-analysis are of the same sign:
Note: You need to set the option “Keep negative estimates” to get this plot.
Our recommendation: Ignore bias-corrected results that go into the opposite direction; set the estimate to zero, do not reject H₀.
Typical small-study effects (e.g., by p-hacking or publication bias) induce a negative correlation between sample size and effect size – the smaller the sample, the larger the observed effect size. PET-PEESE aims to correct for that relationship. In the absence of bias and QRPs, however, random fluctuations can lead to a positive correlation between sample size and effect size, which leads to a PET and PEESE slope of the unintended sign. Without publication bias, this reversal of the slope actually happens quite often.
See for example the next figure. The true effect size is zero (red dot), naive random effects meta-analysis slightly overestimates the true effect (see black dotted triangle), but PET and PEESE massively overadjust towards more positive effects:
PET-PEESE was never intended to correct in the reverse direction. An underlying biasing process would have to systematically remove small studies that show a significant result with larger effect sizes, and keep small studies with non-significant results. In the current incentive structure, I see no reason for such a process.
Our recommendation: Ignore the PET-PEESE correction if it has the wrong sign.
PET-PEESE sometimes overestimates, sometimes underestimates
A bias can be more easily accepted if it always is conservative – then one could conclude: “This method might miss some true effects, but if it indicates an effect, we can be quite confident that it really exists”. Depending on the conditions (i.e., how much publication bias, how much QRPs, etc.), however, PET/PEESE sometimes shows huge overestimation and sometimes huge underestimation.
For example, with no publication bias, some heterogeneity (τ=0.2), and severe QRPs, PET/PEESE underestimates the true effect of δ = 0.5:
In contrast, if no effect exists in reality, but strong publication bias, large heterogeneity and no QRPs, PET/PEESE overestimates at lot:
In fact, the distribution of PET/PEESE estimates looks virtually identical for these two examples, although the underlying true effect is δ = 0.5 in the upper plot and δ = 0 in the lower plot. Furthermore, note the huge spread of PET/PEESE estimates (the error bars visualize the 95% quantiles of all simulated replications): Any single PET/PEESE estimate can be very far off.
Our recommendation: As one cannot know the condition of reality, it is safest not to use PET/PEESE at all.
Recommendations in a nutshell: What you should not use in a meta-analysis
Again, please consider the “limitations of generalizability” disclaimer above.
When you can exclude publication bias (i.e., in the context of registered reports), do not use bias-correcting techniques. Even in the presence of some QRPs they perform worse than plain random effects meta-analysis. In any other setting except registered reports, expect publication bias, and do not use random effects meta-analysis or trim-and-fill. Both will give you a 100% false positive rate in typical settings, and a biased estimation. Under heterogeneity, p-curve and p-uniform overestimate the underlying effect and have false positive rates >= 50% Even if all studies entering a meta-analysis point into the same direction (e.g., all are positive), bias-correcting techniques sometimes overadjust and return a significant estimate of the opposite direction. Ignore these results, set the estimate to zero, do not reject H₀. Sometimes PET/PEESE adjust into the wrong direction (i.e., increasing the estimated true effect size)As with any general recommendations, there might be good reasons to ignore them.
Additional technical recommendations The p-uniform package (v. 0.0.2) very rarely does not provide a lower CI. In this case, ignore the estimate. Do not run p-curve or p-uniform onA Partial Remedy to the Reproducibility Problem
Several years ago, John Ionnidis jolted the scientific establishment with an article titled, “Why Most Published Research Findings Are False.” He had concerns about inattention to statistical power, multiple inference issues and so on. Most people had already been aware of all this, of course, but that conversation opened the floodgates, and many more issues were brought up, such as hidden lab-to-lab variability. In addition, there is the occasional revelation of outright fraud.
Many consider the field to be at a crisis point.
In the 2014 JSM, Phil Stark organized a last-minute session on the issue, including Marcia McNutt, former editor of Science and Yoav Benjamini of multiple inference methodology fame. The session attracted a standing-room-only crowd.
In this post, Reed Davis and I are releasing the prototype of an R package that we are writing, revisit, with the goal of partially remedying the statistical and data wrangling aspects of this problem. It is assumed that the authors of a study have supplied (possibly via carrots or sticks) not only the data but also the complete code for their analyses, from data cleaning up through formal statistical analysis.
There are two main aspects:
The package allows the user to “replay” the authors’ analysis, and most importantly, explore other alternate analyses that the authors may have overlooked. The various alternate analyses may be saved for sharing. Warn of statistical errors, such as: overreliance on p-values; need for multiple inference procedures; possible distortion due to outliers; etc.The term user here could refer to several different situations:
The various authors of a study, collaborating and trying different analyses during the course of the study. Reviewers of a paper submitted for publication on the results of the study. Fellow scientists who wish to delve further into the study after it is published.The package has text and GUI versions. The latter is currently implemented as an RStudio add-in.
The package is on my GitHub site, and has a fairly extensive README file introducing the goals and usage.










