This Visualisation

This visualisation allows one to view the past, present, and (estimated) future gender ratio of authors on academic publications listed on PubMed. The four buttons at the top allow subsetting of the data by journal, research discipline, the author's country of affiliation, and position in the author list (where 'overall' includes all authors).

The lefthand plot shows the estimated author gender ratio for each subset of the data (e.g. a journal, or a scientific discipline) in a certain year. The year can be controlled via the slider. The gender ratio was estimated by fitting a curve to the data, as described in the accompanying paper.

Clicking on a data point in the left plot will bring up a curve showing our estimate of the past, present, and future gender ratio, as well as the author gender ratio and its 95% confidence limits (shown by the error bars, which can be toggled on or off).

Hovering the mouse cursor over a data point shows the sample size in terms of the number of men and women authors, and the number of papers.

The Data

The data were collected by downloading all the ~27 million records on PubMed, and attempting to identify the gender of the authors by matching their given names against the genderize.io database. We assigned each of the journals on PubMed to a research discipline, using PubMed's own categorisation scheme where possible, and tried to identify the country in which each author was based from the address they provided. For clarity, the data accessible through this web app are limited to combinations for which we had a sufficiently large sample size in terms of the number of papers (at least 100), years (at least 5), and authors (at least 50 per year for 5 or more years).

The scripts used to collect, parse and analyse the PubMed data are on Github.

The scripts and data underlying this visualisation are here, and the full dataset (2.5GB) is archived as a SQLite3 file here.

Associated publication

This visualisation accompanies an article published in PLoS Biology.

Holman L, Stuart-Fox D, Hauser CE (2018) The gender gap in science: How long until women are equally represented? PLOS Biology 16(4): e2004956. DOI: 10.1371/journal.pbio.2004956.

Contacts

The gender data were collected by Luke Holman from the School of BioSciences at the University of Melbourne. This data visualisation was made by Errol Lloyd.

FAQ

How do I find a particular journal?

First, select the discipline on which the journal focuses. For example, select 'Biology' if you’d like to look up PLoS Biology. Then click ‘Journals’, and either type the name of the journal into the box, or manually inspect the list of journal names. If you don’t see the journal, it either means we didn’t recover enough data to accurately calculate the gender ratio, or it is classified under a different discipline.

If you still cannot find the journal, please see S1 Data from the associated PLoS Biology publication. Search for the journal of interest, determine which discipline it is in, and then try the web app again. If you cannot find the journal in S1 Data, it means that we did not recover sufficient data for that journal.

You can also access the raw data used in this web app as a .json file here.

The Gender Gap in Academic Publishing
FAQ
Error bars
Small Circles