Last updated: 2023-09-08

Checks: 2 0

Knit directory: fitnessGWAS/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0.4). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 617e85d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rapp.history
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .httr-oauth
    Ignored:    .pversion
    Ignored:    analysis/.DS_Store
    Ignored:    code/.DS_Store
    Ignored:    code/Drosophila_GWAS.Rmd
    Ignored:    data/.DS_Store
    Ignored:    data/derived/
    Ignored:    data/input/.DS_Store
    Ignored:    data/input/.pversion
    Ignored:    data/input/dgrp.fb557.annot.txt
    Ignored:    data/input/dgrp2.bed
    Ignored:    data/input/dgrp2.bim
    Ignored:    data/input/dgrp2.fam
    Ignored:    data/input/huang_transcriptome/
    Ignored:    figures/.DS_Store

Untracked files:
    Untracked:  old_analyses/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/index.Rmd) and HTML (docs/index.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 617e85d lukeholman 2023-09-08 workflowr::wflow_publish("analysis/index.Rmd")
html 9281881 lukeholman 2023-09-08 Build site.
Rmd 7d5fa34 lukeholman 2023-09-08 workflowr::wflow_publish("analysis/index.Rmd")
html 9d18b69 lukeholman 2023-05-15 Build site.
Rmd 79f4aa3 lukeholman 2023-05-15 wflow_publish("analysis/index.Rmd")
html 0ddcb5b lukeholman 2023-05-15 Build site.
Rmd 9fb232d lukeholman 2023-05-15 wflow_publish("analysis/index.Rmd")
html 0ddcf25 lukeholman 2023-04-16 Build site.
Rmd 67235d7 lukeholman 2023-04-16 wflow_publish("analysis/plot_models_variant_effects.Rmd")
html b41137d lukeholman 2023-04-16 Build site.
Rmd 41d5725 lukeholman 2023-04-16 wflow_publish("analysis/index.Rmd")
html 71237e8 lukeholman 2023-03-21 Build site.
Rmd be63a3f lukeholman 2023-03-21 wflow_publish("analysis/index.Rmd")
html 79db4a4 lukeholman 2023-03-21 Build site.
Rmd 24aedf3 lukeholman 2023-03-21 wflow_publish("analysis/index.Rmd")
html 3c8a5db lukeholman 2023-03-21 Build site.
Rmd 8027f80 lukeholman 2023-03-21 wflow_publish("analysis/index.Rmd")
html 3174ec1 lukeholman 2023-03-14 Build site.
Rmd d94e22d lukeholman 2023-03-14 New quant gen and other changes
html ea38a0e lukeholman 2022-07-29 Build site.
Rmd 5795d89 lukeholman 2022-07-29 wflow_publish("analysis/index.Rmd")
html f80d48e lukeholman 2022-07-29 Build site.
Rmd b483da1 lukeholman 2022-07-29 wflow_publish("analysis/index.Rmd")
html 434c1d7 lukeholman 2022-07-29 Build site.
Rmd 13fc628 lukeholman 2022-07-29 wflow_publish("analysis/index.Rmd")
html 1d52711 lukeholman 2022-07-29 Build site.
Rmd 8b7b14d lukeholman 2022-07-29 wflow_publish("analysis/index.Rmd")
html c10e53e lukeholman 2022-07-29 Build site.
Rmd 990336e lukeholman 2022-07-29 wflow_publish("analysis/index.Rmd")
html f04d7d3 lukeholman 2021-11-10 Build site.
html 7449a90 lukeholman 2021-10-01 Build site.
Rmd 01226ab lukeholman 2021-10-01 wflow_publish("analysis/*")
html 6494956 lukeholman 2021-09-26 Build site.
Rmd 3cbeccb lukeholman 2021-09-26 Commit Sept 2021
html a7065f6 lukeholman 2021-09-26 Build site.
Rmd 44c520d lukeholman 2021-09-26 Commit Sept 2021
html 8d14298 lukeholman 2021-09-26 Build site.
Rmd af15dd6 lukeholman 2021-09-26 Commit Sept 2021
html a50524c lukeholman 2021-03-04 Build site.
Rmd 3f847f8 lukeholman 2021-03-04 big first commit 2021
html f5c3861 lukeholman 2021-03-04 Build site.
Rmd 937e1ee lukeholman 2021-03-04 big first commit 2021
html 871ae81 lukeholman 2021-03-04 Build site.
html e112260 lukeholman 2021-03-04 Build site.
html 836a780 lukeholman 2021-03-04 Build site.
html 359ff37 lukeholman 2021-03-04 Build site.
html 5506c4b lukeholman 2021-03-04 Build site.
Rmd 390a393 lukeholman 2021-03-04 big first commit 2021
Rmd 8d54ea5 Luke Holman 2018-12-23 Initial commit
html 8d54ea5 Luke Holman 2018-12-23 Initial commit

This website relates to the paper ‘Pleiotropic fitness effects across sexes and ages in the Drosophila genome and transcriptome’ by Wong and Holman, published in 2023 in the journal Evolution.

Click the headings below to see code, results, plots, tables and figures from this study.

1. Setting up a database to hold variant/gene annotations and GWAS results

This script creates a SQLite3 database holding two tables: one with annotations for each SNP/indel variant (annotations created by the Mackay lab), and one with annotations for each gene (from annotation hub). In step 4 below, we also add the GWAS results to this database, allowing memory-efficient handling of the results.

2. Estimating line mean fitness using Bayesian models

This script uses Bayesian mixed models implemented in the package brms to estimate the line means for our four fitness traits, while imputing missing values and adjusting for block effects.

3. Tables of the raw data and estimated line means

To facilitate data re-use, we here provide tables showing the raw data (i.e. the measurements of male and female fitness that were collected on each individual replicate vial), as well as the estimated line means that were calculated in 2. Estimating line mean fitness using Bayesian models.

4. Calculating quantitative genetic parameters

We first present a table showing the proportion of variance in fitness explained by ‘DGRP line’, which approximates heritability. We then estimate the correlations among line means in the 4 fitness traits, which approximates genetic correlations.

5. Running the GWAS

This script first performs quality control and imputation on the dataset of SNPs and indels for the DGRP (e.g filtering by MAF). Second, it runs mixed model association tests on our four phenotypes using the software GEMMA. Third, it groups SNPs/indels that are in complete linkage disequilibrium in our sample of DGRP lines. Fourth, it uses PLINK to identify a subset of SNPs that are in approximate LD for downstream analyses.

6. Applying adaptive shrinkage to the GWAS results

This script uses the R package mashr to perform multivariate adaptive shrinkage on the results of the GWAS, for an LD-pruned subset of loci. This produces corrected estimates of each SNP’s effect size, and allows estimation of the frequencies of different types of loci (e.g. sexually- or age-antagonistic loci).

7. Running the TWAS and applying mashr to the TWAS results

This script uses the transcriptomic data on the DGRP from Huang et al. 2015 PNAS to run a ‘transcriptome-wide association study’ (TWAS). In this script, we

Plots, tables, and analyses of the results

8. Plots showing line mean fitness

This script plots the estimated line means for each of the four fitness metrics, i.e. Figure 1 in the paper.

9. Tables of GWAS results

This script presents a searchable HTML table showing a list of significant SNPs and indels from the GWAS, with annotations, effect sizes, and \(p\)-values for each.

10. Tables of TWAS results

This script presents a searchable HTML table showing a list of significant transcripts from the TWAS, with annotations, effect sizes, and \(p\)-values for each.

11. Plots and statistical analyses

Here, we present various plots and statistical analyses of the GWAS and TWAS results, specifically:

  • Hex bin plots showing the correlations in effect sizes from GWAS across the 4 fitness traits
  • A statistical test showing that minor alleles tend to be associated with lower fitness
  • A plot inspired by Boyle et al. 2017 (“An expanded view of complex traits: from polygenic to omnigenic”, Cell) illustrating that fitness is highly polygenic (‘omnigenic’) and pleiotropy between male and female fitness is common.
  • An analysis of mutation load, testing whether the number of putatively harmful alleles present in each DGRP line correlates with fitness.
  • Bar plots showing an estimate of the frequencies of sexually antagonistic and sexually concordant SNPs/indels from GWAS, and transcripts from TWAS (plus the same for age concordant vs age antagonistic variants/transcripts)
  • Statistics indicating that the frequency of sexually antagonistic loci declines with age
  • Plots of the evidence ratio for each locus and transcript, where the ER compares evidence for the hypotheses that 1) the locus is concordant between sexes or ages, and B) the locus is antagonistic between sexes or ages. This plot avoids having to create a binary distinction between antagonistic and concordant loci, when in reality there is a continuum of evidence.
  • Statistical models of the evidence ratios, showing inter alia that candidate sexually antagonistic alleles tend to be more common and to be enriched on the X chromosome.
  • GO enrichment of candidate sexually antagonistic transcripts.