使用plink软件利用Fisher精确检验关联基因型和表型(GWAS)

最近看论文 Genetic subdivision and candidate genes under selection in North American grey wolves,论文里用33个狼的皮毛颜色作为表型去和基因型进行关联分析
方法部分写到
individuals. To test for associations between SNPs near coat colour genes and phenotypic variation within our samples, we performed a case/control association test using both the Fisher’s exact test for allelic association (–fisher) and the full model testing for differences in any genotypes, with permutations for assigning significance (–model –cell 0 –perm) within PLINK
最近发表的甘蓝的泛基因组论文中的关联分析也是用这种方法做的,论文
Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica oleracea
方法部分写到
We adopted the case–control GWAS strategy, which was widely used in disease gene mapping for humans30,31, to identify SVs that were substantially associated with different morphotypes of B. oleracea. Briefly, a GWAS analysis was performed between the case group (individuals belonging to a specific morphotype) and the control group (individuals belonging to all the other morphotypes). Significance was tested by a two-tailed Fisher’s exact test and adjusted by Bonferroni correction.
还有一篇水稻的泛基因组论文也是用的这个方法,论文
Long-read sequencing of 111 rice genomes reveals significantly larger pan-genomes
方法部分写到
Fisher’s exact test was used to detect gene PAV-discrete phenotype associations, and theWilcoxon rank-sum test was used to detect gene PAV-continuous phenotype associations in R v4.0.2. P-values were adjusted using the FDR method, and a threshold of FDR < 0.05 was used to claim a significant gene PAV-phenotype association.
这个是用基因的pav矩阵来做的关联
在网上找了找plink做这个分析的教程
https://www.staff.ncl.ac.uk/heather.cordell/mres2020casecon.html
以下这个链接也可以参考
https://cloufield.github.io/GWASTutorial/06_Association_tests/#significant-loci

image.png
输入数据是一个ped文件

image.png
ped文件每列的介绍
第7列开始时基因型数据,每两列是一个位点,这个示例数据里是4个位点,所以是8列
还需要一个map文件

image.png
map格式介绍

image.png
plink做fisher精确检验的命令
plink –ped caseconped.txt –map caseconmap.txt –fisher
输出文件

image.png
把vcf文件转换成 ped和map
表型数据是3列 family id 个体id 表型,表型数据的前两列直接用vcf文件里的样本id就可以了,分隔符是Tab或者空格

image.png
plink –vcf ../rMVP/smoove_filtered.vcf –pheno pheno.txt –recode12 –allow-extra-chr –allow-no-sex –out smoove

image.png

image.png
欢迎大家关注我的公众号
小明的数据分析笔记本

声明:文中观点不代表本站立场。本文传送门:https://eyangzhen.com/418372.html

(0)
联系我们
联系我们
分享本页
返回顶部