Pangenome analysis reveals structural variation associated with seed size and weight traits in peanut
Introduction
Peanut (Arachis hypogaea L.) is a significant oilseed and food legume crop, with seed size and weight being a critical trait for domestication and breeding. Here are codes and result data of the article (mirror URL).
Codes
The main pipelines and self-writen scripts have been uploaded to github.
Link: https://github.com/SJTU-CGM/PeanutPan
Result data
Genome sequences of newly sequenced peanut accessions
Sample | Genome sequence | Gene annotation |
---|---|---|
Adu | Adu.fa.gz | Adu.gff3.gz |
Amon | Amon.fa.gz | Amon.gff3.gz |
H16-5 | H16-5.fa.gz | H16-5.gff3.gz |
mH8 | mH8.fa.gz | mH8.gff3.gz |
NDH108 | NDH108.fa.gz | NDH108.gff3.gz |
ZP06 | ZP06.fa.gz | ZP06.gff3.gz |
- Genome sequence: The long (ONT Ultra-long / PacBio HIFI) reads were corrected and assembled using NextDenovo. First, contigs were polished using Racon and NextPolish with long reads. Next, the contigs were further clustered, ordered, and oriented scaffolds onto chromosomes by LACHESIS.
- Gene annotation: The gene prediction was performed using GeMoMa, PASA, Augustus and EVidenceModeler, combing de novo, transcript and homolog protein based strategy.
Pangenome construction of peanut accessions
Dataset | File type | Description | File Link |
---|---|---|---|
SVAss | Raw VCF | Variants from SVAss, merging SVs from 8 peanuts using SURVIVOR | P8.SVAss.SURVIVOR.vcf.gz |
SVAssRead | Raw VCF | Variants from SVAssRead, merging SVs from 8 peanuts using SURVIVOR | P8.SVAssRead.SURVIVOR.vcf.gz |
MC | Raw VCF | Variants from MC[Minigraph-Cactus], directly constructed from 8 peanuts, including small variants | P8.MC.raw.vcf.gz |
MC | Raw GFA | Variants graph from MC[Minigraph-Cactus], directly constructed from 8 peanuts, including small variants | P8.MC.raw.gfa.gz |
SVAss | Processed VCF | Pangenie-Ready variants from SVAss, using Pangenie preparing pipeline, annotated with VCFanno | P8.SVAss.PangenieReady.VCFanno.vcf.gz |
SVAssRead | Processed VCF | Pangenie-Ready variants from SVAssRead, using Pangenie preparing pipeline, annotated with VCFanno | P8.SVAssRead.PangenieReady.VCFanno.vcf.gz |
MC | Processed VCF | Pangenie-Ready variants from MC, using Pangenie preparing pipeline, annotated with VCFanno | P8.MC.PangenieReady.VCFanno.vcf.gz |
Variants and genotyes of near 269 resequencing peanut accessions
Variant type | Sub-genome | Filtered variant |
---|---|---|
SNP | A | AA_merge.snp.MM05MAF005.vcf.gz |
SNP | B | BB_merge.snp.MM05MAF005.vcf.gz |
SNP | A&B (tetraploid only) | AABB_merge.snp.MM05MAF005.vcf.gz |
SV | A&B (tetraploid only) | AABB230.SVAssRead.evg.c3.force0.vcf.gz |
Indel&SV(variant length>=10) | A&B (tetraploid only) | AABB230.MCmt10bp.evg.c3.force0.vcf.gz |
Contact Information
Hongzhang Xue: xuehzh95@sjtu.edu.cn
Chaochun Wei: ccwei@sjtu.edu.cn
Dongmei Yin: yindm@henau.edu.cn
Copyright © 2025
The laboratory of computational genomics and metagenomics in Shanghai Jiao Tong University &
HAU Peanut Team, College of Agronomy, Henan Agricultural University.
All Rights Reserved.