Samtools is used to compute the read depth of positions in target regions. Please install it first and make sure it is under your PATH.
R is utilized for visualization and statistical tests in the APAV toolbox. Please install R first and make sure R
and Rscript
are under your PATH.
APAVplot is an R package specifically designed for visualization of PAV analysis. The code and more details please see here.
$ tar zxvf APAV-v**.tar.gz
apav
to PATH and add lib/
to PERL5LIB
$ export PATH=$PATH:/path/to/APAV/:
$ export PERL5LIB=$PERL5LIB:/path/to/APAV/lib/:
## Install "ComplexHeatmap" from BiocManagerYou can skip this step if you are going to use APAVplot later to draw plots in the R environment
$ conda install r-BiocManager ## OR: Rscript -e "install.packages('BiocManager')"
$ Rscript -e "BiocManager::install('ComplexHeatmap')"
## OR install it from bioconda
$ conda install bioconda::bioconductor-complexheatmap
## Install "APAVplot"
$ conda install r-devtools ## OR: Rscript -e "install.packages('devtools')"
$ Rscript -e "devtools::install_github('SJTU_CGM/APAVplot')"
$ apavIf you could see the following content, congratulations! APAV toolkit is successfully installed. If not, see if all the requirements are satisfied; or you may contact the authors for help.
Usage: apav... Available commands: Pipeline: geneBatch Automatically execute main commands for genes generalBatch Automatically execute main commands for the general target regions Extract positions: gff2bed Extract the coordinates of target regions from a GFF format file Calculate coverage: staCov Calculate coverage of target regions mergeElecov Merge neighboring elements with the same coverage covPlotHeat Plot a heatmap to give an overview of the coverage profile across samples Determine PAV: callPAV Determine presence/absence variations based on coverage gFamPAV Determine gene family presence/absence based on the gene PAV table mergeElePAV Merge neighboring elements with the same PAV Estimate genome size: pavSim Simulate the size of pan-genome and core-genome from the PAV table pavPlotSim Draw growth curve of genome simulation PAV analysis: pavPlotStat Plot a half-violin chart to show the number of regions in each group of samples pavPlotHist Plot a ring chart and a histogram to show the classifications and distribution of target regions pavPlotHeat Plot a complex heat map to give an overview of the PAV profile pavPlotBar Plot a stacked bar chart to show the classifications of target regions in all samples pavPCA Perform PCA analysis for the PAV table and plot results pavCluster Cluster samples based on the PAV table and plot results Phenotype assocation: pavStaPheno Perform Fisher's exact test and Wilcoxon tests to determine phenotype association pavPlotPhenoHeat Show the main result of phenotype association analysis with a heat map pavPlotPhenoBlock Display the percentage of samples containing target regions in each group of a discrete phenotype pavPlotPhenoMan Draw a Manhattan plot to show the results of a given phenotype pavPlotPhenoBar Show the relationship between a specific genomic region and a specific phenotype in a bar plot pavPlotPhenoVio Show the relationship between a specific genomic region and a specific phenotype in a violin plot Visualization of element regions: elePlotCov Display the coverage of elements in a specific target region elePlotPAV Display the PAV of elements in a specific target region elePlotDepth Display the depth of elements in a specific target region