Although EUPAN toolkit integrates more than 10 software, most of these software can be provided to the main program when the specific tool is selected. It is convenient for users to work on supercomputers or clusters. To run the main program of EUPAN toolkit, you only need several software and packages.
R is utilized for visulization and statistical tests in EUPAN toolkit. Please install R first and make sure R and Rscript are under your PATH.
Download R here.
Several R packages are needed including ggplot2, reshape2 and ape packages. Follow the Installation step 3 or you can install the packages by yourself.
Download the EUPAN toolkit here.
Uncompress the EUPAN toolkit package.
tar zxvf EUPAN-vXX.XX.tar.gz
Install necessary R packages.
cd EUPAN-vXX.XX Rscript installRPac
Compile the source codes.
You will find executable files: ccov and bam2cov in bin/ directory.
Add bin/ to PATH and add lib/ to LD_LIBRARY_PATH.
To do this, edit your path in the following text and add the text to the end of the file
export PATH=$PATH:/path/to/EUPAN/bin: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/EUPAN/lib/: export PERL5LIB=$PERL5LIB:/path/to/EUPAN/lib/: source /path/to/EUPAN/bin/eupan_cmd.sh
Test if EUPAN toolkit is installed successfully
If you could see the following content, congratulations! EUPAN toolkit is successfully installed. If not, see if all the requirements are satisfied; or you may contact the authors for help.
Usage: eupan <command> ... Avalable commands: qualSta View the overall sequencing quality of a large number of files trim Trim or filter low-quality reads parallelly alignRead Map reads to a reference parallelly sam2bam Covert alignments (.sam) to sorted .bam files bamSta Statistics of parallel mapping assemble Assemble reads parallelly assemSta Statistics of parallel assembly getUnalnCtg Extract the unaligned contigs from nucmer alignment (processed by quast) rmRedundant Remove redundant contigs of a fasta file pTpG Get the longest transcripts to represent genes geneCov Calculate gene body coverage and CDS coverage geneExist Determine gene presence-absence based on gene body coverage and CDS coverage subSample Select subset of samples from gene PAV profile gFamExist Determine gene family presence-absence based on gene presence-absence bam2bed Calculate genome region presence-absence from .bam fastaSta Calculate statistics of fasta file sim simulation and plot of the pan-genome and the core genome