Chen Sun
Master's student
Department of Bioinformatics and BiostatisticsShanghai Jiao Tong University
Room 221, Building 4, Biology Complex
800 Dongchuan Road, Minhang District
Shanghai, China, 200240
E-mail: verne91 AT sjtu DOT edu DOT cn
I have graduated from SJTU.
Now I am a PhD student at University of Michigan.
Here is my current personal website.
Education
2014.9 - 2017.3 | M.S. in Bioinformatics (with honors) | Shanghai Jiao Tong University, Shanghai, China |
2010.9 - 2014.7 | B.S. in Bioinformatics | Shanghai Jiao Tong University, Shanghai, China |
2012.2 - 2014.7 | B.A. in History | Shanghai Jiao Tong University, Shanghai, China |
Research Interests
- Statistical learning
- High performance computing for bioinformatics
- NGS data analysis
- Population and comparative genomics
- Genome browser and data visualization
Research Experiences
- 400+ Human Gastric Cancer Genome Project
- In progress.
- We sequenced more than 400 whole genomes of gastric cancer patients with very high sequencing depth in order to find the disease related variations.
- 3000 Rice Genome Project
- Constructed Pan-genome analysis pipeline, including
- - sequence quality control
- - whole genome de novo assembly for 3,010 rice genomes
- - genomic mapping
- - pan-genome sequences construction
- - gene annotation of pan-genome sequences
- - gene clustering based on sequence homology
- - gene and gene family presence/absence determination
- - functional(GO) analysis
- - novel gene validation using RNA-seq data
- - phylogenetic analysis based on presence/absence variation (PAV) of gene and gene family
- Implemented job submitting with LSF and SLRUM system on π supercomputer of SJTU (total computing time > 1.3 million CPU core hours).
- 3K Rice Pan-genome Browser (RPAN)
- Constructed database and visualization tools for 3,000 rice pan-genome analysis results.
- Developed webpages with HTML, CSS and JavaScript.
- Implemented serveral search functions and table browsers with PHP and MySQL.
- Built visualization page with a dynamic tree browser and JBrowse framework.
- Integrated pan-genome reference, gene annotation, gene PAV information and RNA-seq data into genome browser.
- Toolbox for Eukaryotic Pan-genome Analysis (EUPAN)
- Presented a ”map-to-pan” strategy for high-resolution pan-genome studies.
- Integrated available tools and in-house programs at different steps for eukaryotic pan-genome analysis.
- Developed project website and performed regular maintenance.
- Shigella Pan-genome Project
- Called SNP from genomic mapping results.
- Designed pipeline for Shigella pan-genome analysis (including gene family presence/absence analysis, functional enrichment, evolution history of Shigella spp., etc.)
- Constructed phylogenetic relationship of more than 700 strains.
- Sensor Based Human Activity Recognition
- Designed a logistic regression model for classification of human activities based on sensor data.
- Implemented the model with Python and obtained high precision and recall (both 0.89).
- Epigenetic Signal Character around TFBS
- Designed a method to detect signal shapes of histone modifications around transcription factor biding sites (TFBS).
- Predicted TFBS with epigenetic signal characters.
- Classification of short genomic sequences
- Designed a kth-order Hidden Markov Model (HMM) to classify a set of short genomic sequences from different species.
- Implemented the model with Python to predict the source of new sequences.
- Horizontal Gene Transfer in Human Genome
- Located human genomic regions with higher similarities to amphibians than to mammals from multiple sequence alignment of 46 manmal genomes.
- Fuctional analysis with GO database.
- Epigenetics within Repeat Region in Mammal Genomes
- Surveyed and analyzed epigenetic data and distribution of repeat sequences on UCSC Genome Browser.
- Summarized DNA methylation loci in different repeat regions.
- Enzymatic Activity of Microorganisms Associated with Sponge
- Screened bacteria of higher lipase activity with different culture medium.
Publications
- Sun, C. et al. "RPAN: Rice Pan-genome Browser for ~3000 Rice Genomes", Nucleic Acids Res., 2017; 45 (2): 597-605. doi: 10.1093/nar/gkw958. [Full Text]
- Hu, Z., Sun, C. et al. "EUPAN enables pan-genome studies of a large number of eukaryotic genomes", Bioinformatics 2017 btx170. doi: 10.1093/bioinformatics/btx170 (joint first author). [Full Text]
- Huang, W. et al. "Widespread of Horizontal Gene Transfer in the Human Genome", BMC Genomics 2017 18:274 doi: 10.1186/s12864-017-3649-y. [Full Text]
- Wang, W. et al. "Harnessing natural variation in Asian Cultivated Rice (Oryza sativa L.): SNPs, structual variationis and pan-genomes", in revision for Nature.
Awards
- Shanghai Municipal Excellent Graduate in 2017
- Graduate Academic Scholarship of SJTU in 2014, 2015 and 2016
- Academic Excellence Scholarship of SJTU in 2012 and 2013
- Excellent Shanghai Undergraduate Innovation Program in 2012
Technical Skills
- Programming Languages:
- Proficient in Python, C, R, Bash, PHP, HTML, CSS, MySQL, JavaScript, Latex
- Exposure to Perl, Matlab, C++, Ruby, Pascal, Delphi
- Attending Courses:
- Algorithm and Data Structure, Algorithm Analysis and Theory, Machine Learning
Personal Preference
- Reading: Sci-Fi novels, History, Philosophy, etc.
- Travelling: Hiking, Road trip.
- Sports: Basketball, Table tennis, Swimming, Chinese chess.
- Culinary arts: Chinese food.