contig Search Results


  • Logo
  • About
  • News
  • Press Release
  • Team
  • Advisors
  • Partners
  • Contact
  • Bioz Stars
  • Bioz vStars
  • 93
    Illumina Inc contigs
    Schematic overview of our implementation of the MrGBP method to characterise the species relationships among metagenomic <t>contigs.</t>
    Contigs, supplied by Illumina Inc, used in various techniques. Bioz Stars score: 93/100, based on 3776 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contigs/product/Illumina Inc
    Average 93 stars, based on 3776 article reviews
    Price from $9.99 to $1999.99
    contigs - by Bioz Stars, 2020-09
    93/100 stars
      Buy from Supplier

    91
    Pacific Biosciences pacbio contigs
    Assembly benchmarking comparisons reveal high degree of assembly completion. (A) Feature response curves (FRC) showing the error rate as a function of the number of bases in each assembly (CHIR_1.0, CHIR_2.0, and ARS1) and each scaffold test (intermediary assemblies using a combination of Hi-C and Bionano scaffolding). (B) Comparison plots of chromosome 20 sequence between the ARS1 and CHIR_2.0 assemblies reveal several small inversions (light blue circles) and a small insertion of sequence (break in continuity) in the ARS1 assembly. Red circles highlight 9 of the aforementioned inversions and the insertion of sequence in our assembly. The ARS1 assembly contains only 10 gaps on this chromosome scaffold whereas CHIR_2.0 has 5,651 gaps on the same chromosome assembly (gap density histogram on the Y axis). ARS1 optical map scaffolds and <t>Pacbio</t> <t>contigs</t> represented on the X axis as alternating patterns of blue and green shades, respectively, showing the tiling path that comprises the entire single chromosome scaffold.
    Pacbio Contigs, supplied by Pacific Biosciences, used in various techniques. Bioz Stars score: 91/100, based on 495 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/pacbio contigs/product/Pacific Biosciences
    Average 91 stars, based on 495 article reviews
    Price from $9.99 to $1999.99
    pacbio contigs - by Bioz Stars, 2020-09
    91/100 stars
      Buy from Supplier

    92
    Pacific Biosciences contigs
    Comparisons of the assemblers conducted in this study. SSPACE-LongRead is a scaffolder using single molecule long reads to upgrade pre-assembled <t>contigs</t> constructed from short reads. ALLPATHS-LG and SPAdes are hybrid assemblers which take short reads and long reads as inputs. PBcR pipeline uses short reads to correct long reads by pacBioToCA, and then assembles corrected long reads (PBcR) by Celera assembler (runCA). Hierarchical genome-assembly process (HGAP) and PBcR pipeline via self-correction (PBcR pipeline(S)) take long reads as input to produce non-hybrid assembly.
    Contigs, supplied by Pacific Biosciences, used in various techniques. Bioz Stars score: 92/100, based on 1321 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contigs/product/Pacific Biosciences
    Average 92 stars, based on 1321 article reviews
    Price from $9.99 to $1999.99
    contigs - by Bioz Stars, 2020-09
    92/100 stars
      Buy from Supplier

    92
    Unigene contigs
    Distribution of BAC hits and contig/marker . (a) the distribution of BAC hits/marker using 1470 markers; (b) the distribution of the numbers of identified <t>contigs</t> of each marker.
    Contigs, supplied by Unigene, used in various techniques. Bioz Stars score: 92/100, based on 707 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contigs/product/Unigene
    Average 92 stars, based on 707 article reviews
    Price from $9.99 to $1999.99
    contigs - by Bioz Stars, 2020-09
    92/100 stars
      Buy from Supplier

    94
    Celera contigs
    Sources of linking information between <t>contigs.</t> ( A ) overlaps, ( B ) clone mates, ( C ) alignments to reference genome, ( D ) alignments to physical maps, ( E ) conservation of gene synteny.
    Contigs, supplied by Celera, used in various techniques. Bioz Stars score: 94/100, based on 401 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contigs/product/Celera
    Average 94 stars, based on 401 article reviews
    Price from $9.99 to $1999.99
    contigs - by Bioz Stars, 2020-09
    94/100 stars
      Buy from Supplier

    92
    Oxford Nanopore contigs
    Alignment of Illumina-assembled <t>contigs,</t> Illumina and Nanopore reads to the annotated Vibrio parahaemolyticus MVP1 pVa plasmid sub-region containing the pirAB Vp genes. Direction of arrow in the annotation indicates transcription orientation. Blue and red arrows in the Nanopore read alignment indicate forward and reverse strands, respectively.
    Contigs, supplied by Oxford Nanopore, used in various techniques. Bioz Stars score: 92/100, based on 80 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contigs/product/Oxford Nanopore
    Average 92 stars, based on 80 article reviews
    Price from $9.99 to $1999.99
    contigs - by Bioz Stars, 2020-09
    92/100 stars
      Buy from Supplier

    94
    Bioedit Company contigs
    Alignment of Illumina-assembled <t>contigs,</t> Illumina and Nanopore reads to the annotated Vibrio parahaemolyticus MVP1 pVa plasmid sub-region containing the pirAB Vp genes. Direction of arrow in the annotation indicates transcription orientation. Blue and red arrows in the Nanopore read alignment indicate forward and reverse strands, respectively.
    Contigs, supplied by Bioedit Company, used in various techniques. Bioz Stars score: 94/100, based on 896 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contigs/product/Bioedit Company
    Average 94 stars, based on 896 article reviews
    Price from $9.99 to $1999.99
    contigs - by Bioz Stars, 2020-09
    94/100 stars
      Buy from Supplier

    89
    Celera chimeric contigs
    Correlation analysis from assemblies. Hierarchical clustering given by the Spearman correlation matrix of the viral-bacterial (left) and the viral (right) assemblies (A) . The gradient indicates the strength and direction of the correlations. Blue squares represent significant negative correlations and red squares positive correlations ( P value ≤ 0.05). Green clusters are based on accurate short-length and low-assembled <t>contigs</t> while purple clusters are based on highly-assembled long-chimeric contigs. Ctgs: number of contigs, ChCtgs: percentage of chimeric contigs, ChCtgs.V.B: percentage of viral-bacterial chimeric contigs, GR: genomes recovered, IM: median of percentage of contig identity against its original genome, LC: Largest contig, N50, RA: percentage of reads assembled, ROG: percentage of reads assembled on their original genomes, RVB: percentage of reads within a viral-bacterial hit. Principal Component Analysis for the different assemblies in both metagenomes (B) : The two principal components that better explain variation between assemblies are shown for the viral-bacterial (left) and the viral (right) assemblies. The length of the red vectors represents the effect of each component on the stats. The distance between them indicates their correlation. Circles represent two clusters formed due to the close correlations in the matrices in panel A . C: Celera, M: Minimo, N: Newbler, Optimal: optimal assembly.
    Chimeric Contigs, supplied by Celera, used in various techniques. Bioz Stars score: 89/100, based on 30 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/chimeric contigs/product/Celera
    Average 89 stars, based on 30 article reviews
    Price from $9.99 to $1999.99
    chimeric contigs - by Bioz Stars, 2020-09
    89/100 stars
      Buy from Supplier

    92
    Biotechnology Information contig
    Mis-assembled <t>DCC</t> and DOC . Assemblers may mistakenly form two contigs from the two haplotypes, as shown in (a) where <t>contig</t> A contains heterozygous sequence and contig B contains homozygous sequence (light) on both sides of a matching heterozygous region (dark) (with sequencing reads as lines above them). We refer to A as a duplicated contained contig (DCC). We can identify this situation by finding an alignment between contigs A and B that completely covers contig A and comparing contig A's mate pair links in the original location to those same links when contig A is overlaid on contig B at the location of its alignment, as shown in (b) . Dashed curves in (a) indicate distances that are significantly shorter (left side of figure) or longer (right) than expected; solid curves indicate distances that are consistent with specifications. In the situation shown here, we would designate contig A as an erroneous duplication likely to have been caused by haplotype differences. Alternatively, heterozygous sequence may be separated into two contigs that each include some homozygous sequence on opposite ends, as in contigs C and D in (c) , which we refer to as duplicated overlapping contigs. If a significant alignment exists between the ends of these contigs and the distances between mate pairs pointing right from contig C and left from contig D better match their expected fragment sizes when the contigs are joined, we designate the region as an erroneous duplication and join the contigs as in (d) .
    Contig, supplied by Biotechnology Information, used in various techniques. Bioz Stars score: 92/100, based on 118 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contig/product/Biotechnology Information
    Average 92 stars, based on 118 article reviews
    Price from $9.99 to $1999.99
    contig - by Bioz Stars, 2020-09
    92/100 stars
      Buy from Supplier

    94
    Thermo Fisher contigs
    Venn diagram showing distribution of H. glycines BLAST hits by database. Forty-four percent of all 6,860 H. glycines <t>contigs</t> matched sequences in at least one of three databases at a threshold value of 1 e-20 : (a) All cyst nematodes without H. glycines . (b) All non-cyst nematodes. (c) All non-nematodes.
    Contigs, supplied by Thermo Fisher, used in various techniques. Bioz Stars score: 94/100, based on 1254 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contigs/product/Thermo Fisher
    Average 94 stars, based on 1254 article reviews
    Price from $9.99 to $1999.99
    contigs - by Bioz Stars, 2020-09
    94/100 stars
      Buy from Supplier

    92
    Gene Codes Inc contig assembly
    Venn diagram showing distribution of H. glycines BLAST hits by database. Forty-four percent of all 6,860 H. glycines <t>contigs</t> matched sequences in at least one of three databases at a threshold value of 1 e-20 : (a) All cyst nematodes without H. glycines . (b) All non-cyst nematodes. (c) All non-nematodes.
    Contig Assembly, supplied by Gene Codes Inc, used in various techniques. Bioz Stars score: 92/100, based on 182 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contig assembly/product/Gene Codes Inc
    Average 92 stars, based on 182 article reviews
    Price from $9.99 to $1999.99
    contig assembly - by Bioz Stars, 2020-09
    92/100 stars
      Buy from Supplier

    91
    BioNano Genomics bionano contigs
    Assembly and validation of Drosophila miranda genome. A . Overview of assembly pipeline. The steps include assembly of male PacBio reads followed by scaffolding using Hi-C, and extensive QC using <t>BioNano</t> reads and BAC clone sequencing followed by gene and repeat annotation. B. Hi-C linkage density map. Chromatin interaction maps allow recovery of entire chromosome arms. Note that the Y-linked <t>contigs</t> were scaffolded separately from X-linked and autosomal contigs. Unlinked regions with many contacts indicate repetitive regions. C. Comparison of current (Dmir2.0) versus old (Dmir1.0) D . miranda assembly. Note that the Y/neo-Y was not assembled in Dmir1.0, and the dot plot indicates homology between our neo-Y assembly and the neo-X. Other repeat-rich regions, such as the large pericentromeric block on AD, are also missing from D.mir1.0. D. BAC clone mapping for assembly verification. BAC clones are color coded according to how many genomic regions they map to in our assembly; green lines indicate stitch points of scaffolds based on Hi-C contacts, and the black line gives the local repeat content along the genome. Three hundred sixty-one sequenced BAC clones (97%) map contiguously and uniquely to our genome assembly. BAC, bacterial artificial chromosome; F, female; M, male; QC, quality control; Repeat %, local repeat content.
    Bionano Contigs, supplied by BioNano Genomics, used in various techniques. Bioz Stars score: 91/100, based on 116 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/bionano contigs/product/BioNano Genomics
    Average 91 stars, based on 116 article reviews
    Price from $9.99 to $1999.99
    bionano contigs - by Bioz Stars, 2020-09
    91/100 stars
      Buy from Supplier

    94
    Thermo Fisher contig express
    Assembly and validation of Drosophila miranda genome. A . Overview of assembly pipeline. The steps include assembly of male PacBio reads followed by scaffolding using Hi-C, and extensive QC using <t>BioNano</t> reads and BAC clone sequencing followed by gene and repeat annotation. B. Hi-C linkage density map. Chromatin interaction maps allow recovery of entire chromosome arms. Note that the Y-linked <t>contigs</t> were scaffolded separately from X-linked and autosomal contigs. Unlinked regions with many contacts indicate repetitive regions. C. Comparison of current (Dmir2.0) versus old (Dmir1.0) D . miranda assembly. Note that the Y/neo-Y was not assembled in Dmir1.0, and the dot plot indicates homology between our neo-Y assembly and the neo-X. Other repeat-rich regions, such as the large pericentromeric block on AD, are also missing from D.mir1.0. D. BAC clone mapping for assembly verification. BAC clones are color coded according to how many genomic regions they map to in our assembly; green lines indicate stitch points of scaffolds based on Hi-C contacts, and the black line gives the local repeat content along the genome. Three hundred sixty-one sequenced BAC clones (97%) map contiguously and uniquely to our genome assembly. BAC, bacterial artificial chromosome; F, female; M, male; QC, quality control; Repeat %, local repeat content.
    Contig Express, supplied by Thermo Fisher, used in various techniques. Bioz Stars score: 94/100, based on 320 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contig express/product/Thermo Fisher
    Average 94 stars, based on 320 article reviews
    Price from $9.99 to $1999.99
    contig express - by Bioz Stars, 2020-09
    94/100 stars
      Buy from Supplier

    92
    Illumina Inc contig assembly
    Assembly and validation of Drosophila miranda genome. A . Overview of assembly pipeline. The steps include assembly of male PacBio reads followed by scaffolding using Hi-C, and extensive QC using <t>BioNano</t> reads and BAC clone sequencing followed by gene and repeat annotation. B. Hi-C linkage density map. Chromatin interaction maps allow recovery of entire chromosome arms. Note that the Y-linked <t>contigs</t> were scaffolded separately from X-linked and autosomal contigs. Unlinked regions with many contacts indicate repetitive regions. C. Comparison of current (Dmir2.0) versus old (Dmir1.0) D . miranda assembly. Note that the Y/neo-Y was not assembled in Dmir1.0, and the dot plot indicates homology between our neo-Y assembly and the neo-X. Other repeat-rich regions, such as the large pericentromeric block on AD, are also missing from D.mir1.0. D. BAC clone mapping for assembly verification. BAC clones are color coded according to how many genomic regions they map to in our assembly; green lines indicate stitch points of scaffolds based on Hi-C contacts, and the black line gives the local repeat content along the genome. Three hundred sixty-one sequenced BAC clones (97%) map contiguously and uniquely to our genome assembly. BAC, bacterial artificial chromosome; F, female; M, male; QC, quality control; Repeat %, local repeat content.
    Contig Assembly, supplied by Illumina Inc, used in various techniques. Bioz Stars score: 92/100, based on 83 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contig assembly/product/Illumina Inc
    Average 92 stars, based on 83 article reviews
    Price from $9.99 to $1999.99
    contig assembly - by Bioz Stars, 2020-09
    92/100 stars
      Buy from Supplier

    92
    CodonCode contig assembly
    Bayesian phylogenetic inference of CTV genomes and genome fragments. Unrooted, consensus phylogenetic trees were obtained from 2,000,000 generations of the Markov chain Monte Carlo simulation in Bayesian analysis using a general time-reversal model of nucleotide substitution [33] . The number above each branch indicates the Bayesian posterior probability. The scale bars represent 0.1 expected substitutions per site. Branch lengths are proportional to evolutionary distance. Sequences were aligned using ClustalX [47] and subsequently manually aligned prior to the Bayesian phylogenetic analysis. A, Known CTV genomes and CTV genomes assembled from resequencing analysis of FS2-2 (highlighted orange). The suffix at the end of fs2_2 distinguishes multiple genotypes in the isolate and also indicates the anchor sequence from which the consensus <t>contig</t> was generated by the <t>Phrap</t> program. B, the 5′ proximal 1 kb, and C, p33-coding region of CTV genomes obtained by direct sequencing of RT-PCR clones. In both B and C, Bayesian posterior probability and clones with identical sequences were omitted for clarity. Recombinant sequences are highlighted in green.
    Contig Assembly, supplied by CodonCode, used in various techniques. Bioz Stars score: 92/100, based on 211 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contig assembly/product/CodonCode
    Average 92 stars, based on 211 article reviews
    Price from $9.99 to $1999.99
    contig assembly - by Bioz Stars, 2020-09
    92/100 stars
      Buy from Supplier

    92
    Gene Codes Inc contig
    Distribution of <t>contig</t> sequences after BLASTx analysis. (A) Percentage of sequences related to the main categories of existing viruses: vertebrate (blue), plant/fungal (green), invertebrate (brown), protozoan (yellow) viruses and bacteriophages (gray), and unassigned viral sequences (no data available concerning the taxonomic family, indicated in red). The total number of viral <t>contigs</t> is indicated below each pie chart. (B) Percentage of sequences related to the most abundant viral families, indicated by the same color range for each of the main viral categories as in (A) : blue = vertebrate, green = plant/fungal, brown = invertebrate, yellow = protozoan, gray = phage. Viral families accounting for less than 2% of all sequences are pooled in the “other” category (in purple), and read sequences with no available data regarding the taxonomic family are considered to be unassigned (in red).
    Contig, supplied by Gene Codes Inc, used in various techniques. Bioz Stars score: 92/100, based on 115 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/contig/product/Gene Codes Inc
    Average 92 stars, based on 115 article reviews
    Price from $9.99 to $1999.99
    contig - by Bioz Stars, 2020-09
    92/100 stars
      Buy from Supplier

    89
    Illumina Inc 454 contigs
    WT and GPC-RNAi plants 12 days after anthesis . (A) WT (left) and GPC-RNAi plants at 12 DAA used to analyze the GPC -dependent transcriptional changes. (B C) Close-up images of the ears (B) and flag leaves (C) from WT (left) and GPC-RNAi plants (right) at 12 DAA. (D) Expression profile of the GPC genes relative to ACTIN in WT and GPC-RNAi plants across a senescing leaf time course (H = heading, D = days after anthesis). Transcript levels are presented as normalized, linearized values from 10 biological replicates (± SEM) derived from the 2 -ΔΔ C t method [ 36 ], where Ct is the threshold cycle. * P≤0.05, ** P≤0.01. (E) Sample clustering based on counts of Illumina reads mapped on <t>454</t> <t>contigs.</t> Dendrogram represents the hierarchical clustering of samples as determined by Euclidean distance. The heat map shows a false color representation of the Euclidean distance matrix (from red for zero distance to white for large distance).
    454 Contigs, supplied by Illumina Inc, used in various techniques. Bioz Stars score: 89/100, based on 35 PubMed citations. ZERO BIAS - scores, article reviews, protocol conditions and more
    https://www.bioz.com/result/454 contigs/product/Illumina Inc
    Average 89 stars, based on 35 article reviews
    Price from $9.99 to $1999.99
    454 contigs - by Bioz Stars, 2020-09
    89/100 stars
      Buy from Supplier

    Image Search Results


    Schematic overview of our implementation of the MrGBP method to characterise the species relationships among metagenomic contigs.

    Journal: Scientific Reports

    Article Title: A signal processing method for alignment-free metagenomic binning: multi-resolution genomic binary patterns

    doi: 10.1038/s41598-018-38197-9

    Figure Lengend Snippet: Schematic overview of our implementation of the MrGBP method to characterise the species relationships among metagenomic contigs.

    Article Snippet: Across-Samples Coverage Information To obtain the coverage profile for contigs across the longitudinal samples, the Illumina reads were mapped to contigs with Bowtie 2 for each time point.

    Techniques:

    Strategy developed for assemblies and haplotype detection. First, Roche-454 and Illumina reads were assembled separately using Newbler and Trinity, respectively. The contigs obtained were co-assembled using Newbler. The length of contigs was enhanced and redundant contigs were removed using custom scripts. Reads Mapping were done with Bowtie 2 and the different contigs were annotated using three complementary methods: tBLASTx, BLAST2Go and Pfam. Polymorphisms and haplotypes were detected and constructed using mapping data.

    Journal: Genome Biology and Evolution

    Article Title: Reference Transcriptomes and Detection of Duplicated Copies in Hexaploid and Allododecaploid Spartina Species (Poaceae)

    doi: 10.1093/gbe/evw209

    Figure Lengend Snippet: Strategy developed for assemblies and haplotype detection. First, Roche-454 and Illumina reads were assembled separately using Newbler and Trinity, respectively. The contigs obtained were co-assembled using Newbler. The length of contigs was enhanced and redundant contigs were removed using custom scripts. Reads Mapping were done with Bowtie 2 and the different contigs were annotated using three complementary methods: tBLASTx, BLAST2Go and Pfam. Polymorphisms and haplotypes were detected and constructed using mapping data.

    Article Snippet: The number of Illumina contigs obtained using the Trinity assembler is higher in the F1 and the allopolyploid than in the parents (121,733, 110,455, 144,550 for the three hybrid species and 98,455, 76,010 for the parental species) while the number of Roche-454 contigs obtained with Newbler is higher in the parents due to a deeper sequencing and the presence of both normalized and nonnormalized libraries ( ).

    Techniques: Construct

    Comparison of transcription levels estimation. A . Scaterplot of the number of 454 FLX reads used in the assembly of a given contig versus the number of Illumina reads mapping on the same contig. R1 and R2 stand for the correlation coefficients before and after disregarding the points that exhibit an extreme discrepancy between the two technologies (those ones forming an almost vertical line on the rightmost part of the figure). B . This figure depicts a depth profile showing the reads that map on a given genomic region. The upper part corresponds to 454 FLX, Illumina reads appear in the middle and the last part corresponds to the graphical representation of the corresponding genomic region.

    Journal: BMC Genomics

    Article Title: Transcriptome analysis of the bloodstream stage from the parasite Trypanosoma vivax

    doi: 10.1186/1471-2164-14-149

    Figure Lengend Snippet: Comparison of transcription levels estimation. A . Scaterplot of the number of 454 FLX reads used in the assembly of a given contig versus the number of Illumina reads mapping on the same contig. R1 and R2 stand for the correlation coefficients before and after disregarding the points that exhibit an extreme discrepancy between the two technologies (those ones forming an almost vertical line on the rightmost part of the figure). B . This figure depicts a depth profile showing the reads that map on a given genomic region. The upper part corresponds to 454 FLX, Illumina reads appear in the middle and the last part corresponds to the graphical representation of the corresponding genomic region.

    Article Snippet: Specifically, we compared the number of 454 FLX reads used in the assembly of a given contig versus the number of Illumina reads mapping on the same contig.

    Techniques:

    Physical map of wheat chromosome 1AS. The figure integrates multiple sequence resources. a. Chromosome 1AS deletion bin map with the three bins shown in (yellow, green and gray). ESTs from the three deletion bins which were mapped to Brachypodium reference zipper genes are indicated with boxes with colour of the corresponding bin. If more than one EST mapped to the same Brachypodium gene, the boxes were stacked on top of each other. This information was used to estimate the boundaries of each deletion bin in the Brachypodium reference zipper (dashed lines). b. Brachypodium reference zipper. c. Physical map of the 1AS chromosme arm. BAC contigs are symbolised with blue lines (see enlarged legend at the right). The length of the line reflects the number of putative syntenic genes found on the contig, not its physical size. Syntenic genes are also symbolised by black boxes. The number of non-syntenic genes for each contig is indicated with a stack of red boxes. Grey boxes indicate place holders for contigs that contained no syntenic genes but were anchored by means other than synteny (e.g. genetic markers of centromere-specific repeats. d. Published genetic markers from chromosome 1AS that were used to deduce an estimated genetic map (marker and map names and genetic distances are detailed in Table 1 ).

    Journal: PLoS ONE

    Article Title: A Physical Map of the Short Arm of Wheat Chromosome 1A

    doi: 10.1371/journal.pone.0080272

    Figure Lengend Snippet: Physical map of wheat chromosome 1AS. The figure integrates multiple sequence resources. a. Chromosome 1AS deletion bin map with the three bins shown in (yellow, green and gray). ESTs from the three deletion bins which were mapped to Brachypodium reference zipper genes are indicated with boxes with colour of the corresponding bin. If more than one EST mapped to the same Brachypodium gene, the boxes were stacked on top of each other. This information was used to estimate the boundaries of each deletion bin in the Brachypodium reference zipper (dashed lines). b. Brachypodium reference zipper. c. Physical map of the 1AS chromosme arm. BAC contigs are symbolised with blue lines (see enlarged legend at the right). The length of the line reflects the number of putative syntenic genes found on the contig, not its physical size. Syntenic genes are also symbolised by black boxes. The number of non-syntenic genes for each contig is indicated with a stack of red boxes. Grey boxes indicate place holders for contigs that contained no syntenic genes but were anchored by means other than synteny (e.g. genetic markers of centromere-specific repeats. d. Published genetic markers from chromosome 1AS that were used to deduce an estimated genetic map (marker and map names and genetic distances are detailed in Table 1 ).

    Article Snippet: However, we also considered it important to include BAC and BAC contigs that were leftover from the initial FPC assembly.

    Techniques: Sequencing, BAC Assay, Marker

    Pajek analysis of BAC contigs from chromosome 1AS assembled with the FPC software. Repetitive BAC clones within the BAC fingerprints were problematic for the FPC assembly, leading to non-linear contig patterns. The reassembly of fingerprints using the LTC assembly program resolved non-linear contigs. The dashed line indicates where the non-linear contig was cut into two contig segments, because the two segments are only connected by a single BAC clone (indicated by all connections converging in one point.

    Journal: PLoS ONE

    Article Title: A Physical Map of the Short Arm of Wheat Chromosome 1A

    doi: 10.1371/journal.pone.0080272

    Figure Lengend Snippet: Pajek analysis of BAC contigs from chromosome 1AS assembled with the FPC software. Repetitive BAC clones within the BAC fingerprints were problematic for the FPC assembly, leading to non-linear contig patterns. The reassembly of fingerprints using the LTC assembly program resolved non-linear contigs. The dashed line indicates where the non-linear contig was cut into two contig segments, because the two segments are only connected by a single BAC clone (indicated by all connections converging in one point.

    Article Snippet: However, we also considered it important to include BAC and BAC contigs that were leftover from the initial FPC assembly.

    Techniques: BAC Assay, Software, Clone Assay

    The three levels of anchoring used in the construction of the chromosome 1AS physical map. On level 1, genes were anchored to physical BAC contigs using positive hybridisation probe matches, BAC-end sequences and Illumina contigs. Individual anchoring procedures are indicated by capital letters in circles and described in the text. For level 2 anchoring, all BAC contigs which contain genes which have their homologs in the 1AS syntenic region of Brachypodium , rice or sorghum were anchored to the reference zipper. This means that the order of genes in wheat was assumed to be the same as in Brachypodium , rice or sorghum. In the final step (level 3), data from genetic markers were used to anchor BAC contigs to previously published genetic maps.

    Journal: PLoS ONE

    Article Title: A Physical Map of the Short Arm of Wheat Chromosome 1A

    doi: 10.1371/journal.pone.0080272

    Figure Lengend Snippet: The three levels of anchoring used in the construction of the chromosome 1AS physical map. On level 1, genes were anchored to physical BAC contigs using positive hybridisation probe matches, BAC-end sequences and Illumina contigs. Individual anchoring procedures are indicated by capital letters in circles and described in the text. For level 2 anchoring, all BAC contigs which contain genes which have their homologs in the 1AS syntenic region of Brachypodium , rice or sorghum were anchored to the reference zipper. This means that the order of genes in wheat was assumed to be the same as in Brachypodium , rice or sorghum. In the final step (level 3), data from genetic markers were used to anchor BAC contigs to previously published genetic maps.

    Article Snippet: However, we also considered it important to include BAC and BAC contigs that were leftover from the initial FPC assembly.

    Techniques: BAC Assay, Hybridization

    Comparison of LTC and FPC assemblies. a. The BAC clones constituting the contig ltc5279 are depicted at the top. Underneath, FPC contigs which cover corresponding regions are displayed. Gray lines connect the start points of corresponding BACs. Contig ltc5279 (approximately 2,161 kb in size) which is the fusion product of eight smaller FPC contigs. Overall, the relative positions of BAC clones within LTC and FPC contigs are very similar. b. Example of three small FPC contigs which are merged into one LTC contig (ltc132). This LTC contig also includes BACs which were singletons in the FPC assembly (blue). Note that in a and b the scales are different. c. Size distribution of overlaps of FPC contigs which were merged in the LTC assembly. The x-axis indicates the size range of overlaps of two FPC clones that were merged by LTC. The y-axis shows how many cases were identified in each size range. The gray series shows the size distribution of all overlaps. The blue series shows only those cases where additional singletons were included to merge FPC contigs while the red series shows the cases where no additional clones were used for the merging.

    Journal: PLoS ONE

    Article Title: A Physical Map of the Short Arm of Wheat Chromosome 1A

    doi: 10.1371/journal.pone.0080272

    Figure Lengend Snippet: Comparison of LTC and FPC assemblies. a. The BAC clones constituting the contig ltc5279 are depicted at the top. Underneath, FPC contigs which cover corresponding regions are displayed. Gray lines connect the start points of corresponding BACs. Contig ltc5279 (approximately 2,161 kb in size) which is the fusion product of eight smaller FPC contigs. Overall, the relative positions of BAC clones within LTC and FPC contigs are very similar. b. Example of three small FPC contigs which are merged into one LTC contig (ltc132). This LTC contig also includes BACs which were singletons in the FPC assembly (blue). Note that in a and b the scales are different. c. Size distribution of overlaps of FPC contigs which were merged in the LTC assembly. The x-axis indicates the size range of overlaps of two FPC clones that were merged by LTC. The y-axis shows how many cases were identified in each size range. The gray series shows the size distribution of all overlaps. The blue series shows only those cases where additional singletons were included to merge FPC contigs while the red series shows the cases where no additional clones were used for the merging.

    Article Snippet: However, we also considered it important to include BAC and BAC contigs that were leftover from the initial FPC assembly.

    Techniques: BAC Assay, Clone Assay

    Assembly benchmarking comparisons reveal high degree of assembly completion. (A) Feature response curves (FRC) showing the error rate as a function of the number of bases in each assembly (CHIR_1.0, CHIR_2.0, and ARS1) and each scaffold test (intermediary assemblies using a combination of Hi-C and Bionano scaffolding). (B) Comparison plots of chromosome 20 sequence between the ARS1 and CHIR_2.0 assemblies reveal several small inversions (light blue circles) and a small insertion of sequence (break in continuity) in the ARS1 assembly. Red circles highlight 9 of the aforementioned inversions and the insertion of sequence in our assembly. The ARS1 assembly contains only 10 gaps on this chromosome scaffold whereas CHIR_2.0 has 5,651 gaps on the same chromosome assembly (gap density histogram on the Y axis). ARS1 optical map scaffolds and Pacbio contigs represented on the X axis as alternating patterns of blue and green shades, respectively, showing the tiling path that comprises the entire single chromosome scaffold.

    Journal: Nature genetics

    Article Title: Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome

    doi: 10.1038/ng.3802

    Figure Lengend Snippet: Assembly benchmarking comparisons reveal high degree of assembly completion. (A) Feature response curves (FRC) showing the error rate as a function of the number of bases in each assembly (CHIR_1.0, CHIR_2.0, and ARS1) and each scaffold test (intermediary assemblies using a combination of Hi-C and Bionano scaffolding). (B) Comparison plots of chromosome 20 sequence between the ARS1 and CHIR_2.0 assemblies reveal several small inversions (light blue circles) and a small insertion of sequence (break in continuity) in the ARS1 assembly. Red circles highlight 9 of the aforementioned inversions and the insertion of sequence in our assembly. The ARS1 assembly contains only 10 gaps on this chromosome scaffold whereas CHIR_2.0 has 5,651 gaps on the same chromosome assembly (gap density histogram on the Y axis). ARS1 optical map scaffolds and Pacbio contigs represented on the X axis as alternating patterns of blue and green shades, respectively, showing the tiling path that comprises the entire single chromosome scaffold.

    Article Snippet: Optical map scaffolding of PacBio contigs produced an assembly of 333 scaffolds, containing 90.89% of the final ARS1 assembly length, with a scaffold NG50 of 20.623 Mbp and identified 36 misassemblies in the PacBio contigs.

    Techniques: Hi-C, Scaffolding, Sequencing

    Sequence data coverage of the P. micrantha chloroplast genome. Schematic diagram showing the coverage of the P. micrantha chloroplast genome by the seven Illumina contigs (black) and a single PacBio contig (green) following assembly using ABySS and Celera assembler respectively. The red line across the top of the schematic represents the P. micrantha chloroplast genome sequence, blue bold sections indicate the inverted repeat regions of the genome. Sections of contig 1 from both the Illumina and PacBio assemblies corresponding to the non-unique section of the IR are shown in red. Illumina contig 1 spans the start/end point of the linear representation of the circular chloroplast genome.

    Journal: BMC Genomics

    Article Title: An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome

    doi: 10.1186/1471-2164-14-670

    Figure Lengend Snippet: Sequence data coverage of the P. micrantha chloroplast genome. Schematic diagram showing the coverage of the P. micrantha chloroplast genome by the seven Illumina contigs (black) and a single PacBio contig (green) following assembly using ABySS and Celera assembler respectively. The red line across the top of the schematic represents the P. micrantha chloroplast genome sequence, blue bold sections indicate the inverted repeat regions of the genome. Sections of contig 1 from both the Illumina and PacBio assemblies corresponding to the non-unique section of the IR are shown in red. Illumina contig 1 spans the start/end point of the linear representation of the circular chloroplast genome.

    Article Snippet: The PacBio contig contained a total of 139,688 nucleotides.

    Techniques: Sequencing

    Comparisons of the assemblers conducted in this study. SSPACE-LongRead is a scaffolder using single molecule long reads to upgrade pre-assembled contigs constructed from short reads. ALLPATHS-LG and SPAdes are hybrid assemblers which take short reads and long reads as inputs. PBcR pipeline uses short reads to correct long reads by pacBioToCA, and then assembles corrected long reads (PBcR) by Celera assembler (runCA). Hierarchical genome-assembly process (HGAP) and PBcR pipeline via self-correction (PBcR pipeline(S)) take long reads as input to produce non-hybrid assembly.

    Journal: Scientific Reports

    Article Title: Completing bacterial genome assemblies: strategy and performance comparisons

    doi: 10.1038/srep08747

    Figure Lengend Snippet: Comparisons of the assemblers conducted in this study. SSPACE-LongRead is a scaffolder using single molecule long reads to upgrade pre-assembled contigs constructed from short reads. ALLPATHS-LG and SPAdes are hybrid assemblers which take short reads and long reads as inputs. PBcR pipeline uses short reads to correct long reads by pacBioToCA, and then assembles corrected long reads (PBcR) by Celera assembler (runCA). Hierarchical genome-assembly process (HGAP) and PBcR pipeline via self-correction (PBcR pipeline(S)) take long reads as input to produce non-hybrid assembly.

    Article Snippet: Dozens of contigs were generated even if the PacBio long reads were used, and the N50 values obtained from SPAdes were as low as one tenth of the values obtained from ALLPATHS-LG.

    Techniques: Construct

    Alignment of the curated PacBio contigs to the AgamP4 PEST reference [ 21 ]. Alignments are colored by the primary PEST reference chromosome to which they align but are placed in the panel and Y offset to which the contig as a whole aligns best. Contig ends are denoted by horizontal lines in the assembly and vertical lines in PEST. However, there are many Ns in PEST not annotated as contig breaks so the percent Ns per megabase of PEST is overlaid (scale on the right Y axis). There are no Ns in the PacBio assembly.

    Journal: Genes

    Article Title: A High-Quality De novo Genome Assembly from a Single Mosquito Using PacBio Sequencing

    doi: 10.3390/genes10010062

    Figure Lengend Snippet: Alignment of the curated PacBio contigs to the AgamP4 PEST reference [ 21 ]. Alignments are colored by the primary PEST reference chromosome to which they align but are placed in the panel and Y offset to which the contig as a whole aligns best. Contig ends are denoted by horizontal lines in the assembly and vertical lines in PEST. However, there are many Ns in PEST not annotated as contig breaks so the percent Ns per megabase of PEST is overlaid (scale on the right Y axis). There are no Ns in the PacBio assembly.

    Article Snippet: For example, a single contig from the new PacBio assembly expanded a tandem repeat region on chromosome 2L that in PEST was collapsed, while also filling in many Ns (gaps) in PEST, and also spanning a break between PEST scaffolds set to 10,000 Ns ( ).

    Techniques:

    Alignment of X pericentromeric contigs to PEST, highlighting likely order and orientation issues in the PEST assembly that are resolved by a single PacBio contig (22F).

    Journal: Genes

    Article Title: A High-Quality De novo Genome Assembly from a Single Mosquito Using PacBio Sequencing

    doi: 10.3390/genes10010062

    Figure Lengend Snippet: Alignment of X pericentromeric contigs to PEST, highlighting likely order and orientation issues in the PEST assembly that are resolved by a single PacBio contig (22F).

    Article Snippet: For example, a single contig from the new PacBio assembly expanded a tandem repeat region on chromosome 2L that in PEST was collapsed, while also filling in many Ns (gaps) in PEST, and also spanning a break between PEST scaffolds set to 10,000 Ns ( ).

    Techniques:

    Example of a compressed repeat in PEST that has been expanded by the PacBio assembly. Dotted vertical lines represent a gap in the PEST assembly (10,000 Ns) between scaffolds, which is now spanned by the single PacBio contig. Coverage plot of the PacBio subreads aligned to PEST (bottom) highlights the region where excess coverage indicates a collapsed repeat in PEST, in contrast the coverage of PacBio subreads aligned to the PacBio contig (left) is more uniform.

    Journal: Genes

    Article Title: A High-Quality De novo Genome Assembly from a Single Mosquito Using PacBio Sequencing

    doi: 10.3390/genes10010062

    Figure Lengend Snippet: Example of a compressed repeat in PEST that has been expanded by the PacBio assembly. Dotted vertical lines represent a gap in the PEST assembly (10,000 Ns) between scaffolds, which is now spanned by the single PacBio contig. Coverage plot of the PacBio subreads aligned to PEST (bottom) highlights the region where excess coverage indicates a collapsed repeat in PEST, in contrast the coverage of PacBio subreads aligned to the PacBio contig (left) is more uniform.

    Article Snippet: For example, a single contig from the new PacBio assembly expanded a tandem repeat region on chromosome 2L that in PEST was collapsed, while also filling in many Ns (gaps) in PEST, and also spanning a break between PEST scaffolds set to 10,000 Ns ( ).

    Techniques:

    Distribution of BAC hits and contig/marker . (a) the distribution of BAC hits/marker using 1470 markers; (b) the distribution of the numbers of identified contigs of each marker.

    Journal: BMC Genomics

    Article Title: Genetic marker anchoring by six-dimensional pools for development of a soybean physical map

    doi: 10.1186/1471-2164-9-28

    Figure Lengend Snippet: Distribution of BAC hits and contig/marker . (a) the distribution of BAC hits/marker using 1470 markers; (b) the distribution of the numbers of identified contigs of each marker.

    Article Snippet: Of these contigs, 458 (35.1%) have only one unigene, 772 (59.1%) contigs have 2~8 unigenes, and 75 (5.7%) contigs have 9~44 unigenes.

    Techniques: BAC Assay, Marker

    Example of the integrated map view of a ~8 cM region on LG E . The genetic map was redrawn based on the integrated genetic linkage map [11]. The QTL name and position refer to the Soybean Breeders Toolbox [34]. The dashed lines indicate the discrepancies of marker alignments between the physical map and genetic map. The highlighted FPC contigs are questionable. The number above the lines connecting genetic markers and contigs is the number of BAC hits. The white bars between highlight bars are gaps between the WSS scaffolds.

    Journal: BMC Genomics

    Article Title: Genetic marker anchoring by six-dimensional pools for development of a soybean physical map

    doi: 10.1186/1471-2164-9-28

    Figure Lengend Snippet: Example of the integrated map view of a ~8 cM region on LG E . The genetic map was redrawn based on the integrated genetic linkage map [11]. The QTL name and position refer to the Soybean Breeders Toolbox [34]. The dashed lines indicate the discrepancies of marker alignments between the physical map and genetic map. The highlighted FPC contigs are questionable. The number above the lines connecting genetic markers and contigs is the number of BAC hits. The white bars between highlight bars are gaps between the WSS scaffolds.

    Article Snippet: Of these contigs, 458 (35.1%) have only one unigene, 772 (59.1%) contigs have 2~8 unigenes, and 75 (5.7%) contigs have 9~44 unigenes.

    Techniques: Marker, BAC Assay

    Distributions of ESTs in Contigs . Assembly of 12,084 ESTs resulted in 2258 contigs comprising 7333 ESTs. The distribution of 7333 ESTs in each contig was ranged between 2 and 23. The contig size represents the number of ESTs in the contig.

    Journal: BMC Genomics

    Article Title: Gene discovery from Jatropha curcas by sequencing of ESTs from normalized and full-length enriched cDNA library from developing seeds

    doi: 10.1186/1471-2164-11-606

    Figure Lengend Snippet: Distributions of ESTs in Contigs . Assembly of 12,084 ESTs resulted in 2258 contigs comprising 7333 ESTs. The distribution of 7333 ESTs in each contig was ranged between 2 and 23. The contig size represents the number of ESTs in the contig.

    Article Snippet: The 2258 contigs were manually checked and the longest EST from each contig was selected as unigene.

    Techniques:

    A, Frequency distribution of 2,431 unigene contigs across four developmental stages of microspore embryo development. The number of contigs represented by ESTs in cDNA libraries from each of the four stages of development (0 h, 3, 5, 7 d) is indicated

    Journal:

    Article Title: Transcript Profiling and Identification of Molecular Markers for Early Microspore Embryogenesis in Brassica napus 1 1 [W] 1 [W] [OA]

    doi: 10.1104/pp.106.092932

    Figure Lengend Snippet: A, Frequency distribution of 2,431 unigene contigs across four developmental stages of microspore embryo development. The number of contigs represented by ESTs in cDNA libraries from each of the four stages of development (0 h, 3, 5, 7 d) is indicated

    Article Snippet: From this analysis, 467 of the 2,431 unigene contigs (see above) were found to be differentially expressed (significant threshold R < 0.05) across the four stages of embryo development (0 h, 3, 5, and 7 d; ; Supplemental Table S3).

    Techniques:

    Sources of linking information between contigs. ( A ) overlaps, ( B ) clone mates, ( C ) alignments to reference genome, ( D ) alignments to physical maps, ( E ) conservation of gene synteny.

    Journal: Genome Research

    Article Title: Hierarchical Scaffolding With Bambus

    doi: 10.1101/gr.1536204

    Figure Lengend Snippet: Sources of linking information between contigs. ( A ) overlaps, ( B ) clone mates, ( C ) alignments to reference genome, ( D ) alignments to physical maps, ( E ) conservation of gene synteny.

    Article Snippet: Bambus is required to place all contigs in scaffolds and will thus generally create more scaffolds than Celera Assembler.

    Techniques:

    Detailed information produced by Bambus. Shown are two contigs connected by two valid links (v:2 in the header line), with 1 additional link whose length is outside the estimated range and therefore invalid (l:1). The contigs face away from each other, indicated by the arrows (“→”) in the header. Each pair of linked reads is shown on a separate line, with coordinates indicating the position of the read within its contig. For example, GBRDE74TR is mapped to positions 890-1416 of contig_32, and GBRDE74TF is mapped to positions 413-1207 of contig_38.

    Journal: Genome Research

    Article Title: Hierarchical Scaffolding With Bambus

    doi: 10.1101/gr.1536204

    Figure Lengend Snippet: Detailed information produced by Bambus. Shown are two contigs connected by two valid links (v:2 in the header line), with 1 additional link whose length is outside the estimated range and therefore invalid (l:1). The contigs face away from each other, indicated by the arrows (“→”) in the header. Each pair of linked reads is shown on a separate line, with coordinates indicating the position of the read within its contig. For example, GBRDE74TR is mapped to positions 890-1416 of contig_32, and GBRDE74TF is mapped to positions 413-1207 of contig_38.

    Article Snippet: Bambus is required to place all contigs in scaffolds and will thus generally create more scaffolds than Celera Assembler.

    Techniques: Produced

    Alignment of Illumina-assembled contigs, Illumina and Nanopore reads to the annotated Vibrio parahaemolyticus MVP1 pVa plasmid sub-region containing the pirAB Vp genes. Direction of arrow in the annotation indicates transcription orientation. Blue and red arrows in the Nanopore read alignment indicate forward and reverse strands, respectively.

    Journal: bioRxiv

    Article Title: Nanopore long reads enable the first complete genome assembly of a Malaysian Vibrio parahaemolyticus isolate bearing the pVa plasmid associated with acute hepatopancreatic necrosis disease

    doi: 10.1101/861476

    Figure Lengend Snippet: Alignment of Illumina-assembled contigs, Illumina and Nanopore reads to the annotated Vibrio parahaemolyticus MVP1 pVa plasmid sub-region containing the pirAB Vp genes. Direction of arrow in the annotation indicates transcription orientation. Blue and red arrows in the Nanopore read alignment indicate forward and reverse strands, respectively.

    Article Snippet: On the other hand, the Flye assembly using only Nanopore reads produced three contigs all flagged as “complete and circular”.

    Techniques: Plasmid Preparation

    Correlation analysis from assemblies. Hierarchical clustering given by the Spearman correlation matrix of the viral-bacterial (left) and the viral (right) assemblies (A) . The gradient indicates the strength and direction of the correlations. Blue squares represent significant negative correlations and red squares positive correlations ( P value ≤ 0.05). Green clusters are based on accurate short-length and low-assembled contigs while purple clusters are based on highly-assembled long-chimeric contigs. Ctgs: number of contigs, ChCtgs: percentage of chimeric contigs, ChCtgs.V.B: percentage of viral-bacterial chimeric contigs, GR: genomes recovered, IM: median of percentage of contig identity against its original genome, LC: Largest contig, N50, RA: percentage of reads assembled, ROG: percentage of reads assembled on their original genomes, RVB: percentage of reads within a viral-bacterial hit. Principal Component Analysis for the different assemblies in both metagenomes (B) : The two principal components that better explain variation between assemblies are shown for the viral-bacterial (left) and the viral (right) assemblies. The length of the red vectors represents the effect of each component on the stats. The distance between them indicates their correlation. Circles represent two clusters formed due to the close correlations in the matrices in panel A . C: Celera, M: Minimo, N: Newbler, Optimal: optimal assembly.

    Journal: BMC Genomics

    Article Title: Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut

    doi: 10.1186/1471-2164-15-37

    Figure Lengend Snippet: Correlation analysis from assemblies. Hierarchical clustering given by the Spearman correlation matrix of the viral-bacterial (left) and the viral (right) assemblies (A) . The gradient indicates the strength and direction of the correlations. Blue squares represent significant negative correlations and red squares positive correlations ( P value ≤ 0.05). Green clusters are based on accurate short-length and low-assembled contigs while purple clusters are based on highly-assembled long-chimeric contigs. Ctgs: number of contigs, ChCtgs: percentage of chimeric contigs, ChCtgs.V.B: percentage of viral-bacterial chimeric contigs, GR: genomes recovered, IM: median of percentage of contig identity against its original genome, LC: Largest contig, N50, RA: percentage of reads assembled, ROG: percentage of reads assembled on their original genomes, RVB: percentage of reads within a viral-bacterial hit. Principal Component Analysis for the different assemblies in both metagenomes (B) : The two principal components that better explain variation between assemblies are shown for the viral-bacterial (left) and the viral (right) assemblies. The length of the red vectors represents the effect of each component on the stats. The distance between them indicates their correlation. Circles represent two clusters formed due to the close correlations in the matrices in panel A . C: Celera, M: Minimo, N: Newbler, Optimal: optimal assembly.

    Article Snippet: Cluster 1 is characterized by its positive correlations between the number of contigs, the percentage of reads within a viral-bacterial hit, and the percentage of reads matching their original genome; and by the negative correlations between the largest contig with the N50, the number of reads assembled and the number of chimeric contigs.

    Techniques:

    Lowest Common Ancestor in chimeric contigs. The LCA of each chimeric contig is represented as a fraction of the total number of chimeric contigs on every viral and viral-bacterial metagenome assembly.

    Journal: BMC Genomics

    Article Title: Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut

    doi: 10.1186/1471-2164-15-37

    Figure Lengend Snippet: Lowest Common Ancestor in chimeric contigs. The LCA of each chimeric contig is represented as a fraction of the total number of chimeric contigs on every viral and viral-bacterial metagenome assembly.

    Article Snippet: Cluster 1 is characterized by its positive correlations between the number of contigs, the percentage of reads within a viral-bacterial hit, and the percentage of reads matching their original genome; and by the negative correlations between the largest contig with the N50, the number of reads assembled and the number of chimeric contigs.

    Techniques:

    Mis-assembled DCC and DOC . Assemblers may mistakenly form two contigs from the two haplotypes, as shown in (a) where contig A contains heterozygous sequence and contig B contains homozygous sequence (light) on both sides of a matching heterozygous region (dark) (with sequencing reads as lines above them). We refer to A as a duplicated contained contig (DCC). We can identify this situation by finding an alignment between contigs A and B that completely covers contig A and comparing contig A's mate pair links in the original location to those same links when contig A is overlaid on contig B at the location of its alignment, as shown in (b) . Dashed curves in (a) indicate distances that are significantly shorter (left side of figure) or longer (right) than expected; solid curves indicate distances that are consistent with specifications. In the situation shown here, we would designate contig A as an erroneous duplication likely to have been caused by haplotype differences. Alternatively, heterozygous sequence may be separated into two contigs that each include some homozygous sequence on opposite ends, as in contigs C and D in (c) , which we refer to as duplicated overlapping contigs. If a significant alignment exists between the ends of these contigs and the distances between mate pairs pointing right from contig C and left from contig D better match their expected fragment sizes when the contigs are joined, we designate the region as an erroneous duplication and join the contigs as in (d) .

    Journal: Genome Biology

    Article Title: Detection and correction of false segmental duplications caused by genome mis-assembly

    doi: 10.1186/gb-2010-11-3-r28

    Figure Lengend Snippet: Mis-assembled DCC and DOC . Assemblers may mistakenly form two contigs from the two haplotypes, as shown in (a) where contig A contains heterozygous sequence and contig B contains homozygous sequence (light) on both sides of a matching heterozygous region (dark) (with sequencing reads as lines above them). We refer to A as a duplicated contained contig (DCC). We can identify this situation by finding an alignment between contigs A and B that completely covers contig A and comparing contig A's mate pair links in the original location to those same links when contig A is overlaid on contig B at the location of its alignment, as shown in (b) . Dashed curves in (a) indicate distances that are significantly shorter (left side of figure) or longer (right) than expected; solid curves indicate distances that are consistent with specifications. In the situation shown here, we would designate contig A as an erroneous duplication likely to have been caused by haplotype differences. Alternatively, heterozygous sequence may be separated into two contigs that each include some homozygous sequence on opposite ends, as in contigs C and D in (c) , which we refer to as duplicated overlapping contigs. If a significant alignment exists between the ends of these contigs and the distances between mate pairs pointing right from contig C and left from contig D better match their expected fragment sizes when the contigs are joined, we designate the region as an erroneous duplication and join the contigs as in (d) .

    Article Snippet: Abbreviations bp: base pair; DCC: duplicated contained contig; DOC: duplicated overlapping contig; kb: kilobase; Mb: megabase; NCBI: National Center for Biotechnology Information; SNP: single nucleotide polymorphism; WGS: whole-genome shotgun.

    Techniques: Droplet Countercurrent Chromatography, Sequencing

    Venn diagram showing distribution of H. glycines BLAST hits by database. Forty-four percent of all 6,860 H. glycines contigs matched sequences in at least one of three databases at a threshold value of 1 e-20 : (a) All cyst nematodes without H. glycines . (b) All non-cyst nematodes. (c) All non-nematodes.

    Journal: Genome Biology

    Article Title: Divergent evolution of arrested development in the dauer stage of Caenorhabditis elegans and the infective stage of Heterodera glycines

    doi: 10.1186/gb-2007-8-10-r211

    Figure Lengend Snippet: Venn diagram showing distribution of H. glycines BLAST hits by database. Forty-four percent of all 6,860 H. glycines contigs matched sequences in at least one of three databases at a threshold value of 1 e-20 : (a) All cyst nematodes without H. glycines . (b) All non-cyst nematodes. (c) All non-nematodes.

    Article Snippet: Contigs were formed by Affymetrix for the design of the H. glycines group of probesets of the Affymetrix Soybean Genome Array GeneChip and all consensus sequences and contig size details can be accessed at Affymetrix [ ].

    Techniques:

    Assembly and validation of Drosophila miranda genome. A . Overview of assembly pipeline. The steps include assembly of male PacBio reads followed by scaffolding using Hi-C, and extensive QC using BioNano reads and BAC clone sequencing followed by gene and repeat annotation. B. Hi-C linkage density map. Chromatin interaction maps allow recovery of entire chromosome arms. Note that the Y-linked contigs were scaffolded separately from X-linked and autosomal contigs. Unlinked regions with many contacts indicate repetitive regions. C. Comparison of current (Dmir2.0) versus old (Dmir1.0) D . miranda assembly. Note that the Y/neo-Y was not assembled in Dmir1.0, and the dot plot indicates homology between our neo-Y assembly and the neo-X. Other repeat-rich regions, such as the large pericentromeric block on AD, are also missing from D.mir1.0. D. BAC clone mapping for assembly verification. BAC clones are color coded according to how many genomic regions they map to in our assembly; green lines indicate stitch points of scaffolds based on Hi-C contacts, and the black line gives the local repeat content along the genome. Three hundred sixty-one sequenced BAC clones (97%) map contiguously and uniquely to our genome assembly. BAC, bacterial artificial chromosome; F, female; M, male; QC, quality control; Repeat %, local repeat content.

    Journal: PLoS Biology

    Article Title: De novo assembly of a young Drosophila Y chromosome using single-molecule sequencing and chromatin conformation capture

    doi: 10.1371/journal.pbio.2006348

    Figure Lengend Snippet: Assembly and validation of Drosophila miranda genome. A . Overview of assembly pipeline. The steps include assembly of male PacBio reads followed by scaffolding using Hi-C, and extensive QC using BioNano reads and BAC clone sequencing followed by gene and repeat annotation. B. Hi-C linkage density map. Chromatin interaction maps allow recovery of entire chromosome arms. Note that the Y-linked contigs were scaffolded separately from X-linked and autosomal contigs. Unlinked regions with many contacts indicate repetitive regions. C. Comparison of current (Dmir2.0) versus old (Dmir1.0) D . miranda assembly. Note that the Y/neo-Y was not assembled in Dmir1.0, and the dot plot indicates homology between our neo-Y assembly and the neo-X. Other repeat-rich regions, such as the large pericentromeric block on AD, are also missing from D.mir1.0. D. BAC clone mapping for assembly verification. BAC clones are color coded according to how many genomic regions they map to in our assembly; green lines indicate stitch points of scaffolds based on Hi-C contacts, and the black line gives the local repeat content along the genome. Three hundred sixty-one sequenced BAC clones (97%) map contiguously and uniquely to our genome assembly. BAC, bacterial artificial chromosome; F, female; M, male; QC, quality control; Repeat %, local repeat content.

    Article Snippet: HybridScaffold was then used to produce hybrid maps from the BioNano contigs and the genomic scaffolds from our scaffolded PacBio assembly, and IrysView was used to visualize alignments of the BioNano contigs and genomic scaffolds to the hybrid ones. shows coverage of hybrid scaffolds by BioNano contigs and NGS contigs (genomic scaffolds).

    Techniques: Scaffolding, Hi-C, BAC Assay, Sequencing, Blocking Assay, Clone Assay

    Bayesian phylogenetic inference of CTV genomes and genome fragments. Unrooted, consensus phylogenetic trees were obtained from 2,000,000 generations of the Markov chain Monte Carlo simulation in Bayesian analysis using a general time-reversal model of nucleotide substitution [33] . The number above each branch indicates the Bayesian posterior probability. The scale bars represent 0.1 expected substitutions per site. Branch lengths are proportional to evolutionary distance. Sequences were aligned using ClustalX [47] and subsequently manually aligned prior to the Bayesian phylogenetic analysis. A, Known CTV genomes and CTV genomes assembled from resequencing analysis of FS2-2 (highlighted orange). The suffix at the end of fs2_2 distinguishes multiple genotypes in the isolate and also indicates the anchor sequence from which the consensus contig was generated by the Phrap program. B, the 5′ proximal 1 kb, and C, p33-coding region of CTV genomes obtained by direct sequencing of RT-PCR clones. In both B and C, Bayesian posterior probability and clones with identical sequences were omitted for clarity. Recombinant sequences are highlighted in green.

    Journal: PLoS ONE

    Article Title: Persistent Infection and Promiscuous Recombination of Multiple Genotypes of an RNA Virus within a Single Host Generate Extensive Diversity

    doi: 10.1371/journal.pone.0000917

    Figure Lengend Snippet: Bayesian phylogenetic inference of CTV genomes and genome fragments. Unrooted, consensus phylogenetic trees were obtained from 2,000,000 generations of the Markov chain Monte Carlo simulation in Bayesian analysis using a general time-reversal model of nucleotide substitution [33] . The number above each branch indicates the Bayesian posterior probability. The scale bars represent 0.1 expected substitutions per site. Branch lengths are proportional to evolutionary distance. Sequences were aligned using ClustalX [47] and subsequently manually aligned prior to the Bayesian phylogenetic analysis. A, Known CTV genomes and CTV genomes assembled from resequencing analysis of FS2-2 (highlighted orange). The suffix at the end of fs2_2 distinguishes multiple genotypes in the isolate and also indicates the anchor sequence from which the consensus contig was generated by the Phrap program. B, the 5′ proximal 1 kb, and C, p33-coding region of CTV genomes obtained by direct sequencing of RT-PCR clones. In both B and C, Bayesian posterior probability and clones with identical sequences were omitted for clarity. Recombinant sequences are highlighted in green.

    Article Snippet: These fragments and the quality scores associated with each base call were used in contig assembly by the Phrap program as implemented in the CodonCode Aligner program.

    Techniques: Sequencing, Generated, Reverse Transcription Polymerase Chain Reaction, Clone Assay, Recombinant

    Distribution of contig sequences after BLASTx analysis. (A) Percentage of sequences related to the main categories of existing viruses: vertebrate (blue), plant/fungal (green), invertebrate (brown), protozoan (yellow) viruses and bacteriophages (gray), and unassigned viral sequences (no data available concerning the taxonomic family, indicated in red). The total number of viral contigs is indicated below each pie chart. (B) Percentage of sequences related to the most abundant viral families, indicated by the same color range for each of the main viral categories as in (A) : blue = vertebrate, green = plant/fungal, brown = invertebrate, yellow = protozoan, gray = phage. Viral families accounting for less than 2% of all sequences are pooled in the “other” category (in purple), and read sequences with no available data regarding the taxonomic family are considered to be unassigned (in red).

    Journal: PLoS ONE

    Article Title: A Preliminary Study of Viral Metagenomics of French Bat Species in Contact with Humans: Identification of New Mammalian Viruses

    doi: 10.1371/journal.pone.0087194

    Figure Lengend Snippet: Distribution of contig sequences after BLASTx analysis. (A) Percentage of sequences related to the main categories of existing viruses: vertebrate (blue), plant/fungal (green), invertebrate (brown), protozoan (yellow) viruses and bacteriophages (gray), and unassigned viral sequences (no data available concerning the taxonomic family, indicated in red). The total number of viral contigs is indicated below each pie chart. (B) Percentage of sequences related to the most abundant viral families, indicated by the same color range for each of the main viral categories as in (A) : blue = vertebrate, green = plant/fungal, brown = invertebrate, yellow = protozoan, gray = phage. Viral families accounting for less than 2% of all sequences are pooled in the “other” category (in purple), and read sequences with no available data regarding the taxonomic family are considered to be unassigned (in red).

    Article Snippet: Larger viral contigs were obtained and their position in the genome determined, by pooling contig and read sequences for a given viral family for each sample, and assembling them with Sequencher 5.0 software (Gene Codes Corporation).

    Techniques:

    WT and GPC-RNAi plants 12 days after anthesis . (A) WT (left) and GPC-RNAi plants at 12 DAA used to analyze the GPC -dependent transcriptional changes. (B C) Close-up images of the ears (B) and flag leaves (C) from WT (left) and GPC-RNAi plants (right) at 12 DAA. (D) Expression profile of the GPC genes relative to ACTIN in WT and GPC-RNAi plants across a senescing leaf time course (H = heading, D = days after anthesis). Transcript levels are presented as normalized, linearized values from 10 biological replicates (± SEM) derived from the 2 -ΔΔ C t method [ 36 ], where Ct is the threshold cycle. * P≤0.05, ** P≤0.01. (E) Sample clustering based on counts of Illumina reads mapped on 454 contigs. Dendrogram represents the hierarchical clustering of samples as determined by Euclidean distance. The heat map shows a false color representation of the Euclidean distance matrix (from red for zero distance to white for large distance).

    Journal: BMC Genomics

    Article Title: Effect of the down-regulation of the high Grain Protein Content (GPC) genes on the wheat transcriptome during monocarpic senescence

    doi: 10.1186/1471-2164-12-492

    Figure Lengend Snippet: WT and GPC-RNAi plants 12 days after anthesis . (A) WT (left) and GPC-RNAi plants at 12 DAA used to analyze the GPC -dependent transcriptional changes. (B C) Close-up images of the ears (B) and flag leaves (C) from WT (left) and GPC-RNAi plants (right) at 12 DAA. (D) Expression profile of the GPC genes relative to ACTIN in WT and GPC-RNAi plants across a senescing leaf time course (H = heading, D = days after anthesis). Transcript levels are presented as normalized, linearized values from 10 biological replicates (± SEM) derived from the 2 -ΔΔ C t method [ 36 ], where Ct is the threshold cycle. * P≤0.05, ** P≤0.01. (E) Sample clustering based on counts of Illumina reads mapped on 454 contigs. Dendrogram represents the hierarchical clustering of samples as determined by Euclidean distance. The heat map shows a false color representation of the Euclidean distance matrix (from red for zero distance to white for large distance).

    Article Snippet: Alignment of Illumina reads to 454 contigs To determine the differences in transcript levels of the different isogroups between the WT and GPC-RNAi plants at 12 DAA we used the Illumina platform that provided a much greater average sequencing depth per library than the 454 platform (Additional file figure S3).

    Techniques: Gel Permeation Chromatography, Expressing, Derivative Assay