lnu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improved haplotype resolution of highly duplicated MHC genes in a long-read genome assembly using MiSeq amplicons
Lund University, Sweden.
Linnaeus University, Faculty of Health and Life Sciences, Department of Biology and Environmental Science. Lund University, Sweden;Nat Hist Museum, UK.ORCID iD: 0000-0002-6139-7828
Lund University, Sweden.
Lund University, Sweden.
Show others and affiliations
2023 (English)In: PeerJ, E-ISSN 2167-8359, Vol. 11, article id e15480Article in journal (Refereed) Published
Abstract [en]

Long-read sequencing offers a great improvement in the assembly of complex genomic regions, such as the major histocompatibility complex (MHC) region, which can contain both tandemly duplicated MHC genes (paralogs) and high repeat content. The MHC genes have expanded in passerine birds, resulting in numerous MHC paralogs, with relatively high sequence similarity, making the assembly of the MHC region challenging even with long-read sequencing. In addition, MHC genes show rather high sequence divergence between alleles, making diploid-aware assemblers incorrectly classify haplotypes from the same locus as sequences originating from different genomic regions. Consequently, the number of MHC paralogs can easily be over-or underestimated in long-read assemblies. We therefore set out to verify the MHC diversity in an original and a haplotype-purged long-read assembly of one great reed warbler Acrocephalus arundinaceus individual (the focal individual) by using Illumina MiSeq amplicon sequencing. Single exons, representing MHC class I (MHC-I) and class IIB (MHC-IIB) alleles, were sequenced in the focal individual and mapped to the annotated MHC alleles in the original long-read genome assembly. Eighty-four percent of the annotated MHC-I alleles in the original long-read genome assembly were detected using 55% of the amplicon alleles and likewise, 78% of the annotated MHC-IIB alleles were detected using 61% of the amplicon alleles, indicating an incomplete annotation of MHC genes. In the haploid genome assembly, each MHC-IIB gene should be represented by one allele. The parental origin of the MHC-IIB amplicon alleles in the focal individual was determined by sequencing MHC-IIB in its parents. Two of five larger scaffolds, containing 6-19 MHC-IIB paralogs, had a maternal and paternal origin, respectively, as well as a high nucleotide similarity, which suggests that these scaffolds had been incorrectly assigned as belonging to different loci in the genome rather than as alternate haplotypes of the same locus. Therefore, the number of MHC-IIB paralogs was overestimated in the haploid genome assembly. Based on our findings we propose amplicon sequencing as a suitable complement to long-read sequencing for independent validation of the number of paralogs in general and for haplotype inference in multigene families in particular.

Place, publisher, year, edition, pages
PeerJ Inc , 2023. Vol. 11, article id e15480
Keywords [en]
Haploid genome assembly, Amplicon sequencing, Major histocompatibility complex, MHC diversity, Family, Linkage analysis, Copy number variation
National Category
Evolutionary Biology Genetics
Research subject
Ecology, Evolutionary Biology
Identifiers
URN: urn:nbn:se:lnu:diva-123652DOI: 10.7717/peerj.15480ISI: 001030927800003PubMedID: 37456901Scopus ID: 2-s2.0-85168601901OAI: oai:DiVA.org:lnu-123652DiVA, id: diva2:1787520
Available from: 2023-08-14 Created: 2023-08-14 Last updated: 2023-11-21Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMedScopus

Authority records

Stervander, Martin

Search in DiVA

By author/editor
Stervander, Martin
By organisation
Department of Biology and Environmental Science
In the same journal
PeerJ
Evolutionary BiologyGenetics

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 27 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf