Extracting “legacy” genes from an invertebrate sequence capture dataset to complement phylogenomic datasets

Caroline Miller

Authors:  Caroline Miller, Michael Forthman, Rebecca T. Kimball, Christine W. Miller

Faculty Mentor: Dr. Michael Forthman

College:  College of Agricultural and Life Sciences


The field of phylogenetics has greatly benefited from the introduction of next-generation sequencing (NGS) and genome reduction approaches, allowing molecular datasets to consist of thousands of loci from model and non-model species. These phylogenetic studies result in rich datasets comprised of targeted regions of the genome but can also include off-target reads. Molecular datasets comprised of loci from NGS genome reduction approaches are superseding Sanger-based datasets that target a few well-known loci (“legacy markers”). However, integrating these types of datasets is of interest as legacy markers can include different types of loci (e.g., mitochondrial, ribosomal, and nuclear protein coding) across a potentially larger sample of species from past phylogenetic studies. Here, I am using existing legacy data for a group of leaf-footed bugs (Hemiptera: Coreoidea) — a model group for sexual selection studies — to recover legacy markers from off-target sequences in an existing NGS genome reduced dataset comprised of protein-coding ultraconserved elements. Specifically, I use two bioinformatic resources to extract legacy markers from off-target sequences: (1) MitoFinder to retrieve mitochondrial loci (2) and BLAST to retrieve nuclear protein coding and ribosomal loci.

Poster Pitch

Click the video below to view the student's poster pitch.


Click the image to enlarge.
0 0 vote
Presenter Rating
Newest Most Voted
Inline Feedbacks
View all comments
Dr. Miller
Dr. Miller (@guest_1136)
1 year ago

Congratulations on a nice poster! It’s so impressive the work that you’ve done here and in your other projects.

Caroline Miller
Caroline Miller (@guest_3880)
Reply to  Dr. Miller
1 year ago

Dear Dr. Miller,

Thank you so much for your comment, I appreciate it! I am so lucky to be a part of this incredible lab!

Kayli Sieber
Kayli Sieber (@guest_2496)
1 year ago

Really nice! Great job!

Caroline Miller
Caroline Miller (@guest_3958)
Reply to  Kayli Sieber
1 year ago

Dear Kayli,

Thank you so much, I appreciate it!

Dr Ginny Greenway
Dr Ginny Greenway (@guest_4144)
1 year ago

Nice poster Caroline! Great to see how new techniques are helping to get the most information possible out of existing genomic datasets.

Caroline Miller
Caroline Miller (@guest_4328)
Reply to  Dr Ginny Greenway
1 year ago

Dear Ginny,

Thank you so much, I appreciate it! Yes, I agree new techniques and publically available resources are facilitating some great and interesting research.

Genhsy Monzon
Genhsy Monzon (@guest_4542)
1 year ago

Really well done! Such interesting work Caroline!

Caroline Miller
Caroline Miller (@guest_4906)
Reply to  Genhsy Monzon
1 year ago

Dear Genhsy,

Thank so much, I appreciate it!

Hadley Owen
Hadley Owen (@guest_4822)
1 year ago

Your poster looks magnificent! What was your favorite part of working on the project?

Caroline Miller
Caroline Miller (@guest_5138)
Reply to  Hadley Owen
1 year ago

Dear Hadley,

Thank you so much, I appreciate it! I found it really interesting that we could accomplish what the study aimed to do and successfully extract some legacy data from an existing sequence capture dataset. I think that this highlights how we can accomplish some great and interesting research without expending costs for baits to target specific loci, (i.e., mitochondrial and nuclear), when we may already have this data generated!

Emily Angelis
Emily Angelis (@guest_5244)
1 year ago

I love your poster Caroline it looks great!

Caroline Miller
Caroline Miller (@guest_5308)
Reply to  Emily Angelis
1 year ago

Dear Emily,

Thank you so much, I appreciate it!

Dr. Michael Forthman
Dr. Michael Forthman (@guest_5674)
1 year ago

Proud of you!

Caroline Miller
Caroline Miller (@guest_5820)
Reply to  Dr. Michael Forthman
1 year ago

Dear Michael,

This could NOT have all come together without you! Thank you SO much for everything!

Jimmy Peniston
Jimmy Peniston (@guest_5854)
1 year ago

Great presentation and poster! You did a great job explaining some complicated concepts in a very understandable way. I definitely learned a lot. Thank you!

Caroline Miller
Caroline Miller (@guest_6112)
Reply to  Jimmy Peniston
1 year ago

Dear Jimmy,

Thank you so much, I really appreciate it! One of the most difficult parts of preparing this presentation was finding a way to relay this information in a clear, concise, and graphic manner that could be understood by the general public. I am so glad that the concepts came across as understandable!

Sara Zlotnik
Sara Zlotnik (@guest_6448)
1 year ago

Hi Caroline,

Great job on making a very visually appealing and super informative poster! You also explained the technical aspects of your methods really clearly. Thanks for sharing your awesome research with everyone!


Caroline Miller
Caroline Miller (@guest_6836)
Reply to  Sara Zlotnik
1 year ago

Dear Sara,

Thank you so much, I appreciate it! I am glad that the methods flowed so that it was easy to follow. I certainly worked/designed my presentation so that I presented the methods in a parallel manner to make sure it is understandable/more understandable. Thank you for your support!

Allen Wysocki - Associate Dean CALS
Allen Wysocki - Associate Dean CALS (@guest_7548)
1 year ago


You did a great job explaining complex research in terms that this economist could understand,

Doc W

Caroline Miller
Caroline Miller (@guest_7636)
1 year ago

Dear Dr. Wysocki,

Thank you so much, I appreciate it! I am so glad the presentation translated!