Tag Archives: gedmatch

So I’m starting to explore AncestryDNA’s ‘Shared Matches’ tool…

I’ve recently started to explore AncestryDNA’s  ‘Shared Matches’ feature. By recent, I mean only 48 hours ago. Explorations like this makes my inner geek happy. I’ve tinkered with digital stuff and technology for years, so tech like this isn’t daunting to me. I fearlessly dive in to see how things work…or least try to figure out how things like this tool work.

I offer this caveat up-front: I’ll be covering the ‘Shared Matches’ feature as it appears on the new Ancestry / AncestryDNA site. So please don’t be thrown or confused when looking at the screen grabs. They will look very different from the old version of the site, if that’s the one you’re still using.

I’m just exploring and working things out at the moment. There is an end game for this exploration. I hope I can make some in-roads into my Irish and Ashkenazi Jewish genealogy. Yes, that’s right, I have 3 of the most challenging ethnicities to research when it comes to genealogy: African-American, Irish and Central/Eastern European Jewish. Three ethnicities that have undergone a worldwide diaspora with some of the most challenging records to find. That’s me. If there is a genealogy higher power, s/he must be laughing.

What better way to finely hone my pending approach to tackling my unknown Irish and Jewish ancestors than a thorough understanding of how this feature works with my known African-American cousins and known European cousins? I can then apply this insight to DNA cousins I have yet to find common ancestors for. If I can understand the strengths and weaknesses of the ‘Shared Matches’ algorithm and results, I can have a more informed  approach to understanding the many Irish and Jewish DNA cousins I have on AncestryDNA. I have a staggering number of each.

That’s my working premise at the moment.

Ok, with that out of the way, let me show you how I’ve been exploring this AncestryDNA tool.

Below is the standard AncestryDNA family matches landing page. No surprises here for those of you familiar with the service. And yes, your eyes aren’t deceiving you. I really do have 49 pages of DNA cousin matches.

image showing my AncestryDNA family matches landing page

My AncestryDNA family DNA match landing page. Please note that I have will be protecting the identities of my DNA matches in this post. Click for a larger image

Now I have a LOT of cousins with connections to what was the Old Ninety-Six County of South Carolina (this county was dissolved to create the following counties: Abbeville, McCormick, Edgefield, Saluda, Greenwood, Laurens, and Union counties, parts of  Spartanburg County; much of Cherokee and Newberry counties; and small parts of Aiken and Greenville Counties). I know exactly how I’m related to a number of these 3rd, 4th and 5th cousins. There’s a group of us who are very, very active genealogy researchers and share information of various family groups on Facebook.

I connect to these cousins in a myriad of ways. We are the descendants of enslaved Africans, free people of colour and Quakers who left England and Scotland for Antrim and Ulster in northern Ireland, made their way to Pennsylvania and later the former Old Ninety-Six. Knowing how we’re related tells me a bit about how ‘Shared Matches’ works.

And, in a way, my Edgefield family heritage is a good one when it come to understand this DNA matching analysis tool. Whether Quaker, formerly Quaker or African-American, my Edgefield ancestors and relations married within the extended family for generations. So many of my Edgefield-connected cousins and I, regardless of ethnicity, are related to one another a few times over (so far, the winner is a cousin I’m related to at least 4 different ways). Why is this important to note? There are few single sets of common ancestors when it comes to the Edgefield side of my family tree. Which makes pinpointing isolated common ancestors a bit tricky. I’m going to find the same pattern for my rural, agricultural Irish ancestors (who married within a clan structure) and my Ashkenazi Jewish ancestors. Edgefield, it turns out, will be a great genetic genealogy proving ground. It really is as complicated and intricate as general genealogy, much less genetic genealogy, gets.

So, I want to see how many DNA cousins there are who also share a connection to Edgefield. So I typed this search term in the search box as shown in the image below:

image showing how to filter for a specific place on AncestryDNA

Image showing how to filter for a specific place on AncestryDNA. Click for a larger image.

And this is what I got:

image showing how to filter for a specific place on AncestryDNA

Click for larger image.

Turns out I have 78 at the moment. I’m sure if I have added the other counties created from Old Ninety-Six, this number would have been much, much higher.  For this exercise, I want to solely concentrate on Edgefield.

Roughly a third of these 78 matches are cousins that I share a common set of 18th Century Quaker x-times-many great grandparents who lived in Pennsylvania. These ancestors didn’t live in Edgefield themselves – but had descendants or extended family members who did. The only reason they appear in the search results above is down to the fact that, like me, they are researching the whole family and not just their own direct line. I already had their ancestors in my tree and they had mine in theirs.

So I’ve placed this group of cousins to one side. For this exercise, I’m solely focussing on cousins whom I know I share a common set of Edgefield-born ancestors.  It’s always a good idea to grapple with a small sample size when dealing with something new; and something as complicated, complex and intricate as genetic inheritance. It’s like learning how a car engine works for the first time. You wouldn’t tackle how the engine works as a whole. You start with how two parts of it relate to one another and work together, and then add a third engine component and a fourth and a fifth until you finally understand how the whole engine works. This is how I’m approaching the Shared Matches tool.

78 people is just far too many to begin to unravel the mystery of this DNA analysis tool.

So I started hunting around within these results for someone who would match me and only a handful of other DNA cousins on the service within these Edgefield results.

The hunt is on for a DNA match who matched me as well as other Edgefield-based DNA matches. Click for larger image

Trial and error: the hunt is on for a DNA match who matched me as well as other Edgefield-based DNA matches. Click for larger image

So I searched around until I found the DNA cousin above. I’ll call her ‘Mary’. 

When I clicked on the ‘Shared Matches’ link on her page, this is what appeared:

Click for larger image

Click for larger image

Mary matches with me as well as 3 other people. Now this is a sample size I can work with when it comes to analyzing the Shared Match tool!

To re-cap, the 5 of us (Mary, myself and the 3 people who share DNA with Mary and I), share ancestors who lived in Edgefield. Which is exactly what I wanted. Now I have to unravel how, exactly, we’re related. Why are there only five of us -and not all of the other people who match me for Edgefield? Why do my known Edgefield cousins (from families like Matthews/Mathis, Holloway, Settles, Williams, Dorn, Ouzts, Peterson, Timmerman, Harlan/Harling, Gilchrist, Borum, etc) match me, but not Mary or the 3 matches she and I have in common. What family line do the five of us share that none of my other Edgefield relations do? Understanding this not only fills in a genealogy information gap. It will give me a sound insight into how the Shared Matches tool works. Only when I understand how this tool really works can I begin to extrapolate and apply what I’ve learned to other groups of shared matches.

I have a feeling that this Shared Match tool is AncestryDNA’s compromise offering for not having a Chromosome Analysis tool like the ones available from Family Tree DNA and Gedmatch.

image A chromosome analsyis i ran on DNA couin match group on Family Tree DNA

A chromosome analysis I ran on DNA cousin match group on Family Tree DNA. Click for larger image.

AncestryDNA’s position on not having a chromosome analysis tool is entrenched. Like many others, I think it’s a bad call.  Knowing the DNA segment lengths you share with DNA matches can provide critical insights. I have nicknames for parts of my chromosomes that I match others on on Gedmatch and FTDNA: names like my Roane segment or my St. Clair segment, my Josey segment or my Matthews/Mathis segment. Or I name them by region: those are my Arab chromosomes or my Central Asian Chromosomes or my Jewish Chromosomes. On FTDNA and Gedmatch, I don’t even have to know the name of person to see how we’re related. More often than not all I need to see is which chromosome segment we match on and share.

While Ancestry’s Segment Match is better than nothing, and will ultimately yield results, it’s not really a substitute for a chromosome anlaytics tool.

With that said, a few things have already piqued my interest with this group of DNA Shared Matches. Mary and two of the other matches have a 100% European ancestry as reported by AncestryDNA. One person has an Ancestry that’s as mixed as mine. Which, initially, tells me that the shared Ancestor pair we have in common is most likely European. And looking at the major ethnicities of Mary and the two other European-descended matches, this common ancestral pair has the highest likelihood of roots in England, Wales, Scotland or Ireland.

And I’m very excited about the guy with an ancestry as mixed as my own, who I will call Joe. Why? Because I know Joe. I know how we’re related on our known white and black Edgefield lines. The common ancestors Joe and I share can’t be shared by Mary and the other 3 people in this shared match result. Which means I can exclude all of the ancestors and relations that Joe and I share in common when it comes to identifying the common ancestral pair that links us to Mary and the 3 others in these results. Somewhere out there is a new white family name I have yet to find. One that Joe and I don’t share with all of our other known Edgefield cousins who have taken the AncestryDNA test to-date.

This is the benefit of working with a small match pool. It narrows the parameters which results in a narrower field of inquiry. And if this is all beginning to sound like a forensic, CSI-esque kind of experience?  Well, it kind of is. Again, it makes my inner-geek happy.

The next step is to dive into the family trees for these matches…if their trees are public. Heck, if they even have them at all. in this case, only one of the trees is private. So I have 3 trees to work with. Which is kind of a lucky break. Trust me, 3 trees to work with was more than I could have hoped for.

The next step will be applying what I will be learning about this tool to other Edgefield family match groups that are larger. And when I have a finely-tuned understanding of this tool? I’ll start applying it to the Irish and Jewish DNA cousins where absolutely nothing is known in terms of ancestors we share in common. That is my end-game.

 

Leave a comment

Filed under ancestry, genealogy, Genetics

The Folly of Using Small Segments as Proof in Genealogical Research, Part One

Just when I was on the verge of publishing a series of posts based on recent intensive research findings…well, the brown stuff hit the fan in America. So much so that, for me, it seems terribly indulgent for me to post about my family’s genealogy. Suffice to say my genealogy and family history mojo is a bit askew. No, me not posting genealogical and family history materials  isn’t going to affect the current social climate at play in the US one way or another. It’s my own thing; my headspace.

So, while I won’t be publishing family history related posts in the present climate, I will post general finds –  items of interest that appear on my radar.

There’s a blog post by genetic genealogist Cee Cee More that certainly qualifies as just that. For those of you unfamiliar with Ms Moore, she is a an independent professional genetic genealogist and television consultant for the PBS series Finding Your Roots.

In her post entitled The Folly of Using Small Segments as Proof in Genealogical Research, Part One, Cee Cee raises the question:  just how reliable are small DNA fragments when it comes to genealogy research? Not very, as it turns out – or at least not in a manner that’s definitive.

It’s a great post. It’s definitely worth a read.

The Folly of Using Small Segments as Proof in Genealogical Research, Part One: http://www.yourgeneticgenealogist.com/2014/12/the-folly-of-using-small-segments-as.html 

Leave a comment

Filed under Genetics

The real power of Gedmatch – the reconstruction of slavery-disrupted families in the US

Gedmatch’s best feature just so happens to be its original premise: allowing people who have had their DNA tested through Family Tree, Ancestry.com and 23andme to upload their results and connect with long-lost family members. All of this for free too! The benefit is allowing people to make contact with others who have had their DNA tested through with another testing service.

I’ve been blessed to make contact with long lost relations who either live in, or have a connection to, Edgefield County, South Carolina through Gedmatch. The effort that Edgefield-connected African Americans have made in stitching together a slavery-disrupted family tree is phenomenal. The bon ami, the support and the goodwill in freely exchanging family information has been incredible.

And believe me, with enormous families like the Matthews, Harlings, Petersons, Holloways, Settles, Browns, et al – you tend to need all the help you can get.

The stories that have come to light have been brilliant. Simple little things, really. But they give such an insight into the day-to-day lives of our ancestors in Edgefield. The building of a church, the building of a small school for African American children, cousin so-and-so’s baptism, the pride our ancestors felt for their community – it’s the seemingly mundane stories that provide some of the best glimpses into the world of our grandparents, great-grandparents and beyond.

So, for me, this is the true benefit of Gedmatch. By connecting my DNA to others who haven’t tested with Ancestry, I’m making some great family discoveries.

The cool thing is this active community of researchers, who are all related to one another – are succeeding. Bit by bit, record by record, snippet by snippet, the pieces of what I’m fondly calling our ‘super family’ are slowly falling into place.  This has been no mean feat. ‘Slavery-disrupted’. By that I mean the systematic and ceaseless breaking apart of families, generation after generation, for over 250 years. Not all, I admit, but the majority of American slaves suffered this fate. In African American genealogy you will hit this brick wall sooner or later. You hit a point in black genealogy when you only have first names to go on, with only glimmers, hints and almost whispers to go on it trying to fathom the identities of an ancestor’s parents, siblings, cousins, aunts and uncles.

So I am grateful that, at least for my Edgefield ancestors, there are many minds at work.

I can’t help but wonder what my Matthews grandmother and Harling great-grandmother would make of our discoveries.

Leave a comment

Filed under AfAm Genealogy, Genetics

Gedmatch’s EthioHelix Africa-only DNA admixture test

This is the last post in the series covering free admixture analysis tools and how sub-Saharan admixtures are calculated and reported. You can read the full series of posts here: https://genealogyadventures.wordpress.com/tag/african-dna/

The last Gedmatch DNA admixture analysis test I’ve run is the EthioHelix K10 Africa Only test. It is a very interesting test indeed.

The developer of this test has front-loaded an important caveat on the test page: Results are currently only meaningful for persons who are 100% African. Ironically, considering I’m not 100% African, EthioHelix best represents the geographical spread of my African DNA of all the Gedmatch tests I’ve explored.

EthioHelix K10 Africa Only Admixture Proportions

EthioHelix-K10-Africa-Only-Admixture

Population
Nilo-Saharan 2.78%
East-Africa2 17.27%
Mbuti-Pygmy 1.40%
East_Africa1 3.17%
Khoi-San 1.52%
West_Africa 41.58%
Hadza 0.73%
Biaka-Pygmy 1.28%
North-Africa 28.50%
Omotic 1.77%

 

So no, the proportional weightings aren’t quite correct. However, taken as a representational concept, this test shows that many regions of Africa have contributed to my genetic makeup.

One additional EthioHelix test I highly recommend exploring is the Admixture Proportions by Chromosome test.

chromosome-paintingI find the indicative results fascinating:

ethiohelix-chromosomeThe thing that fascinates me about this test (which can be run for any of the admixture tests available on Gedmatch) is the peaks for each region in my chromosomal spread.  I hold my hand up to say I haven’t grasped the significance of the peaks on specific chromosomes. I’m not a geneticist. I just find it amazing that the admixtures I’ve inherited influence some chromosomes and not others. I have some reading to do on this!

If you’re African-American, it’s definitely worth using this analysis option with the HarappaWorld test. Here’s my results:

harappa-chromosome

One of my questions, using the Harappa test above as an example, is why some peoples/regions contribute to almost all of my chromosomes (i.e. Northeast European, Mediterranean and Central Asian/Caucasian) while others are only connected to a handful of chromosomes (i.e. Southeast Asian and Siberian). Does this mean I have fewer ancestors that were Southeast Asian and Siberian? Does time influence a relationship between admixtures and chromosomes? There are so many questions!

So what are my overall takeaways?

It’s definitely a test worth exploring the Gedmatch tests to see which regions of Africa have contributed to your own admixtures. I think that’s the best way to approach such tests. I wouldn’t concentrate too much on the percentages.

Overall, I’d say there is a distinct need in the marketplace for the development of an admixture analysis tool for African and African-descended peoples. Such a test needs to be developed by a team who have a deep and thorough knowledge of African DNA – especially how African admixtures have developed over the eons and how to give a proper weighting to certain aspects of African admixtures. in other words, develop an African-focused admixture analysis tool that is as sophisticated and as refined as comparable tests for European and Eurasian peoples.

In the next post, I’ll be writing about the true genius of Gedmatch: locating long lost relations who have used different DNA testing services.

 

 

Leave a comment

Filed under AfAm Genealogy, Genetics

Gedmatch’s HarappaWorld admixture test answers my West African results question

I have been covering the challenges of African descended DNA analysis in this series of posts. The children of African descent, particularly those in the Americas and the Caribbean do not have a straightforward genetic inheritance. Through the centuries we have inherited a smorgasbord of genetics from wildly different populations. Putting our non-African inheritance to one side, slaves came from many parts of the African continent. And this is at the heart of my conundrum with the generously donated and free admixture tests available on Gedmatch.

I have been scratching my head over why my Gedmatch admixture test results were so completely skewed to West Africa. The results bore no resemblance to my West African heritage shown on a more comprehensive DNA test taken a year ago via Genebase.

I had my suspicions, namely that the people who have given their time to developing these free analytical tools could only create their analytical tools with publicly available admixture data sets. In other words, they had to work with what they could get. The fault, if there was one, lay in the data sets with which they were working. But I needed proof.

The smoking gun came in the form of the Harappa World test on Gedmatch. I am grateful that the developer of Harappa has been so transparent on the blog he created about this test.

I ran the Harappa test and you’ll see the results below:

Harappa

Population
S-Indian 0.54%
Baloch 3.43%
Caucasian 6.91%
NE-Euro 15.13%
SE-Asian
Siberian 0.12%
NE-Asian
Papuan 0.59%
American 0.94%
Beringian 0.07%
Mediterranean 12.04%
SW-Asian 1.82%
San 0.68%
E-African 2.62%
Pygmy 2.86%
W-African 52.25%

Again, West Africa is wildly out of proportion to what I know to be true. So I went to the Harappa blog (http://www.harappadna.org) to investigate. And I found background information on the test that would prove to be a goldmine.

The test uses populations from around the world. Which is great for those of us with complex non-African genetic inheritances. I could get an overall sense once again of just how mixed my admixtures really are. Again, looking at the table above, the non-African results are pretty much in line with what I know about my admixture makeup already. There are some omissions to be sure, but those populations weren’t included in the data sets for this test – it’s always a good idea to really read the background information for these tests to understand what world populations have and haven’t been included and what each test has been designed to measure.

Group results that form this test are shown in the bar chart below (click on the images  below to see a larger image).

harappaworld-admixture-1harappaworld-admixture-2harappaworld-admixture-3harappaworld-admixture-4harappaworld-admixture-5harappaworld-admixture-6harappaworld-admixture-7If I’m understanding this bar chart correctly, populations from the Caribbean and the US are influencing the results for West Africa – and not by a little bit either. The inclusion of either of these populations would drastically skew results for West Africa. Together, the weight they place on this result is dramatic. The Caribbean and the US should be their own categories, and not included within West African results. African descended peoples in either region are the children of Africa, not the progenitors.

Not all slaves who arrived in the Americas and the Caribbean came from West Africa, which is another point to consider. Yes, a sizable percentage of African slaves did come from this region. However, one shouldn’t assume that just because you’re African descended and live in these two regions that the larger part of your African DNA inheritance comes from West Africa. I’m a living example of this with the majority of my African DNA arising from Northwest Africa (Tuareg & Berber) with regions such as North, Central and East Africa contributing far more than my inheritance from either South and West Africa.

Again, kudos to Harappa’s creator for his candor and transparency. In his own words:

“…the admixture components do not necessarily represent real ancestral populations. Also, the names I have chosen for the components should be thought of as mnemonics to ease discussion. I chose them based on which populations in my data these components peaked in. They do not tell anything directly about ancestral populations. The best way to look at these admixture results is by comparing individuals and populations.”

This, in the end, answered my question. I would advocate that such an important qualifier should accompany the test itself.

Casting an eye down the bar chart provided above, making note of the various tribes who comprise results for West Africa, there are many who influence this result. Some I have to question. I will take the Kongo, as an example. The larger part of the Kongo admixture (86%) is attributed to West Africa. In actuality, historically, this is a Northeast African tribe. Which is odd as Northeast Africa is only shown as contributing to 5% f this tribe’s admixture. In my Gedmatch DNA test, the Kongo tribe is attributed to Northeast Africa and accounts for roughly 22% of my admixture from Northeast Africa (Egypt is the largest contributor from this region of Africa at 53% via Genebase).

Again, this isn’t a criticism of the person behind the Harappa DNA analysis test. It’s more to do with the originator of the data set that was produced.

I have a strong feeling that this is the reason behind the skewed West African results for the tests I’ve done to-date on Gedmatch.

I have one more Gedmatch test to run and post about. However, what I will say at this point is that a more refined understanding of African admixtures is sorely needed for these kinds of admixture analysis tests. Personally, I would love for someone who understands the intricacies of Africa, its peoples and African admixtures to develop a Pan-African admixture test that is every bit as comprehensive and detailed as the European and Eurasian focused tests found on Gedmatch. That would be an amazing thing indeed.

I’m fortunate. I found a DNA testing service provider that answered this question for me already. My wish is for others to have a refined and accurately reflective picture of their own African genetic inheritance.

I have one last Gedmatch test to report on. It’s the EthioHelix K10 Africa Only Admixture test. In many ways, I saved the best Gedmatch admixture test for African descended peoples to last.

3 Comments

Filed under AfAm Genealogy, Genetics

Gedmatch’s Dodecad DNA analysis and African DNA results

Continuing the series of posts covering the various genetic admixture analytic tools hosted by Gedmatch, this post covers the Dodecad tool.

The team behind Dodecad carried out an extensive K=3 Admixture analysis of around 130 different populations and about 2,000 individuals from Europe, Asia, and Africa. Using the allele frequency results of this analysis, the Dodecad team were able to create an analytical model that represents West Eurasians, Asians, and Sub-Saharan Africans.

Based on a relatively small genetic sampling, it’s worth understanding that some results will probably be skewed. I’d advise to interpret the proportional results as representative rather than as actual. While not a pan-African continental admixture analytic tool, I was pretty optimistic about the results it would provide.

Before you cast an eye over the results, it’s worth understanding two of the main classification terminology for the Dodecad tests:

  • Palaeo-Africans: Sub-Saharan African tribes including the San, Mbuti and Biaka Pygmy tribes; and
  • Neo-Africans: Sub-Saharan tribes including the Yoruba, Mandenka and Bantu-speaking tribes

Dodecad V3 Admixture Proportions

Dodecad-V3-Admixture-Proportions

Population
East_European 3.40%
West_European 16.44%
Mediterranean 9.91%
Neo_African 32.43%
West_Asian 5.46%
South_Asian 1.04%
Northeast_Asian 0.21%
Southeast_Asian 0.95%
East_African 7.10%
Southwest_Asian 0.24%
Northwest_African 3.01%
Palaeo_African 19.81%

 

Africa9 Admixture Proportions

According to the explanatory notes, the number of SNPs for this analysis is small: there is probably noise in the minor components, but the major components of one’s ancestry should be well-defined. As such, this DNA analytical tool should be used by Africans and African-West Eurasian admixed individuals. It is not meant for people with additional admixture (e.g., South/East Asian or Native American).

Africa9-Admixture-Proportions

Population
Europe 22.78%
NW_Africa 9.23%
SW_Asia 11.21%
E_Africa 3.32%
S_Africa 9.90%
Mbuti 2.17%
W_Africa 36.61%
Biaka 3.03%
San 1.74%

Given my own exceedingly mixed genetic inheritance, I was pretty happy with the basic snapshot of African DNA distribution given above – at least within the African populations that this test covers.

World9 Admixture Proportions

This test was designed to measure Amerindian admixtures.

An important caveat for Americans who suspect that they may have an Amerindian ancestor: trace amounts of Amerindian in this analysis might be attributable to ‘noise’. This component is also found in Siberia, and may represent either backflow from the Americas or the common ancestry of Siberian and Amerindian populations. I suspect that this is the case with my results through what I’ve already known about my genetic links to various Siberian cultures..

World9-Admixture-Proportions

Population
Amerindian 1.10%
East_Asian 0.32%
African 56.93%
Atlantic_Baltic 24.20%
Australasian 0.50%
Siberian 0.15%
Caucasus_Gedrosia 6.50%
Southern 9.89%
South_Asian 0.41%

 

Dodecad K7b Admixture Proportions

This test was designed to focus on the analysis of African contributed admixtures.

Dodecad-K7b-Admixture-Proportions

Population
South_Asian 0.69%
West_Asian 7.24%
Siberian 0.60%
African 57.92%
Southern 8.49%
Atlantic_Baltic 24.53%
East_Asian 0.52%

 

Dodecad K12b Admixture Proportions

This test was designed to focus on the analysis of Eurasian contributed admixtures.

Dodecad-K12b-Admixture-Proportions

Population
Gedrosia* 3.04%
Siberian 0.75%
Northwest_African 0.73%
Southeast_Asian 0.34%
Atlantic_Med 13.33%
North_European 13.57%
South_Asian 0.82%
East_African 6.14%
Southwest_Asian 1.01%
East_Asian
Caucasus 7.62%
Sub_Saharan 52.65%

* OK, so I had to look this one up. Gedrosia is the hellenized name of an area that corresponds to today’s Balochistan. It mainly includes southwestern Pakistan, southeastern Iran and a very small section of southwestern Afghanistan

 

Alongside the Eurogene K-36 Admixture Percentages test, I’m pretty impressed by the suite of Dodecad tests. It’s the closest pan-African DNA analytical tool that I’ve experimented with to-date on Gedmatch.

The more I read about these free admixture analysis tools, the more I begin to realize that the data used to compile them comes from publicly available sources. In other words, there is limited access to data to compile large data sets which would provide truly refined results. The developers deserve props and kudos for spending an inordinate amount of time in developing free analytical tools.

It’s worth bearing the above in mind. If you’re of mixed African descent, my advice is to approach these free analytic tools as basic, illustrative overviews; unless you plan to have a full DNA test done.

Overall, I continue to be amazed at the additional genetic insights are available via an Ancestry.com DNA test I uploaded to Gedmatch.

3 Comments

Filed under AfAm Genealogy, Genetics

Eurogenes DNA analysis and African DNA results

Like the MDLP test, the Eurogenes DNA analysis tool hosted on Gedmatch was developed to largely analyze European-DNA. With that said, it does provide some very interesting insights into non-European genetic inheritances. On the whole, I found that the Eurogenes tools provided a deeper and more meaningful non-European DNA analysis than the MDLP test for a person with a very mixed genetic background (cue: that would be me).

One rather cool inclusion within Eurogenes is an Amerindian component that includes a five native reference populations from North and Central America. This component should be useful for users from the Americas who are wondering about an Amerindian admixture. Indeed, for the first time in any DNA analysis I’ve done, I’m showing trace results of an Amerindian genetic inheritance. Sadly, there’s no indication of what Amerindian tribe(s) comprise these result.

Of all the Gedmatch-hosted DNA analysis tools I’ve used, the Eurogenes K36 Admixture Proportions analysis is the one that I’m the most excited about. I’d still say the results could be more finessed by splitting “West African” into Northwest Africa and West Africa. Some might say this is just semantics; the proverbial splitting of hairs. However, this finessing of the analytical results would better enable me to measure how the Berbers and Tuaregs of Northwest Africa and the West African Bantu-speaking tribes contribute to my DNA profile.  And now, as it so happens, the West African Jewish results in my DNA – a previously unknown component of my genetic makeup. As it stands, with this test, this differentiation is impossible to assess.

Still, compared to the MDLP test (which wasn’t designed to measure specific African populations), this is a step in the right direction for people who have an African heritage.

Each test below appears to be skewed towards certain populations, based on the reference samples for each analysis. That’s all I can say for now. There really isn’t much supporting information on the different nature of each test nor what, exactly, they’re measuring in terms of populations. Logic says each test must be measuring something different – hence the reason for its existence. I’ll continue to search what the differences are and share them when I know them.

Eurogenes K36 Admixture Proportions

Eurogenes-K36-Admixture-Proportions

Population
Amerindian 0.66%
Arabian
Armenian 0.32%
Basque 0.36%
Central_African 0.43%
Central_Euro 1.74%
East_African 5.29%
East_Asian
East_Balkan 0.20%
East_Central_Asian
East_Central_Euro 1.07%
East_Med
Eastern_Euro 2.56%
Fennoscandian 0.63%
French 1.85%
Iberian 7.32%
Indo-Chinese
Italian 8.34%
Malayan
Near_Eastern 1.32%
North_African
North_Atlantic 4.91%
North_Caucasian 0.51%
North_Sea 6.33%
Northeast_African
Oceanian 0.38%
Omotic
Pygmy 4.16%
Siberian
South_Asian 0.15%
South_Central_Asian 1.30%
South_Chinese
Volga-Ural
West_African 49.08%
West_Caucasian 1.09%
West_Med

One note regarding the above. I know that I have a significant Egyptian heritage from my maternal and paternal lines. So I would naturally expect Egypt to fall within the East African results. So I’m puzzled at the absence of any Egyptian results in this category.

The test below was designed to analyze Ashkenazi Jewish admixtures. This is a difficult admixture to analyze for a myriad of reasons. I’d suggest reading about the parameters of this test which you can do here: http://bga101.blogspot.com/2012/09/eurogenes-ashkenazim-ancestry-test-files.html

Jtest Admixture Proportions

Jtest-Admixture-Proportions

Population
South Baltic 2.69%
Eastern Europe 2.83%
North Central Europe 8.63%
Atlantic 9.80%
Western Mediterranean 3.22%
Ashkenazi 6.97%
Eastern Mediterranean 3.10%
West Asian 2.03%
Middle Eastern 1.41%
South Asian 1.46%
East African 10.86%
East Asian
Siberian 0.52%
West African 46.47%

A note about the above: You learn something new every day. I remember how surprised I was when I discovered Indian Jewish DNA in my results. I learned that India was home to an ancient Jewish community. Now I’ve discovered that West Africa also had an ancient Jewish community: the Jews of the Bilad al-Sudan (אַהַל יַהוּדּ בִּלַדּ אַל סוּדָּן).

There were West African Jewish communities who were connected to known Jewish communities from the Middle East, North Africa, or Spain and Portugal (what we would term Sephardic Jews) – as well as trading and living alongside the Berbers. Various historical records attest to their presence at one time in the Ghana, Mali, and Songhai empires, then called the Bilad as-Sudan from the Arabic meaning Land of the Blacks. These are people and places I have ancient genetic links to.

Jews from Spain, Portugal, and Morocco in later years also formed communities off the coast of Senegal and on the Islands of Cape Verde. These communities continued to exist for centuries – but have since disappeared due to changing social conditions, migration and the Trans-Atlantic Slave trade. A lost tribe if ever there was one. You can read more about the Western African Jewish community and its history here: http://en.wikipedia.org/wiki/Jews_of_Bilad_el-Sudan

Eurogenes K13

euroegenes-13

Population  Key:

North European – Lithuania
West African – Nigeria
Mediterranean – Sardinia
Northeast African – Ethiopia Oromo
North Eurasian – Central Siberia
South Asian – South India
Southwest Asian – Bedouin
Pygmy – Mbuti Pygmy
Caucasus – Georgia
East Siberian – Koryaks
East Asian – Eastern China
Amerindian – South America
West Central Asian – Balochistan

Population
North_Atlantic 17.98%
Baltic 6.20%
West_Med 4.73%
West_Asian 3.07%
East_Med 7.01%
Red_Sea 0.73%
South_Asian 0.93%
East_Asian
Siberian
Amerindian 1.15%
Oceanian 0.98%
Northeast_African 2.59%
Sub-Saharan 54.62%

Eurogenes EUtest V2 K15 Admixture Proportions

Eurogenes-EUtest-V2-K15-Admixture-Proportions

Population
North_Sea 10.70%
Atlantic 13.45%
Baltic 3.65%
Eastern_Euro 0.78%
West_Med 2.29%
West_Asian 4.28%
East_Med 3.28%
Red_Sea 1.44%
South_Asian 0.69%
Southeast_Asian
Siberian
Amerindian 0.91%
Oceanian 0.88%
Northeast_African 2.53%
Sub-Saharan 55.13%

Eurogenes K9b Admixture Proportions

Eurogenes-K9b-Admixture-ProportionsPopulation Key:

South Asian – South India
Caucasus – Georgia
Southwest Asian – Bedouin
North Amerindian + Arctic – Northwest America
Siberian – Central Siberia
Mediterranean – Sardinia
East Asian – Eastern China
West African – Nigeria
North European – Lithuania

Population
Southwest_Asian 6.75%
Native_American 1.18%
Northeast_Asian 0.10%
Mediterranean 8.52%
North_European 26.00%
Southeast_Asian 0.46%
Oceanian 0.56%
South_African 3.09%
Sub-Saharan_African 53.34%

Eurogenes K9 Admixture Proportions

Eurogenes-K9-Admixture-Proportions

Population
South Asian 0.95%
Caucasus 7.51%
Southwest Asian 1.48%
North Amerindian + Arctic 0.82%
Siberian
Mediterranean 11.32%
East Asian
West African 57.99%
North European 19.93%

Eurogenes K10 Admixture Proportions

Eurogenes-K10-Admixture-Proportions

Population Key:

South Asian – South India
Caucasus – Georgia
Southwest Asian – Bedouin
North Amerindian + Arctic – Northwest America
Siberian – Central Siberia
Mediterranean – Sardinia
East Asian – Eastern China
West African – Nigeria
East European – Belarus
North Atlantic – Ireland

Population
South Asian 0.89%
Caucasus 7.48%
Southwest Asian 1.48%
North Amerindian + Arctic 0.81%
Siberian
Mediterranean 6.95%
East Asian
West African 57.98%
East European 4.33%
North Atlantic 20.07%

Eurogenes K11 Admixture Proportions

Eurogenes-K11-Admixture-ProportionsPopulation key

South Asian – South India
Caucasus – Georgia
Southwest Asian – Bedouin
North Amerindian + Arctic – Northwest America
Siberian – Central Siberia
Mediterranean – Sardinia
East Asian – Eastern China
West African – Nigeria
Volga-Ural – Western Volga
South Baltic – Lithuania
North Atlantic – Ireland

Population
South Asian 0.89%
Caucasus 7.46%
Southwest Asian 1.45%
North Amerindian + Arctic 0.80%
Siberian
Mediterranean 6.67%
East Asian
West African 57.98%
Volga-Ural 0.69%
South Baltic 4.63%
North Atlantic 19.44%

Eurogenes K12 Admixture Proportions

Eurogenes-K12-Admixture-ProportionsPopulation Key:

South Asian – South India
Caucasus – Georgia
Southwest Asian – Bedouin
North Amerindian + Arctic – Northwest America
Siberian – Central Siberia
Mediterranean – Sardinia
East Asian – Eastern China
West African – Nigeria
Volga-Ural – Western Volga
South Baltic – Lithuania
Western European – Western Ireland
North Sea – Southern Norway

Population
South Asian 0.87%
Caucasus 7.28%
Southwest Asian 1.41%
North Amerindian + Arctic 0.79%
Siberian
Mediterranean 6.00%
East Asian
West African 57.97%
Volga-Ural 0.26%
South Baltic 3.54%
Western European 11.71%
North Sea 10.17%

Eurogenes K12b Admixture Proportions

Eurogenes-K12b-Admixture-ProportionsPopulation Key:

Western European – Cornwall
Siberian – Central Siberia
East African – Masaai
West Central Asian – Balochistan
South Asian – South India
West African – Nigeria
Caucasus – Georgia
Finnish – Eastern Finland
Mediterranean – Sardinia
Southwest Asian – Bedouin
North European – Lithuania
East Asian – Eastern China

Population
Western European 14.86%
Siberian 0.14%
East African 5.19%
West Central Asian 1.12%
South Asian 1.48%
West African 52.54%
Caucasus 6.04%
Finnish 2.60%
Mediterranean 7.17%
Southwest Asian 1.85%
North European 6.79%
East Asian 0.23%

A note about the above: Well, you could imagine my surprise when I saw Cornwall in these results. Considering I loved there for nearly 13 years – and never had a clue. I absolutely loved it. To say I felt an affinity with the place would be an understatement. What I really think this is saying is that there is an English Celtic genetic inheritance. Like the Welsh, the Cornish are the last remnants of the Celtic peoples who inhabited England.

If I am indeed a descendant from the specific Roane family line that I think I am, then this result would make sense. This line of Roanes is descended from Robert the Bruce of Scotland and Edward I “Longshanks” of England – both of whom had genetic links to Wales and Cornwall.

EUtest Admixture Proportions

EUtest-Admixture-Proportions

Population
Southern Baltic 2.76%
Eastern European 3.26%
Northern & Central European 9.15%
Atlantic 10.71%
Eastern Mediterranean 4.08%
Western Mediterranean 4.89%
Western Asian 3.11%
Middle Eastern 2.57%
Southern Asian 1.47%
East African 11.02%
Eastern Asian
Siberian 0.56%
West African 46.41%

Eurogenes Hunter-Gatherer vs. Farmer Admixture Proportions

Eurogenes-Hunter-Gatherer-vs.-Farmer-Admixture-ProportionsPopulation Key:

Anatolian Farmer – Western Caucasus
Baltic Hunter Gatherer – Lithuania
Middle Eastern Herder – Bedouin
East Asian Farmer – Eastern China
South American Hunter Gatherer – South America
South Asian Hunter Gatherer – South India
North Eurasian Hunter Gatherer – Central Siberia
East African Pastoralist – Masaai
Oceanian Hunter Gatherer – Papua New Guinea
Mediterranean Farmer – Sardinia
Pygmy Hunter Gatherer – Mbuti Pygmy
Bantu Farmer – West Africa

Population
Anatolian Farmer 7.14%
Baltic Hunter Gatherer 16.85%
Middle Eastern Herder 1.93%
East Asian Farmer
South American Hunter Gatherer 0.87%
South Asian Hunter Gatherer 0.94%
North Eurasian Hunter Gatherer
East African Pastoralist 4.57%
Oceanian Hunter Gatherer 0.56%
Mediterranean Farmer 13.42%
Pygmy Hunter Gatherer 3.99%
Bantu Farmer 49.73%

My overall feelings on the Eurogenes tests is that the DNA analysis they provide are useful for people with a significant African genetic inheritance test. More so than the MDPL test.  The K13 test is especially useful.

My main note is I think the results for West Africa are too skewed and warrant closer scrutiny and development.  In all my other genetic tests, Nigeria only contributes a very marginal amount to my overall genetic makeup.  However, to look at my Eurogenes results, one would be forgiven for believing that Nigeria was my ancestral African homeland. So this is something to be mindful of.

Again, this could mirror the point I made in my previous post about the MDPL test:  African data sets tend to get lumped together within the analytical database. As such, the tests that I’ve been using to-date don’t provide a true representation of Africa’s heterogeneous populations as they do with European and Asian populations.

8 Comments

Filed under AfAm Genealogy, AfAm History, Genetics

The problem with sub-Saharan Africa and DNA analysis tools

This is the first post in a series that covers issues I’ve experienced with reporting of sub-Saharan African results in DNA analysis. This series of posts will have a particular emphasis on DNA testing for African Americans. Over the next series of posts, I’ll be looking at the strengths and weaknesses of DNA admixture analysis tools – with tips for things to look out for.

I recently had the opportunity to upload my Ancestry.com DNA results to Gedmatch.com. And what a revelatory experience Gedmatch.com has been. To be honest, this DNA analysis service is proving fascinaing. There is just so much to explore and comprehend. I have been doing a LOT of research in order to get my head around all of the information Gedmatch has provided.

My experience with Gedmatch has better enabled me to finely tune a quibble I’ve had with my Ancestry.com results. Don’t get me wrong, Ancestry’s DNA test has done exactly what I wanted it to – put me in touch with distant (and not so distant) relations from my various family lines. It’s allowed me to find my 4x great Sheffey grandfather. And it put me on the right track towards identifying my 4 x Roane great-grandfather.

My niggle with Ancestry’s results has to do with my admixtures and the countries it genetically tied me to. These results were always going to be general in nature. Ancestry.com states as much. The quibble I had has to do with Africa. And my recent experience with Gedmatch has allowed me to better understand the nature of my quibble.

DNA test results are based on data sets. These data sets are compiled by DNA test result databases. A database can only be as precise as the data that’s put into it. In this case, precision DNA results rely on large numbers of a population 1) having a DNA test and 2) those results being added to a data set which is imported into a database. For instance, a data set with 200,000 DNA results from the Baltic region of Eastern Europe will provide more precise insights than a data set of 50,000 individuals from the same region. It also depends on how each individual is classified and sub-classified (i.e. Bulgarian, Caucasian Bulgarian, Central Asian Bulgarian, Altaic Bulgarian, etc).

This brings me to my quibble about Africa. The way African DNA test results are classified, you would thing Africa was one large country populated by a homogenous people. This simply is not the case. The continental African population is arguably one of the most heterogenous populations. The admixture analysis tools and reports I’ve used on Ancestry.com and Gedmatch simply don’t reflect this diversity of African peoples.

For instance, I know that the central African pygmy populations have contributed roughly 2% to my genetic makeup. This comes from my mother’s mtDNA as well as through my paternal grandmother’s DNA as evidenced by my Genebase Y-DNA and mtDNA tests as well as my father’s mtDNA test.

Now where things get tricky is what’s classed as ‘Sub-Saharan Africa.

image of the map of African
Ancestry.com, along with a number of Gedmatch’s DNA analysis tools, takes the literal approach: all countries below the Sahara desert. Genebase, on the other hand, does not. Genebase, for instance, has categorized the territory from Western Sahara to Niger and south to Nigeria as Northwestern Africa. On its service you will also find North Central Africa, West Africa, Eastern Africa, Central Africa and so on and so forth. These sub-classifications of Sub-Saharan regions (and its peoples) allows for far more accurate interpretation for DNA analysis purposes. It’s also much more meaningful.

Based on this classification, my 18% African result is primarily spread across: Northwest (4%), Western (2%), Northern (5%), North Central (3%) and Eastern (4%) Africa. This is more meaningful that either a report that simply says 18% African or 12% sub-Saharan African, specifically.

For someone who is developing a travel-adventure series based on his DNA results, I’m a stickler for DNA reporting accuracy.

Gedcom & the MDLP DNA analysis tool

So first up is the MDLP DNA analysis tool which can be found on Gedmatch.

MDLP is a bio-geographical analysis project for the territories of the former Grand Duchy of Lithuania. Lithuania should have been my first clue. It was only after I saw the first set of results that I discovered that MDLP was designed for individuals with European and some Eurasian ancestry (mostly Finno-Uralic and Altaic). This tool is not recommended for inferring African-American, East-Asian etc. ancestry.

You’ll see why this tool wouldn’t be particularly useful to peoples of a largely African or East Asian ancestry:

MDLP World-22 Admixture Proportions

MDLP-World-22-results

Population  
Pygmy 2.63%
West-Asian 3.99%
North-European-Mesolithic 0.53%
Indo-Tibetan
Mesoamerican
Arctic-Amerind
South-America_Amerind 0.09%
Indian 1.86%
North-Siberean 0.31%
Atlantic_Mediterranean_Neolithic 13.71%
Samoedic
Indo-Iranian 1.61%
East-Siberean
North-East-European 12.89%
South-African 0.78%
North-Amerind 1.38%
Sub-Saharian 54.86%
East-South-Asian
Near_East 5.30%
Melanesian 0.08%
Paleo-Siberian
Austronesian

The sub-Saharan results were all out of proportion to what I already knew. Which made me go back to do some more research on this particular analysis. That’s when I found it was created to actually analyze European and Eurasian admixtures. Basically, this tool takes quite a literal and generous view of what’s meant by sub-Saharan.

However, where this tool has been interesting, for me, is in analyzing exactly what it was meant to – my European and Eurasian admixtures.

Variations of this test can be found below. Each has a different emphasis. I’m still researching what the emphasis of each actually is. There isn’t much information available. My DNA contact is off doing his research about this series of tools. The basic clue is in the name: “proportions”. However, I’m in the dark about what’s being proportionally measured – or why results for each geographical region can differ so staggeringly from one sub-test to another

If anyone out there actually understands what aspects of a person’s admixtures these analysis, feel free to post in the comment section below.

MDLP World Admixture Proportions

MDLP-World-Admixture

Population
Caucaus_Parsia 5.26%
Middle_East 5.45%
Indian 2.04%
South_and_West_European 17.20%
Melanesian 0.07%
Sub_Saharian 49.22%
North_and_East_European 11.00%
Arctic_Amerind 0.74%
East_Asian
Paleo_African 8.48%
Mesoamerican 0.56%
North_Asian

 

MDLP K=5 Admixture Proportions

MDLP-K=5-Admixture

Population
East-Eurasian 24.68%
West_Eurasian 4.08%
Caucasian 32.99%
South-Asian 12.02%
Paleo_Mediterranean 26.24%

 

MDLP K=6 Admixture Proportions

MDLP-K=6-Admixture

Population
South_Asian 11.92%
Caucasian 32.59%
North_West_Eurasian 4.29%
West_Eurasian 1.85%
Paleo_Mediterranean 26.01%
East_Euroasian 23.34%

 

MDLP K=7 Admixture Proportions

MDLP-K=7-Admixture

Population
Volga_Uralic 3.78%
Paleo_Mediterranean 25.80%
Altaic_Turkic 22.87%
South_Central_Asian 11.78%
Caucasian 32.27%
Paleo_Scandinavian 1.97%
West_Eurasian 1.54%

 

 MDLP K=8 Admixture Proportions

MDLP-K=8-Admixture

Population
Altaic_Turkic 22.81%
Paleo_Scandinavian 1.38%
South_Central_Asian 11.65%
East_European
West_European 10.73%
Caucasian 25.41%
Paleo_Mediterranean 24.75%
Volga_Finnic 3.27%

My question with the above results is: Where has the Eastern European from the other results gone? It disappears from this point onwards.

MDLP K=9 Admixture Proportions

MDLP-K=9-Admixture-Proportions

Population
Paleo_Balkanic 0.39%
Caucasian 25.06%
East_European
Volga_Finnic 3.32%
South_Central_Asian 11.62%
Paleo_Mediterranean 25.54%
Altaic_Turkic 22.72%
West_European 9.97%
Paleo_Scandinavian 1.38%

 

MDLP K=10 Admixture Proportions

MDLP-K=10-Admixture-Proportions

Population
Altaic_Turkic 22.62%
South_Central_Asian 11.56%
Paleo_North_European 1.28%
Paleo_Mediterranean 25.44%
Iberian 5.23%
Caucasian 23.00%
East_European
Paleo_Balkanic 0.40%
British 7.42%
Volga_Finnic 3.05%

 

MDLP K=11 Admixture Proportions

MDLP-K=11-Admixture-Proportions

Population
Paleo_Balkanic 0.39%
Celto_Germanic 7.37%
Caucasian 22.80%
Volga_Uralic 1.22%
Iberian 5.04%
Altaic_Turkic 22.56%
Paleo_North_European 1.27%
South_Central_Asian 11.47%
East_European
Uralic_Permic 2.55%
Mediterranean 25.34%

 

 MDLP K=12 Admixture Proportions

MDLP-K=12-Admixture-Proportions

Population
East_European
Paleo_Mediterranean 25.19%
Iberian 5.08%
Caucasian 22.52%
Uralic_Permic 2.63%
Balto_Finnic 1.21%
Paleo_Balkanic 0.37%
Celto_Germanic 7.23%
Paleo_North_European 0.25%
South_Central_Asian 11.48%
Volga_Uralic 1.27%
Altaic_Turkic 22.77%

So, while not particularly insightful for my African DNA associations, it has been very insightful for others. The Paleo Mediterranean results are largely in line with my Genebase results and incorporate my results associated with Sicily, Smyrna (Greece), and what we would think of as the Phoenicians (Malta, Cyprus and present day Lebanon).

The other Paleo findings are new. So I’m definitely looking to finding out more about them.

I remain absolutely fascinated by my Altaic and Caucasus results…a probable legacy from the ancient Silk Road trade route.

If you’re African American and your Ancestry.com or 23andme results are showing European and/or Eurasian results, this DNA analysis tool is worth investigating.

3 Comments

Filed under AfAm Genealogy, Genetics