Tag Archives: Twitter Names

Turkish Onomastics and Migration Patterns

Next week at Regent’s University Turkish Migration Conference (TMC2014, London), Elian Carsenat will present breakthrough data mining technology to apply onomastics (the recognition of personal names) to the discovery of new migration patterns.

20140522_TMC_Flyer

As states struggle to provide timely and accurate data to international organizations (such as the OECD, IOM, United Nations High Commissioner for Refugees UNHCR, …), these organizations can turn to the Big Data to identify and monitor new trends. What can Twitter, LinkedIn, Google, Facebook, D&B, Thomson WoS … tell us about the changing migration patterns of highly educated professionals, entrepreneurs? We’ll present how applied onomastics and the Big Data can be a game changer in migration studies, with vast implications on how countries or even regions can engage their Diaspora (to attract FDI, remittances, to build networks of expertise, …)

We look forward to see you at Regent’s University Turkish Migration Conference (TMC2014, London). Full program here.

To download the supporting presentation 20140530_TMS2014_Pitch_vFf.pdf

Further reading:

Leave a comment

Filed under EthnoViz

DataViz of the Dutch Digital Diaspora

As a final map in our Twitter GEOnomastics serie, we present today the Dutch e-Diaspora.

To prepare the mapping, we recognized Twitter names as Dutch, Turkish or Spanish and filtered those having a geotag (~3% of tweets).

Emigration from the Netherlands has been happening for at least the last eight centuries. In several former Dutch colonies and trading settlements, there are ethnic groups of partial Dutch ancestry. Emigrants from the Netherlands since the Second World War went mainly to the United States, Canada, Australia, New Zealand, and until the 1970s South Africa. There are recognisable Dutch immigrant communities in these countries. Smaller numbers of Dutch immigrants can be found in most developed countries. In the last decade, short-range cross-border migration has developed along the Netherlands borders with Belgium and Germany. Source: Wikipedia

To access the interactive map: http://cdb.io/1fsjItu

Dutch Digital Diaspora

Finally, we present the summary of the different Twitter GEOnomastics mappings we’ve published so far :

 

NamSor Applied Onomastics is a European vendor of name recognition software (NamSor sorts names), which aims to help understand international flows of money, ideas and people. namsor.com

NamSor will be at Big Data Paris on the 2nd of April 2014 and present at 5PM the potential benefits of mining the Big Data to reduce inequalities, promote Foreign Direct Investments in less favoured territories, using Diaspora Marketing. Meet us there!

Leave a comment

Filed under EthnoViz, General

Hispanic, French, German names in the United-States

NamSor has mapped Hispanic Twitter accounts around the world. Not just Hispanic: French and German as well.

This interactive world map of the Hispanic, French and German e-Diasporas was produced using Twitter account data.

To access the interactive map, click here: http://cdb.io/1dqVd2n

20140503_US_Twitter_GEOnomastics_vF

Twitter is an interesting source because about 3 per cent of Twitter accounts opt-in to show their Tweet location (using GPS from a smartphone) and can be visualised on a map.

Our method of anthroponomical classification can be summarized as follow: judging from the Twitter name only and the publicly available list of all ~150k Olympic athletes since 1896, for which team would the person most likely run (of France, Spain, Germany)?

NamSor Applied Onomastics is a European vendor of name recognition software (NamSor sorts names), which aims to help understand international flows of money, ideas and people. namsor.com

Further reading :

Leave a comment

Filed under EthnoViz

NamSor and @FDIMagnet wish you a Happy New Year 2015!

Happy New Year!

About : self organized FDI related Twitter names ; dataviz done using GEPHI.

Leave a comment

Filed under General

Making sense of Big Data : mining Twitter names

Millions of geo tweets in various languages, discussing anything from ‘hey, I’m here‘ to finance, geopolitics or marketing. How do you make sense of them?

We’ve used name recognition (applied onomastics) to filter information and produce unique maps of the e-Diasporas. Where are the digitally connected Italian, Turkish and Russian today? They may be migrants, tourists, business travellers, student, visiting scientists…

To jump directly to the interactive map, click here : http://cdb.io/1iSeWw2 or read more about our methodology.

Italian, Russina, Turkish Twitter

Italian, Russina, Turkish Twitter

TIP : Filter out layers and zoom in/out.
Below we filtered out the Turkish Twitter layer to visualize where the Russian & Italian tourists go to holiday in Turkey

Russian, Italians in Turkey

Russian, Italians in Turkey

The Italian America :

Italian America

Italian America

Further reading :

Leave a comment

Filed under EthnoViz

What’s in a Twitter name? A glance at the Irish digital Diaspora

To jump directly to the interactive map, click here : http://cdb.io/1beWaVB

(onomastics.co.uk reblog)

It’s been a while since I published a first ‘Feature of the Month’ in onomastics.co.uk and I can measure the progress made. The article, published in March 2013, showed maps of French and English investments in Africa, established by recognizing the names of Company Directors, instead of the traditional measurement of capital flows (FDI).

At the time, NamSor Applied Onomastics software was new and I was still exploring how such data mining tool, which recognizes personal names, could be useful. I was uncertain whether the social benefits would exceed the risks inherent to such powerful technology.

Names are a Code and contain a lot of information about an individual, but there is no determinism. Human groups of different levels can be recognized through names, but human societies are fractals. Each group can be broken down again and again, from different angles. A first name,  a last name, a Twitter handle are part of a person’s identity and may indicate a social intent, the belonging to an ethnic/linguistic group, a geographic origin, beliefs, … however at the finest grain level, every individual is unique and an exception to the group.

Genetic code, at one point, was thought to contain all the information needed to ‘build’ an individual from the physical point of view. After years of research, it seems that part of the information and the ‘algorithm’ are elsewhere…  Still there is huge interest in applied research such as 23andMe that ‘decrypt’ the genetic code to provide insights into a person’s ancestry, as well as hints about potential health issues.

The Name Code and the Genetic Code share the same ability to fascinate : each can somehow statistically be recognized to have an influence on your life, social status, average income, career… both relate to a family history. Each Code can be misleading and yet insightful. Fleur Pellerin, the French SME & ICT Minister, was born Kim Jong-suk in South Korea. She is both truly French and truly Korean, one name indicating a culture, the other a phenotype and genetic heritage. Considering only the Genetic Code would be denying a part of our humanity, which comes from being a child, a teenager, experiencing life, interacting socially, being part of a country and a culture, making one’s choices.

Twin studies would tell a lot about the links between those two codes (Name, Genetic) – if only there were more twins. Even though identical twins possess the same genetic makeup, they may go through different experiences throughout their lives that shape their personality, behaviour, and psychopathology in ways that make them unique relative to each other (Hughes et al., 2005). Twins will have a different first name.  Twins might also have a different last name, if -hypothetically- one twin was raised in Russia and the other twin was adopted and raised in the United States. In that case, what would the Name Code and the Genetic Code tell about potential Health issues (smoking or alcohol addiction, obesity & diabetes, life expectancy, etc.) ?

An article published last month caught my eye ‘Scientists seek volunteers willing to have genetic code published on internet‘: the hunt is on for 100,000 British volunteers to post their genetic information online in the name of science, as a North American open-access DNA project arrives in Europe. Personal Genome Project UK’s mission is ‘to make a wide spectrum of data about humans accessible to increase biological literacy and improve human health‘. The organization recognizes that ‘Even if a person’s name, home address or facial photograph is specifically excluded, a dataset like the one we are building is far from anonymous. It is simply too easy for someone to connect the dots and reveal a person’s identity.’ Genetic Code is a very personal data. Would you like to see yours published along with your Name Code and Identity? Yet if the identity of participants can be protected, I can see huge scientific value in such Open Data.

The Name Code, as such, is not personal data. Personal data is all information about yourself, that you should be allowed to keep confidential. A name is given to you as a communication tool, to interact with the World. There is a social intent in giving a child a common name, or a rare name that will more immediately identify a person – though I believe that one should be allowed to change names, just as Casanova did (who named himself Chevalier de Seingalt). There are legitimate reasons to keep one’s name and identity secret sometimes: you should be free to do so, unless that freedom infringes on someone else’s rights. A personal name (except possibly when it becomes a trademark) doesn’t belong to anyone : it’s been used before, it’ll be used again, it’s often shared by several people, it’s found in the press, it’s made up for fiction books … Could a democracy work without the citizen knowing their politicians’ names? How could historians do their research if we were to erase all personal names from the archives?

We see potential social benefits in applied onomastics and name data mining, that clearly exceed the risks of misuse : not just in social sciences research, but also in economic development, tourism, marketing, health, urban planning … We’ve helped one EU country reach out to its Diaspora in the US to originate foreign direct investments (FDI) and create jobs. We’re currently helping a BioTech scientific cluster raise its game through better understanding where the talents lay in that field, and where the brain juice flows internationally. We’re trying to find local partners to launch AgroDiaspora, an economic development initiative in Africa to foster stronger links between Sustainable Agriculture Transformation Projects and top-level BioTech scientists of African heritage, who could help make local plants climate-change resistant, among other benefits. We are also very excited and enthusiastic about a paper we submitted to ICOS 2014, the XXV International Congress of Onomastic Sciences, which will take place in Glasgow in August – as we foresee very positive outcome from that research.

In last month onomastics.co.uk feature ‘The Impact of Diasporas on the Making of Britain‘, Eleanor Rye mentions a very interesting research into what surname-based sampling can reveal about historic male migrations in the UK and Ireland.

We are currently conducting similar applied research on Twitter. I love Twitter. The freedom to choose one’s handle and name. The limited amount of structured information that goes with an account : a location, a language, a short profile, a few pictures. What’s in a Twitter name or handle? Anything : real names, company names, fancy names, pictograms, … the amount of information produced through Twitter is enormous, but it’s possible to filter this ‘bigdata’ in a way to make sense of it. We created geographic maps of e-Diasporas, by recognizing the Twitter names of geotagged tweets: Irish, Swedish, Russian, etc. We call this Twitter GEOnomastics, borrowing a term from Dr. Evgeny Shokhenmayer. Below is the map of the Irish e-Diaspora, along with Swedish and Russian.

Irish Twitter GEOnomastics

Irish Twitter GEOnomastics

Click here to access the interactive map:
http://cdb.io/1beWaVB

How does it work? The software accurately recognizes that ‘NamSor Applied Onomastics’ (@NomTri) is probably a trade mark or a company name, whereas ‘Elian Carsenat’ (@ElianCarsenat) is probably a personal name – and most likely a French name. Fancy names are also recognized and filtered out.

We see wide applications of such maps. When Captain James Cook explored the seas in the 18th century, having accurate maps could mean life or death for a ship and its crew. Working out latitude had been known for centuries, but measuring longitude was still tricky and inaccurate. In today’s digital world, I see latitude as ‘recognizing the semantics’ in a message expressed in a particular language and longitude as ‘recognizing the culture’ of the target audience. We’re full of curiosity on how and to whom this map can be useful, possibly Twitter itself. We’re going from Paris to Dublin in two weeks to find out : we hope to meet people at Twitter European Headquarters. Twitter just issued its IPO but is also not clear how to make its money. We’ll also meet Irish urban planners, people working in the tourism industry, investment analysts and Diaspora experts.

Read our next posts to discover more Twitter GEOnomastics maps showing Irish, French, German, Spanish, Russian, Turkish, Swedish, Italian, Dutch e-Diasporas (or cultural influence).

NB. The maps are currently interactive, so you can zoom in and out of a particular territory, however this may be shut down in a month or two.

[onomastics.co.uk | get a pdf version | academia.edu] Related : Can name data mining help economic development?

1 Comment

Filed under General