Tag Archives: onomastics mille-feuille

NamSor presented during Symposium on Academic Excellence

Our friend Tania Vichnevskaia of the French National Institute for Health (INSERM) presented the paper ‘Applying onomastics to scientometrics‘ yesterday at IREG International symposium organised by University of Maribor and Shanghai Jiao Tong University.

NamSor as a private start-up company has been solicited in 2014 by a European country to help measure the ‘brain drain’ affecting its competitiveness in the BioTech sector and to produce a global map of its scientific Diaspora (who are they, where are they and what are they doing). The objective was to build up the country’s scientific international cooperation and to engage its Diaspora.

Serendipity led analysts to discover interesting patterns in the way scientists names affect co-authorship and citation – not just for this particular country, but globally.

Last year, during ICOS2014 conference at Glasgow University, we presented how data mining millions of scientific articles in PubMed/PMC LifeSciences database uncovered amazing patterns in the way scientists names correlate with whom they publish, and who they cite in their papers.

We were interested to mine the large commercial bibliographic databases (Thomson WoS, Scopus) because they offer better data quality on citations and useful additional information, compared to PubMed:

– firstly, they have the full name in addition to the short name cited with just initials; this significantly reduces the error rate of onomastic classification

– secondly, they link scientists to research institutions (affiliations) and geographies (country of affiliation) ; this allows additional analysis on the topic of Diasporas and brain drain, comparing -for example- the research output of Chinese / Chinese American scientists in the US with that of scientists of Mainland China;

– thirdly, those databases have a larger coverage in terms of scientific disciplines, allowing comparison between different fields of research.

So collaboration started between NamSor and bibliometric experts at INSERM –the French National Institute for Health- to evaluate and visualize the effects of migration, Diaspora engagement and possibly cultural biases in Science.

This is Tania’s presentation at the conference:

What does the ‘onomastic millefeuille‘ of the global Cancer Research community look like?

201501_ThomsonWoS_CancerResearchOn this same topic:

The agenda of the Symposium is presented below

2nd Maribor Academicus Event

Academic Excellence: BETWEEN HOLY GRAIL AND MEASURABLE OBJECTIVES

International symposium  organised by University of Maribor and Shanghai Jiao Tong University

within the IREG Project on Academic Excellence

19-20 January 2015, Maribor, Slovenia

Higher education can importantly benefit from the rankings and league tables when used in a context with clear perspective of what ranking actually reflects (Prof. Jan Sadlak, President of IREG)
Active participants at the conference will be:

  •            Prof. Jan Sadlak, President of IREG,
  •            Prof. Gero Federkeil, CHE (Coordinator of Multi-Ranking),
  •            Prof. Nian Cai Liu,  Jiao Tong University in Shanghai (Author of the Shanghai ranking list),
  •            Prof. Seeram Ramakrishna,  National University of Singapore,
  •            Prof. Santo Fortunato,  Aalto University,
  •            Prof. Karin Stana Kleinschek, University of Maribor,
  •            Prof. Henryk Ratajczak, member of Czech Academy of Sciences,
  •            Prof. Edvard Kobal, Slovenian Science Foundation,
  •            Roberta Sinatra, PhD, Northeastern University,
  •            Tania Vichnevskaia, French National Institute for health (INSERM),
  •            Prof. Andrée Sursock, Senior Adviser at EUA,
  •            Prof. Øivind Andersen, University of Oslo.

About NamSor

NamSor™ Applied Onomastics is a European vendor of Name Recognition Software (NamSor sorts names). NamSor mission is to help understand international flows of money, ideas and people.

NamSor launched FDIMagnet,  a consulting offering to help Investment Promotion Agencies and High-Tech Clusters leverage a Diaspora to connect with business and scientific communities abroad.

Leave a comment

Filed under FDI Magnet, General

What’s in a name in 1914, in 2014?

(a onomastics.co.uk reblog)

This month, starting 25th of August, the University of Glasgow will host the 25th International Congress of Onomastic Sciences, the premier conference in the field of name studies.

For this occasion, we have started to calibrate NamSor software to recognize Scottish names. This is work in progress, but I’d like to share some preliminary data visualizations of regional names.

2014 marks 100 years since the start of the First World War. All across Europe and beyond, families lost dear ones, children were raised without knowing their father and grand-children were born in the aftermath of this trauma – only to live another global war, WWII. Let’s respect the people who died in both wars, and let’s also listen to the message their names convey to us about who they were, about who we are.

What do personal names tell us about the world in 1914?

In 1914, Europe was composed of Nations and Nations of Regions with deeply rooted people. This was the situation before the massive rural exodus and before the international migration flows caused by either decolonization or what we call today ‘globalization’. This first global war was fought by local people who lived close by among themselves, married in their local community, often spoke their own local language…

Scottish names

We’ve analysed the Commonwealth War Grave Commission (CWGC) database to see if we could correlate onomastics and regiments. The result is presented below:

20140801_Scottish_WWI_Onomastic_Millefeuille_v002

 

We’ve found a majority of Scottish names in regiments such as: the Gordon Highlanders, the Mercantile Marine Reserve, the Royal Scots, the Cameron Highlanders, the Seaforth Highlanders, the King’s Own Scottish Borderers and also the Royal Flying Corps.

The onomastic mille-feuille is dense but hard to understand. You can think of it as a sorted list of pie charts, like this one:

20140801_Scottish_WWI_Onomastic_PieChart_MercantileMarineReserve_v001

This pie chart tells us that the Mercantile Marine Reserved was composed mostly of Scottish and Welsh soldiers.

By looking at the soldiers ranks for that particular regiment, we can produce a new onomastic mille-feuille : names DO matter when it comes to rank in 1914.

20140801_Scottish_WWI_Onomastic_MilleFeuille_MercantileMarineReserve_v001

In more easily understandable pie chart language, this means that the Firemen were mostly Scottish and Welsh, whereas the Carpenters were English.

20140801_Scottish_WWI_Onomastic_Ranks_PieChart_MercantileMarineReserve_v001

Indian names

The first world war started as a European war but populations from Africa, Asia were immediately mobilized by the colonial powers of the time : the British Colonial Empire,  France, … many soldiers came from far away to meet their death in the tranchées of Eastern  France.

The Indian names in CWGC are indicated without any given name, but with the son’s and father’s name, for example:

sonName fatherName place regiment
PURANBAHADUR GHARTI KAMANSING GHARTI NEPAL 9th Gurkha Rifles
PUNE THAPA NAIN SING THAPA GULMI NEPAL 4th Gurkha Rifles
RADHA KISHN GANGA RAM RAJPUTANA Bharatpur Infantry
SITARAM SAWANT NILU SAWANT BOMBAY 117th Mahrattas
NAMDAR KHAN HAYAT KHAN N W F  PROVINCE 21st Punjabis
SHAHAB UDDIN KARAM ILAHI PUNJAB 53rd Sikhs (Frontier Force)
RAM RAKHA CHHOTE PUNJAB Sirmur Imperial Service Sapper Corps
AMAR SINGH GURDITT SINGH PUNJAB 15th Ludhiana Sikhs
LALITBIKRAM THAPA RAMBIKRAM THAPA NEPAL 5th Gurkha Rifles (Frontier Force)
PANCHAM DHUNDA UNITED PROVINCES Army Bearer Corps
CHINNASWAMI DURUGAYA MYSORE 2nd Queen Victoria’s Own Sappers and Miners
LAKKHI JAHANGIR UNITED PROVINCES Indian Royal Artillery
SHIU DAS DUBE RAM SEWAK DUBE UNITED PROVINCES 3rd Brahmans
BATAN SINGH BELA SINGH PUNJAB 57th Wilde’s Rifles (Frontier Force)
KALU GHALE KAMI GHALE NEPAL 8th Gurkha Rifles
ISMAIL HAIDAR MANUBUDDIN SIKDAR BENGAL Indian Railway Department
FATTEH KHAN DIL DOST KHAN PUNJAB 82nd Punjabis
SUJAWAL KHAN BAHADUR KHAN PUNJAB 38th King George’s Own Central India Horse
MAHABIR MURAI LACHHMAN MURRAI UNITED PROVINCES 3rd Sappers and Miners
SURENDRANATH RIAWA CHANDI CHARAN BISWAS BENGAL Indian Labour Corps

 

So we have used a different algorithm to automatically cluster Indian names into onomastic classes. Some onomastic classes might be related to geography, to Indian casts, to social status or religious beliefs …

We can again use an onomastic mille-feuille to visualize the correlation between names and geography, but here a classic geographical map would probably tell a better story.

20140801_Indian_WWI_Onomastic_Millefeuille_v001

Distinctive patterns are recognized in names from Bombay, Madras, Delhi or Pashawar, allowing the software to cluster them into distinct onomastic classes.

And again we can then look at regiments to visualize how ethnically/linguistically diverse they were:

20140801_Indian_Regiments_WWI_Onomastic_Millefeuille_v001

 

Italian names

All regions of Italy have paid a heavy tribute to the Great War:

2014_Italian_WWI_Casualties

 

Italian regional names are particularly well differentiated, as can be seen in the following onomastic millefeuille:

2014_Italian_WWI_Onomastics

We display here some examples of typical names from different regions. Can you see how different they are?

  • IT/Abruzzi e Molise: MEZZACAPPA GIUSEPPE DI ANTONIO, PAOLILLI-TREONZE PASQUALE DI DOMENICO, BONITATIBUS ERMANNO DI ANGELO, FIDELIBUS ANGELANTONIO DI EUGENIO, PAOLILLI-TREONZE DONATO DI GAETANO, VASQUENZ AUGUSTO ANGELO DI ANTONIO, AMMAZZALORSO ANTONIO DI ANGELO.
  • IT/Basilicata: LATERZA GIOVANNI DI GIUSEPPE, SCAMORCIA GIUSEPPE DI GAETANO, ALAGIA NICOLA DI GIUSEPPE, CLAPS VITO CANIO DI GAETANO, CLOROFORMIO VITO DOMENICO DI TADDEO, SCANDIFFIO DOMENICO DI INNOCENZO, CLAPS ANGELO VITO DI VITANTONIO, CASAMASSIMA FRANCESCO PAOLO DI GIOVANNI, PENNIMPEDE GIUSEPPE DI PIETRO.
  • IT/Calabria: PROCOPIO FRANCESCO DI NICOLA, CANDREVA FRANCESCO DI GIUSEPPE, SCICCHITANO FRANCESCO DI GIUSEPPE, SPACCAROTELLA GIOVANNI DI ANGELO, CICCIU CONSOLATO DI ANTONIO, LULJ GIUSEPPE DI VINCENZO, TRUNCELLITO DOMENICO PASQUALE DI GIUSEPPE, DAVOLOS DOMENICO DI PASQUALE, CHIDICHIMO GIOVANNI DI SALVATORE.
  • IT/Campania: ANNUNZIATA GIOVANNI DI ANTONIO, PISCOPO GIOVANNI DI ANTONIO, PISCOPO GIUSEPPE DI ANTONIO, SARRAPOCHIELLO LORENZO DI NICOLA, GENETIEMPRO GIUSEPPE DI MATTEO, VALIANTAE ANIELLO DI CARMINE, DONNIACUO ALFONSO DI GIUSEPPE.
  • IT/Emilia-Romagna: SCHIAVAZAPPA BONFIGLIO DI CRISTOFORO, SAVRIE ADELCHI DI GIUSEPPE, VACONDIO BONFIGLIO DI PIETRO, GUAGLIUMI GEMINIANO DI CESARE, ASTROLOGI GIOVANNI DI FERDINANDO, SAVRIE GIUSEPPE DI PRIMO, GUAGLIUMI GIOVANNI DI LEANDRO, MANSERVIGI GIOVANNI DI SALINGUERRA.
  • IT/Lazio: ASTROLOGO ANGELO DI PACIFICO, FAPERDUE SALVATORE DI VALENTINO, CENTOSCUDI NAZZARENO DI SANTE, CARLODALATRI UMBERTO DI FRANCESCO, CAPPADOCIA GIUSEPPE DI GIOVANNI, SCHIETROMA GIUSEPPE DI PASQUALE, PALAMIDES GIOVANNI DI GIUSEPPE, GIANFERMI GIOVANNI BATTISTA DI DOMENICO, CAPPADOCIA AMEDEO DI GIUSEPPE, PIETROBONO GUGLELMO DI BENIAMINO.
  • IT/Liguria: GAGGERO GIOVANNI BATTISTA DI GIUSEPPE, KONIG GIOVANNI BATTISTA DI GIOVANNI BATTISTA FILIPPO, MONTEGHIRFO GIOVANNI DI LUIGI, MAGIONCALDA GIOVANNI BATTISTA DI GIOVANNI, BACIGALUPO GIOVANNI BATTISTA DI DOMENICO, REDEGOSO GIOVANNI BATTISTA DI BARTOLOMEO, KONIG GUGLIELMO DI PIETRO, ARBOCO GIOVANNI BATTISTA DI EMANUELE VINCENZO.
  • IT/Lombardia: SANTAMBROGIO GIUSEPPE DI FRANCESCO, RUEFF GIOVANNI DI GIOVANNI, RECALCATI GIUSEPPE DI AMBROGIO, TAGLIABUE GIUSEPPE DI ANGELO, RANZENIGO FRANCESCO DI GIOVANNI, PIANTANIDA ANTONIO DI FELICE, SALMOIRAGHI GIUSEPPE DI ATTILIO, CONSONNI GIUSEPPE DI DOMENICO.
  • IT/Marche: CUCCU GIUSEPPE DI FRANCESCO, FIORDOLIVA GIUSEPPE DI PACIFICO, CINGOLANI NAZZARENO DI PIETRO, ANGELOME MARONE DI GIUSEPPE, VOLTATTORNI NAZZARENO DI FRANCESCO, CARSTANJEN GUSTAVO DI PAOLO, MENGHI-CERRA NAZZARENO DI DAVID, VOLTATTORNI CIRIACO DI LUIGI, CARSTANJEN EDOARDO DI PAOLO, BRUZZECHESSE DOMENICO DI FRANCESCO.
  • IT/Piemonte: DESTEFANIS GIOVANNI DI GIUSEPPE, RIVOIRA GIOVANNI DI PIETRO, CUTTICA GIUSEPPE DI CARLO, BELLINO-ROCI GIUSEPPE DI NICOLAO, NEPOTE GIOVANNI DI DOMENICO, AIMAR BARTOLOMEO DI BARTOLOMEO, LANTELME GIORGIO DI FRANCESCO, GUELPA GIOVANNI DI GIOVANNI, VALSANIA GIOVANNI DI ANTONIO, ARNEODO GIUSEPPE DI GIOVANNI.
  • IT/Puglia: SPAGNULO COSIMO DAMIANO DI FRANCESCO, VANTAGGIATO GIUSEPPE DI VINCENZO, SEMERARO GIOVANNI DI GIUSEPPE, EPICOCO DOMENICO DI GIOVANNI, AGHILAR RUGGIERO DI LUIGI, CANNABONA CROCIFISSO DI PASQUALE, BAGLIVO CROCIFISSO DI ORONZO, SPEDICATO CROCEFISSO DI SALVATORE, GIANCANE CROCIFISSO DI RAFFAELE.
  • IT/Sardegna: MARONGIU SALVATORE DI ANTONIO, PORCU GIOVANNI DI FRANCESCO, MARONGIU FRANCESCO DI SALVATORE, PUTZOLU GIOVANNI DI GIUSEPPE, DESOGUS GIOVANNI DI ANTONIO, MURTAS GIOVANNI DI GIUSEPPE, LAMPIS ANTIOCO DI FRANCESCO.
  • IT/Sicilia: RAPISARDA SALVATORE DI GIUSEPPE, GIONFRIDDO PAOLO DI SALVATORE, MACALUSO GIUSEPPE DI GIUSEPPE, SPAMPINATO ANTONINO DI GIUSEPPE, PRIVITERA ANTONINO DI GIUSEPPE, SCACCIANOCE SALVATORE DI ROSARIO, RAPISARDA SALVATORE DI CARMELO, CANGIALOSI ANTONINO DI MICHEL.
  • IT/Toscana: SCHIUMARINI IACOPO DI ANTONIO, DIOLAIUTI FERRUCCIO DI GIULIO, MAZZEI EFREM DI GIUSEPPE, DELL’EUGENIO ANGIOLO DI ANTONIO, DELL’ARINGA GABBRIELLO DI DANIELE, PISTOI ASTAROTTE DI OLIMPIO, BIENTINESI MILZIADE DI GIOVANNI, ANZEMPAMBER FILIPPO DI ADOLFO, BEMPORAD DUILIO DI POLICARPO, DELL’OMODARME RANIERI DI DEMETRIO.
  • IT/Trentino-Alto Adige: DALPIAZ GIUSEPPE, ANDERLE GIOVANNI, DEVIGILI GIUSEPPE, PONTALTI GIUSEPPE, CASAGRANDA GIUSEPPE, FLAIM GIOVANNI, PALLAORO GIUSEPPE, STEDILE GIUSEPPE, DETASSIS GIUSEPPE, DELVAI GIUSEPPE.
  • IT/Umbria: DESANTIS GIUSEPPE DI DOMENICO, MAGARINI-MONTENERO DOMENICO DI BONAVENTURA, QUONDAM GIOVANNI DI NAZZARENO, GAMBELUNGHE SALVATORE DI CESARE, CENTOGAMBE DOMENICO DI FELICE, QUONDAM CASTORINO DI GIUSEPPE, BESTIACCIA GIOVENALE DI GIUSEPPE, BELLACHIOMA ASTORRE DI ALBERTO, SFORNA CRISPOLTO DI NAZZARENO, CENTOGAMBE GIUSEPPE DI PIETRO.
  • IT/Veneto: DELL’OSBEL GIOVANNI DI ANTONIO, MESTRINER GIOVANNI DI GIUSEPPE, RODIGHIERO GIOVANNI DI ANTONIO, BOF GIOVANNI DI LUIGI, DALL’OSTO GIUSEPPE DI PIETRO, SKREZENEK GIUSEPPE DI CARLO, FILOSOFO GIOBATTA DI PAOLO, MENEGUZ GIOBATTA DI ANTONIO, MESCALCHIN GIOBATTA DI ANDREA, CIPOLAT-GOTET GIOVANNI DI GRAZIADIO.

French names

The equivalent of CWGC in France is the Mémoire des Hommes database. We’ve used it to calibrate NamSor recognition of French regional names. After calibration, about 70% of names can be allocated to a particular region and we can produce the following onomastic mille-feuille, sorted according to the relative number of Bretons (people from Brittany):

20140801_France_WWI_Millefeuille_v001

We can also view the total number of casualties, broken down according to the onomastic class. It show the large number of people originally from Brittany who died during WWI, regardless of their birthplace. However, this remains debatable – as ~30% of names could not be specifically allocated to a region of origin (only recognized as French).

20140801_France_WWI_RegionalBreakdown_v001

Baptiste COULMONT, a sociologist, published a very interesting study on given names analysing the results of students at the French Baccalaureate in 2014. We’ve used a similar dataset compare regional names in 1914 and in 2014. Unfortunately, we didn’t have enough time to align the geographic mappings – but the result is visual and self-explanatory. We can see how rural exodus and internal migration have eroded the regional identity in personal names. Still we can see that even in 2014, the correlation between onomastics and geography remains strong – especially in Brittany, in the North of France, in Alsace, in Lorraine, in Loire, in Lyon, in Aquitaine and Corsica.

20140801_France_Millefeuille_1914_2014_v001

What do names tell us about the world in 2014?

A lot! Some say: too much!

Enough to make ICOS2014 a very exciting and current event. We look forward to be in Glasgow on 24th August and meet you there. Long live onomastics.co.uk

Feel free to contact us, mailto:contact@namsor.com

About

NamSor™ Applied Onomastics is a European designer of name recognition software. Our mission is to help make sense of the Big Data and understand international flows of money, ideas and people.
http://namsor.com/

NamSor is committed to promote diversity and equal opportunity and launched GendRE API, a free API to conduct analysis of gender equality using open data.

 

1 Comment

Filed under EthnoViz

Assessing the Gender Gap in the Film Industry

IMDb, the Internet Movie Database is a massive resource, offering a unique picture of the global film industry. The IMDb list of Actors and Actresses is one of the many databases that we’ve used to evaluate how accurate our Gendre API is in predicting gender from personal names.

IMDb also provides a list of world film directors. The file does not include a title or a gender. We’ve set a task for ourselves: to independently measure the gender gap among movie directors globally.

A previous study found -from assessing the gender of 11,197 directors, writers, producers, cinematographers, and editors-  that only one quarter of all narrative content creators were female. Can we confirm the findings of this study by enriching the gender information of 337,548 IMDb film directors, in all countries of the world?

IMDb Movie Directors – the Gender Gap

Our findings confirm the previous study. We were able to infer gender from 98% of the names in minutes. That’s the power of OpenData combined with APIs : they can do a lot of work for you!

2014_IMDb_MovieDirectors_GenderGap

2014_04_IMDB_table_vFLarge

A previous study : ‘Exploring the Barriers and Opportunities for Independent Women Filmmakers’

We reproduce the main quantitative findings below:

Executive Summary

Sundance Institute and Women In Film Los Angeles
Women Filmmakers Initiative

Exploring the Barriers and Opportunities for Independent Women Filmmakers
Stacy L. Smith, Ph.D., Katherine Pieper, Ph.D. & Marc Choueiti

Annenberg School for Communication & Journalism

University of Southern California

The purpose of this research is to examine how females are faring in American independent film. Studies have been conducted in the past on women in the mainstream U.S. film industry, but little research has yet been done in the U.S. independent film arena. To this end, we developed
a research strategy with a two-prong approach.
First, we quantitatively document the involvement of female content creators of U.S. films at the Sundance Film Festival, assessing the gender of 11,197 directors, writers, producers, cinematographers, and editors across 820 films classified as U.S. narratives (534 films) or documentaries (286 films) between 2002 and 2012.
The second prong documents the qualitative experiences of female filmmakers through interviews with emerging and seasoned content creators as well as key industry gate-keepers. Here, we surveyed 51 individuals to unpack the specific obstacles that face female directors and producers in the independent film arena. We also assessed participants’ perceptions of opportunities that may increase women’s involvement behind the camera. Below is a summary of key findings.

Quantitative Findings: American Films at the Sundance Film Festival 2002-2012
• At the Sundance Film Festival from 2002-2012, one quarter (25.3%, n=1,911) of all narrative content creators (i.e., directors, writers, producers, cinematographers, editors) were female and 39.1% (n=1,422) of all documentary content creators were female. This translates into a behind-the-camera gender ratio of 2.96 males to every 1 female in narratives and 1.56 males to every 1 female in documentaries.
• Females were half as likely to be directors of U.S. narratives (16.9%) than U.S. documentaries (34.5%). Similar disparity by storytelling platform (narrative vs. documentary) was found among female writers (20.6% vs. 32.8%), female producers (29.4% vs. 45.9%), female cinematographers (9.5% vs. 19.9%), and female editors (22% vs. 35.8%).
[…]

The full report can be download here.

Considering cultural differences

Can we dig deeper? As in any global industry, cultural differences are key to explain any phenomenon. But in the case of a cultural industry, the gender gap is not just a phenomenon – it’s also a causal factor: with films reproducing cultural clichés from one generation to the next, or conversely introducing cultural revolutions in a particular country. The role of foreign films directors in this process is essential.

At NamSor, we invented a simple yet efficient tool to represent cultural differences and diversity, the onoma(s)tic millefeuille:

2014_04_IMDB_vFLarge

Combining gender and cultural information, we found -for example- that women represent 26% of film directors having a French name, whereas for Algeria that figure drops to 19% and Japan 14%.

Meet us on 29 April 2014 at DataTuesday Paris with Girls in Tech Paris, on the topic ‘Women & Data’.

Further reading:

Leave a comment

Filed under General