James Lee

M. Dubremetz

Computational Linguist, PhD

Contact Me

About Me

ATTENTION! Le 6 février 2023 mon sac contenant ma carte d'identité Suédoise et Francaise a disparu dans les rues de Lille voir message ici . Alert my IDs have been stolen on the 2023-02-06 see message here. I am a engineer in Digital Humanities based in Uppsala and Lille. My original background is in NLP. This is a professional website. I have no WhatsApp neither facebook, the best way to contact me is by email. Wondering what I am up to? Have a look at my now page.

Projects List

project name


De retour dans ma Lille natale, j'ai observé qu'il manquait un club local dédié aux femmes qui codent. Alors avec l'aide de Laetitia, Jacqueline, Thanh Lan, et d'autres consoeurs, nous organisons des rencontres tech 100% féminines. Chaque mois nous votons un sujet et faisons une présentation. Notre mode de communication est une mailing liste chtitedev@lists.friposts.org


Notre mailing liste
Liste de nos évènements passés

project name

Your quotes online

Never you wished to share a book you like with a friend? If only you could remember this great quote! This project aims at exporting quotes and highlights directly from your ebook reader to your personal website via a gitlab page.


View on gitlab
Read example of quotes on my personal website

project name

Cyrillic to latin

What to do when you get thousands of txt files in Cyrillic and you need to transliterate them in latin alphabet? I wrote a program that make the conversion over all files of your folders and subfolders. Everything is explained for beginners opening their terminal for the first time.


View on gitlab

project name

Epub study in Swedish

I wrote a ebook analyser. It does preprocessing of *.epub such as, conversion to .txt, removal of editorial metadata, lemmatization, stopwordisation... and all of that in Swedish. This project is a professional partnership with a researcher in Swedish literature analysis.


View on gitlab

project name

Every color in...

I do color analysis of texts. First one is the colors of the New Testament.


View image

project name

Karaoke "In the shell script"

I wrote a parody of "in the navy" called "in the shell script". The song is sang by an artificial voice. This song aims at teaching shell commands in a class I give for the center for Digital Humanities.


Contribute on gitlab
View lyrics and movie

project name

Map libraries

In this professional project in collaboration with the department of Scandinavian language, I made a network representation of manuscripts in Sweden. This project required high amount of graph generation and webscraping expertise.


View on gitlab
Example network

project name

XML feeds

RSS feeds used to be on any website before facebook and other closed plateform decide to bann this universal protocol. As a former trainee in information intelligence I launched my own instance of, feed-me-up-scotty to watch out for strategic sources of info in my life.


View on gitlab
List of "rssified" websites

project name

Uppsala women coding

I was a founder of the first meetup for women that code in Uppsala. To keep yourself informed about our events follow the links below!


Join the Uppsalatech slack
View on gitlab
Join the meetup!

project name

Print @ Uppsala University

I noticed that researchers at my work ended up with black margins and other printing suprises due to poor printer settings, provoking a lot of stress and waste at work. These scripts help you keep control over your printing jobs at UU. Thanks to your Terminal (Linux, Mac) you can print in color, gray, double sided...


View on gitlab

project name

Mycroft voice assitant

Mycroft is the open source platform for vocal home assistance. It is like Alexa / Cortana except that it is made by the open source community and with respect of privacy. I created my own mycroft skill to turn on and off a projector via ssh and a raspberry pi. While installing and creating I also participated to correct both bugs and public documentation.


View on github

project name


This personal project, uses computational linguistics, webscraping, and image recognition to present each days the lunch menus in Uppsala city. 100% open source. You want to know what's for lunch today but you are too lazy to search all the restaurants? Go to https://lunch.uppsala.ai .


View on gitlab
Link to website

project name


Vtml is a language that helps tuning synthetic voices for a more natural reading. Writing a vtml file can be repetitive. I created vtml-tag shortcuts for Sublime 3 text editor.


View on github

project name

ACL anthology

In this professional project performed for the Association for Computational Linguistics I helped archiving old conference papers. This project represents:200+ inconsistent webpages from the 2000s converted into csv files, 3000+ pdf inserted into the new scientific database, 10 000+ names of scientists, titles of articles and pdf links scraped and normalised


View on github

project name

Mustache website

This website has been completely re-written to use a mustache templating. This language allows you to separate html from content. And thus to edit your website with a simple yaml file.


View on gitlab

project name

Chiasmus detector

I designed a tool for literature analysis purpose that detects the figure of speech called chiasmus.


View on github

project name

(Ep)anaphora/Epiphora detector

Martin Luther King made one on his famous discourse in Memphis. Epanaphora is the figure that consists in starting your sentences by the same words. Epiphora is the same but at the end of the sentences. I made a detector for it.


Link to thesis

project name

CoreNLP French Lemmatizer

I made an (unofficial) script to get lemmatization into the XML output of Stanford CoreNLP.


View on github


Journal Article

Dubremetz, Marie and Nivre, Joakim (2018) Rhetorical Figure Detection: Chiasmus, Epanaphora, Epiphora. Frontiers in Digital Humanities. 5:10. doi: 10.3389/fdigh.2018.00010


Long Article

Dubremetz, Marie and Nivre, Joakim (2016). Syntax Matters for Rhetorical Figure Detection: the Case of Chiasmus. In Computational Linguistics for Literature (CLFL 2016). San-Diego, United-States.

PDFBIBVideoLong Article


Litteræ et Linguæ. Rhetoric Workshop, Uppsala. (February 2016) Sweden.


Long Article

Dubremetz, Marie and Nivre, Joakim (2015). Rhetorical Figure Detection: the Case of Chiasmus In Computational Linguistics for Literature (CLFL 2015). Denver, United-States.


Long Article

Dubremetz, Marie and Nivre, Joakim (2014). Extraction of nominal multiword expressions in French. In Proceedings of the 10th Workshop on Multiword Expressions (MWE). Gothenburg, Sweden.


Long Article

Dubremetz, Marie (2013). Vers une identification automatique du chiasme de mots. In Actes de la 15e Rencontres des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL’2013), (pages 150–163). Les Sables d’Olonne, France.



Command Line workshops

A beginner workshop for learning terminal compatible with both windows and unix users

Cultural analytics 2022

I am one of the main instructors together with Karl Berglund and Ekta Vats for the course "Cultural analytics" I teach Command line and webscraping.

Grundläggande textanalys 2016

Basic text analysis. Lectures given in English. I was teaching computational linguistics treatment such as: lemmatisation, tokenization, finite state transducers, HMM and more. I was the main responsible for this course with responsibility such as: managing the schedule, design and correction of assignment, individual support to students, preparation of lab. The audience were heterogeneous profiles of first year bachelor. I developed pedagogical tools such as MCQs with team work and use of flash cards.


Below  you can see the recording of the second talk I gave on chiasmus detection.