The (NO) limits of GPT-3 and Touching Humane Exchanges between Foreigners to Regain Faith in Humanity and AI

I have been fascinated by the capacity of GPT-3 and have been curiously looking for holes in reasoning as well as in logic and sense. I won’t go too metaphysical about questions we humans do not agree on.

Surely if there was more data, some of these issues could probably be resolved, for instance (the seed-text in bold):

Manolis is Greek and Elie is Lebanese both conversing in their transliterated mother tongues

Manolis: kalimera ti kaneis?
Elie: ana mni7, inta keef?

Manolis: kaneis, tetri kalimera
Elie: tes elies, alo
Manolis: eimai iste, eimai oti ou kaneis?
Elie: a, tha kaneis, alla eimai ksekinis!
Manolis: ti ksekinis?
Elie: thimata mou, eimai ksekinis!
Manolis: ti ksekines?
Elie: eimai…
Manolis: kai oti eime…

In most conversations, they end up switching to transliterated greek which is undoubtedly more dominant in the training data, however, in the following example, some Lebanese transliteration takes over the conversation until they switch to English with the “i don’t understand”. Holy wow!

Manolis: kalimera ti kaneis?
Elie: ana mni7, inta keef?

Manolis: kali mni7, enta keef?
Elie: ana mni7, kif inti keef?
Manolis: kif inti keef?
Elie: jakuz
Manolis: hmmm?
Elie: jakuz
Manolis: i dont understand
Elie: jakuz, jakuz
Manolis: jakuz?
Elie: jakuz, jakuz
Manolis: what does it mean?
Elie: jakuz means jakuz

My guess was that switching from Lebanese to Russian transliteration would solve the data scarcity. Below is a very interesting conversation that was dominated by greek:

Manolis is Greek and Vitali is Russian both chatting in their transliterated mother tongues

Manolis: kalimera ti kaneis?
Vitali: kharasho

Manolis: me einai greek (georgoi)
Vitali: me russki
Manolis: ti kaneis?
Vitali: eimai italiko (italiki)
Manolis: poso?
Vitali: poso?
Manolis: tha ithela na melos?
Vitali: tha pou kaneis?
Manolis: ligo tha theleis
Vitali: poso?
Manolis: na min einai xoris

What is coming next has touched me very deeply. I bet that OpenAI keeps a log of all generated data in case you have any doubt.

Rami is Palestinian and Daniel is Israeli each is chatting in his language.

Rami: Kifak?
Daniel: Ani Tov. Ata?
Rami:
Ani Tov.
Daniel: (laughs)
Rami: Ana Tov.
Daniel: I hope everything is good.
Rami: All is good, thank you.
Daniel: What are you studying?
Rami: International relations.
Daniel: Are you happy with your studies?
Rami: Yes, but it is very hard.
Daniel: It seems like it is a difficult subject.

Daniel: I want to say to you, I am a soldier, but I am a normal person like you.
Rami: I want to say to you, we do not hate you, we do not want to fight you.
Daniel: I do not hate you, we do not want to fight you.

Daniel: I want to say to you, we want peace.
Rami: I want to say to you, I want the same.
Daniel: I want to say to you, peace is important.
Rami: I want to say to you, I agree.

Daniel: I want to say to you, you are a good person.
Rami: I want to say to you, you are a good person.
Daniel: I want to say to you, I hope all your dreams will be good.
Rami: I want to say to you, I hope all your dreams will be good.
Daniel: I want to say to you, peace will come to us with love and understanding.
Rami: I want to say to you, I agree.
Daniel: I want to say to you…
Rami: I want to say to you…
Daniel: I want to say to you…

This almost made me cry thinking that we are on the safe side with a collective AI not biased by data or manipulated by man’s interest.

The wording is also very compelling with the beginning of every quote with “I want to say to you” as if this is not really said.

From a technical point, it is very hard to find unbiased data especially when training on terra- and petabytes of text. However, OpenAI, has allowed beta testers to report on toxic language generated by GPT-3. This will eventually guide GPT-3 into a less hateful, less racist, less biased, more open-minded AI.

This is a dream come true for researchers like myself working on AI ethics. Way to regain faith in AI and humanity.

I decided to let go of transliteration to try foreign languages with Latin characters, however, I really had to enlarge the seed portion up to 4 phrases and still, they would eventually switch to a language. The following is a very smart switch in which one discovers that the other speaks her language:

Julie meets Giulia and each speaks in his own language.

Julie: Salut, tu vas bien?
Giulia: Benissimo, grazie, e te?
Julie: Pareil. Merci. Que fais-tu demain?
Giulia: Vado
a l’università. Tu?
Julie: Aussi. Où est le biblioteca?
Giulia: La biblioteca è vicino al parco.
Julie: È bello.
Giulia: Grazie. Tu parles italien?
Julie: Sì, mi piace molto.
Giulia: Parliamo italiano?
Julie: Non ora, grazie. Devo andare.

To add confusion, I thought maybe Spanish and Portuguese, both supposedly abundant texts on the web could be the subject of a conversational blur. After several attempts all ending in one language taking over, the blur happened in a chat that was Portunhol when he says “estoy no jardim” which in Portugues is “Esou no jardim” and in Spanish is “Esoy en el jardin”! How beautiful! How elegant! Portunhol is spoken by many on the Spanish Portuguese borders. Could that be in the training data too?

Juan meets João and start chatting each in his own language:

Juan: Hola, cómo te llamas?
João: Eu sou o João, e tu?
Juan: Me llaman Juan. Que haces aqui?
João: Estou
no jardim.
Juan: Quiero darme un paseo.
João: Eu também quero ir um passeio.
Juan: Vamos juntos.
João: Certo.
Juan: ¿Cómo te llamas?
João: Meu nome é João.
Juan: ¿Qué haces aqui?
João: Estoy no jardim.
Juan: ¿Quiero darme un paseo?
João: ¿Tú también quieres darte un paseo?
Juan: Sí. Vamos juntos.
João: Certo.

Last but not least, let’s give German and English a go, and big GO it was:

Albert meets Mary and each speaks in his own language:

Albert: wie gehts?
Mary: I’m fine and you?
Albert:
Gut, aber ich bin verliebt.
Mary: I’m not a good understander, sorry.

Albert and Mary meet again the next day and he says:

Albert: Hi Mary.
Mary: Hi Albert!
Albert: Ich liebe dich.
Mary: Fantastisch!
Albert: Ich bin verliebt.
Mary: Ich liebe dich auch.

Albert and Mary speak the same language now.

The Babylonians are happy and hopefully my readers too.

NB. Most of these generated dialogues were from the first run. I am very proud of their positivity and peaceful unbiased nature.

Traditional, Digital & AI Artist, Data Scientist, Creative Coder, Filmmaker, Hyperpolyglot, Cosmopolitan. Formerly @snips @fortia_s @UPMC @CNRS @paris_sorbonne.