Speaking our language 11

Càit a bheil thu a'dol?
 - Tha mi a' dol ...

Càit a bheil sibh a'dol?
 - Tha sinn a' dol ...

dhan bhaile
dhan sgoil
dhan bhùth
dhan oifis
dhan stèisean
dhan bhanca

Steòrnabhagh
Inbhir Nis
Lunnainn

Disathairne

Càit a bheil thu a'dol an-diugh?
Càit a bheil thu a'dol a-nochd?
Càit a bheil thu a'dol a-màireach?

Tha mi a' dol a-mach.
Tha mi a' dol a chèilidh air Anna.
Tha mi a' dol gu dannsa.
Tha mi a' dol a dh'obair.
Tha sinn a' dol dhachaigh.

BBC: German cathedral bones 'are Saxon queen Eadgyth'

URL

Speaking our language 10

Tìoraidh!
Tìoraidh an-dràsda!
Tìoraidh ma-tha!
Mar sin leibh!
Mar sin leibh an-dràsda!

Feumaidh mi falbh.
Tha mi duilich.
Slàn leibh!

Mar sin leat!
Slàn leat!
Oidhche mhath!

Carson?
Tha mi sgìth.
Tha film air an telebhisean.
Tha cabhag orm.
Tha mi a' falbh.
Tha an trèana a' falbh.
Tha am bus a'falbh.
Tha am bàta a'falbh aig coig uairean.

Chì mi a-rithist thu.
Chì mi a-rithist sibh.

Today's Rhoda:

Speaking our language 8

Dè an uair a tha e?
 - Tha e ...

uair
dà uair
trì uairean
...
deich uairean
aon uair deag
dá uair dheag

gu bhith dà uair
cairteal gu dhà
cairteal gu dà uair
leth-uair an dèidh dhà

Cuin a tha am film air?
 - Tha am film air aig dà uair.
Cuin a tha am ball-coise air?

Cuin a tha sibh a' fosglach?
Cuin a tha a' bhùth a' fosglach?
Cuin a tha an oifis a' fosglach?

What's Rhoda wearing?

BBC: Genetic study sheds light on Jewish diaspora

"The researchers analysed genetic samples from 14 Jewish communities across the world and compared them with those from 69 non-Jewish populations. Their study, published in Nature, revealed that most Jewish populations were 'genetically closer' to each other than to their non-Jewish neighbours. It also revealed genetic ties between globally dispersed Jews and non-Jewish populations in the Middle East."

URL

Speaking our language 7

mac
nighean
balach
dithis
triùir
ceathrar
balach beag
nighean bheag

A bheil clann agad?
 - Tha.
 - Tha, tha ... agam.
 - Chan eil.
 - Chan eil clann agam.
 - Chan eil fhathast.

A bheil clann agaibh?
 - Tha, tha ... againn.
 - Chan eil clann againn.
 - Chan eil fhathast.

Greas ort!
Greasaibh oirbh!

What's Rhoda wearing today?

Speaking our language 9

Dè an obair a th' agaibh/agad?
 - 'S e ... a th' annam.

nurs
tidsear
clèireach
saor
portair
iasgair

Càit a bheil sibh/thu ag obair?
 - Tha mi ag obair aig an taigh.
 - Tha mi gun obair an-dràsta.
 - Tha mi ag obair ann an/am ...

Comar nan Allt 
oifis
banca
bùth
sgoil
ospadal
stèisean

Tha mi trang.

Here is Rhoda today:

Speaking our language 6

snàmh
iasgach
còcaireachd
dràibheadh
ball-coise
iomain

Is toigh leam ...
Cha toigh leam ... (idir).

An toigh leat ... ?
An toigh leibh ... ?
 - Is toigh l'.
 - Cha toigh l'.

Here is Rhoda today:

Nadine Gordimer

Saturday's Guardian contained the following quote by Nadine Gordimer as an answer to her own question "What is the most important lack in your life?":

I've lived that life in Africa without learning an African language. Even in my closest friendships, literary and political activities with black fellow South Africans, they speak only English with me. If they're conversing together in one of their mother tongues (and all speak at least three or four of each other's), I don't understand more than a few words that have passed into our common South African use of English. So I'm deaf to an essential part of the South African culture to which I'm committed and belong.

Conversational inclusion

Given a society consisting of N people, there will be (N2-N)/2 possible dialogue pairs (i.e. subsets of cardinality 2) and 2N-N-1 possible multilogue groups (i.e. subsets of cardinality greater than or equal to 2).

N(N2-N)/22N-N-1
10451013
10049501.27x1030
1000499,500...
1000049,995,000...

The dialogue inclusion of society S is defined as the proportion of the possible dialogue pairs which are communicating pairs (i.e. the two members have a common language). The multilogue inclusion of society S is defined as the proportion of the possible multilogue groups which are communicating groups (i.e. all members have a common language).

In order for a society to have a multilogue inclusion of 1, there must be a universal language which is common to all members - although it may still be possible for particular subsets to have 'private' conversations in non-universal languages. However this is not true for dialogue inclusion - imagine a society of 3 (groups of) people A, B and C where A speaks English and French, B speaks English and German, and C speaks French and German. Thus, multilogue inclusion appears to be a better measure of conversational inclusion.

NB: Can we come up with models which penalise private languages? Or models which reward them? Multilogue exclusion - the proportion of the multilogue groups which have a private language wrt the society as a whole? How many languages for a society S whose multilogue exclusion is 1? 2|S|-|S|-1. In other words, 1030 for a society of 100 people! Note that if multilogue exclusion is 1 then so is multilogue inclusion. However, in a monolingual society (or even a perfectly multilingual one where everyone speaks every language) multilogue exclusion will be 0.

Communicative capital

Person P's communicative capital is the number of people in the world P can hold a dialogue with.

Society S's communicative capital is the sum of the communicative capital of all its members.

Linguistic capital

Person P's linguistic capital is the number of languages P has.

Society S's linguistic capital is the sum of the linguistic capital of every P in S.

Note that this definition implies that simply knowing a language is enough, even if you don't use it to talk to anyone.

Foundation for Endangered Languages

The Foundation for Endangered Languages exists to support, enable and assist the documentation, protection and promotion of endangered languages. In order to do this, it aims:

  1. to raise awareness of endangered languages, both inside and outside the communities where they are spoken, through all channels and media
  2. to support the use of endangered languages in all contexts: at home, in education, in the media, and in social, cultural and economic life
  3. to monitor linguistic policies and practices, and to seek to influence the appropriate authorities where necessary
  4. to support the documentation of endangered languages, by offering financial assistance, training, or facilities for the publication of results
  5. to collect and make available information for use in the preservation of endangered languages
  6. to disseminate information on all of the above activities as widely as possible

SOILLSE - National Network for Gaelic Research

The aim of the SOILLSE project is to build a world-class research network for Gaelic across Scotland, in order to inform government policy and induce a "paradigm shift" in the use of Gaelic in Scotland. It has £5.29 million funding over seven years, from the SFC, BnG, HIE, as well as UoE, UoG, UoA and UHIMI. The themes are:

  • Gaelic as a family and community language
  • Gaelic in education
  • assessment of policies directed towards the revitalisation of Gaelic

There are two transversal themes:

  • Gaelic identity and self-confidence
  • Gaelic language use

NLP Technologies

NLP Technologies is a Montreal-based company, run by Atefeh Farzindar.

NLP is a North American leader in Natural Language Processing, Electronic Document Summarization of structured texts such as legal decisions, Qualitative Search solutions and Automatic Translation. We offer services that streamline the traditional cumbersome and time-consuming processes of reading, analyzing, and researching texts by using our patent-pending intelligent summarization and translation technologies. NLP Technologies’ current focus is to provide its innovative solutions to lawyers and other legal practitioners, associations, governments and courts.

Joel on Unicode

A quite brilliant post by Joel Spolsky on Unicode.

And here are Bill Poser's Unicode lecture notes.

Here's another nice article.

Phaistos disc script

Wikipedia - 241 tokens, 45 types.

Unicode: 101D0 - 101FF (i.e. 48 code points).

Aegean font - description, font

𐇐 𐇑 𐇒 𐇓 𐇔 𐇕 𐇖 𐇗 𐇘 𐇙 𐇚 𐇛 𐇜 𐇝𐇞 𐇟

𐇠 𐇡 𐇢 𐇣 𐇤 𐇥 𐇦 𐇧 𐇨 𐇩 𐇪 𐇫 𐇬 𐇭𐇮 𐇯

𐇰 𐇱 𐇲 𐇳 𐇴 𐇵 𐇶 𐇷 𐇸 𐇹 𐇺 𐇻 𐇼 𐇽𐇾 𐇿

Gaelic BBC news corpus so far

TokensTypesTokens-so-farTypes-so-farnew-types/new-tokens
Jan1023826191023826190.26
Feb511015541534833750.14
March1516233443051052980.13
April1316030604367066760.10

SALTMIL'98 Workshop

URL

review by Nicholas Ostler - "A language is a dialect with a dictionary, grammar, parser and a multi-million word corpus of texts - and they'd better all be computer tractable."

Headlines 13 May

Bile na Croitearachd ga dheasbad - Debate on crofting bill

  • bile (masc) - (parliamentary) bill
  • croit (fem) - croft
  • croitear (masc) - crofter
  • deasbad (fem/masc) - debate

Dachaighean cùram ùra "eu-coimearsalta"

  • dachaigh (fem) - home
  • cùram (masc) - care/responsibility/trust/worry
  • ùr - new
  • ùraich - to renew/refresh/modernise

Iomairt a' leantainn an aghaidh losgadair sgudail

  • iomairt (fem) - enterprise/initiative/campaign/venture
  • lean - to follow/continue/pursue
  • aghaidh (fem) - face/facade
  • sgudal (masc) - rubbish/trash

Sgoiltean gus dùnadh an Earra-Ghàidheal

  • Earra-Ghàidheal - Argyll
  • dùnadh (masc) - closure/closing/ending

Brataichean Gàidhlig air nochdadh an Glaschu

  • bratach (fem) - flag/banner
  • nochd - to appear/reveal/show

Trì iolairean gam marbhadh air taobh sear Chataibh

http://www.bbc.co.uk/scotland/alba/naidheachdan/story/2010/05/100512_skye_health.shtml