Good post Karl, I am familiar with the language situation in Norway, and personally, I think there are good lessons there for us too.
I'm also quite fascinated by some of the attempts to unify the Gallo-Italic languages, particularly Michael Dallera's Lombard Restruiturad Stàndard (LoReS)... he's in the Facebook group but I'm not sure if he's here in the forum. I don't want to speak for his work and risk being incorrect on something, but it seems really cool.
We have a subforum for Gallo-Sicilian, if you're a speaker you're welcome to open conversations there and introduce the relevant people to that area.
You certainly may! That's an excellent example. Hmm ... They used Perl for their project. That's what I'm using. I wonder if we could borrow some of their libraries? ... According to their about page: I clicked through the EDD's "Tjenester og verktøy", but I could not find any information about the database. But I did find a link to some more EDD dictionaries. Cool stuff! Thank you for sharing this.
Yes. They are. Here's how Dr. Cipolla defines the rule (Mparamu, p. 64): I would define the rule differently. I would say: all Sicilian verbs have an unstressed "stem" and a stressed "boot." the infinitive reflects either the stem or the boot. stem + ari stem + iri (sc) boot + iri the stem appears in all of the conjugations except: the present indicative -- 1st S., 2nd S., 3rd S., 3rd P. the imperative -- 2nd S., the boot replaces the stem in those locations finiri -- stem: fin, boot: finìsc finisciu __ finemu finisci __ finiti finisci __ finìscinu sèntiri -- stem: sint, boot: sènt sentu __ sintemu senti __ sintiti senti __ sèntinu aspittari -- stem: aspitt, boot: aspètt aspettu __ aspittamu aspetti __ aspittati aspetti __ aspèttanu mòriri -- stem: mur, boot: mòr moru __ muremu mori __ muriti mori __ mòrinu allungari -- stem: allung, boot: allòng allongu __ allungamu allonghi __ allungati allonga __ allònganu parrari -- stem: parr, boot: pàrr parru __ parramu parri __ parrati parra __ pàrranu rispùnniri -- stem: rispunn, boot: rispùnn rispunnu __ rispunnemu rispunni __ rispunniti rispunni __ rispùnninu crìdiri -- stem: crid, boot: crìd cridu __ cridemu cridi __ criditi cridi __ crìdinu
I just rewrote my scripts to implement that rule and to implement Fissatu's corrections. It should be a big improvement. Attached is a ZIP file containing an XLSX spreadsheet and PDF printout. There are 32 conjugated verbs -- allungari, arrispùnniri, aspittari, aviri, capiri, crìdiri, dari, diri, èssiri, fari, finiri, jiri, manciari, mèttiri, mòriri, ntènniri, pàriri, parrari, pèrdiri, pòniri, purtari, putiri, ripètiri, rispùnniri, sapiri, sèntiri, stari, studiari, tèniri, vèniri, vìdiri, vuliri Does anyone have fresh eyes for me? Thanks in advance!
Hi Fissatu, Thanks for the quick response. Bonner and Cipolla both put the stress on the "a" -- pàriri. Importantly, if the stress fell on the penultimate, then it would be an exception to the rule laid out above ... which might require me to rethink the rule. So if you notice any more cases like that, please tell me. Thanks!
I was just double checking Piccitto, it states that one locality in Enna pronounces pàriri, otherwise it's pariri. Camilleri shows both parìri and pàriri - so I guess both must be acceptable.
Good work! Also note that Dieli lists both mòriri and muriri. So we now have two examples of a caveat to the "boot + iri" part of the rule. Good eye! Thanks!
This project is going to fail and that's a good thing. It's a good thing because the result is going to be better. Dr. Dieli compiled a great vocabulary list, but to take his work forward, we need to completely rewrite it. The result is going to be a better dictionary. And it's going to be a better dictionary because the community will rewrite it. Specifically, Dr. Dieli's list needs a deeper level of definition. Part of speech and English/Italian translation is a great start. Now we need to add detail: usage notes, preferred forms, examples, verb conjugations, etc. Adding detail requires a new dictionary. For example: "vìviri" and "vìviri." Both verbs have the same infinitive, but they're two different verbs ("Iu vissi" and "Iu vippi"). Dr. Dieli's dictionary does not contain enough detail to handle this situation. So we either need to make an endless series of small edits to Dr. Dieli's work or we need a complete rewrite. Bite the bullet and rewrite it. At the very least, a complete rewrite is probably the best way to implement the new orthographic standard. And at best, a complete rewrite gives the community an opportunity to become involved. The only question is: "How to rewrite it?" I propose a flow "dû Sicilianu Spirimentali ô Sicilianu Cadèmicu." Let's use these Perl hashes to create a bunch of (experimental) spreadsheets -- a sheet of verbs, a sheet of nouns, a sheet of adjectives, etc. Then let's ask the community to correct the spreadsheets. Then we can load the approved version ntô Dizziunariu Ufficiali dâ Cadèmia Siciliana. For the official dictionary, I'm going to recommend SQL -- not my Perl hashes. Perl is great for running experiments, but the community needs to store its work in a proper database. And then the community will have a dictionary that truly belongs to them.
Well put @Eryk , I think perhaps what we should focus then on is: 1) How to best facilitate information collection/compilation (spreadsheets is neat idea) 2) How to best present this information in a flexible interface that deals with the specific issues of Sicilian 3) How to manage this project. We made an official GitHub for the Cadèmia, maybe we can start working out of there? github.com/cademia
I'm very happy that @Eryk has come up with this decision! I agree SQL is the way to go. As per the correcting approach: we cou'd show a random word/verb/... on page load, and let it be checked (✔/❌ and correction proposal) by users (allow anonymous users?). If we allow anonymous users, we should think of a Commission checking whether these crowdsourced entries are OK. We could also do a mix: autoapprove authenticated users (we should decide whom to give this power to), and let anonymous post suggestions. I could make a Python web frontend for this kind of thing, if needed
Thank you for the encouragement, @Paul and @dapal. I wonder if we can "grow a seed into a flower." The "seed" is the information already available to us, in this case: Dr. Dieli's dictionary. The "flower" is an SQL dictionary database with Python frontend. To "grow" the seed into the flower, we need a set of functions. The argument to the functions will be the already available information plus the information that we add. The values of the functions will be the spreadsheets that we will examine, correct and load into the SQL database. The functions that I have in mind are no different from the ones you learned in math class. Just with words instead of numbers. For example, suppose that f is a function of the variables: x and y. f(x,y) = x^2 + y^2 When x=2 and y=3, the value of the function is 13: f(x=2,y=3) = 2^2 + 3^2 f(x=2,y=3) = 13 Now, suppose that conjugate is a function of the variables: stem, boot, conjugation, tense and person. One value that it might return is "finìscinu": conjugate( stem="fin", boot="finìsc", conjugation="iri", tense="present", person="3rd plural" ) = finìscinu With that in mind, let's consider what information is available to us and what information we want to add. From Dr. Dieli's dictionary, we already have: the word itself -- finiri the part of speech -- verb English translations -- to finish, to end Italian translations -- finire, smettere Based on the spelling, we can infer that the verb's conjugation is "iri." The information that we must add is that the boot is "finìsc". And there's lots of information that we may want to add: example usage, regional variations, preferred forms, synonyms, antonyms, .... I think it would be really cool if we provided a Sicilian language definition for each word. So our first task is to specify exactly what information we want to collect. For example: the word itself part of speech etymology, regional variations, preferred forms Sicilian language definition synonyms, antonyms English/Italian translation examples of usage other notes For verbs, we also need: stem, boot conjugation irregular forms For nouns and adjectives, we also need: irregular singular and plural forms Let's begin making that list. Once we have a formal list, this project will take a life of its own. That would be awesome. I will organize my work and write a README for it.
@Eryk , I notice you mention Dr. Dieli's dictionary, have you considered the Wiktionary? I personally consider it a better source, even if it's less organised it has many more words and many variants as well.
It's a great source. To be exact, Sicilian Wiktionary has 21,841 words while Dr. Dieli has 12,060. You can download the whole Sicilian Wiktionary from dumps.wikimedia.org/scnwiktionary. The one to focus on is the one marked: "Articles, templates, media/file descriptions, and primary meta-pages." Each page is rolled up into a gigantic XML file, from which we could extract a lot of information. The difference is that Dr. Dieli's lists are simple HTML tables, so it's super easy to work with them. And frankly, I admire Dr. Dieli's work because he poured so much of his heart into developing the language. Ultimately, we will collect information from a lot of different lists. Dr. Dieli's list is just a great place to start.
Eryk's reasoning is sound, a lot of info is there, and I'd agree given Dr Dieli's knowledge, it would be quality data. However, the extra info being sought, well, it could get tied down for years trying to fill all those gaps. Even if one person was allocated one item of data to do, say they were tasked with doing the etymology of 11,000 words, that alone could take one person years, if two people were doing it, you might get that down to a couple of years, etc. This probably needs some sort of group decision as to how ambitious we want to be with this, there being a trade off between completeness ( a very good thing) and having something useful up and running as quickly as possible (also very useful). Not an easy decision.
If we write the dictionary to write itself, we could have a complete work in a relatively short amount of time. Think of it like "mail merge." You put the addresses into a spreadsheet and your word processor prints out hundreds of letters. We do the same thing here, but with the added twist that we automatically collect the information too. The spreadsheet writes itself. The letters write themselves. And pretty soon you have enough mail to fill a whole post office. In this specific case, we program our computer to collect information from Dieli's dictionary, from Wiktionary, etc. The computer then populates a spreadsheet with information that it collected. A human being then compares the information in the spreadsheet with their own knowledge of the language, with textbooks, etc. and makes corrections to the spreadsheet. We then load the corrected spreadsheet into an "official dictionary." The important step is to rigorously define exactly what information we want to collect. For example, below is information that I collected on four verbs. That little bit of information on diri, vistiri and vistirisi correctly produces whole conjugations. We only need a little bit of information because the information that we are collecting is so well-defined. (Èssiri requires more information because it is almost entirely irregular). I just finished defining what information to collect on verbs. My next steps are to define what information to collect on other parts of speech and to define what information to collect on words in general. Once we know exactly what information to collect, we will know exactly what assistance to ask for and this dictionary will grow rapidly. In the meantime, one thing that I will ask for is examples. For example, in Salvatore's video about taliari, he gives the examples: "Talìu i picciriddi ca jòcanu." "Taliamu a partita ô stàdiu." Those are excellent examples. They really help you understand how to use the verb taliari. Good examples like that will help people learn the language quickly. Code: %{ $vnotes{"diri"} } = ( verb => { conj => "xxiri", stem => "dic", boot => "dìc", irrg => { inf => "diri", pai => { quad => "dìss" }, pap => "dittu", adj => "dittu", }, },); Code: %{ $vnotes{"vistiri"} } = ( verb => { conj => "xxiri", stem => "vist", boot => "vèst", irrg => { inf => "vistiri", }, },); %{ $vnotes{"vistirisi"} } = ( reflex => "vistiri", ); Code: %{ $vnotes{"èssiri"} } = ( dieli => ["essiri"], verb => { conj => "xxiri", stem => "ess", boot => "èss", irrg => { pri => { us => "sugnu", ds => "sì", ts => "è", up => "semu", dp => "siti", tp => "sunnu"}, pim => { ds => "sia", ts => "fussi", up => "semu", dp => "siti", tp => "fùssiru"}, pai => { us => "fui", ds => "fusti", ts => "fu", up => "fomu", dp => "fùstivu", tp => "foru"}, imi => { us => "era", ds => "eri", ts => "era", up => "eramu", dp => "eravu", tp => "eranu"}, ims => { us => "fussi", ds => "fussi", ts => "fussi", up => "fùssimu", dp => "fùssivu", tp => "fùssiru"}, fti => { stem => "sa" }, coi => { stem => "sa" }, pap => "statu", adj => "statu", }, },);
Ancora travagghiu ô dizziunariu. Juncìi quarchi palori e criai quarchi "ricoti di palori." (Mi pari ca sunnu cchiù "ricoti" ca "cullizzioni"). Però lu me travagghiu cchiù importanti è chiddu ca nun si vidi: Juncìi un' àutra classi di verbi a li perl hashes. Ora ê travagghiari a li sostantivi, aggittivi, avverbi, ... Comu si dici in sicilianu: "Just keep truckin' on" ??