QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: Yoruba language and ICT
Views: 18033, Unique: 9461 
Subscribers: 20
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
All messages    << 328-343  312-327 of 360  296-311 >>
About these ads
Who | When
Messagessort recent-bottom   
Post a new message
 
Mike Maxwell  327
11-01-2007 09:32 AM ET (US)
I apologize for bringing up old stuff, but I just ran across something that may have a bearing on this.

QT - Remi-Niyi Alaran wrote:
> I would like to share a writing system that I have developed for
> the Yoruba language.
> ...
> The Ajayi script... has proved difficult to convert
> Yoruba into computer machine code because of the diacritical
> marks.

Actually, it can be *encoded* on a computer without any problem, using Unicode. I have seen problems displaying it, due to inappropriate diacritic placement. A font problem, not an encoding problem (and certainly not a problem with Unicode). But that isn't the real point I wanted to bring up:

> The FaYe system does away with diacritical marks altogether.

There was some discussion in this list about the difficulties of getting a script like this encoded in Unicode, since the script is (as far as I know) not in general use. It turns out there is a systematization of the Unicode Private Use Area (PUA) called Conscript:
   http://www.evertype.com/standards/csur/
One could submit a request to set aside the necessary code points in this area for FaYe. I don't know how much of a standing this has as something _official_, but it's probably better than picking your own area in the PUA and issuing a font for it.

(Apologies if this was already brought up; I don't recall seeing it mentioned on this list, but my memory gets shorter as my years get longer.) --
 Mike Maxwell
 maxwell@ldc.upenn.edu
Remi  326
10-31-2007 06:03 AM ET (US)
in = reviewing Unicode website and considering Dejavu font for Faye possibilities = mode
Mike Maxwell  325
10-30-2007 08:11 PM ET (US)
> First of all,
> only scripts with established use are approved.

Or scripts which were in established use in millenia past
(hieroglyphics, cuneiform, the Tagalog Brahmi script...). But the general point is correct: Unicode does not set aside blocks for any arbitrary script, otherwise they would be swamped with requests to encode lots of one-person scripts.

> Secondly, the
> developing of a solid proposal for encoding is a bit of work and
> there is a backlog of scripts that have not yet been encoded

While that's true, I believe the reason has to do with the fact that all the easy cases have already been done. What's left are probably under-documented scripts, relatively rare scripts, or ancient symbol sets for which it's not even clear if they really were a form of writing (as opposed, say, to decorations).
--
 Mike Maxwell
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  324
10-30-2007 12:34 PM ET (US)
Hi Remi, One possible approach for fonts might be to encode the characters in the "Private Use Area" (PUA) of Unicode.* This way any special font you make (which would be required anyway to read text in your alphabet) will let you mix with other scripts for translation or explanatory notes, and it won't have incompatibilities with other Unicode text.

Does anyone else have any guidelines on use of PUA in this area? Could Remi do this, say, in the PUA of DejaVu font and rename it, say, DejaVu-FaYe?

As for getting your alphabet in Unicode/ISO-10646, that is not an easy matter at all, from what I understand. See http://www.unicode.org/pending/proposals.html . First of all, only scripts with established use are approved. Secondly, the developing of a solid proposal for encoding is a bit of work and there is a backlog of scripts that have not yet been encoded (hence the Script Encoding Initiative http://linguistics.berkeley.edu/sei/ ).

Don

*See:
Chapter 13 in Unicode 5.0 book (section 13.5) http://unicode.org/charts/PDF/UE000.pdf
Unicode chart "Private Use Area Range: E000–F8FF Disclaimer Terms of Use" http://unicode.org/charts/PDF/UE000.pdf
Remi  323
10-26-2007 06:15 AM ET (US)
Edited by author 10-26-2007 06:18 AM
All
My apologies for the delay since posting the Yoruba FaYe system. Thank you for all comments and support so far. There is some work to do before submitting for UTF. I will particularly appreciate help in creating
1) a Linux-compatible and Windows-compatible Yoruba FaYe font
2) a keyboard mapping,
3) virtual keyboard so that users can click 'n' write.
4) integrate font onto OpenOffice suite and Firefox browser
With these four components, it should become easier for people to use FaYe to generate and understand literature.

This is not just about the alphabet. A fatalistic flaw of the Ajayi script is its dependence on adding diacritical marks to the Roman script. It is difficult to read or understand Yoruba without those marks. Unfortunately, it is not easy to add them to online or print literature due to lack of software vendor (no Microsoft code) support for digitisation.

Adé, ẹ wa ni bẹ
Because of lack of diacritical marks, these five words CAN MEAN
Adé, the letter 'ẹ' is there;
Adé, you are there;
Adé, you come to the place;
Adé, you will look for someone to cut;
Adé, you look for someone to beg (someone-else)

PS. I am not a linguist or any sort of language expert. My interest is in helping the next generation of Yoruba learners and in promoting wider cultural / commercial interest in written Yoruba.

I found the Fontforge font-editing software. Can anyone please advise on how to create non-Latin fonts with it? Or recommend other agreeable open source software for attaining 1 - 4 above? In due course, I may seek advice on UTF procedures.
Remi  322
10-26-2007 05:23 AM ET (US)
E wa ni be
Mike Maxwell  321
09-14-2007 10:32 PM ET (US)
To echo Don Osborn's thoughts:
 > An alternative argument would be that even a sub-optimal but
 > workable script could that is already established is better to > keep working with than to change.

IMO, the best writing system is one people use. It doesn't matter how bad it is, if it's used, it's probably better than throwing it away and starting with a new one. Look at English, with its awful spelling. Many alternative alphabets have been proposed over the last century or two, and none caught on. Why? Because people could already read and write. And no matter how much better a reformed alphabet would have been (believe me, we waste a lot of time in school learning to spell), none of the alternatives ever caught on.

Or worse, look at Chinese. It would be hard (IMO) to think of a worse way to write; the only advantage is that it allows people speaking mutually unintelligible languages (such as Mandarin and Cantonese) to communicate by writing. Despite the existence of workable alphabetic alternatives, the Chinese continue to use their character system. And they publish books, magazines, newspapers, and have a substantial presence on the Internet.

I would say that much better than trying to devise a new and better writing system, one's efforts could be put to teaching people to use the one they have, and then encouraging them to do so. A thriving
literature written in a bad alphabet is much more important to a language and culture than a lack of literature with a perfect alphabet.
The best writing system is one people use.
--
 Mike Maxwell
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  320
09-14-2007 09:18 PM ET (US)
Edited by author 09-14-2007 09:20 PM
Dear Remi-Niyi, Thanks for bringing the new FaYe system to our attention /m317. There have been several new alphabets discussed in recent years (one each in Senegal, Gambia and Cameroon), not to mention older African writing systems. Each one proposes new advantages but have there been any efforts to compare with the others? This is not to discourage you, just to ask.

An alternative argument would be that even a sub-optimal but workable script that is already established is better to keep working with than to change. (A similar argument was made in the case of Bambara of Mali concerning several changes in the orthography over the last 30 years - people who learn one system have to learn the new one and old publications have to be revised, etc., even though the older systems were not bad.)

In any event, I am sure that in the longer run, computer tools for transliteration would enable transforming text in a particular language from one script to another. So work on a proposed new script need not slow work in the existing one.

Similarly, the current problems with diacritics on Latin characters should not be overstated. Many more complex scripts are already fully used on computers, the internet, cellphones, SMS. Computer systems are being improved in their ability to handle combining diacritics, complex scripts, etc.

As for Adé's question about Unicode, my understanding is that a script needs to be pretty well established before getting into the pipeline. The Mandombe script used in parts of D.R. Congo (invented in the 1970s) is still not in process at all, as far as I know. Nor does it have an identifier code. Again, this is not to discourage you, but rather to outline the situation.

On the other hand, I suppose that if the Nigerian government were to decide that FaYe is what it will use in all schools for Yoruba and whatever other languages, the importance of the new writing system for encoding in Unicode would be significant. This is a little bit like what happened with Tifinagh - there was a proposal to encode this ancient but still used script in Unicode/ISO 10646, but only when the Moroccan government decided it was going to use this in schools for Tamazight instruction did it get enough attention to finalize a version of the proposal for approval.

Hope this is of help. Good luck.

Don Osborn
Bisharat.net
PanAfriL10n.org
Adé  319
09-06-2007 06:10 PM ET (US)
Rẹmi,

I see that you accounted for ọ and ṣ (that is s with sub-dot) but not ẹ. Also what about the nasal n sound?

Do you plan to submit your system to Unicode for inclusion in the Universal Text Format?
maxwell@ldc.upenn.edu  318
09-06-2007 12:31 PM ET (US)
Quoting QT - Remi-Niyi Alaran <qtopic+15-KKgbRqJUAR8@quicktopic.com>: > Research indicates that humans can only
> delineate between about 34 unique sounds.

What research is this? (Warning in advance: I don't think this is correct, although I suppose it depends on what you mean by "unique sounds".)

> The FaYe system...is phonetic, so that every sound in Yoruba is now
> represented by a unique character.

I suspect you mean phonemic, not phonetic. It is certainly possible to write Yoruba (or any other language) phonetically, using e.g. the IPA system, but that typically multiplies the number of "unique sounds" over what a phonemic system would require.

   Mike Maxwell
   CASL/ U MD

-------------------------------------------------------------- -
This message was sent using IMP, the Internet Messaging Program.
Remi-Niyi Alaran  317
09-06-2007 11:24 AM ET (US)
I would like to share a writing system that I have developed for the Yoruba language.
Called Yoruba FaYe [meaning "draw it so we can understand it"), it may also be extended to other languages. It comprises 48 characters = alphabet (38) and numerals (10). This is a smaller character set than the 62 symbols (26 capitals, 26 lower-case and 10 numerals) in the standard English language, which uses Roman script.

The present Yoruba script was developed by the missionary Ajayi Crowther in 1836. The Ajayi script features a standard Roman script with the addition of diacritical marks to reflect Yoruba tonality and accent, So the Ajayi script is larger than the 62 characters used in English. It has proved difficult to convert Yoruba into computer machine code because of the diacritical marks. Without the diacritical marks, it is very difficult for even accomplished Yoruba linguists to efficiently read or write Yoruba.

The FaYe system does away with diacritical marks altogether. It is phonetic, so that every sound in Yoruba is now represented by a unique character. Research indicates that humans can only delineate between about 34 unique sounds. Yoruba FaYe has a 38 character alphabet and it has a natural rhythm able to accommodate the sophistication of Yoruba's complex and rich oral literature.

It is hoped that the FaYe system helps in considerably improving the quantity and quality of literature available to document African Literary Heritage. You can view or download the FaYe system at www.ijebudrums.blogspot.com
BisharatNetPerson was signed in when posted  316
06-18-2007 11:02 PM ET (US)
FYI, the Teaching and Learning with Technology site of the Pennsylvania State University has a page on "Yoruba Accent Codes" at http://tlt.its.psu.edu/suggestions/interna...anguage/yoruba.html

Don Osborn
Bisharat.net
PanAfriL10n.org
BisharatNetPerson was signed in when posted  315
05-03-2007 01:41 PM ET (US)
FYI, a Yoruba dictionary online that hasn't been mentioned here is at http://freelang.net/dictionary/yoruba.html . I haven't looked at it yet (it requires downloading). Don
BisharatNetPerson was signed in when posted  314
05-03-2007 12:56 PM ET (US)
Belated thanks to Mike and Samuel for the feedback. I think that advanced technologies such as machine translation in the case of Yoruba need some long term planning. Knowing the practical steps, such as the utility of parallel texts, and the paths, such as context-specific work, help.

While I think it's important to discuss these things, one side effect is that people sometimes get the impression that it's a project actively underway.

Ultimately if such projects - from basic issues like corpora to advanced applications - are to really progress, I think there would need to be more training of Nigerians in Nigeria in aspects of language and ICT. Not sure how much of that is going on already, but given how multilingual Africa is, any strategy to advance use of ICT should consider this aspect. Just a thought.

Don Osborn
Bisharat.net
PanAfricanL10n.org
Mike Maxwell  313
03-16-2007 06:49 PM ET (US)
QT - Dr. Samuel Olamijulo wrote:
> For the special attention of all who are working on Machine
> Translation for Yoruba,

I'm not sure that anyone is...

> I am sure you will find very useful this relatively recent
> Yoruba and English publication. Corresponding Yoruba and English
> translations are on each half of 1978 pages. The publishers must
> have it in digital format.=20
>
> Title: BIBELI MIMO =96 HOLY BIBLE King James Version

I neglected to mention in my earlier post that the bilingual text one uses for statistical MT should be in the same genre that one hopes to translate. So parallel news text if you hope to use MT for news, parallel text of computer manuals if you hope to use MT to translate computer manuals, etc. The Bible is indeed a source, and is available in print form (if not in electronic form) for most written languages. But it would probably only work well as an MT source only if you wanted to translate other texts of a religious sort. (I think one might be able to extract other kinds of information, e.g. about morphology, from a Bible; but then Yoruba doesn't have much in the way of morphology.) --
 Mike Maxwell
 maxwell@ldc.upenn.edu
Dr. Samuel Olamijulo  312
03-16-2007 05:38 AM ET (US)
 Yoruba-English Bilingual Texts on Same of 1978 Pages

For the special attention of all who are working on Machine Translation for Yoruba,

I am sure you will find very useful this relatively recent Yoruba and English publication. Corresponding Yoruba and English translations are on each half of 1978 pages. The publishers must have it in digital format.

Title: BIBELI MIMO – HOLY BIBLE King James Version

Produced year 2004 by
 
Bible Society of Nigeria

Tel: 234 1 545 7524 ; 587 6471

Website: http://www.biblesociety-nigeria.org/

Contacts: http://www.biblesociety-nigeria.org/nigeria-3.htm

18 Wharf Road,
Apapa
Lagos, Nigeria

Thank you.
Dr. Samuel Kayode Olamijulo
RSS link What's this?
All messages    << 328-343  312-327 of 360  296-311 >>
QuickTopicSM message boards
Over 200,000 topics served
Learn more Frequently asked questions  Acknowledgements
What they're saying about QuickTopic
 Questions, comments, or suggestions? Contact Us
Read our use policy before beginning. We value your privacy; please read our privacy statement.
Copyright ©1999-2008 Internicity Inc. All rights reserved.