QuickTopic (SM) free message boards QuickTopic (SM) free message boards
Skip to Messages
  Sign In to access your topic list  |New Topic |My Topics|Profile
Upgrade to Pro   Customize, show pictures, add an intro, and more:   QuickTopic Pro...and check out QuickThreadSM
Topic: Yoruba language and ICT
Views: 18025, Unique: 9458 
Subscribers: 20
What's
this?
Printer-Friendly Page
Subscribe to get & post, or stop messages by email Subscribe
About these ads
Who | When
Messagessort recent-bottom   
Post a new message
 
BisharatNetPerson was signed in when posted  360
07-24-2009 08:56 PM ET (US)
I received a request for Unicode text of African languages written in Latin script with extended characters. The purpose is testing some new fonts. Does anyone have access to such digital text of Yoruba?

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  359
06-12-2009 12:26 PM ET (US)
Excerpt from a message posted by Manuela Noske on A12n-collaboration:

FYI that Microsoft Corporation has released the Windows Vista Language Interface Packs for Hausa, Igbo and Yoruba on June 2nd. The LIPs are available at the locations below. They are free downloads that install on any PC that runs Windows Vista with either SP1 or SP2.

Yoruba Details Page and .mlc:

http://www.microsoft.com/downloads/details...7f5b&displaylang=yo

http://download.microsoft.com/download/A/E...F5EB9/LIP_yo-NG.mlc

...


(See also /m353 on this message board.)
Olalekan (YorubaWorld)  358
05-19-2009 06:39 AM ET (US)
Mr Osbon

In response to your post number 355 below, I quote as thus

"Ringtones that are set to "say" something, such as the name of the caller"

I've outlined how it can be done below

YORUBA RINGTONES ALERT
----------------------

Just a little modification to phone contact list, old handsets may not have this feature, user need to check their phone manual about "CONTACT LIST" settings

1.Record your ringtone as ( ADE npe o, gbe foonu re) means (ADE is calling you pick your call)in .mp3,.mp4 or 3gp with any audio programs such as Audacity ( http://audacity.sourceforge.net/)then copy it onto your phone memory card.

2. Or use phone voice recording features.

3. Rename the recorded audio clip to e.g."ADE" the name of the person

4. At phone contact list, "ADD NEW CONTACT" add the person's name.

4. Scroll to a section that says "RING TONE" click it

5. Browse the ring tone list, look for "ADE" then select it

6. Now you've assigned recorded audio clip "ADE.mp3" as "RINGTONE" to ADE on your contact list, it will alert you of all "ADE's" incoming calls.

7.Whenever ADE calls, you will hear(ADE npe o, gbe foonu re)means (ADE is calling you pick your call)

6. That's all, Wao! you're done.


Olalekan
YorubaWorld Information Service
Olalekan (YorubaWorld)  357
05-19-2009 05:47 AM ET (US)
Yeah! this is a good idea, as I've posted on several Yoruba sites / forums that, the best way to promote Yoruba language and culture online at the moment is by publishing Yoruba contents in multimedia format. because support for Yoruba orthography is still at its infancy. Therefore, the use of aural and visual elements along with textual content will aid understanding of Yoruba language.

In year 2002, YorubaWorld forum at yahoo, published an Interactive multimedia Yoruba Alphabet & Number Tutor, we had an overwhelming downloads and positive comments, we also modify Windows Operating System sound scheme to "Yoruba sound scheme" applied to events in Windows and Programs i.e. Greetings in Yoruba on Windows Opening(Ekaabo), Closing(Oda bo), Alert: OK (Beeni), NO(Beeko)and so on.

In 2003, TeachME ABD, an instructional program was designed to teach Yoruba Language online, Visually interactive with an on-screen demonstration.


A full fledge free interactive Yoruba e-content will be featured on our forthcoming web site YorubaWorld Information Service
Pl's Watch Out
BisharatNetPerson was signed in when posted  356
05-18-2009 11:30 PM ET (US)
The Google web search engine now has a Yoruba interface. See http://www.google.com/intl/yo/ . It seems that they did not use any subdot characters, but do have some tone marks.
BisharatNetPerson was signed in when posted  355
03-01-2009 11:53 PM ET (US)
Given the tonal nature of Yoruba and the talking drum tradition/technology, and noting the work that Tunde Adegbola has been doing with that, what might be the possibility of doing something creative with cellphone technology that would be unique to Yoruba? Such as:
  • ringtones that are set to "say" something, such as the name of the caller
  • SMS messages that are tone sequences

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  354
01-02-2009 02:11 PM ET (US)
Edited by author 01-02-2009 02:12 PM
The A12n-collaboration list has had a thread of possible interest under the title "Tech support for Yoruba orthography." It began (or resumed) with a posting by Dr. Samuel Olamijulo on 26 Dec. at http://lists.kabissa.org/lists/archives/pu...ation/msg01174.html

The full archives A12n-collab are accessible at http://lists.kabissa.org/lists/archives/pu...a12n-collaboration/ and mirrored on Linguist List (since 2004) at http://listserv.linguistlist.org/archives/a12n-collaboration.html

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  353
12-13-2008 02:19 PM ET (US)
Microsoft's Local Language Program is working to localize Windows Vista in Yoruba.

See this story on 'Gbenga Sesan's Oro blog:
"Vista, MS Office in Hausa, Igbo, Yoruba"
Friday, December 12th, 2008
http://www.gbengasesan.com/blog/?p=307

The translation of terminology in Yoruba is available for review at:
http://www.pinigeria.org/microsoft/yorubaglossary.pdf

You can check out more information about Windows Vista Language Support in Yoruba and other languages at:
http://www.microsoft.com/globaldev/vista/V...nguage_Support.mspx

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  352
08-02-2008 08:57 AM ET (US)
Edited by author 08-02-2008 09:02 AM
Hi, re /m350 I neglected to put the URLs for the abstracts. They're on the ACM Ditital Library Portal at:

A modular holistic approach to prosody modelling for Standard Yorùbá speech synthesis
http://portal.acm.org/citation.cfm?id=1288084

A fuzzy decision tree-based duration model for Standard Yorùbá text-to-speech synthesis
http://portal.acm.org/citation.cfm?id=1221595.1221970

These are only abstracts. Apparently, one can subscribe (not free) to read the full text of the articles.

Unfortunately it looks like the ACM site is Unicode-challenged - note how some words in Yoruba seem garbled. I assume that subdot characters or combining diacritics were involved.
Sir Lawie  351
08-02-2008 07:28 AM ET (US)
Thanks a lot I'll check it out
BisharatNetPerson was signed in when posted  350
08-02-2008 03:13 AM ET (US)
FYI (but not light reading) - 2 articles on text to speech for Yoruba:

A modular holistic approach to prosody modelling for Standard Yorùbá speech synthesis
Computer Speech and Language
Volume 22 , Issue 1 (January 2008) table of contents
Pages 39-68
Year of Publication: 2008
ISSN:0885-2308
Authors
dtúnjí A. djbí Room 109, Computer Buildings, Computer Science and Engineering Department, báfmi Awólw` University, Ilé-If`, Nigeria
Shun Ha Sylvia Wong Computer Science, Aston University, Aston Triangle, Birmingham B4 7ET, UK
Anthony J. Beaumont Computer Science, Aston University, Aston Triangle, Birmingham B4 7ET, UK

A fuzzy decision tree-based duration model for Standard Yorùbá text-to-speech synthesis
Source Computer Speech and Language archive
Volume 21 , Issue 2 (April 2007) table of contents
Pages 325-349
Authors
dtúnjí A. djbí Computer Science, Aston University, Aston Triangle, Birmingham B4 7ET, UK and Room 109, Computer Buildings, Computer Science and Engineering Department, báfmi Awlw` University, Ilé-If`, Nigeria
Shun Ha Sylvia Wong Computer Science, Aston University, Aston Triangle, Birmingham B4 7ET, UK
Anthony J. Beaumont Computer Science, Aston University, Aston Triangle, Birmingham B4 7ET, UK
 
Messages 349-348 deleted by topic administrator between 07-11-2008 06:26 PM and 07-09-2008 07:28 AM
Sir Lawie  347
07-07-2008 07:41 AM ET (US)
RE: Yoruba Text to Speech

Thanks Mr. Adegbola
Sir Lawie  346
07-07-2008 07:20 AM ET (US)
All you need is 'Total video converter' this program is good at encoding & decoding both video and audio files, either Web, DVD, Portable player, Mobile phone or PDA (Pocket PC)

Follow this link for free trial:

http://www.effectmatrix.com/total-video-converter/index.htm

Sir Lawie
[YorubaWorld Information Service]
http://groups.yahoo.com/group/yorubaworld/
kalison  345
07-07-2008 04:21 AM ET (US)
Need new Rip DVD to AVI ?
Rip DVD to AVI
Have a nice surfing!
BisharatNetPerson was signed in when posted  344
06-22-2008 03:30 PM ET (US)
Thanks Tunde, BTW I just came across a report of a recent meeting at which you and your colleague presented about Yoruba language and ICT:

Daily Sun
"How computer can boost learning of Yoruba language and culture"
By SOLA BALOGUN
Sunday, May 25, 2008
http://www.sunnewsonline.com/webpages/feat...-25-05-2008-001.htm

Don
Tunde Adegbola  343
06-20-2008 10:51 AM ET (US)
One of our associates, Dr. Tunji Odejobi of the Computer Science and Engineering Department at Obafemi Awolowo University is working on Yoruba Speech Synthesis and has developed a software that works.
He is at present at University of Cork in Ireland where he will be for the next year. I shall make him aware of this demand.
Tunde
Sir Lawie  342
06-18-2008 07:24 AM ET (US)
Is there any 'text to speech software' that can read Yoruba text?
BisharatNetPerson was signed in when posted  341
06-11-2008 07:06 AM ET (US)
Here's part of a reply Andrew sent to Samuel's inquiry (also last Nov.):

Not all Yoruba Unicode keyboard layouts produce the same sequence of Unicode codepoints, so testing with alternative character sequences should be done,

e.g. in Unicode the letter Ọ́ can be represented as <U+004F U+0323 U+0301> (NFD), <U+004F U+0301 U+0323>, <U+1ECC U+0301> (NFC) or <U+00D3 U+0323>. These are canonically equivalent and should display the same. But it is necessary to test fonts with all combinations.

Andrew
BisharatNetPerson was signed in when posted  340
06-10-2008 10:15 AM ET (US)
Last November, Samuel Olamijulo sent this question concerning new Gentium fonts to selected lists and individuals. It included a copy of an announcement by the creator of these fonts. I'm posting it all here with a question as to what kind of response there was and any further thoughts about Gentium and Yoruba. Don

---------------
Please what do you think of these new Gentium Basic and Gentium Book fonts offers in relation to still much needed enhancing of Yoruba Language Display and Storage on the INTERNET, complete with tonal signs and under dots ?
 
Thank you for your time.
Dr Samuel Kayode Olamijulo
**********************************
----- Forwarded Message ----
From: Gentium-Announce List <owner-gentiumlist@lists.sil.org>
To: Gentium-Announce <gentium-announce@lists.sil.org>
Sent: Thursday, November 15, 2007 6:51:51 AM
Subject: [Gentium] Update #5 - Gentium Basic and Gentium Book Basic available for testing

Gentium-Announce List
Update #5 - Gentium Basic and Gentium Book Basic available for testing
- - - - - - - -

Dear friends of Gentium,

Great news! We have been hard at work to complete the first versions of Gentium that include bold and bold italic. For the first time in many years we have a major release for you - in a preliminary test version.

We have two new font families in the Gentium clan: Gentium Basic and Gentium Book Basic. Each has a complete set of four weights: regular, italic, bold and bold italic. Gentium Book Basic is generally heavier than the original Gentium and better for some publishing uses. Both families also include a few OpenType and Graphite smart font features, including optimized diacritic positioning. I've appended parts of the Gentium Basic FONTLOG below to give you more detailed information on these fonts.

The new fonts are called 'Basic' because they support a smaller set of characters than the full Gentium fonts. They only support basic Latin and a handful of extended Latin characters. There is no Greek or Cyrillic, or even full IPA. The purpose is to provide early versions of the new weights that meet the needs of most Latin script users.

Never fear - we haven't abandoned the main Gentium fonts. Our next task will be to return to them and complete an update of the existing regular and italic to add extended Cyrillic, ancient Greek glyphs, Unicode 5.1 updates, and smart font capabilities. We'd hoped to have this completed by now, but wanted to get the new weights to you as soon as we could. After that we plan to expand the main Gentium family to include these new weights and smart font code.

The new Basic fonts are only available in beta test right now. They contain known bugs, so we don't yet recommend them for everyday production use, or as the source for derivative versions. The most serious one is that the lowercase 'z' has too much space in the heavier italic weights. We plan to release a fixed, final release of the Basic fonts in a month or two, once initial broad testing is done. So we welcome your bug reports and general opinions on the design of the heavier faces.

The beta test fonts are available at:

    http://scripts.sil.org/Gentium_basic

A few requests:

- please note the limitations and known problems
- please do not ask us to expand the Basic character set, as those needs will be met by the complete Gentium font family
- please report problems to me at the email address below, not via the download feedback form on the main Gentium download page

Thanks again for your interest in Gentium, and the many encouraging emails you have sent.

Victor Gaultney
Gentium /at/ sil.org
BisharatNetPerson was signed in when posted  339
06-06-2008 12:10 AM ET (US)
I have been given to understand that there is a project involving University of Wisconsin and University of Oregon concerning Yoruba fonts. Am seeking more information. Does anyone know anything about it?

Don
   338
06-03-2008 11:23 PM ET (US)
Deleted by topic administrator 06-06-2008 12:08 AM
BisharatNetPerson was signed in when posted  337
04-06-2008 05:47 PM ET (US)
Edited by author 04-06-2008 05:47 PM
FYI, the Aflat.org site has an "Automatic Diacritic Restoration" utility that can be used for Yoruba. See http://www.aflat.org/?q=node/184 , and let us (and them) know what you think!

Don Osborn
Bisharat.net
Tunde Adegbola  336
01-29-2008 12:10 AM ET (US)
The small vertical line below is supposed to be always contiguous with the base letter. The rational is that the character should be seen as one whole character rather than a base character and a modifier.

If I may reiterate the background, the vertical line was proposed as against the dot because with use, the dot wore off and disappear from the die before the base character. This was in the days of die casting. Hence, the contiguity of the small vertical line and the base character promoted longevity of the die.

Now that printing technology has changed radically, thank to ICT, the problem of the wearing dot has disappeared but few have taken notice. The under dot now seems to be tolerated because the dot does not wear.

I think the change in printing technology may instigate or ultimately force a change in the standard.
BisharatNetPerson was signed in when posted  335
01-18-2008 06:50 PM ET (US)
Edited by author 01-18-2008 06:54 PM
Happy New Year all (belatedly)!*

I have a quick question about the positioning of the small vertical line below (U+0329) in the "classical" usage (as opposed to the more commonly used dot under/subdot). Is it:
a) always contiguous with the base letter
b) usually contiguous with the base letter
c) sometimes contiguous, sometimes separate

It's a question relevant to any effort to (1) provide for positioning of the vertical line or (2) design a combined glyph. This is not a question about line vs. dot (I think that was pretty much resolved a while back) but on what the "correct" or "ideal" use of the line is, when it is used.

A separate question is whether any other languages in the region use (or have used) the line under and if so how they position it.

Thanks in advance for any feedback.

Don Osborn
Bisharat.net
PanAfriL10n.org

*It's International Year of Languages. See http://tinyurl.com/yofvdf
BisharatNetPerson was signed in when posted  334
12-15-2007 10:25 AM ET (US)
The One Laptop Per Child project (see /m282 & /m331) has a page for people who want to work on localization at http://wiki.laptop.org/go/Pootle#Sign-up . For languages like Yoruba not yet in the table, it looks like you will have to add appropriate rows in order to enter your name.

Note link to their page on Yoruba.

Don Osborn
Bisharat.net
PanAfriL10n.org
BisharatNetPerson was signed in when posted  333
12-15-2007 10:23 AM ET (US)
You're welcome Remi. This message board was set up over 5 years ago in response to some questions about using Yoruba on computers (see /m1). Over the years it has had a lot of input by various people working on various initiatives or ideas. I've also personally tried to post relevant info that I come across with the thought that it is useful to have as much info on Yoruba language and ICT in one place. Or at least one place - this QuickTopic board does not have to be the only one, and different forums or websites can have different approaches.

The particular advantage of this board is that it is somewhat free-form like a public discussion, letting people post as they need to without subscriptions or log-ins, and without the need to conform to a particular structure (for example, you don't have to find the right category to ask a particular question).

Maybe it is a good time to ask those who have not posted lately for updates on their projects or efforts to use Yoruba on computers, the internet, and mobile devices.

Don
Remi  332
12-14-2007 06:12 PM ET (US)
Thanks for the updates.
BisharatNetPerson was signed in when posted  331
11-23-2007 11:16 AM ET (US)
Edited by author 11-23-2007 02:00 PM
During the last four weeks there has been a discussion about keyboard layouts on the a12n-collaboration list. The main topic is plans by the One Laptop Per Child project for a multilingual keyboard for its "XO" laptop. The keyboard layout would be intended to support languages of Nigeria (including Yoruba) or the whole of West Africa.

If you are interested, see http://lists.kabissa.org/lists/archives/pu...a12n-collaboration/ (Note- this list is not an official OLPC forum)

The OLPC layouts are shown at:
http://wiki.laptop.org/go/Image:WAfrica-Alt-1.png
&
http://wiki.laptop.org/go/OLPC_Nigeria_Keyboard

See also /m282

Don
BisharatNetPerson was signed in when posted  330
11-23-2007 11:09 AM ET (US)
Edited by author 11-23-2007 11:13 AM
Hi Remi,

There are two separate issues. First, you could develop a font by either using the PUA in Unicode (see /m324 & /m327) or by modifying the upper range of an old 8-bit font. This would enable you to produce documents. You could also, with an 8-bit FaYe font, drive the font from the server for a webpage. Such things have been done with 8-bit fonts by people using the N'Ko alphabet to write Manding (Malinke/Bambara).

That would give you some utility but is obviously limited.

The second issue is that for a script to be recognized across devices, platforms and applications in the way you want, the only standard is Unicode. Without a common coding standard, of course, it is impossible to have such intercompatibility. But as we have said, Unicode can't accept proposed scripts whatever their virtues may be - there are just so many and new ones continue to be invented.

The N'Ko alphabet, to use that example again, has just been incorporated in Unicode, but it has been in use for half a century with quite some publication and ongoing newsletters etc. Getting N'Ko in Unicode involved demonstrating that it is in active use.

Sorry there is not better news.

Don
Remi  329
11-20-2007 04:38 PM ET (US)
Thanks all for comments so far.
What I would like to achieve with Yoruba Faye, for now, is to be able to type the characters online, or on a mobile phone, just like I am typing these characters.
World domination through the Unicode can wait ;-) unless unicode is the only way to get Yoruba Faye characters into a word processor or virtual keyboard.
Please advise accordingly.
BisharatNetPerson was signed in when posted  328
11-17-2007 01:00 PM ET (US)
 
Several Nokia mobile phone models apparently have text menu support in Yoruba. Does anyone have any experience with using this? How good is it?

See http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Nokia for a list of models.

TIA for any info.

Don Osborn
Bisharat.net
PanAfriL10n.org
Mike Maxwell  327
11-01-2007 09:32 AM ET (US)
I apologize for bringing up old stuff, but I just ran across something that may have a bearing on this.

QT - Remi-Niyi Alaran wrote:
> I would like to share a writing system that I have developed for
> the Yoruba language.
> ...
> The Ajayi script... has proved difficult to convert
> Yoruba into computer machine code because of the diacritical
> marks.

Actually, it can be *encoded* on a computer without any problem, using Unicode. I have seen problems displaying it, due to inappropriate diacritic placement. A font problem, not an encoding problem (and certainly not a problem with Unicode). But that isn't the real point I wanted to bring up:

> The FaYe system does away with diacritical marks altogether.

There was some discussion in this list about the difficulties of getting a script like this encoded in Unicode, since the script is (as far as I know) not in general use. It turns out there is a systematization of the Unicode Private Use Area (PUA) called Conscript:
   http://www.evertype.com/standards/csur/
One could submit a request to set aside the necessary code points in this area for FaYe. I don't know how much of a standing this has as something _official_, but it's probably better than picking your own area in the PUA and issuing a font for it.

(Apologies if this was already brought up; I don't recall seeing it mentioned on this list, but my memory gets shorter as my years get longer.) --
 Mike Maxwell
 maxwell@ldc.upenn.edu
Remi  326
10-31-2007 06:03 AM ET (US)
in = reviewing Unicode website and considering Dejavu font for Faye possibilities = mode
Mike Maxwell  325
10-30-2007 08:11 PM ET (US)
> First of all,
> only scripts with established use are approved.

Or scripts which were in established use in millenia past
(hieroglyphics, cuneiform, the Tagalog Brahmi script...). But the general point is correct: Unicode does not set aside blocks for any arbitrary script, otherwise they would be swamped with requests to encode lots of one-person scripts.

> Secondly, the
> developing of a solid proposal for encoding is a bit of work and
> there is a backlog of scripts that have not yet been encoded

While that's true, I believe the reason has to do with the fact that all the easy cases have already been done. What's left are probably under-documented scripts, relatively rare scripts, or ancient symbol sets for which it's not even clear if they really were a form of writing (as opposed, say, to decorations).
--
 Mike Maxwell
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  324
10-30-2007 12:34 PM ET (US)
Hi Remi, One possible approach for fonts might be to encode the characters in the "Private Use Area" (PUA) of Unicode.* This way any special font you make (which would be required anyway to read text in your alphabet) will let you mix with other scripts for translation or explanatory notes, and it won't have incompatibilities with other Unicode text.

Does anyone else have any guidelines on use of PUA in this area? Could Remi do this, say, in the PUA of DejaVu font and rename it, say, DejaVu-FaYe?

As for getting your alphabet in Unicode/ISO-10646, that is not an easy matter at all, from what I understand. See http://www.unicode.org/pending/proposals.html . First of all, only scripts with established use are approved. Secondly, the developing of a solid proposal for encoding is a bit of work and there is a backlog of scripts that have not yet been encoded (hence the Script Encoding Initiative http://linguistics.berkeley.edu/sei/ ).

Don

*See:
Chapter 13 in Unicode 5.0 book (section 13.5) http://unicode.org/charts/PDF/UE000.pdf
Unicode chart "Private Use Area Range: E000–F8FF Disclaimer Terms of Use" http://unicode.org/charts/PDF/UE000.pdf
Remi  323
10-26-2007 06:15 AM ET (US)
Edited by author 10-26-2007 06:18 AM
All
My apologies for the delay since posting the Yoruba FaYe system. Thank you for all comments and support so far. There is some work to do before submitting for UTF. I will particularly appreciate help in creating
1) a Linux-compatible and Windows-compatible Yoruba FaYe font
2) a keyboard mapping,
3) virtual keyboard so that users can click 'n' write.
4) integrate font onto OpenOffice suite and Firefox browser
With these four components, it should become easier for people to use FaYe to generate and understand literature.

This is not just about the alphabet. A fatalistic flaw of the Ajayi script is its dependence on adding diacritical marks to the Roman script. It is difficult to read or understand Yoruba without those marks. Unfortunately, it is not easy to add them to online or print literature due to lack of software vendor (no Microsoft code) support for digitisation.

Adé, ẹ wa ni bẹ
Because of lack of diacritical marks, these five words CAN MEAN
Adé, the letter 'ẹ' is there;
Adé, you are there;
Adé, you come to the place;
Adé, you will look for someone to cut;
Adé, you look for someone to beg (someone-else)

PS. I am not a linguist or any sort of language expert. My interest is in helping the next generation of Yoruba learners and in promoting wider cultural / commercial interest in written Yoruba.

I found the Fontforge font-editing software. Can anyone please advise on how to create non-Latin fonts with it? Or recommend other agreeable open source software for attaining 1 - 4 above? In due course, I may seek advice on UTF procedures.
Remi  322
10-26-2007 05:23 AM ET (US)
E wa ni be
Mike Maxwell  321
09-14-2007 10:32 PM ET (US)
To echo Don Osborn's thoughts:
 > An alternative argument would be that even a sub-optimal but
 > workable script could that is already established is better to > keep working with than to change.

IMO, the best writing system is one people use. It doesn't matter how bad it is, if it's used, it's probably better than throwing it away and starting with a new one. Look at English, with its awful spelling. Many alternative alphabets have been proposed over the last century or two, and none caught on. Why? Because people could already read and write. And no matter how much better a reformed alphabet would have been (believe me, we waste a lot of time in school learning to spell), none of the alternatives ever caught on.

Or worse, look at Chinese. It would be hard (IMO) to think of a worse way to write; the only advantage is that it allows people speaking mutually unintelligible languages (such as Mandarin and Cantonese) to communicate by writing. Despite the existence of workable alphabetic alternatives, the Chinese continue to use their character system. And they publish books, magazines, newspapers, and have a substantial presence on the Internet.

I would say that much better than trying to devise a new and better writing system, one's efforts could be put to teaching people to use the one they have, and then encouraging them to do so. A thriving
literature written in a bad alphabet is much more important to a language and culture than a lack of literature with a perfect alphabet.
The best writing system is one people use.
--
 Mike Maxwell
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  320
09-14-2007 09:18 PM ET (US)
Edited by author 09-14-2007 09:20 PM
Dear Remi-Niyi, Thanks for bringing the new FaYe system to our attention /m317. There have been several new alphabets discussed in recent years (one each in Senegal, Gambia and Cameroon), not to mention older African writing systems. Each one proposes new advantages but have there been any efforts to compare with the others? This is not to discourage you, just to ask.

An alternative argument would be that even a sub-optimal but workable script that is already established is better to keep working with than to change. (A similar argument was made in the case of Bambara of Mali concerning several changes in the orthography over the last 30 years - people who learn one system have to learn the new one and old publications have to be revised, etc., even though the older systems were not bad.)

In any event, I am sure that in the longer run, computer tools for transliteration would enable transforming text in a particular language from one script to another. So work on a proposed new script need not slow work in the existing one.

Similarly, the current problems with diacritics on Latin characters should not be overstated. Many more complex scripts are already fully used on computers, the internet, cellphones, SMS. Computer systems are being improved in their ability to handle combining diacritics, complex scripts, etc.

As for Adé's question about Unicode, my understanding is that a script needs to be pretty well established before getting into the pipeline. The Mandombe script used in parts of D.R. Congo (invented in the 1970s) is still not in process at all, as far as I know. Nor does it have an identifier code. Again, this is not to discourage you, but rather to outline the situation.

On the other hand, I suppose that if the Nigerian government were to decide that FaYe is what it will use in all schools for Yoruba and whatever other languages, the importance of the new writing system for encoding in Unicode would be significant. This is a little bit like what happened with Tifinagh - there was a proposal to encode this ancient but still used script in Unicode/ISO 10646, but only when the Moroccan government decided it was going to use this in schools for Tamazight instruction did it get enough attention to finalize a version of the proposal for approval.

Hope this is of help. Good luck.

Don Osborn
Bisharat.net
PanAfriL10n.org
Adé  319
09-06-2007 06:10 PM ET (US)
Rẹmi,

I see that you accounted for ọ and ṣ (that is s with sub-dot) but not ẹ. Also what about the nasal n sound?

Do you plan to submit your system to Unicode for inclusion in the Universal Text Format?
maxwell@ldc.upenn.edu  318
09-06-2007 12:31 PM ET (US)
Quoting QT - Remi-Niyi Alaran <qtopic+15-KKgbRqJUAR8@quicktopic.com>: > Research indicates that humans can only
> delineate between about 34 unique sounds.

What research is this? (Warning in advance: I don't think this is correct, although I suppose it depends on what you mean by "unique sounds".)

> The FaYe system...is phonetic, so that every sound in Yoruba is now
> represented by a unique character.

I suspect you mean phonemic, not phonetic. It is certainly possible to write Yoruba (or any other language) phonetically, using e.g. the IPA system, but that typically multiplies the number of "unique sounds" over what a phonemic system would require.

   Mike Maxwell
   CASL/ U MD

-------------------------------------------------------------- -
This message was sent using IMP, the Internet Messaging Program.
Remi-Niyi Alaran  317
09-06-2007 11:24 AM ET (US)
I would like to share a writing system that I have developed for the Yoruba language.
Called Yoruba FaYe [meaning "draw it so we can understand it"), it may also be extended to other languages. It comprises 48 characters = alphabet (38) and numerals (10). This is a smaller character set than the 62 symbols (26 capitals, 26 lower-case and 10 numerals) in the standard English language, which uses Roman script.

The present Yoruba script was developed by the missionary Ajayi Crowther in 1836. The Ajayi script features a standard Roman script with the addition of diacritical marks to reflect Yoruba tonality and accent, So the Ajayi script is larger than the 62 characters used in English. It has proved difficult to convert Yoruba into computer machine code because of the diacritical marks. Without the diacritical marks, it is very difficult for even accomplished Yoruba linguists to efficiently read or write Yoruba.

The FaYe system does away with diacritical marks altogether. It is phonetic, so that every sound in Yoruba is now represented by a unique character. Research indicates that humans can only delineate between about 34 unique sounds. Yoruba FaYe has a 38 character alphabet and it has a natural rhythm able to accommodate the sophistication of Yoruba's complex and rich oral literature.

It is hoped that the FaYe system helps in considerably improving the quantity and quality of literature available to document African Literary Heritage. You can view or download the FaYe system at www.ijebudrums.blogspot.com
BisharatNetPerson was signed in when posted  316
06-18-2007 11:02 PM ET (US)
FYI, the Teaching and Learning with Technology site of the Pennsylvania State University has a page on "Yoruba Accent Codes" at http://tlt.its.psu.edu/suggestions/interna...anguage/yoruba.html

Don Osborn
Bisharat.net
PanAfriL10n.org
BisharatNetPerson was signed in when posted  315
05-03-2007 01:41 PM ET (US)
FYI, a Yoruba dictionary online that hasn't been mentioned here is at http://freelang.net/dictionary/yoruba.html . I haven't looked at it yet (it requires downloading). Don
BisharatNetPerson was signed in when posted  314
05-03-2007 12:56 PM ET (US)
Belated thanks to Mike and Samuel for the feedback. I think that advanced technologies such as machine translation in the case of Yoruba need some long term planning. Knowing the practical steps, such as the utility of parallel texts, and the paths, such as context-specific work, help.

While I think it's important to discuss these things, one side effect is that people sometimes get the impression that it's a project actively underway.

Ultimately if such projects - from basic issues like corpora to advanced applications - are to really progress, I think there would need to be more training of Nigerians in Nigeria in aspects of language and ICT. Not sure how much of that is going on already, but given how multilingual Africa is, any strategy to advance use of ICT should consider this aspect. Just a thought.

Don Osborn
Bisharat.net
PanAfricanL10n.org
Mike Maxwell  313
03-16-2007 06:49 PM ET (US)
QT - Dr. Samuel Olamijulo wrote:
> For the special attention of all who are working on Machine
> Translation for Yoruba,

I'm not sure that anyone is...

> I am sure you will find very useful this relatively recent
> Yoruba and English publication. Corresponding Yoruba and English
> translations are on each half of 1978 pages. The publishers must
> have it in digital format.=20
>
> Title: BIBELI MIMO =96 HOLY BIBLE King James Version

I neglected to mention in my earlier post that the bilingual text one uses for statistical MT should be in the same genre that one hopes to translate. So parallel news text if you hope to use MT for news, parallel text of computer manuals if you hope to use MT to translate computer manuals, etc. The Bible is indeed a source, and is available in print form (if not in electronic form) for most written languages. But it would probably only work well as an MT source only if you wanted to translate other texts of a religious sort. (I think one might be able to extract other kinds of information, e.g. about morphology, from a Bible; but then Yoruba doesn't have much in the way of morphology.) --
 Mike Maxwell
 maxwell@ldc.upenn.edu
Dr. Samuel Olamijulo  312
03-16-2007 05:38 AM ET (US)
 Yoruba-English Bilingual Texts on Same of 1978 Pages

For the special attention of all who are working on Machine Translation for Yoruba,

I am sure you will find very useful this relatively recent Yoruba and English publication. Corresponding Yoruba and English translations are on each half of 1978 pages. The publishers must have it in digital format.

Title: BIBELI MIMO – HOLY BIBLE King James Version

Produced year 2004 by
 
Bible Society of Nigeria

Tel: 234 1 545 7524 ; 587 6471

Website: http://www.biblesociety-nigeria.org/

Contacts: http://www.biblesociety-nigeria.org/nigeria-3.htm

18 Wharf Road,
Apapa
Lagos, Nigeria

Thank you.
Dr. Samuel Kayode Olamijulo
BisharatNetPerson was signed in when posted  311
03-15-2007 11:22 PM ET (US)
Dear Samuel, Andrew, all,

Yes there was an effort last year that produced basic locale data for OpenOffice, which was then also rewritten to also submit to CLDR. This (along with locales for some other Nigerian languages including Hausa and Igbo) was facilitated by Alberto Escudero-Pascual and Louise Berthilson using the locale generator at http://www.it46.se/localegen/ . So there is something usable for basic localization.

CLDR, however, is on a different site. See http://unicode.org/cldr/ - there is some introductory information.

The way that CLDR is set up it is almost like a glossary of basic terms and it would help to have someone review those. See another view of the data at http://www.unicode.org/cldr/data/charts/summary/yo.html (this is different than the page Andrew gave, but with the same info in a different way).

Not sure if there are established Yoruba words for some of the territories or languages listed, or how those will help localize, say, a cellphone or browser or web-page. Nevertheless, the Yoruba language experts should have a look.

BTW, out of the total 2136 lines, the entire alphabet is listed on line 1629 (with auxiliary characters, such as used for loan words in line 1628).

Don Osborn
Bisharat.net
PanAfrican Localisation project
Mike Maxwell  310
03-15-2007 10:50 PM ET (US)
QT - BisharatNet wrote:
> The issue of machine translation (MT) for Yoruba and other
> African languages per /m303 and /m304 deserves a lot more
> attention. Good to see what LDC and a few other centers are
> doing, but one has the impression that if some serious resources
> were made available, it should be possible to have MT in a
> relatively short time. I am no expert, just looking at what is
> available now for language pairs like English Chinese, or
> Japanese, or Arabic. Unfortunately there is not a lot of ready
> material to work with as Mike suggests, even for a language like
> Yoruba that has some literature. Don
 
Not sure which Mike that was (probably not me), but the amount of material in Yoruba is precisely the problem for statistically-based MT. The necessary resource is machine-readable bilingual text. I was at the LDC until a year or so ago, and at least at that point there was hardly any *monolingual* Yoruba text, much less bilingual text, in electronic form. At that point, the plans at LDC for getting electronic bilingual text were to buy printed newspapers and key them in.
Primitive, to put it mildly.

For Chinese, Japanese and Arabic, on the other hand, there is "tons" of bilingual text available on the Internet. And for many other major languages, there is at least monolingual text: Tagalog, Cebuano, other major Philippine languages; Swahili, and maybe Zulu (not sure about the other languages of South Africa, apart of course from Afrikaans); Hindi, Bengali, Tamil, many other Indic languages; Amharic and to a lesser extent Tigrinya; all European languages; Thai, Bahasa Indonesian, Vietnamese, Persian/ Farsi, and so forth. There are even some
indigenous languages of the Americas that have some Internet text, such as Guarani and maybe some of the Quechua languages.

FWIW, I suspect Igbo and Hausa are in the same situation as Yoruba, although I haven't checked lately.

There is also rule-based MT. I believe the African Languages Technology Initiative, a group in Nigeria, was looking at that, but last I heard they didn't have any funding for MT.

For the record, here's what was available in the way of
computer-readable resources when I was at the LDC:
   http://lodl.ldc.upenn.edu/found.cgi?lan=YORUBA
The list of potential resources is a template we used; as you can see, it's virtually blank.

Here's a more recent survey, done about a year ago:
   http://lodl.ldc.upenn.edu/LCTL/Yoruba_harvest.html
--
 Mike Maxwell
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  309
03-15-2007 10:37 PM ET (US)
Andrew Cunningham's response:

Having a quick look at http://unicode.org/cldr/apps/survey?_=yo, it would appear that most of the work that needs to be done, is adding terminology, e.g. country names, language names, currency names, calendar/time terminology, etc.

The basic character requirements have been done. Not sure about number formats, etc.

Andrew
BisharatNetPerson was signed in when posted  308
03-15-2007 10:33 PM ET (US)
The Common Locale Data Repository (CLDR) is accepting new locale data and, in the case of Yoruba which has a locale, additional information and corrections. See http://lists.kabissa.org/lists/archives/pu...forum/msg00581.html . Samuel Olamijulo mailed the following appeal out to several people and lists. I repost it here for the record. Don

Respected Yoruba People and friends everywhere, good evening.
 
Please read the request below. The time is short but the request appears of tremendous importance for future easier more effective Internet communication in Yoruba than at present.
 
I earnestly BEG Yoruba people with Language, IT and other relevant competencies to urgently and openly discuss and cooperate on this one for Harmonious Competent input.
 
I am a Yoruba Pediatrician and I BEG more competent volunteers to coordinate the discussion and submission of the best input possible on Yoruba.
 
Thank you,
Dr. Samuel Kayode Olamijulo
BisharatNetPerson was signed in when posted  307
03-15-2007 10:27 PM ET (US)
Thanks, Tom, for the info on the Mac keyboard /m305. I posted it on the Yoruba language profile at http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Yoruba (see 7.2). Don
BisharatNetPerson was signed in when posted  306
03-15-2007 10:25 PM ET (US)
The issue of machine translation (MT) for Yoruba and other African languages per /m303 and /m304 deserves a lot more attention. Good to see what LDC and a few other centers are doing, but one has the impression that if some serious resources were made available, it should be possible to have MT in a relatively short time. I am no expert, just looking at what is available now for language pairs like English <-> Chinese, or Japanese, or Arabic. Unfortunately there is not a lot of ready material to work with as Mike suggests, even for a language like Yoruba that has some literature. Don
Tom Gewecke  305
02-21-2007 12:07 PM ET (US)
In case anyone needs a Yoruba keyboard for Mac OS X, I have an experimental one available. You can find the link here:

http://m10lmac.blogspot.com/2007/02/typing-yoruba.html

Suggestions for improvements, etc. welcome.
Mike Maxwell  304
02-17-2007 08:14 AM ET (US)
QT - kúnlé wrote:
> Sir, I don't know If I can get any software or machine
> translator that can translate Yoruba language.

I very much doubt that this exists. There was hope for a Yoruba MT project at http://www.alt-i.org/projects.htm several years ago, but their website says:

     We were scheduled to start a project on machine
     translation of Yoruba into English and vice versa
     during 2004. However, We were not able to commence
     this project due to funding constraints. Efforts
     are still on to raise funds for this project.

The LDC has two pages of "resources" for Yoruba, at
    http://lodl.ldc.upenn.edu/LCTL/Yoruba_harvest.html
and
    http://lodl.ldc.upenn.edu/found.cgi?lan=YORUBA
but nothing for MT (and the latter is virtually empty). According to these listings, there are virtually no Yoruba-English bilingual texts on the web that could be fed into a statistical MT machine.

I would be happy to be proven wrong on this pessimistic appraisal! --
 Mike Maxwell
 maxwell@ldc.upenn.edu
kúnlé  303
02-17-2007 04:45 AM ET (US)
Sir, I don't know If I can get any software or machine translator that can translate Yoruba language. I need it for my work, I have thick volume of english articles to translate for my community. thanks
BisharatNetPerson was signed in when posted  302
02-10-2007 12:50 PM ET (US)
The PanAfriL10n.org page on Yoruba has been updated. See http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Yoruba (corrections, more updates are invited).

Don Osborn
Bisharat.net
PanAfrican Localisation project
BisharatNetPerson was signed in when posted  301
02-10-2007 12:42 PM ET (US)
Dear Samuel, I hope your efforts work out. One thing that I have noticed among people in the the US and Africa, is the tendency to uderestimate the potential for children to learn two (or sometimes more) languages fluently. Some parents have the idea that learning more than one language means learning less well, but that is not true. Bilingualism can be a long-term advantage for children (and best to start with the language(s) that the parents are fluent in).

Don
Dr. Samuel Olamijulo  300
01-29-2007 06:06 AM ET (US)
 
Yoruba Home Training - Web Help List by Olamijulo S.K.

http://www.hopeafricaepublisher.com/yohtlist0107.html
 
From this year 2007 onwards, it is a cardinal imperative that Yorubas unite in deliberately encouraging and assisting one another to promote the speaking, reading and writing of Yoruba in Yoruba People homes worldwide, irrespective of what other languages Yorubas are able to speak, read or write. We require no further Government or other official intervention to personally get on with this widely neglected personal and family responsibility.

Following are some suggestions and helpful Internet links for all who seriously desire to Read, Write, Speak, Sing, Watch, Learn, Think and Dream in Yoruba. Please encourage and assist all others worldwide who have, or should have, the desire to do likewise.

WE SHALL CONSTANTLY PRAY AND LABOR FOR PEACE, UNITY AND PROGRESS AMONG ALL YORUBAS FROM WHEREVER WE ARE LOCATED WORLDWIDE.

Read more about:

P2. YORUBA INTERNET RADIO

P3. YORUBA FOOD

P4. YORUBA CULTURE QUICK REMINDERS

P5. YORUBA ON COMPUTERS AND INTERNET

P6. YORUBA DICTIONARIES

P7. YORUBA PROVERBS

P8. YORUBA PEOPLE PUBLICATIONS

P9. USEFUL WEBSITES

P10. YORUBA CURRENT WORLD POPULATION ESTIMATE
OVER 100 MILLION

P11. YORUBA E-GROUPS

P12. BIBELI YORUBA ATOKA-YORUBA REFERENCE BIBLE ONLINE

P13. YORUBA ART ON THE WEB

P14. UNIVERSTIES WITH NET ACCESSIBLE YORUBA PROGRAMS

P15. RELATED LINKS

At: Yoruba Home Training - Web Help List by Olamijulo S.K.

http://www.hopeafricaepublisher.com/yohtlist0107.html
 
Thank you.
From Dr. Samuel Kayode Olamijulo
BisharatNetPerson was signed in when posted  299
01-24-2007 09:12 PM ET (US)
Dear Isaac, Sorry for the slow reply. I hope you looked through the list of messages which has some keyboard information (Konyin production keyboard and various keyboard drivers for Tavultesoft Keyman, MSKLC and others).

See also:
* http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Yoruba
* http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Tools

Perhaps others will have some further suggestions.

Don Osborn
Bisharat.net
PanAfrican Localisation project
Isaac Jadesimi  298
01-03-2007 01:00 PM ET (US)
Dear Sir,
I thank you very much indeed for your
e-mail message.

As a matter of the utmost urgency,
 I need, very desperately, (almost instantly!!!), your advice, suggestions, etc., as to how one could gain access to a web site/source, etc., in connection with the setting up of the key-board, in terms of the Yoruba language------in a straightforward format (involving the shedding of tears, as little as possible!!!!)-------hitherto, one's experience, relative to the above-mentioned key board, has been really very frustrating------to say the least!!!!

I would greatly appreciate receiving your advice, suggestions, etc., as soon as you are able to do so.

Thanking you in advance for your response to my request, as outlined above, Sincerely,
Isaac Jadesimi
E-mail address: jades@tamcotec.com
Message -----
From: "QT - BisharatNet" <qtopic+15-KKgbRqJUAR8@quicktopic.com>
To: "QT topic subscribers" <qtopic+subs@quicktopic.com>
Sent: Wednesday, January 03, 2007 10:32 AM
Subject: Yoruba language & ICT (fonts, keyboards & applications)


>
< replied-to message removed by QT >
BisharatNetPerson was signed in when posted  297
01-03-2007 10:32 AM ET (US)
Happy New Year 2007! Hope the holiday season was good (whichever holidays you observed)!

I've opened up a new "wikigroup" on the PanAfriL10n.org website/wiki for Nigerian localisation: http://www.panafril10n.org/wikidoc/pmwiki.php/NG-L10n/HomePage

The object is to provide a more flexible space for Nigerian localisers to list their contact details, websites, and projects. This is in some ways an online "virtual plaza/market," interactive, with the added advantage that it links with the larger PanAfrican Localisation wiki and other country-specific "wikigroups."

Among other things, I am trying to set up an RSS feed from this forum to the new wikigroup.

Don Osborn
Bisharat.net
PanAfrican Localisation project
Samuel OlamijuloPerson was signed in when posted  296
11-16-2006 08:41 AM ET (US)
Read and Write Good Yoruba on the Internet-11.16.06- by Olamijulo S.K.
 
  These are Some general Suggestions from a Student in the field.
 
1. Use as modern a computer as you can possibly get to use.
 
2. Use a recent Operating System like Windows XP on your computer.
 
 
3. Get Yoruba capable Unicode Compatible fonts like
             Tahoma ; Charis SIL that can be downloaded free from
 
    http://scripts.sil.org/cms/scripts/page.ph...=CharisSIL_download
 
4. Get one of now many available Yoruba keyboards like
 
    ABD Yoruba Keyboard downloaded for free from
 
           http://www.africanportal.net/Publications/ABD/mktut1.htm
 
       OR: ALT-1 Yoruba Keyboard Layout from
 
             http://www.alt-i.org/projects.htm
    
       OR: Keyboard from Learn Yoruba website at
 
              http://www.learnyoruba.com/keyboard.htm
 
    
OR buy a Yoruba capable physical keyboard like
 
KONYIN from
 
http://www.konyin.com/?page=home
 
ETC, ETC.
 
Read and follow the user manual or tutorial on each font or keyboard of your choice.
 
       5. Start and PRACTICE FREQUENTLY to read and write good Yoruba on suitable computers. There is no other way at present to become proficient.
 
6. You can easily save and print your writings from your computer hard disc, floppy, CD or other removable storage device.
 
7. For Yoruba Internet Text communication, there are unfortunately still persisting Yoruba display issues with many e-mail, e-groups and multilingual non Yoruba websites.
Therefore, if you want to circumvent these issues, you can e-mail your Yoruba texts in Adobe Acrobat format as e-mail attachments. That guarantees every recipient with free Adobe Reader can read it clearly.
 
8. Please contact product manufacturers for technical issues on their products or relevant experts closest to you, who are familiar with your set up to advise you on specific operational challenges at your end.
 
Kind wishes,
 
From Dr. Samuel Kayode Olamijulo
Ade Oyegbola  295
11-07-2006 10:46 PM ET (US)
How to make sure your Yoruba texts show correctly as type in most emails using Microsoft Word

After typing your message in word document
Go to Tools > Language > Set Language - Select ant language but English (there is an option to select Yorùbá also)

After that now do the following:

Go to > Tools > Options
Click the - Save tab
Check the - Embed TrueType fonts
Check the - Embed Characters in use only
Check - Embed smart tags
Check - Save smart tags as XML properties in Web pages.
Click - OK

then:
Click > File > Send to > Mail Recipient

Adé Oyegbọla
Dr. Samuel Olamijulo  294
11-07-2006 09:16 PM ET (US)
Yoruba Alphabets Online with Tone Marks and Sub dots by Olamijulo S.K.

Please Compare pdf format at URL:

http://www.hopeafricaepublisher.com/abdyo.110906.pdf

 

Free Font : Charis SIL downloaded from

 

http://scripts. sil.org/cms/ scripts/page. php?site_ id=nrsi&item_ id=CharisSIL_ download

 

Free ABD Yoruba Keyboard with usage tutorial downloaded from

 

http://www.africanportal.net/Publications/ABD/mktut1.htm

 

Written MS Word to Adobe PDF format

 

A a À à Á á B b D d E e È è É é Ẹ ẹ Ẹ̀ ẹ̀ Ẹ́ ẹ́

 

F f G g Gb gb Ì ì Í í H h J j K k L l M m

 

 N n Ò ò Ó ó Ọ ọ Ọ̀ ọ̀ Ọ́ ọ́ P p R r S s Ṣ ṣ

 

 T t Ù ù Ú ú W w Y y


 


  From Dr. Samuel Kayode Olamijulo
Ade Oyegbola  293
11-07-2006 07:01 PM ET (US)
Edited by author 11-07-2006 08:12 PM
FYI… Microsoft Outlook: The trick to get your accented letters and tonal marks to show correctly even in your email is to do the following:

After typing your message… click the Edit > Select All

Then click Tools > Language and change the Language setting from English (XX) to another language, you can also select Yorùbá.

This will, in most cases disable the ISO default setting and use the UTF-8 setting.

This process should work about 85% of the time. You email will arrive with all the text intact at the receiver’s end.

Adé Oyegbọla
Mike Maxwell  292
11-07-2006 06:47 PM ET (US)
QT - Dr. Samuel Olamijulo wrote:
> At the very minimum, usage of acute and grave tone signs on
> A a; E e; E e with subdots ; I i ; O o ; O o with subdots ; U u
> in addition to S s with subdots are ALL essential to correct
> reading of written Yoruba.

I'm sure they are standard, but often one (particularly a native speaker) can read a text perfectly well even if not all the phonemic contrasts are indicated. English is a good example of this: we mark maybe half of the vowel contrasts, and those we do mark are not marked consistently. Of course I wouldn't wish our orthography on anyone--any language where spelling bees are taken to be an indication of
intelligence has its priorities in the wrong place :-). Many Semitic languages do without marking vowels entirely, although from what I hear this does not make for fluent reading, and it could even contribute to a lower literacy rate.

Other languages do without marking phonemic tone, nasalization, etc., particularly where these do not have a large "functional load"--that is, where there aren't many minimal pairs for those particular contrasts. In these cases, the reader easily fills in the missing phonological information based on context. (If it isn't easy for reasonably fluent readers to fill in that missing information, this is a good indication that the functional load of the information is higher than you thought.)
Of course I'm not advocating dropping tone or subdots for Yoruba, nor am I in a position to advocate this. It's just that there's seldom a "right" and a "wrong" to orthographies; and when you try to choose among relatively good orthographies, it's usually possible to find advantages and disadvantages to each choice. A history of use of a particular orthography usually trumps the other considerations (which is why English spelling is unlikely to change in the foreseeable future). --
 Mike Maxwell
 maxwell@ldc.upenn.edu
Mike Maxwell  291
11-07-2006 07:42 AM ET (US)
QT - Andrew wrote:
> as Ad=E9 indicated, there is a greater shift to Unicode and UTF-8
> with web services and web applications. GMail uses UTF-8, the
> new Yahoo Mail beta also uses UTF-8. The weak point is that
> these sites tend to opt for using windows core fonts for
> displaying text.

I'm a little confused here--probably my lack of knowledge about CSS etc.
You're saying these sites specify UTF-8 _and_ they specify a font?
I had thought that the font to be used with a language was specified by the browser. Firefox, for example has in its Options dialog, under the Content tab, a button called 'Advanced' which brings up a second dialog box associating a language with several fonts. (It should really associate a script with a font, but the settings are expressed in Firefox at least as if a language had only one script. I imagine that's because users would be confused if it referred instead to the script.)
But as I think about it, it's unclear what they do with character encodings. Unless the browser internally maps the ISO 8859 etc. encodings to Unicode, the font has to differ depending on which encoding it is, doesn't it? (The dialog box does have a check box to allow web pages to override the font choice.)

So can CSSs also override the language->font mapping established in the browser?

And if I un-check the box in Firefox's options for allowing pages to choose their own fonts, does that solve the "weak point" you mention above? (While introducing other problems, I would imagine--like the bizarre proprietary encodings used in India.)
--
 Mike Maxwell
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  290
11-05-2006 09:05 PM ET (US)
Thanks Mike, Adé, Samuel and Andrew,

All of this is helpful. I confess I have not kept up with the changes at SIL (I thought some of the fonts were being sold). I will try to incorporate this information on the http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Yoruba wiki page.

I would also like to get an RSS window for this group either there or on the new (you heard about it here first) wikigroup for localization in Nigeria at http://www.panafril10n.org/wikidoc/pmwiki.php/Ng/HomePage . This will be a more open format as kind of a wiki-gateway to the current and planned L10N activities of the Nigerian "localisation communities" content than the existing Nigeria profile at http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Nigeria . More info soon - but please note that there is a lot of flexibility on the appearance of the page with the PmWiki software, and each country page can have its own look & working language, but retain the common search and information base with everyone else.

On the topic of the fonts etc., we're able to do so much more now than when this group started 4 1/2 years ago. I'm looking forward to the next 4-5 years!

Don Osborn
Bisharat.net
Andrew  289
11-05-2006 05:45 PM ET (US)
Dear Dr. Olamijulo,

as Adé indicated, there is a greater shift to Unicode and UTF-8 with web services and web applications. GMail uses UTF-8, the new Yahoo Mail beta also uses UTF-8. The weak point is that these sites tend to opt for using windows core fonts for displaying text. Which creates display problems in Internet Explorer, and ugly display issues in Firefox and Opera.

In Firefox, I tend to use the Stylish extension and a site specific set of css rules to override the fonts chosen by Google and Yahoo.

A similar approach can be used with any unicode based web service.

Current web inetrnationalization best practice and web accessibility best practice requires web pages to have languages appropriately marked up. For those websites that do have langauge taging its possible to develop user css sheets in Firefox and Opera that will use your preferred fonts on a language by language basis.

But as I said, this approach will only work with sites that actually bother to use language tagging.

For instance in Stylish I'd have a global rule:
:lang(yo) {font-family: "Charis SIL" !important;}

Andrew
Andrew  288
11-05-2006 05:35 PM ET (US)
Don,

Doulos SIL and Charis SIL are under a OFL licence. There are moves to make this a standard license for distributing open source fonts. SIL also have a new font under development (Andika). Other fonts include James Kass' Code2000, and Chris Harvey's fonts. There are a few others, possibly LeedsUni 2.0 when its released. Other open source fonts that are building in OpenType tables include DejaVu family, esp DehaVu Sans. Don't use it much and haven't tested it with Yoruba, but should work, if not Denis and others would probably be willing to up date it to do so.

For commercial fonts, there are the core fonts on Windows Vista.

As you indicate, Don, there is a critical need for additional professionally developed font. LOcally here, we were looking at getting Tiro Typeworks to develop a family of fonts to our specifications, unfortunately the funding source fell through. Still trying to secure funding to go ahead with that project. Although there are design issues that would need to be thrashed out, i.e. design harmony between scripts, since our primary projects tend to be multilingual and multiscript, it would be beneficial to have a Latin typeface that designed wise was compatible with typeface designs for other scripts used in Africa.

At the moment my preference is for Charis SIL, the only font in this list that has a family: regular, bold, italic and bold-italic.

Don, I'll send some links to you offline about opentype tables wrt Latin script and combining diacritics.

Andrew
Dr. Samuel Olamijulo  287
11-04-2006 04:13 PM ET (US)
Yoruba Language Tone Marks and Subdots
Dear Dr. Don Osborn, greetings.
As a very interested, non technicaI native Yoruba reader and writer, I suggest you access:
 
Yoruba Reference Bible Online - Bibeli Yoruba Atoka
 
at http://www.africanportal.net/ABO/BibeliAtoka/
 
for a more helpful view of " tone marks and subdots" usage in Yoruba language texts than what you have observed at Widipeka.
 
At the very minimum, usage of acute and grave tone signs on
 
A a; E e; E e with subdots ; I i ; O o ; O o with subdots ; U u in addition to S s with subdots are ALL essential to correct reading of written Yoruba.
 
Written Yoruba display on the skilled writer's desktop ; sent on the net in PDF format and on purpose built webpages can be correct and beautiful.
 
Until now display of the Yoruba specific alphabets listed above, even when Unicode Compatible or Aware, by email programs and at e-groups like MSN, Yahoo. Google and even MS Outlook remain irritatingly unsatisfactory.
 
Thank you.

  From Dr. Samuel Kayode Olamijulo
Adé Oyegb&#7885;la  286
11-04-2006 02:49 PM ET (US)
Hi Don /m283, /m284

Don, I actually had someone from my office use our keyboard for Nigeria (KONYIN Nigeria Multilingual Keyboard) to add tonal marks and sub dots to the Yorùbá page on wikipedia a while ago. But some bone-head in Nigeria that is part of a group advocating non tonal mark usage went and edited out the tonal marks. This is the problem with Yorùbá intellects.

SIL fonts are free and the best font for Yorùbá is Doulos SIL font or another free font called Junicode.

I think we had flogged the technical issues to death in prior postings to this site that it is not worth going over again. Anybody that wants to update themselves should go back and read prior postings, especially some by Andrew.

There is no need to reinvent the wheel, all the tonal marks and sub-dot needed to type correct Yorùbá orthography is readily available, some for free and others for sale.

I noticed recently that more and more of the servers used by Yahoo and AOL are now Unicode UTF complaint, so more of the emails using these servers are been received without corrupted texts. But, it is going to be a long way before the UTF becomes common.

One more thing, the letters with sub-dot are considered as part of the standard alphabets in Yorùbá.

Adé Oyegbọla
Mike Maxwell  285
11-04-2006 09:39 AM ET (US)
QT - BisharatNet wrote:
> Is there a list of fonts that do have this? I assume Charis SIL
> is one (but if memory serves, that's not a free one).

Don't know whether it has what you want, but it (like nearly everything else SIL makes) is free:

http://scripts.sil.org/cms/scripts/page.ph...em_id=CharisSILfont
Specifically, from SIL's license agreement for this font (and many other SIL fonts):
    The SIL Open Font License (OFL) is a free and
    open source license specifically designed for
    fonts and related software based on our experience
    in font design and linguistic software engineering.
--
 Mike Maxwell
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  284
11-04-2006 09:16 AM ET (US)
Looking at the Yoruba Wikipedia at http://yo.wikipedia.org/ (which has a small but growing set of articles) I noted that there was more atention to tone marks in some text than to subdots. This is apparently due to lack of a way of adding the subdots in the editing bar (and the users' not having the facility on their own systems to compose with them and then cut and paste in the editing window). But it does raise a question: Are tone marks more important than the subdots?

In some correspondence it has been suggested that the tone marks are not as essential as the subdots on selected letters. So I ask just to get a clearer idea of priorities - or confirmation that they all are equally valuable.

Another more complicated question would require a technical analysis of Yoruba texts and meaning. Are there some tone-vowel combinations that tend to be more critical for meaning differences than others? That is, that the "a" or "ọ" (just to take 2 random examples) tend to carry more meaningfully important tone distinctions than other vowels?

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  283
11-04-2006 09:01 AM ET (US)
Hi Andrew, Re /m281 -

Is there a brief description of "OpenType tables that support the use of combining diacritics"?

Is there a list of fonts that do have this? I assume Charis SIL is one (but if memory serves, that's not a free one).

By the way, since in the use of languages like Yoruba we are faced with the problems of fewness of fonts and licensing for a number of those, let me say for the record that if a large philanthropic institution wanted to facilitate bridging the digital divide, a bold (and probably not too expensive in their terms) move to "buy out" the licenses of quality extended Latin fonts to make them freely available could have a longterm benefit. It would remove a persistent hurdle for computing in languages that use extended scripts.

Don
Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  282
10-26-2006 07:21 PM ET (US)
Kọ̀ǹpútà Alágbèéká fún Ọmọ Kọ̀ọ̀kan
http://laptop.org/index.yo.html
Andrew  281
10-09-2006 11:57 PM ET (US)
Re /m280

Both Arial Unicode MS and Gentium do not have OpenType tables that support the use of combining diacritics.

There are a very limted number of fonts capable of support African languages that require the use of combining diacritics and stacking of combining diacritics.

re /m277 Samuel's test looks like its a mix: UTF-8 text plus NCRs and a couple of invalid character sequences displaying as iso-8859-1.

Whereas in /m278 the text used in the test is iso-8859-1 (probably actually windows-1252) with NCRs and displays correctly. The default font used in the forum can't position the combining diacritics correctly.

My web browser display shows the characters correctly. I'm using Firefox with the Stylish extension to override the CSS on the pages on the www.quicktopic.com domain forcing the pages to use Charis SIL.

Andrew
BisharatNetPerson was signed in when posted  280
10-09-2006 08:46 PM ET (US)
Edited by author 10-09-2006 09:10 PM
Thank you Samual and Mike,

Arial Unicode MS is apparently not being updated. The Gentium font is a good one.

The issue of "rendering" or positioning of diacritics is illustrated by Samuel's text in /m278 (Samuel, this comes through legibly on this side, reading on Firefox 1.5.0.7). In this case, the precomposed accented letters are used with the dots-under added, instead of using the precomposed dot-under characters with tone marks (accents) added. But the positioning of the dots is not quite accurate.

Newer software is supposed to handle this better. Microsoft apparently has or will have a "Uniscribe" method of rendering such diacritics correctly as Andrew has mentioned in the past). Not sure about OpenOffice.

MS also are / will be expanding many of their Latin fonts to include the necessary characters for Yoruba and other languages.

Fonts are still a problem for a lot of languages, but I expect that this will be less the case in the future. We do need to encourage new Latin fonts to include the full set of Latin extended ranges to facilitate multilngual computing.

As for positioning, this hopefully is in the process of resolution. Unfortunately, older systems will continue to have problems with this (as far as I know).

Hope this helps.

Don Osborn
Bisharat.net
Mike Maxwell  279
10-06-2006 07:38 PM ET (US)
QT - BisharatNet wrote:
> Russell Southwood of Balancing Act forwarded this question from
> Ramsey:
>
> Do you know where I can find a yoruba font working on Mac OSX?
>
> My response:
>
> The short answer is that you will need a Unicode font that
> includes the "Latin Extended Additional" and "Combining
> Diacritics." The diacritics may be used for the tone marks if
> you are including those.

I had a bad experience a couple years ago with trying to display Yoruba characters with the Arial Sans Unicode font, a fairly large Unicode font from Microsoft. (Don't know whether it can be use on a Mac.) For the record, the placement of the combining acute accent on the dotted upper case dotted E and O was lousy (the accents were through, not over, the vowels). The placement of combining dot under the acute accented upper case E and O was better, but not ideal; I believe it also violates the Unicode normalization standard. The rest of the characters looked fine, at least to my eye.

In some languages (like Spanish), one doesn't normally display accents on upper case vowels. (Ironically, I suppose that might be because placing an accent over an upper case vowel was difficult on a
typewriter!) I don't know whether one normally displays accents on upper case vowels in Yoruba (when one uses accents in Yoruba at all, that is).
--
 Mike Maxwell
 maxwell@ldc.upenn.edu
Dr. Samuel Olamijulo  278
10-06-2006 11:36 AM ET (US)
Internet Written Yoruba Practice 10.07.06

Free ABD Yoruba Keyboard from
http://www.africanportal.net/Publications/ABD/mktut1.htm

+ Free Gentium Font

À à Á á È è É é Ẹ ẹ Ẹ́ ẹ́ Ẹ̀ ẹ̀ Ì ì Í í

Ò ò Ó ó Ọ ọ Ọ̀ ọ̀ Ọ́ ọ́ Ù ù Ú ú Ṣ ṣ

Looks perfect on MS Word 2003 at my end.

How about your end?

Dr. Samuel Kayode Olamijulo
Dr. Samuel Olamijulo  277
10-06-2006 11:15 AM ET (US)
Internet Written Yoruba Practice by Olamijulo S.K.- 10.07.06
   
  Free ABD Yoruba Keyboard from
   
  http://www.africanportal.net/Publications/ABD/mktut1.htm
   
  + Free Gentium Font
   
  Ã€ à à á È è É é Ẹ ẹ Ẹ́ ẹ́ Ẹ̀ ẹ̀ ÃŒ ì à í
  Ã’ ò Ó ó Ọ ọ Ã’̣ ọ̀ Ọ́ ọ́ Ù ù Ú ú Ṣ ṣ
  Looks perfect on MS Word 2003 at my end.
   
   How about your end ?
   
  Dr. Samuel Olamijulo
  ----------------------------------------------
  

QT - BisharatNet <qtopic+15-KKgbRqJUAR8@quicktopic.com> wrote:


     
---------------------------------
Want to be your own boss? Learn how on Yahoo! Small Business.
< replied-to message removed by QT >
BisharatNetPerson was signed in when posted  276
10-06-2006 09:40 AM ET (US)
Russell Southwood of Balancing Act forwarded this question from Ramsey:

Do you know where I can find a yoruba font working on Mac OSX?

My response:

The short answer is that you will need a Unicode font that includes the "Latin Extended Additional" and "Combining Diacritics." The diacritics may be used for the tone marks if you are including those.

A good directory for Unicode fonts is at http://www.alanwood.net/unicode/fonts.html (you probably want to look under Mac OSX, and Large fonts, Latin fonts, or Pan-European fonts for ones that include the above ranges. Gentium is a good choice as it has Mac, Windows, and Linux versions.

Unicode fonts are becoming more the norm, and in the future more of them will have more complete Latin ranges to accommodate languages like Yoruba. I would avoid old 8-bit fonts unless your work will never be exchanged with others in digital format or posted on the internet.

Maybe others will have more suggestions?

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  275
10-06-2006 08:59 AM ET (US)
Hi Natalie, Sorry about the slow reply to your message /m272 . There is a map at http://www.molli.org.uk/yoruba/1_about_yoruba/maps.htm but it does not indicate specific communities. I think that most communities in the indicated areas of Nigeria and perhaps Benin are Yorubaphone, though this like many other parts of Africa is probably multilingual to varying degrees.

There is a page with general information on Yoruba for localization purposes at http://www.panafril10n.org/wikidoc/pmwiki.php/PanAfrLoc/Yoruba . Maps are of interest for localizing ICT, especially for planning and marketing localized software and targeting internet content in Yoruba.

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  274
10-06-2006 08:46 AM ET (US)
Edited by author 10-06-2006 08:46 AM
Hi Ade, Sorry for the slow reply to your message /m271 - there is not to my knowledge any Yoruba translation software - yet. I have heard that someone is working on a simple machine translation program, but this is not far advanced.

A very short gateway page for information on machine translation and African languages (including Arabic) is available at http://www.bisharat.net/Trans/

Don Osborn
Bisharat.net
Dr. Samuel Olamijulo  273
09-14-2006 03:43 PM ET (US)
Yoruba Internet Radio List by Olamijulo S.K. -Updated 09.14.06

HOPE AFRICA E-PUBLISHER

First Published December 28, 2005

Please share the link at:

http://www.hopeafricaepublisher.com/yirlist1205.html

with your family, friends and others who might or should be interested.

Thank you.

Dr. Samuel Kayode Olamijulo
Natalie Ceperley  272
09-13-2006 01:54 PM ET (US)
Does anyone know of a detailed map of the yoruba speaking world? In particular I am interested in lists of villages where yoruba is the main language and maps of its use in the americas.
Sorry I am straying from ICT a bit....hope someone can help!
Andulganiyu Ade  271
08-16-2006 01:02 AM ET (US)
It is possible to supply me Yoruba Software CD that has translation to English and Arabic Languages.

Adeniran
BisharatNetPerson was signed in when posted  270
08-01-2006 10:35 AM ET (US)
Paa Kwesi Imbeah asks on A12n-collaboration* :
Would you know if there are standard Yoruba strings for computer terms like "File", "Menu", "Settings", etc? Where would I find these?

Have such localization terms been translated and more or less agreed on? (And by who?)

Don Osborn
Bisharat.net

* http://lists.kabissa.org/lists/archives/pu...ation/msg00922.html
qasim tunde  269
04-19-2006 05:45 AM ET (US)
n ko ni ero nipa ibeeere yii, e jowo e je ki n gbo n pa e ti e ba ti ri gbo


  
---------------------------------
Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls. Great rates starting at 1¢/min. < replied-to message removed by QT >
AKSHA@AOL.COM  268
04-16-2006 09:02 AM ET (US)
Good morning to the list:

I have begun researching compounding in Yoruba. I am looking for English language web resources or texts that speak in depth about the morphemes and morphology of Yoruba Compounding.

Does anyone have pointers?

Thank you in advance
AKSHA
KONYIN  267
03-11-2006 10:59 AM ET (US)
Issue: Now I can type words from Nigerian native languages on my PC, but I cannot save these words into my custom dictionary. Every time I tried to save words with Hausa, Ìgbo or Yorùbá characters, I get a message that: “The custom dictionary is full. The word was not added.”

Solution: The custom dictionary is a notepad that is defaulted to ANSI encoding, to save non ANSI characters, like Hausa, Ìgbo or Yorùbá words that includes Ẹẹ, Ịị, Ọọ, Ụụ or tonal marks, you will need to change the custom dictionary encoding from ANSI to Unicode.

How to change Custom Dictionary encoding in Windows XP:

1. Click - START button on the left bottom of you screen;
2. Click - My Computer from the menu;
3. Open – Local Disk (C:) from the menu;
4. Open – Documents and Settings folder;
5. Open – [Your User Profile] folder;
6. Open – Application Data folder;
7. Open – Microsoft folder;
8. Open – Proof folder;
9. Open - CUSTOM file;
10. Click – File at the top left corner;
11. Click – Save As from the menu;
12. On the line for Encoding – click the dropdown and select Unicode;
13. Click – Save button;
14. Click – Yes button;
15. Click –CLOSE ALL FOLDERS

KỌNYIN Nigeria Multilingual Keyboards
http://www.konyin.com
Andrew  266
03-05-2006 03:30 PM ET (US)
Hi Adé

sounds very interesting. Good luck.

Although Yoruba, German, Igbo, Spanish, English, Zulu would be trivial cases.

It would be more interesting to see how the system works with more challenging languages.

Andrew
KONYIN  265
03-05-2006 09:13 AM ET (US)
Edited by author 03-05-2006 09:13 AM
Andrew,

Our reseach is still in the early stages, but I can only tell you that a major OS platform mainstream company is seriously invloved in the theory.

In our theory, users will only need to set their prefered script and any language that uses that script will be fine. Glyph variants will depend on the properties of the font in use. (e.g. Yoruba, German, Igbo, Spanish, English, Zulu, etc all use latin scripts, the keyboard scancodes does not really needs to know which language is going to use the UNICODE codepoint for "A" to translate it for rendering.)

To really grasp the theory, you have to think of three levels between keyboard scancode input and font rendering and not the current two levels.

We are probable 12-24 months away from a system level code demo.

Will keep you updated.

Adé
Andrew  264
03-05-2006 08:33 AM ET (US)
Hi Adé,

You said: "That is exactly the point I am try to make and the purpose of our research, the keyboard layout should be decoupled from other OS functions."

I know that is your point. My point is that decoupling input from other OS functions would create more problems than it solves. There are languages spoken around the world where it is necessary to know what language is being input to get the correct shaping or the correct glyph variants for the language.

Anyway, you don't need to change anything in the OS. Just define a custom "Nigerian" or "African" locale in Windows Vista. Select that for use with your keyboard. It will achieve what you need.

I don't see a need to remove input locales from the OS. It would be a retrograde decision for many different languages and writing scripts where it is important to know the language context of the characters being typed. The Latin script is a very simplistic writing script in many ways, and probably the worst script to use as a basis for developing input models (because it doesn't reflect the needs or challenges of more complex writing scripts).

Although this discusison is purely theoretical, since its unlikely Microsoft would do anything detrimantal to their OS considering the needs of the Chinese, Korean and Japanese markets.

Andrew
KONYIN  263
03-04-2006 10:51 PM ET (US)
Edited by author 03-04-2006 10:54 PM
Andrew /m261

Andrew's Q: "Using your product, how would someone indocate/change the input locale while continuing to use your keyboard?"

Our product will not impact your input locale at all, that is the bases of the neutrality. All the settings on your PC is left intact.... [other info is secret cannot disclose]

In the case of Nigeria - Yoruba and others, for example: we give the users option to modify the regional settings to allow the Naira sign as the default currency symbol and if the user has US(English) settings to change the date format from MM/DD/YYYY to DD/MM/YYYY. All other regional/language settings, remains as set by user.

Andrew's Statement: "My point is that having to select and change a keyboard layout does more than change the keyboard layout."

That is exactly the point I am try to make and the purpose of our research, the keyboard layout should be decoupled from other OS functions.

Adé
Andrew  262
03-04-2006 08:53 PM ET (US)
re /m260

As Adé indicates the wikipedia site is well done in most aspects of web internationalization. The one area for improvement would be to meet the WAI accessibility requirement for marking up change in language, i.e. that teh Yoruba words are marked up as Yoruba words.

Currently, the way the web page is marked up, the entire page is indicated as being English; the yoruba words are marked up as English words and an software processing the page would be expected to treat the Yoruba words as English words, at least in terms of text processing, since thats teh way they've been marked up.

In theory the Yoruba words should be conatined in a span element with the lang or xml:lang attribute set to the language code for Yoruba.

Andrew
Andrew  261
03-04-2006 08:45 PM ET (US)
Hi Adé,

I did understand what you said. You said

"Our current product is language neutral. When you install the driver, it will not change any of the language settings on the OS."

My point is that having to select and change a keyboard layout does more than change the keyboard layout. It changes the input locale. Other programs such as Microsoft Office applications and others use this change of input locale to make other changes including which proofing tools are used, which typographic options kick in, etc.

Although is is somewhat academic for most African languages, since proofing tools do exist, but wil impact of African users who use a mix of languages including French, English and Arabic.

Using your product, how would someone indocate/change the input locale while continuing to use your keyboard?

Hope the question makes sense?

Andrew
KONYIN  260
03-04-2006 01:51 PM ET (US)
Check out the update on this site:

http://en.wikipedia.org/wiki/Yoruba

I have just spent sometime making sure that most of the Yoruba words are correctly represented.

This is a very good example of a well designed and scripted website.

Adé
Dr. Samuel Olamijulo  259
03-03-2006 01:53 AM ET (US)
RE: ABD Yoruba Keyboard Issues

Dear Dr. Jadesimi,

I have had some personal access and posting issues with QuickTopic recently which I shall try to resolve.

Many thanks for your mail of Jan 24,2006 which I just noticed in the QuickTopic digest I received today.

 Sir Lawie, as usual,obviously did a very good job addressing issues you raised.

Should you need any additional help, please feel free to contact directly

ABD Yoruba Keyboard Publisher at

publisher@africanPortal.net

Thank you for your interest in Yoruba Language and People Development.

Dr. Samuel Olamijulo
KONYIN  258
03-02-2006 09:41 PM ET (US)
Edited by author 03-02-2006 09:43 PM
Response to Andrew's post /m256

Andrew, I think you missed my start point, because I did not do a good job explaining it. In short, there is a need to separate character generation from input locale and every other rendering processes.

We already achieved some of that in our current product. Our current product is language neutral. When you install the driver, it will not change any of the language settings on the OS. We actually went on a roundabout way to write a code that leaves the input locale and other language setting in place while making our keyboard layout the default for the OS. (All other keyboard layout creators requires a user to select a language, ours does not)

What we are now doing is to follow a logical backward mapping for a specific platform, find the bottleneck and provide a solution.

What I am suggesting is to create another intermediate processing point between the input scancodes, the character translation and all associations with language locale or regional settings. I really cannot get too much into the details of the programming. But I can tell you that it will not impact any of the items you listed.

One issue I have come to disregard is retrospective application of patches for African based users. My reason is simple; there is not that much computer penetration in these countries for now. (I don’t think the ownership and usage volume of computers for Yorùbás, as at today, will justify the expenditure to develop patches for old OSes, in any case it will be another cottage industry for those that want to) These countries are still largely based on Windows 98. I think in the next couple of years, there will be tremendous growth in computing and computer ownership, this will be the wave we should help to shape its destiny.

My two cents

Adé
Mike Maxwell  257
03-02-2006 08:24 PM ET (US)
QT - Andrew wrote:
> What do you think the first steps forward are? What's the best
> way to increase Yoruba's presence on the internet?

As someone who has looked for texts in lots of languages on the
Internet, including Yoruba (I was building corpora for the Linguistic Data Consortium of the University of Pennsylvania), the answer to your question is simple. (But that doesn't mean you'll like the answer...) The answer is simply to put more Yoruba text out there.

Well, that was a dumb answer, I know. Unfortunately, there's no easy way around it.

So how can you do this? I'm assuming you are located in Nigeria (I'm in the US). How do Yoruba-language newspapers and magazines in Nigeria create text? Do they use typewriters, or non-computerized typesetting? I doubt it, although it's possible. We (Yiwola Awoyale, myself and others at the LDC) looked at a couple Yoruba newspapers from several years ago. Except for the front page banner, none of the text had tone marks, and what underdots there were had clearly been put in by hand. That is, someone had created a printed version of the text, perhaps by computer printout, and had used a pen to mark the underdots.

I'm hoping things have improved since that newspaper, and that
Yoruba-language newspapers are composed on computer with the underdots, and maybe even with tone marks. If so, is there any way the publishers could be persuaded to put their papers--or the articles they
contain--on-line? Now the on-line readership in Nigeria is probably too small to justify this, but I would bet that there is a large number of Yoruba speakers living abroad who have Internet access, and hunger for news from home.

And it doesn't--I think--take a huge investment to put up a web page of your newspaper, plus an archive of past issues. (Without the archive, just putting today's newspaper on-line doesn't do much to increase the presence of Yoruba on the web.) This is commonplace now, and not just in the US or Europe. It's done in Bangladesh, so I suspect that if they can do it, a Nigerian newspaper could do it.

At a more personal level, why not create a Yoruba-language blog? A year ago, I did a search for blogs in various languages. I found them in most languages I looked for (Thai, Bengali, Panjabi...), but not in Yoruba. It doesn't mean they don't exist, just that they're not common.
Another thing that could be done would be to put the text of
out-of-copyright Yoruba-language books on-line. (Be very careful about the copyright, or getting written permission from the copyright holder, lest you spend a lot of work putting a text up, only to have to take it down again. I'm assuming there are out-of-copyright Yoruba-language books, i.e. that it's been a written language, with a more or less standard form, for long enough that some of the older books are public domain.) There are plenty of places that you could upload such books too, once you've keyed them in (I don't think OCR would work very well, at least not for text that uses the underdot and accents). I just ran across a website today which has books in around 20 languages.
(Unfortunately, I didn't keep the URL, but I'm sure I could find it again.)
Schools could do these projects, too, and benefit from the experience. I realize that investing in computers in the classroom, and web access, can be an expensive proposition, but I think the payoff would be great: think of being the first Nigerian High School students to post your work to the World Wide Web!

Finally, it might be interesting to translate out-of-copyright books into Yoruba, and make them web-accessible. The disadvantage is that this isn't promoting the Yoruba/ Nigerian culture, and I suspect you're interested in that, too. But one advantage is that you can provide works which may otherwise be inaccessible to monolingual Yoruba
speakers. (Well, you might want to make them available in print form, too.) For example, some years ago there was a book published in English and Spanish called "Where there is no doctor" ("Donde no hay doctor"), intended as a guide for semi-trained medical providers in poorer areas. The text was intentionally made simple to translate, but so far as I know it was only ever translated into Spanish. I suspect--but don't know--that the author would be more than happy to let it be translated into Yoruba. There are doubtless other books which would be similarly helpful, and whose authors would freely grant permission for
translation. (Hey, I'd let you translate any of my linguistic papers, but I don't think there would be a market...)

Another advantage of posting translations into Yoruba on a website, is that you could use that website to advertise your translation ability. Possibly this would get you more business doing translation. And further down the road, it's at least possible that a large quantity of parallel (Yoruba-English) text could serve as the fuel for statistical machine translation programs that would do "translation" (scare quotes intentional) between Yoruba and English.
Andrew  256
03-02-2006 03:31 PM ET (US)
Dear Adé,

an interesting proposal. Personally I have mixed feelings. I can see how such an OS development would be usefully initially in Africa. I can also see how such a proposal could create chaos in East Asia, and how it might not be an ideal solution for Central Asian languages.

Although, to some extend that is what we already have by accident for unicode input for African languages, under Windows Vista (from my understanding).

Although personally, I wish there were parts of the OpenType spec that Microsoft would implement in their OS. One issue is glyph variation (for the same unicode character). An OpenType font can contain alternative glyphs for a character and display each alternative based on the language in use. Microsoft haven't implemented this yet to any wide degree.

Also your proposal would fundamentally break operating system support for languages such as Chinese (Simplified and Traditional Scripts), Japanese and Korean where characters may be Unified across writing scripts and language (and country identification) is fundamentally important.

Languages such as Sindhi, Pashto and Urdu share letters but may have different glyphs for some of them. The ideal situation is that there is one single font which will be used to render different glyphs based on language.

Currently, spell checking and proofing tools, and a range of other tools actually use the input locale to switch languages. Word and RTF files where language is marked up use input locales for the language tagging.

I cann't see the British, New Zealanders or Australians wanting to do away with input locales. Last thing we want is micrsoft products defaulting to US style English.

For those of us who work with multilingual documents, lack of automatic input locale detection would be more of a hinderance.

If you divorce keyborads form locale then you will need to address how applications will operate without being able to automatically detect input locales.

personally I have no problems with using a generic keyboard that is base don NFD. I've actually have a few keyboard layouts I use that will reorder combining diacritics based on combining class where combining diacritics don't interact typographically, ie. produce NFD output.

I'm not sure what you mean by "system seperation of characters as vowels or consonants". As far as I was aware the OS doesn't have any awareness of orthographic features such as vowels or consonants.

If anchor points have been to consonants in the font, then you can combining diacritics with them, e.g. in doulos sil font you can combining the letter "R" iwth the combining breve to form a letter in a vietnames minority language.

Your proposal also assumes that all diacritics always have the same shape and would always use the same anchor point on a base character.

For instance the acute accent is commonly centred over the based character in most languages. In one Pasikika language the diacritic is further to the right at the edge of the character.

For some languages that use both a diaeresis and an acute, the acute is centred above the diaeresis, in at least one African orthography, the acut is placed at the same hieght as and before the diaeresis. In other languages the acute is at the same height and between the dots of the diaeresis.

Vietnamese uses differnet tyopgraphic conventions for stacking diacritics.

If you divorce input locale from keyboard input, and relied on the font for correct rendering, then you'd need multiple instances of all the operating systems core fonts ... maybe ten to twenty different ersions of Times New Roman, Arial, Courier New, verdana, Tohoma, etc to accomodate all the differnet orthographic and typographic conventions.

It would make more sense to have one sngle font, and implement language awareness into the font rendering system (to utilise the existing OpenType feature for language specific rendering). This would require applications to tag documents according to language. Easiest way of doing that is tracking input locale.

if your system was implemented or the system i'm suggesting, the effect for Nigerians would be the same. They'd ahve to upgrade to the new Operating System to use the new features.

The crux of the issue is that the fonts and font rendering systems on current operating systems doesn't work for Yoruba. Also current operating systems can't identify Yoruba as an input locale.

Windows Vista will have the fonts and the font rendering system. You can also develop custom locales for Vista, breaking the cycle of not being able to support a locale because Microsoft hasn't implemented it.

The remaining issue, keyboard input based on combining diacritics and not on precomposed diacritics, could be a convention adopted by companies developing keyboard solutions.

So on Vista you'd be able to achieve most, if not all of your objectives, ie developing a generic Latin script keyboard layout using combining diacritics.

Either way, as you said "I think it will be very painful in the short term, but makes life easier for everyone on the long run." irregardless of whether MacOS, Windows or Linux is used, the solution will end up being the same. Phasing out old operating systems and replacing them with new OSes that support Yoruba and other ASfrican languages. Unfortunately, OS vendors don't tend to retrospectively fit solutions to old versions of an OS. They release a new version instead.

I'd be interested in hearing more about the tests KONYIN have been doing. It sounds very interesting.

What we do need is more experimentation. looking at the needs of African langauges and developing solutions suited to them.

Thankyou Adé.

Andrew
KONYIN  255
03-02-2006 02:03 PM ET (US)
Edited by author 03-02-2006 02:03 PM
Andrew,

The real solution is for mainstream platforms to decouple character generation from languages or regional groupings and make keyboard-to-OS character generation strictly Script based (e.g. Instead of selecting Languages or regions, users will simply select a Script (Latin, Cyrillic, etc), on one hand, then remove all precomposed single codepoint that represents charcter & tonal marks from standard scripts, that means only based characters and combining tonal marks.

This will mean that all characters and combining diacriticals will be generated in there raw UNICODE codepoint and font rendering will depend on the actual properties of the font in use by an application. I will also get rid of the system seperation of characters as vowels or consonants. All characters will be combinable with any combining diacritical marks. Character storage in UNICODE codepint in databases will also become universal and simple.

It will be a radical departure from the present approach of trying to fix font rendering problems without uprooting the base structure of character encoding in all OSes.

I think it will be very painful in the short term, but makes life easier for everyone on the long run.

We are presently testing this logic on some platforms in our Lab.

Adé
Andrew  254
03-02-2006 07:44 AM ET (US)
One last though:

it may also be useful to lobby Microosft, to encourage then to release a service apck or an update for Internet Explorer 6 that would install an updated version of Uniscribe locally in IE6's directory.

Andrew
Andrew  253
03-02-2006 07:26 AM ET (US)
continuation of /m252 ...

Possible solutions for web based email:
  * use gmail
  * lobby Yahoo to add a UTF-8 based mail interface
  * build an appropriate web based email service in Nigeria

For web services:
  * follow W3C Internationalization guidelines
  * use Unicode through out the service
  * use appropriate fonts or allow users to change fonts.
  * use language tagging
  * use software or modules (PHP, CGI etc) that will correctly handle Yoruba Unicode Text.
  * build appropriate tools and services locally tailored for Yoruba.

at least thats my thoughts ... feel free to disagree.

Hopefully, next week, I'll get a chance to put phpBB on a server here and see if I can get this discusison board sogtware to correctly display Yoruba and some other African languages.

When I have time, I need to do some testing on blogging and wiki software.

What do you think the first steps forward are? What's the best way to increase Yoruba's presence on the internet?

Andrew
Andrew  252
03-02-2006 07:17 AM ET (US)
Dear Dr. Olamijulo

finally have some time to respond to your posting /m238 in more detail. I'll quickly note down soem thoughts.

With respect to 1) displaying emails and websites and 2) having webservices work correctly with Yoruba ....

1) EMAIL CLIENTS

Major email clients such as Outlook, Outlook Express, Mozilla, Thunderbird, etc support Unicode. To successfully send and receive Yoruba Unicode emails, there are some things that are necessary:

  a) have an appropriate OpenType font (these exist)
  b) have a keyboard or keyboard layout that supports typing in Yoruba (these exist)
  c) your application or operating system supports correct rendering of combining diacritics. (The core of the problem)
  d) you correctly configure your email client to use appropriate fonts to display or compose emails. (Straight forward)

The rendering issue is the core problem. If you have the right fonts like Doulos SIL and others, how do you know it will render correctly?

Outlook, Outlook Express, Thunderbiord, Mozilla, Opera all depend on the font rendering of the Operating System. In the case of Windows, this means that Windows is using the correct version of Uniscribe (usp10.dll). This is available on Windows XP Service Pack 2. It will also be on Windows Vista.

Alternatives: Doulos SIL has both OpenType and Graphite tables. Graphite is an opensource rendering system. Versions of Thunderbird are available for Windows and Linux which ahve Graphite support enabled. These versions of Thunderbird should render Yoruba Unicode emails on older versions of Windows. I haven't tried it yet, but I suspect it should work. Graphite is an Open Source project that is under development. Have a look at http://sila.mozdev.org/grFirefox.html

WEB BASED EMAIL

First issue here is the having web browser support. What I said about email clients also applies to web browsers. You need the right fonts, and input emthod and appropriate rendering. Internet Explorer, Mozilla, Firefox and Opera use the operating systems rendering ... so on WinXP-SP2 and Windows Vista, you should be OK. Jus need an appropriate way of specifying fonts.

There is also the Graphite alternative, with a Graphite enabled version of Firefox available.

Additionally, if you copy an updated version of usp10.dll to internet Explorer 6's directory on your hard disk, Internet Explorer 6 will use that local copy rather than the system version. This is the principle that Sinhala for Internet Explorer 6 uses (http://www.fonts.lk/down.html). In theory this kit should also enable combining diacritic support.

The core problem is the Web interfce to email systems. English language Yahoo and Hotmail interfaces use the Western European character sets. If you type in Unicode and send the email, the users at the other end will not automatically see the yoruba display correctly. The email header will not identify the email as UTF-8 encoded email. It is necessary to manually select the UTF-8 encoding. This amy or amy not work.

From what I understand gmail supports Unicode, so this may be a better choice for a web based email service than using Yahoo or Hotmail. I haven't used Gmail, so I can guarantee it. Alternatively when I need to send an UTF-8 email in yahoo, I switch to the Vietnamese language Yahoo user interface. This is UTF-8 based and allows me to send UTF-8 emails from Yahoo.

Alternatively, you could enter the Yoruba as HTML numerical character references. In this case email clients that can display html embedded in emails should display the Yoruba text correctly. Text based emails on the other hand will not see the characters, rather they will see the list of numerical character references.

What is necessary is a web based interface that will allow you to specify the encoding of your emails, and that would preserve the correct encoding in the email header when the email is sent. There are also other features, such as being able to specify the font you wnat your messages to display with, etc.

Personally I prefer to use Firefox or Opera rather than Internet Explorer. If IE uses the wrong font to display a page, and that font is missing necessary characters, you will just see boxes instead of characters.

In Firefox and Opera, if the character is missing from the font being used, it will swap fonts and use a differnet font to display that character.

2) When developing websites, it is necessary to to correctly identify the character encoding. UTF-8 should be used. It is also very useful to indicate the primary language of the web page and to also markup any change in languages. For more information look at the resources at http://www.w3.org/International/

Firefox and Opera allow the user to specify CSS rules that can override how a web page is displayed. For some African languages, I can put in a CSS rule that will force a font change in the page for any text in a specific language. If I was viewing a web page in Yoruba, and the web developer had included a language tag identifying the content as Yoruba text, I could then write a rule for Firefox that would tell the browser to use the Charis SIL font for any Yoruba text.

For web services, blogs, discussion borads, wikis and other online tools: they need to support Unicode (both in the web page, but also in all scripts and programs that handle data in the backend of the site), they also need to perform Unicode normalization on any text input. It is also useful if they indicate primary language or any change in the text processing language. Web services should use appropriate fonts to display Yoruba, or allow users to set their own font preferences.

I guess thats my take on the situation at the moment.

Possible solutions for web based email:
Lekan (Sir Lawie)  251
02-26-2006 08:27 PM ET (US)
Dear Mr. Jadesimi

RE: UNINSTALLING ABD-YORUBA KEYBOARD
-------------------------------------
Thanks for your prompt reply. ABD Yoruba Keyboard was successfully installed and uninstalled on three different computers running Windows XP.


HINTs:

You need to re-install WIN XP in order to correct the error message.

Insert WIN XP CDROM to update missing / corrupted reference or file.

Thanks for your time.

Lekan

YorubaWorld
http://groups.yahoo.com/group/yorubaworld/
--------------

Can you send me the screen captur
Isaac Jadesimi  250
02-26-2006 01:09 PM ET (US)
For The Attention of:
Lekan (Sir Lawie)
=================
Dear Sir,
I thank you very much for your e-mail message.

I set out below the relevant ABD download page link (s):

I , right clicked on this link: ABDYooba.exe
Saved target as "Desktop"
Double clicked on on the downloaded ABDYooba.exe, and clicked on "Extract" to extract all 4 files.
Clicked on
My Computer"----C Drive
New folder "ABDYooba" created, via C Drive----Double clicked to open the folder
Right clicked on ABDYooba.msi, clicked on "install"-----clicked " close"-----I followed the remaining steps of the installation
process...........as far as:
Select "English (United States)--ABDYoruba"------then clicked OK
via, http://www.africanportal.net/Publications/ABD/mktut4.htm

I look forward very much to hearing from you.
Thanking you in advance,
Sincerely,
Isaac Gboyega Jadesimi.
jades@tamcotec.com
>
< replied-to message removed by QT >
Lekan (sir Lawie)  249
02-26-2006 11:18 AM ET (US)
Dear Mr. Jadesimi

I would like to troubleshoot the problem, on my system, to see if it's windows or ABD software bug.

Pl's include ABD download page link, in your reply.

Thanks for your time.

Lekan (Sir Lawie)

YorubaWorld
http://groups.yahoo.com/group/yorubaworld/
Isaac Jadesimi  248
02-26-2006 02:42 AM ET (US)
>
< replied-to message removed by QT >
Isaac Jadesimi  247
02-25-2006 04:17 PM ET (US)
For The Attention Of:
Lekan (Sir Lawie)
==================
Dear Sir,
I thank you very much indeed for your most valuable e-mail message.
I have tried to work through the steps which you kindly outlined-----i.e., Tip 1 and Tip 2.
Unfortunately, the name of the program (ABD Yoruba), did not appear on the list, in each case, with regard to both Tips.
ABD Yoruba only appears on the list, via the Control Panel-----and , unfortunately, I tried to remove it, but I still keep getting the same message "Fatal error during Installation. (I did succeed in deleting it from the desktop, as well as the "C" drive) . The only problem is the Control Panel, "Add & Remove" section.
I would greatly appreciate and value your further assistance.
Thanking you in advance,
Sincerely,
Isaac Jadesimi.
jades@tamcotec.com
>
< replied-to message removed by QT >
Lekan (Sir Lawie)  246
02-24-2006 06:09 PM ET (US)
Dear Mr. Jadesimi

RE: UNINSTALL PROGRAM FROM WINDOWS
----------------------------------
I read your posting regarding above underline heading. I presume these tips may solve your problem.

TIP:1

REMOVING PROGRAM FROM STARTUP
-----------------------------

1) select START -> then RUN -> type MSCONFIG in the white box.-> select OK

2)SYSTEM CONFIGURATION UTILITY POP UP -> then clik on STARTUP TAB -> LISTINGS OF FILES APPEAR -> LOOK FOR THE PROGRAM NAME -> then UNCHECK IT.

3)then click on APPLY -> then click on OK.

4) YOU NEED TO RESTART WINDOWS, before changes can apply.

*********************************************************** **

TIP:2
REMOVING PROGRAM REFERENCE FROM REGISTRY
----------------------------------------

1) Go to Start -> Select RUN -> a dialogue box appear -> type REGEDIT in the white space. then click OK

2) registry window appear-> series of FOLDERS lined up on the left pane.

3) then select HKEY_LOCAL_MACHINE -> Folder listing shows -> select & open SOFTWARE folder -> long folder listings appear look for the PROGRAM NAME you wish to remove. -> select the PROGRAM NAME FOLDER -> THEN DELETE IT.

4) EXIT FROM REGISTRY.

5) RESTART YOUR SYSTEM.

*********************************************************
 
If the problem still persist pl's do not hesitate to get in touch.

Lekan (Sir Lawie)

YorubaWorld
http://groups.yahoo.com/group/yorubaworld/
Isaac Jadesimi  245
02-24-2006 03:46 PM ET (US)
>
< replied-to message removed by QT >
Isaac Jadesimi  244
02-24-2006 03:32 PM ET (US)
Dear Dr. Olamijulo,

I thank you very much indeed for your
e-mail message-----which came as an answer to my desperate prayers, in the sense that I happen to be one of the "many ordinary users" to whom you referred, in your statement, to the effect that "On global internet applications like.........................e-mails, and many websites, Yoruba Language Display remains frustratingly problematic".
In my case, the very severe frustrations of several days, (involving even sleepless nights !!!), revolve around countless number of attempts to remove the ABD Yoruba program, from my WinXP computer, via the Control Panel. The only recurring message, I kept, and still keep getting, on the computer screen, has been, and still is, "Fatal error during
installation"-----consequently, it has proved to be, virtually, impossible to remove the program, with a view to "uninstalling, and restarting the process of installation", as I was advised to do.

I would greatly appreciate your advice, suggestions, and tips, with regard to overcoming the above-mentioned problem------especially as I need to set up the Yoruba keyboard, very desperately----very soon!!!.
Thanking you in advance,
Sincerely,
Isaac. Gboyega Jadesimi, M.A., Ph.D.
E-mail address: jades@tamcotec.com
< replied-to message removed by QT >
Dr. Samuel Olamijulo  243
02-24-2006 02:44 AM ET (US)
Yoruba Language Display by Emails and Website Applications -
Test 02.16.06


Mr. Andrew Cunningham and Compatriot Ade Oyegbola, greetings.
 
Your communication, as usual , is very helpful.
 
At my end on my desktop, Yoruba Language Display is beautifully flawless.
 
On global Internet applications like qiucktopic, yahoogroups , emails, and many websites,Yoruba Language Display remains frustratingly problematic for many ordinary users.
 
Your suggestions below give useful options to work with.
 
Thank you.
 
Dr. Samuel Kayode Olamijulo
-------------------------------------------------------------- --
------------------------------------------------------------ ----

From: Dr. Samuel Olamijulo Time: 06:18 AM
Yoruba Language Display by Emails and Website Applications -
Test 02.16.06

Using Free ABD Yoruba Keyboard

at: http://www.africanportal.net/Publications/ABD/mktut1.htm

With
--------------------------------------------------------------
-----------
Free Arial Unicode MS Font

À à Á á È è É é Ẹ ẹ Ì ì Í í Ò à Ó
ó Ọ ọ Ù ù Ú ú S s

-----------------------------------------------------------
 --------------

Free Doulos SIL Font

À à Á á È è É é Ẹ ẹ Ì ì Í í Ò à Ó
ó Ọ ọ Ù ù Ú ú Ṣ ṣ
--------------------------
--------------------------------- ---------

Free Charis SIL Font

À à Á á È è É é Ẹ ẹ Ì ì Í í Ò à Ó
ó Ọ ọ Ù ù Ú ú Ṣ ṣ
--------------------------
--------------------------------- --------------------------

Free Gentium Font

À à Á á È è É é Ẹ ẹ Ì ì Í í Ò à Ó
ó Ọ ọ Ù ù Ú ú Ṣ ṣ

-----------------------
---------------------------------- ----------------------------


Dr. Samuel Kayode Olamijulo
February 16, 2006
------------------------------------------------------------
From: Andrew Time: 11:19 AM
Dr. Olamijulo,

It would appear that quicktopic has mangled some of your test
characters. It converted some of the bytes in the sequence to
HTML Numerical Character References (NCRs).

For s similar test, have a look at

http://www.openroad.net.au/languages/african/yoruba/sample.html

or

http://www.openroad.net.au/languages/afric...uba/sample_nfd.html

Something I threw together some time ago to test web browser
rendering.
 
Should render well (using Doulos SIL and Charis SIL)
if you are on WinXP SP2, a Windows Vista Beta, or if you are
using a web browser with appropriate rendering support.

-----------------------------------------------------------

From: KONYIN Time: 02:13 PM
If you add this line to the section of your web page, you will
get all these characters showing correctly.
 

I don't know who the webmaster for this quickTopic is, but he
can simply add the above line to the section of the template
and all the chacters will show correctly.

One more thing, it will also depend on the default font in use
by the website. If the character is not present in the font or
the combining diacritical marks are not present, then you may
get a box or not properly aligned combination of character and
tonal mark. (e.g. ?, ȩ, Ẹẹ,
Ẹ́ẹ̀ ̭, ӳ, Ọ̀ọ́,
ڹ, Ṣṣ) (The box is Ss with subdot)

All the posting done by Dr Ọlamijulọ is actually
been display in a single font, this site's default font. (Arial
Font)
------------------------------------------------------------
From: Andrew Time: 09:45 PM
Alternatively, you can convert Yoruba Unicode text to Numerical
Character References (NCRs) and QuickTopic allows you to add a
font tag to specify an appropriate font.

A a Á á À à E e É é È è
Ẹ ẹ Ẹ́ ẹ́
Ẹ̀ ẹ̀ I i Í í Ì ì
O o Ó ó Ò ò Ọ ọ
Ọ́ ọ́ Ọ̀ ọ̀
Ṣ ṣ U u Ú ú Ù ù

Although the ideal situation would be a UTF-8 forum or
discussion board that supported stylesheets customised for
Yoruba.

Andrew
Dr. Samuel Olamijulo  242
02-24-2006 02:43 AM ET (US)
Yoruba Language Display by Emails and Website Applications -
Test 02.16.06


Mr. Andrew Cunningham and Compatriot Ade Oyegbola, greetings.
 
Your communication, as usual , is very helpful.
 
At my end on my desktop, Yoruba Language Display is beautifully flawless.
 
On global Internet applications like qiucktopic, yahoogroups , emails, and many websites,Yoruba Language Display remains frustratingly problematic for many ordinary users.
 
Your suggestions below give useful options to work with.
 
Thank you.
 
Dr. Samuel Kayode Olamijulo
-------------------------------------------------------------- --
------------------------------------------------------------ ----

From: Dr. Samuel Olamijulo Time: 06:18 AM
Yoruba Language Display by Emails and Website Applications -
Test 02.16.06

Using Free ABD Yoruba Keyboard

at: http://www.africanportal.net/Publications/ABD/mktut1.htm

With
--------------------------------------------------------------
-----------
Free Arial Unicode MS Font

À à Á á È è É é Ẹ ẹ Ì ì Í í Ò à Ó
ó Ọ ọ Ù ù Ú ú S s

-----------------------------------------------------------
 --------------

Free Doulos SIL Font

À à Á á È è É é Ẹ ẹ Ì ì Í í Ò à Ó
ó Ọ ọ Ù ù Ú ú Ṣ ṣ
--------------------------
--------------------------------- ---------

Free Charis SIL Font

À à Á á È è É é Ẹ ẹ Ì ì Í í Ò à Ó
ó Ọ ọ Ù ù Ú ú Ṣ ṣ
--------------------------
--------------------------------- --------------------------

Free Gentium Font

À à Á á È è É é Ẹ ẹ Ì ì Í í Ò à Ó
ó Ọ ọ Ù ù Ú ú Ṣ ṣ

-----------------------
---------------------------------- ----------------------------


Dr. Samuel Kayode Olamijulo
February 16, 2006
------------------------------------------------------------
From: Andrew Time: 11:19 AM
Dr. Olamijulo,

It would appear that quicktopic has mangled some of your test
characters. It converted some of the bytes in the sequence to
HTML Numerical Character References (NCRs).

For s similar test, have a look at

http://www.openroad.net.au/languages/african/yoruba/sample.html

or

http://www.openroad.net.au/languages/afric...uba/sample_nfd.html

Something I threw together some time ago to test web browser
rendering.
 
Should render well (using Doulos SIL and Charis SIL)
if you are on WinXP SP2, a Windows Vista Beta, or if you are
using a web browser with appropriate rendering support.

-----------------------------------------------------------

From: KONYIN Time: 02:13 PM
If you add this line to the section of your web page, you will
get all these characters showing correctly.
 

I don't know who the webmaster for this quickTopic is, but he
can simply add the above line to the section of the template
and all the chacters will show correctly.

One more thing, it will also depend on the default font in use
by the website. If the character is not present in the font or
the combining diacritical marks are not present, then you may
get a box or not properly aligned combination of character and
tonal mark. (e.g. ?, ȩ, Ẹẹ,
Ẹ́ẹ̀ ̭, ӳ, Ọ̀ọ́,
ڹ, Ṣṣ) (The box is Ss with subdot)

All the posting done by Dr Ọlamijulọ is actually
been display in a single font, this site's default font. (Arial
Font)
------------------------------------------------------------
From: Andrew Time: 09:45 PM
Alternatively, you can convert Yoruba Unicode text to Numerical
Character References (NCRs) and QuickTopic allows you to add a
font tag to specify an appropriate font.

A a Á á À à E e É é È è
Ẹ ẹ Ẹ́ ẹ́
Ẹ̀ ẹ̀ I i Í í Ì ì
O o Ó ó Ò ò Ọ ọ
Ọ́ ọ́ Ọ̀ ọ̀
Ṣ ṣ U u Ú ú Ù ù

Although the ideal situation would be a UTF-8 forum or
discussion board that supported stylesheets customised for
Yoruba.

Andrew
Andrew  241
02-23-2006 09:45 PM ET (US)
Alternatively, you can convert Yoruba Unicode text to Numerical Character References (NCRs) and QuickTopic allows you to add a font tag to specify an appropriate font.

A a Á á À à E e É é È è Ẹ ẹ Ẹ́ ẹ́ Ẹ̀ ẹ̀ I i Í í Ì ì O o Ó ó Ò ò Ọ ọ Ọ́ ọ́ Ọ̀ ọ̀ Ṣ ṣ U u Ú ú Ù ù


Although the ideal situation would be a UTF-8 forum or discussion board that supported stylesheets customised for Yoruba.

Andrew
KONYIN  240
02-23-2006 02:13 PM ET (US)
Edited by author 02-23-2006 02:30 PM
If you add this line to the <head> section of your web page, you will get all these characters showing correctly.

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">


I don't know who the webmaster for this quickTopic is, but he can simply add the above line to the <head> section of the template and all the chacters will show correctly.

One more thing, it will also depend on the default font in use by the website. If the character is not present in the font or the combining diacritical marks are not present, then you may get a box or not properly aligned combination of character and tonal mark. (e.g. Áá, Èé, Ẹẹ, Ẹ́ẹ̀ Ìí, Óó, Ọ̀ọ́, Úù, Ṣṣ) (The box is Ss with subdot)

All the posting done by Dr Ọlamijulọ is actually been display in a single font, this site's default font. (Arial Font)
Andrew  239
02-23-2006 11:19 AM ET (US)
Dr. Olamijulo,

It would appear that quicktopic has mangled some of your test characters. It converted some of the bytes in the sequence to HTML Numerical Character References (NCRs).

For s similar test, have a look at

http://www.openroad.net.au/languages/african/yoruba/sample.html

or

http://www.openroad.net.au/languages/afric...uba/sample_nfd.html

Something I threw together some time ago to test web browser rendering. Should render well (using Doulos SIL and Charis SIL) if you are on WinXP SP2, a Windows Vista Beta, or if you are using a web browser with appropriate rendering support.
Dr. Samuel Olamijulo  238
02-23-2006 06:18 AM ET (US)
Yoruba Language Display by Emails and Website Applications - Test 02.16.06

Using Free ABD Yoruba Keyboard
 
at: http://www.africanportal.net/Publications/ABD/mktut1.htm
 
With
-------------------------------------------------------------- -----------
Free Arial Unicode MS Font
 
À à à á È è É é Ẹ ẹ Ì ì à í Ò à Ó ó Ọ ỠÙ ù Ú ú S s
 
----------------------------------------------------------- --------------
 
Free Doulos SIL Font
 
À à à á È è É é Ẹ ẹ Ì ì à í Ò à Ó ó Ọ ỠÙ ù Ú ú Ṣ ṣ
-------------------------- -------------------------------------------
 
Free Charis SIL Font
 
À à à á È è É é Ẹ ẹ Ì ì à í Ò à Ó ó Ọ ỠÙ ù Ú ú Ṣ ṣ
-------------------------- ------------------------------------------------------------
 
Free Gentium Font
 
À à à á È è É é Ẹ ẹ Ì ì à í Ò à Ó ó Ọ ỠÙ ù Ú ú Ṣ ṣ
 
----------------------- ---------------------------------------------------------------
 
Dr. Samuel Kayode Olamijulo
February 16, 2006
KONYIN  237
02-22-2006 12:02 PM ET (US)
Edited by author 02-22-2006 12:03 PM
The below site has perfect fonts for Yoruba

http://scripts.sil.org/cms/scripts/page.ph...nload#FontsDownload
Doig Simmonds  236
02-22-2006 11:09 AM ET (US)
I have designed a Yoruba Font for system 8 (now obsolete) on the Mac. But I now need a Yoruba font with all diacritics for System 10.4.4 for the Mac. Can anybody point me in the right direction. <doig.simm@btinternet.com>
Isaac Jadesimi  235
02-02-2006 03:46 AM ET (US)
Yes, I am currently teaching an online course for Beginners, learning the Yoruba language.
I would be happy to send you some relevant information, relative to the course.
Initially, I would suggest that you contact the Registrar at the University of Cincinnati, (Ohio)------since that University is the Institution which is offering the course, and I was appointed to teach the course.
Sincerely,
Dr. Isaac Jadesimi
E-mail address: jades@tamcotec.com
>
< replied-to message removed by QT >
Dr. Samuel Olamijulo  234
01-31-2006 05:13 PM ET (US)
SUBJECT: Yoruba Oduduwa Radio-YO Radio-new URL 02.01.06

YORUBA VERSION

Yoruba Oduduwa Redio -YO Redio – Adiresi Titun 02.01.06
 
A ki gbogbo omo Yoruba, Afirika ati awon ore wa kaakiri agbaye.

Adiresi titun fun Yoruba Oduduwa Redio - YO Redio (Redio iran Yoruba ni gbogbo agbaye) ni:
 
http://www.yoradio.org
 
E JOWO SAMI SI. E maa te botini LISTEN ni webusaiti Yoruba Oduduwa Redio-YO Redio lojojumo lati tunbo gbe itumo ati igbadun aye yin laruge.
 
A si tun be yin lati darapo mo egbe titun fun awon ololufe Yoruba Oduduwa Redio ni


 http://groups.yahoo.com/group/Yoruba_Oduduwa_Radio
 
lati le maa fi ikunlukun fun itesiwaju gbogbo omo Yoruba ati omo Afirika.
 
E jowo e se eto omoluwabi tiyin gege bi enikan lati fi iroyin ayo yi to Ebi, Ara, Ore ati Gbogbo Agbaye leti .

 Ire o. Lati owo:
 
 Igbimo Oludari Yoruba Oduduwa Redio – YO Redio
 
----------------------------------------------------------- -----
 
ENGLISH TRANSLATION
 
Yoruba Oduduwa Radio -YO Radio – new URL 02.01.06
 
Greetings to All Yoruba, African People and Friends in the Global Community.
 
The new URL for Yoruba Oduduwa Radio - YO Radio (Internet Radio for ALL Yoruba Global Community) is:
 
http://www.yoradio.org
 
PLEASE BOOKMARK . Click on the LISTEN button at this Yoruba Oduduwa Radio-YO Radio website homepage regularly to enhance the quality of your life.
 
You are warmly invited to join a new Yoruba Oduduwa Radio e-group and participate for progress at:
 
http://groups.yahoo.com/group/Yoruba_Oduduwa_Radio
 
Please do your own personal patriotic duty; share this happy news widely with your Family Members, Friends and the General Public.
 
Kind regards , from
 
Yoruba Oduduwa Radio -YO Radio -Working Group
Olalekan (Sir Lawie)  233
01-30-2006 02:49 PM ET (US)
Bawo ni, tangye_mcdaniel

Visit YorubaWorld Group at Yahoo, access to the resource section of the group page is by registration, sign up then make use of interactive programs, audio, textual and visual learning materials also available in Yoruba-English & English-Yoruba.

Links section: of the group page consists of Yoruba language web resources.

Also visit:

LEARN YORUBA (AUDIO & TEXTUAL): www.learnyoruba.com

AKOYE (VIDEO & AUDIO TUTORIALS): http://www.africa.uga.edu/Yoruba/

OMNIGOT:
http://www.omniglot.com/writing/yoruba.htm

HOPEAFRICAEPUBLISHER.COM:

1.COMPREHENSIVE LISTS OF YORUBA HARDWARE & CONTENT DEVELOPERS
http://www.hopeafricaepublisher.com/yoruba...tributionlinks.html

WEB LINKS ON YORUBA LANGUAGE RESOURCES
http://www.hopeafricaepublisher.com/yoruba-digital.html



Thanks for your time.

Olalekan (Sir Lawie)
http://groups.yahoo.com/group/yorubaworld/
Awobajo  232
01-29-2006 07:26 PM ET (US)
Edited by author 01-29-2006 07:40 PM
Bawo ni,

I would like more infomation on 2. Yoruba language learning program (medium -> CDROM) my email address is tangye_mcdaniel@msn.com my husband is Yoruba and I would like to be more effient in it.


O se
Sir Lawie (YorubaWorld  231
01-24-2006 11:54 AM ET (US)
Bawo ni Mr. Ganiyu

I learnt that you are in need of Yoruba software, could you pl's define your needs clearly, so we can know how to help you better.
Are you looking for language learning programs or application software to wordprocess your document in Yoruba?

SELECT FROM OPTIONS BELOW
1. Yoruba wordprocessing application.
2. Yoruba language learning program (medium -> CDROM)
3. Yoruba language learning program (medium -> PDA)
4. Yoruba language learning Program (medium -> Mobile Phone)
Kenny  230
01-20-2006 09:31 AM ET (US)
Sir/Ma,
I was informed that Yoruba Language Alphabets now has about 11 New Letters. Some have been scrapped off while others still remain.

If Yes or No, Pl's send as soon as possible to me at info_kenny@yahoo.com
KONYIN  229
01-16-2006 04:03 PM ET (US)
NITDA to promote Nigerian keyboard

Everest Amaefule, Abuja

The National Information Technology Development Agency has pledged to promote the study of Nigerian languages through the use of information technology.

Director-General, NITDA Prof. Cleopas Angaye, made the commitment on Friday in Abuja when a team from Lancor Management Limited, presented a special computer keyboard and software known as Konyin, that can write most Nigerian languages to the agency.

Angaye said the promotion of Nigerian languages through IT has become imperative at a time when several languages are becoming extinct because of generations that have been alienated by their mother tongue.

He congratulated the Lancor team for integrating both software and hardware to achieve the feat of the special Nigerian keyboard and pledged to promote the keyboard especially in the public sector.

Full Story... http://www.punchng.com/computer/article05

The PUNCH, Monday, January 16, 2006
ganiyu adeniran  228
12-29-2005 02:22 PM ET (US)
Dear Sirs,
   
  I w'll like to take out of the Keyboards and please help me to get the Software in Yoruba language.
  Thanks bet regards
  Adeniran.

QT - Sitaram <qtopic+15-KKgbRqJUAR8@quicktopic.com> wrote:
  


  
---------------------------------
Yahoo! Photos
 Ring in the New Year with Photo Calendars. Add photos, events, holidays, whatever. < replied-to message removed by QT >
Dr. Samuel Olamijulo  227
12-29-2005 01:55 PM ET (US)
Subject: Yoruba Internet Radio List by Olamijulo S.K.
   
  HOPE AFRICA E-PUBLISHER
   
  December 28, 2005
   
  Please share the link
   
  http://www.hopeafricaepublisher.com/yirlist1205.html
   
  with family, friends and others who might or should be interested.
   
  Thank you.
   
  Dr. Samuel Kayode Olamijulo
Dr. Samuel Olamijulo  226
12-29-2005 01:46 PM ET (US)
Re Krio Translator

Krio is a language with native speakers in Sierra Leone West Africa.

Krio has very large Yoruba language content.
The best Krio translators will be native Krio speakers from Sierra Leone.

I suggest that you contact Prof. Antonia Yetunde Schleicher of the University of Wisconsin
 
email: ayschlei@facstaff.wisc.edu

for more specific advice.

Regards.

Dr. Samuel Kayode Olamijulo
Sitaram  225
12-28-2005 10:33 PM ET (US)
I have a request to find someone to be a translator in Krio, which is a dialect of Yoruba.

Please e-mail voicesofafricaunited at yahoo dot com

or post at

http://www.voicesofafricaunited.myfreeforum.org
Abdulganiyu Adeniran  224
12-21-2005 05:27 AM ET (US)
Dear Sirs,

Please let me know if you can send Software Yoruba to me and the cost, because l'm not connect with Internet yet.
Awatin for your reply.

Adeniran
KONYIN  223
12-20-2005 10:58 AM ET (US)
Can technology lead the way back?

Extract from Punch 12/20/2005

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
What was Stella’s Edo name?


While I appreciate the way and manner Nigerians mourned the sudden demise of the First Lady, Mrs. Stella Obasanjo, especially via the obituaries in the print and broadcast media, one thing that baffled me most is what appeared to be a conspiracy of silence over the late First Lady’s Edo name.

Even the Edo State Government and Stella’s own Abebe family couldn’t care less about using her given Edo name, as if her marriage to a Yoruba man obliterated her Edo ancestry.

This is one typical instance of how Nigerians relinquish their native names and identities for whatever reasons, preferring to take on foreign names and identities, such that it becomes virtually impossible to trace their origin at critical moments.

Nowadays, especially among Yorubas, it is considered crude to speak the native language, much less to give one’s children the traditional Yoruba names.

Children are given foreign names, taught to speak foreign language’s and eat foreign (unhealthy) foods, to the point that the children become citizens of no world on the long run.

Someone said that if you want to annihilate a people, make them forget their language, religion and culture.

Most Nigerians are on their way to self-annihilation, and unless we make a U-turn for the better, it is only a matter of time before we lose our peculiar identity as indigenous peoples.

Olajide Laniyonu,
27, Aluko Street,
Felele, Ibadan,
Oyo State.

The PUNCH, Tuesday, December 20, 2005
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

KONYIN
KONYIN  222
11-06-2005 02:49 PM ET (US)
Here is an update to the conversation posted by Adé at /m220

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
KONYIN 187
 
11-06-2005 11:44 AM ET (US)
Edited by author 11-06-2005 01:22 PM
Hi Andrew,

Your passion about correct internationalization practices for website and web-content developers show through in your latest write-up and like I said before I or my company are no experts in these arena. Your scripting suggestions are very informative and I hope that web developers reading your posting will use it effectively.

Our single focus is that if the input device hardware (keyboard) allows you to easily type the character, then the character can be used anywhere in general programming.

My question is will most of these scripting issues go away if we have websites and contents developed using default fonts, especially OpenType fonts, with complete features for GSUB and/or GPOS layout tables?

Some house cleaning issues I will like to address: KỌNYIN Multilingual keyboard is not a Nigerian keyboard layout project. The technology behind the 63 alphanumeric keycaps with 4 shift keys was created by LANCOR Technologies of Boston, MA United States. (http://www.lancorltd.com/Konyin.html) The hardware created by the company can be used for any combination of character-sets regardless of the scripting, from Cyrillic, to Syllabic, to Latin alphabets. Like I said in explaining the input/output as using Unicode standard for handling fonts, if the character has a code point in the Unicode charset we can create a layout mapping in our driver for it. The keyboard driver compilation is language neutral.

We have used the LANCOR Multilingual Keyboard Technology (LMKT) to create a multilingual keyboard for United States, which included all Latin characters and tonal marks for English, Español, Deutsch, Français, Gaelic, Italino, 'Ōlelo Hawai'i, Polski, Português, etc... The keyboard is currently on sale and is expected to be available in some major retail stores in the US in January. (http://www.konyin.com) We are also in the final stages of releasing the KỌNYIN South American Multilingual Keyboard that will represent all the Latin characters and tonal marks for Spanish (Español), Portuguese (Português), English, Italian (Italiano), German (Deutsch), Quechna (Runasimi), Aymará, French (Français), Guaraní, Dutch and other Latin alphabetic languages.

Just thought it is very important to make the distinction. LANCOR Technologies is a division of Lagos Analysis Corporation (LANCOR) and the privately held company is owned by two US based Nigerians, which should explain why the LMKT was first used to create a multilingual keyboard for all Nigerian languages combined and why the product carries a Nigerian name. KỌNYIN in Yorùbá means “a drop of honey.”

So much for my rambling about our company, the issues of correct representation of extended and combining Latin characters in website and web content development, which is better understood by people like you, should be disseminated through information exchange. Have you thought about writing a book on this, something like “Proper techniques for developing internationalized website and contents for dummies?” I will buy it.

Adé G Oyegbọla

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 Andrew 186
 
11-05-2005 11:25 PM ET (US)
 Thanks Adé.

I believe that your reply has answered my initial query, thankyou. I now have a much better sense of how your product compares with others wrt to character sequences generated. Considering the limitations Microsft related technologies place on keyboard layouts, the output scenario you describe for KỌNYIN is a practical approach that has avoided some of the problems other Nigerian keyboard layout projects sometimes have.

As you noted this forum, much like a lot of other projects including web based email, discussion boards, blogs and wikis do hold traps for Nigerian languages. The in-house script you describe for converting to precomposed characets is useful. Within our projects we have similar scrips allowing us to convert to NFC or NFD and also NCRs for HTML. Sometimes they prove to be useful.

WRT QuickTopic ... the server is identifying the character encoding as ISO-8859-1, so the only effective way of adding text in African languages is to add the text (with a codepoint above U+0255) as NCRs. Mozilla, Firefox and Opera handle fonts somwwhat differently than IE. If a character is missing in IE, it will show the "missing glyph" glyph from the font being used. Current typographic convention is to display a square box or rectangle. Mozilla, Firefox and Opera on the other hand will instead change the font for that one characer.

Either approach results in problems with combining daicritics.

And if the website is using a font that has copepoints for the diacritics, but isn't an appropriate OpenType font, then the display will be sub-optimal.

The only successful way to render such text is with an appropraite OpenType font and an appropriate version of usp10.dll (ie use winXPSP2) or alternatively use a graphite font and a graphite enabled version of Firefox (on any Windows or Linux), or appropariate solutions on the MacOS.

With Firefox and the URIid extension (https://addons.mozilla.org/extensions/moreinfo.php?id=563) its possible to write website specific CSS rules in the userContent.css file to override QuickTopics default fonts.

In my userContent.css file I have the following rule:

body#www-quicktopic-com div.messagecell {font-size: 1.1em !important; font-family: Doulos SIL !important;}

This is a quick hack. The ideal situation would be to use web services that are well internationalized and meet the needs of African languages. Instead we have to work around implemetations.

The chromEdit extension (http://cdn.mozdev.org/chromedit/) is a useful tool for editing the userContent.css file.

Likewise in Firefox and Mozilla it is possible to add a CSS language psuedo selectors to match a language and override the font of the website. This approach is useful, BUT only when web developers markup the primary language of a website or markup any change of language. Some do, some don't.

For example, you could add the rules:

*[lang|="ig"] {font-family: "Charis SIL",serif !important;}
:lang(ig) {font-family: "Charis SIL",serif !important;}

These CSS rules would kick in if the web developer had language tags in the web page that identified the page or part of the page as Igbo.

I started working with the issues arising from multilingual web development, multilingual public access services, electronic multicultural library services and resource discoverability of multilingual government documents since late 1994. Things have got a lot easier over time. Unfortunately there are still issues that need resolving as African langauge web content increases.

Re the issue of getting KỌNYIN Africa Multilingual
keyboard. I tend to use multiple operating systems. Win98SE, Win2000, WinXPSP2 and we're also using Ubuntu Linux. When working with languages that use combining diacritics I predominately use WinXPSP2. I've had minxed results on Ubuntu. I need to test on other Linux distros, but to date I've found that I get the best results in Linux if I use graphite enabled software (I haven't had a chance to apply Graphite patches to Pango/Gnome yet). Ubuntu doesn't seem to handle OpenType support for combining diacritics.

Andrew

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
KONYIN 185
 
11-04-2005 09:29 AM ET (US)
 Edited by author 11-04-2005 12:09 PM
Hello Andrew, /m184

Another good description and recap of issues relating to multilingual computing outside the “AmeriEuro” languages circle, I totally agree with all your observations, however, it is important to always distinguish between the input device itself, the translation functionality of the OS and font text rendering that is application dependent.

The question I answered arose because of the way you framed the issue in /m178
“In the case of KONYIN, KONYIN's approach is interesting, but like most Igbo/Nigerian input projects, KONYIN doesn't give any idea of what output their keyboard layout generates.”

I think you differentiated it better in your reply /m184 item4 “rendering of text is OS specific and also specific OpenType, AAT or Graphite fonts (depending on what OS and what font rendering etchnology is being used).”

Other than this observation I really don’t have much to add to your recap of the issues relating to rendering and display of extended characters both precomposed and combining ones.

In the case of keyboard layout mapping, the correct use of Hex or Unicode code point is paramount. (Like the famous saying: garbage in, garbage out) KỌNYIN uses Unicode code points for its character mapping. KỌNYIN uses both precomposed and combining diacritical marks’ Unicode code points. All the characters labeled on keycaps uses precomposed Unicode code points in the mapping. You can only achieve characters with tonal marks by combining a based character with a tonal mark. All the tonal marks labeled on keycaps uses combining diacritical Unicode code points. The approach we believe gives consistent and uniform result for text handling and rendering regardless of the application. The technology is fully Unicode standard reliant, which is why the keyboard driver is only designed for Windows 2000 or above. (We are presently in the final development of the Mac OS X and Linux versions)

For our own in-house uses, we do have a script that maps any combining character to a corresponding precomposed glyph for internet communication purposes, like typing into this quicktopic portal. As you know, this portal does not translate combining characters properly and the default font does not include all extended Latin characters, so when you see “Adé” in my typing it is actually in Unicode code point lingo as U+0041, U+0064 & U+00E9; even though what I typed using the keyboard originally is U+0041, U+0064, U+0065 & U+0301.

Like you said, the answer is proper knowledge of scripting by web developers and better application development techniques by application programmers.

For Ìgbo and other tonal languages, the ability to freely use tonal marks is mandatory; and an input device without these capabilities will not be universally accepted. Trust me, we know, we spent over 7 years to arrive at that conclusion.

On the issue of getting KỌNYIN Africa Multilingual keyboard, we are presently waiting for final agreement of the characters to be included from our team of linguists. This is due before the end of this month. Currently the driver is for Windows 2000 or above, but we expect the beta for Mac OS X and Linux to be ready before the end of this year. So if you like to wait for the Linux version, we should begin shipping in February.

Que sera sera

Adé

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Andrew 184
 
11-03-2005 06:28 PM ET (US)
 Hi Adé,

Good to hear from you. Thankyou for your response.

1) I'd be interested in the possibility of getting KỌNYIN to work with SCIM or even KMFL on the Linux platform,esp teh African keyboard underdevelopment.

2) re /m183

I wasn't asking a question. Just raising the issue that people may need to be aware of further down the track:

1) currently different input solutions for Igbo do produce different output. One input solution (NITDA) does produce invalid sequences.

2) most applications DO NOT normalize text. Irregardless of whether they should. You mention Dreamweaver (one of the few tools that does have an option to normalize text). Unfortunately Dreamweaver is an exception rather than the norm.

3) major web services like Google and others do not normalize data. Despite Unicode and W3C recommendations that NFC data should be used on the web.

Web authors are the ones who should be responsible for normalizing their web pages, and currently most web tools don't allow normalization. But then a lot of commonly used HTML and XML editors can't handle Unicode or have broken Unicode implementations.

4) rendering of text is OS specific and also specific OpenType, AAT or Graphite fonts (depending on what OS and what font rendering etchnology is being used).

You say
(E.g. if you use one keyboard to type "Ẹ" it may look different using another keyboard)

I'm not discussing whether the output appears the same, thats a font and font rendering issue. I'm discussing which unicode characters are in the data.

As I said before I don't care what input solution people use to type. They can use any keyboard driver or keyboard they like.

As you indicate. It is an issue for content developers and authors. Therefore content developers and authors need to know what their tools do or do not do. They need to be aware that different keyboard drivers will produce different character sequences. They need to know that their web pages may be searchable through Google with certain drivers and not with others. Mainly a problem for people who use tones. They need to be aware that they should normalize their pages or their data. They can do this with Dreamweaver or they can do it with other tools. Generally most websites are scripting langauge and databse driven, so it usually should be done in the scripts before data is saved and before a search is conducted.

people creating word documents converting them to correctly generated unicode PDF files and uploading them to the web need to know that there are discoverability issues.

It will impact on spell-checking in word processors. The issue has already been seen in some Microsoft Office proofing tools for languages Microsoft uses combining diacritics for.

Passing of the issue as a purely application or developer issue ignores the fact that most applications and web services do not normalize data. It ignores the fact that the content developers and authors need to be aware of the issue. It ignores the fact that end users will have resource discoverability issues.

NITDA is a bad example here since their keyboard actually uses depreciated characters, ie some of the characters it uses are the wrong characters (characters that Unicode standard says should NOT be used).

Yes, in theory, I'add agree with your statement that "the action between an input text and what is displayed in the application or internet browser is not directly related to the keyboard used".

This assumes that application developers have done the right thing. But since most Latin script languages only use precomposed characters, and the need for normalization is only critical for some African , Central Asian, and a handful of South east Asian languages, which most developers are in ignorance of ....

You end up with a situation where very few software developers are aware of the need for normalization in their tools.

The whole point of my replies, isn't really about the relative merits of different keyboards or keyboard drivers (except for NITDA who really DO need to fix those flaws in their keyboard layout).

The issue I have is that end users esp. people who will be developing content for the web, whether its X/HTML, XML, RTF, DOC, PDF, text or some other format need to know that

1) differnet character sequences may be produced by different input solutions (most notably with tones). I believe it is useful for a developer or author to know which format their favourite input solution uses. Since this will inform their need to normalize data (if they are producing content only). If their site also collects data or allows searching normalization should alwyas occur.

More importantly, they need to be aware of the issue. They need to be aware that they need to take normalization issues into acocunt, because other services or applications including web browsers do not.

2) a small number of tools can normalize data. You've mentioned Dreamweaver already. SC Unipad can convert data. SIL produce an encoding converter called TECKit which can be used to normalize a Microsoft Word document. If you use Perl, ASP, ASP.Net. VB.Net, Python there are ways of doing it as well.

Ultimately, there needs to be a guide containing information on web development issues when the content is in African languages.

Most tutorials and books on web development are from a anglocentric or eurocentric perspective, and rarely touch web internationalization issues.

Its rare to find a book or tutorial on creating web sites that mentions the word normalization, or why its important or critical for some languages.

Andrew
Dr. Samuel Olamijulo  221
11-04-2005 08:13 AM ET (US)
SUBJECT: Yoruba and African Languages Display by Computers and Internet Applications.
 
Please permit me, as a lay but very interested IT user, to commend the article below by Compatriot Adegbola, KONYIN CEO. Everyone of us must continue to work hard and use every opportunity at every level to improve the display of Yoruba and other African languages by Computer and especially Internet Applications.
 
Olodumare a fun wa se o - Amin
 
Thank you.
 
Dr. Samuel Kayode Olamijulo
-------------------------------

From: KONYIN Time: 08:18 AM
The question I am asked to answer is: Is there a direct
relationship between the text input using a specific input
device and the text displayed in an application?

(E.g. if you use one keyboard to type “Ẹ” it may look
different using another keyboard)

My answer is: NO!

This is like asking a person, how do you cook
Ọgbọnọ with goat meat? Of course, you cannot,
but you can use goat meat as part of the ingredients in
Ọgbọnọ, if you want.

Please follow the roadmap to my answer.

1. Input Device, most common input device is the keyboard
and mouse and I will reference keyboard in the physical and
virtual sense. The keyboard is an input device and nothing more.
When a user presses a key on the keyboard, the keyboard sends a
scan code digit to the attached computer’s operating system
(“OS”) (e.g. user presses key with letter U on the keyboard and
let say the scan code for the key is 0xx44)

2. Operating System, when the OS receives the scan code
from the keyboard, it will use the corresponding keyboard
driver, which usually includes a keyboard layout mapping, to
translate the scan code. The translated value can either be Hex
or Unicode. (E.g. the Unicode value for Latin small letter “u”
is U+075 and the Hex value is 0x75)

So, if the keyboard layout mapping was done in Unicode, the OS
will translate the scan code 0xx44 to mean U+075 (I assume here
that we are only dealing with Unicode compliant OS and I will
not delve into codepages) The OS, now sends the new code to the
opened software application. (E.g. MS Word, WordPerfect,
FrontPage, PageMaker, Dreamweaver, etc)

3. Applications, all applications use fonts to render and
display text values received from the OS. When an application
receives a text value from the OS (in this case U+075) it will
use the active font available to render and display the glyph
representing the text value received, if an application does not
recognize the text value, it will return a “?” mark value to the
font for display. In this case the glyph for U+075 is “u” and if
the font does not recognize the text value it will display a
box.

That in short version is the roadmap from text input by a
keyboard to text display by an application.

I answer the question to address Andrew’s frequent reference to
“what an output a keyboard layout generates…” The answer is it
depends on the OS, the application and font in use.

The main issue of character normalization and NFC or NFD is
purely application to internet browser relations and has no
direct relevance to the type of input device used.

Example: Dreamweaver web development application is fully
Unicode compatible and it will properly display both precomposed
and combining characters, but developers have to know to set the
preferences specifically for internet browsers. Also, web
developers have to do additional scripting to account for
disparate treatment of text values by all the internet browsers.
(My text example “u” is U+075 in Unicode, 0x75 in HEX and u
in HTML(hex)) Most browsers use language codepages to support
text translation, but most extended Latin characters are not
contained in any codepage. This will be also be true for
combining diacritics, so while browsers like Explorer can handle
combining glyphs, Firefox cannot. A smart web developer has to
write scripts to account for Firefox web browser shortcomings
(normalization of text display across internet browsers). The
answer to this problem is standardization under Unicode, if all
the browsers a fully Unicode compliant, I think the problem will
go away. (Not an expert in this area)

In conclusion, I say the action between an input text and what
is displayed in the application or internet browser is not
directly related to the keyboard used. Users will always get
what the keyboard layout is mapped to give the application, what
a user sees in the application is a different matter. (See
NITDA)

Adé G. Oyegbọla
Co-President/CEO
Lagos Analysis Corporation
http://www.lancorltd.com

Applications
KONYIN  220
11-03-2005 08:18 AM ET (US)
Edited by author 11-03-2005 10:00 AM
The question I am asked to answer is: Is there a direct relationship between the text input using a specific input device and the text displayed in an application?

(E.g. if you use one keyboard to type “Ẹ” it may look different using another keyboard)

My answer is: NO!

This is like asking a person, how do you cook Ọgbọnọ with goat meat? Of course, you cannot, but you can use goat meat as part of the ingredients in Ọgbọnọ, if you want.

Please follow the roadmap to my answer.

1. Input Device, most common input device is the keyboard and mouse and I will reference keyboard in the physical and virtual sense. The keyboard is an input device and nothing more. When a user presses a key on the keyboard, the keyboard sends a scan code digit to the attached computer’s operating system (“OS”) (e.g. user presses key with letter U on the keyboard and let say the scan code for the key is 0xx44)

2. Operating System, when the OS receives the scan code from the keyboard, it will use the corresponding keyboard driver, which usually includes a keyboard layout mapping, to translate the scan code. The translated value can either be Hex or Unicode. (E.g. the Unicode value for Latin small letter “u” is U+075 and the Hex value is 0x75)

So, if the keyboard layout mapping was done in Unicode, the OS will translate the scan code 0xx44 to mean U+075 (I assume here that we are only dealing with Unicode compliant OS and I will not delve into codepages) The OS, now sends the new code to the opened software application. (E.g. MS Word, WordPerfect, FrontPage, PageMaker, Dreamweaver, etc)

3. Applications, all applications use fonts to render and display text values received from the OS. When an application receives a text value from the OS (in this case U+075) it will use the active font available to render and display the glyph representing the text value received, if an application does not recognize the text value, it will return a “?” mark value to the font for display. In this case the glyph for U+075 is “u” and if the font does not recognize the text value it will display a box.

That in short version is the roadmap from text input by a keyboard to text display by an application.

I answer the question to address Andrew’s frequent reference to “what an output a keyboard layout generates…” The answer is it depends on the OS, the application and font in use.

The main issue of character normalization and NFC or NFD is purely application to internet browser relations and has no direct relevance to the type of input device used.

Example: Dreamweaver web development application is fully Unicode compatible and it will properly display both precomposed and combining characters, but developers have to know to set the preferences specifically for internet browsers. Also, web developers have to do additional scripting to account for disparate treatment of text values by all the internet browsers. (My text example “u” is U+075 in Unicode, 0x75 in HEX and u in HTML(hex)) Most browsers use language codepages to support text translation, but most extended Latin characters are not contained in any codepage. This will be also be true for combining diacritics, so while browsers like Explorer can handle combining glyphs, Firefox cannot. A smart web developer has to write scripts to account for Firefox web browser shortcomings (normalization of text display across internet browsers). The answer to this problem is standardization under Unicode, if all the browsers a fully Unicode compliant, I think the problem will go away. (Not an expert in this area)

In conclusion, I say the action between an input text and what is displayed in the application or internet browser is not directly related to the keyboard used. Users will always get what the keyboard layout is mapped to give the application, what a user sees in the application is a different matter. (See NITDA)

Adé G. Oyegbọla
Co-President/CEO
Lagos Analysis Corporation
http://www.lancorltd.com
Paradigm International  219
10-03-2005 11:04 AM ET (US)
Edited by author 10-03-2005 11:05 AM
Dear Ganiyu Adeniran:

Our Paradigm Lingua software is for content creation especially in African languages. One may produce a document with different paragraphs in different languages using the standard keyboard. You dont have to switch layouts or memorize several things, it is VERY simple to use, please download the evaluation version from http://www.paradigmint.net/lingua.htm, it is a STAND ALONE word processor that also assists with translation.


Don:

We are yet to localize the Lingua Interface as that was not the initial priority.
BisharatNetPerson was signed in when posted  218
10-03-2005 10:06 AM ET (US)
FYI, from http://www.conferencealerts.com/seeconf.mv?q=ca1h8600 . I have not been able to access the site so don't have any details on topics to be covered. DZO

LANGUAGE,CULTURE AND GLOBALIZATION
21 to 24 November 2005
Owerri, Imo State, Nigeria

Website: http://www.apnilac.4t.com
Contact name: ANOPUE CALISTUS CUSSONS
E-mail: callycussons_AT_yahoo.com (to e-mail the conference organizers, please
replace _AT_ with @)

Organized by: ASSOCIATION FOR PROMOTING NIGERIAN LANGUAGES AND CULTURE.
Deadline for abstracts/proposals: 20 October 2005 (Check the event website for latest details.)
   217
09-30-2005 01:06 PM ET (US)
Deleted by topic administrator 10-03-2005 10:04 AM
KONYIN  216
09-26-2005 06:48 PM ET (US)
Edited by author 09-26-2005 06:49 PM
/m215 Says: (Ọpẹ ni fun Ọlọhun Ọba ti o da gbogbo ede (language) mo ni fẹ lati ra software ti o jẹ ede Yorùbá pẹlu Keyboards, fonts ati applications ti oba seese lati ri
Mo nreti esi. Emi ni Ọmọ Iya yin ni Ede Yorùbá.)

FONTS: If you are using Windows OS 2000 or above, you will not need to buy fonts separately, there are fonts included in Office products that you can use. If you want more font varieties, there are free fonts like: Gentium, NITDA, Ariya and Code2000.

KEYBOARDS: KỌNYIN Nigeria Multilingual keyboard is the only physical keyboard in the market that you can buy for Yorùbá typing. (http://www.konyin.com) However, there are some keyboard layouts that you can download for free (http://www.africanportal.net/Publications/ABD/mktut1.htm, http://www.nitda.gov.ng/projects/kbd/index.php, and http://tavultesoft.com/keyman/downloads/keyboards/index.php ) just to name a few. With these keyboard layouts you can use your existing physical keyboard by adding some labels to identify Yorùbá alphabet locations and key commands.

APPLICATIONS: There are many software applications for learning Yorùbá on the internet; you have to be more specific about what kind of applications you are looking for.

Good Luck
Adé
Abdu Ganiyu Adeniran  215
09-26-2005 06:21 PM ET (US)
Ope ni fun OLohun Oba ti o da gbogbo ede (language)mo ni fe lati ra software ti o je ede Yoruba pelu Keyboards, fonts ati applications ti oba seese lati ri

Mo nreti esi.

Emi ni Omo Iya yin ni Ede Yobuba.
KONYIN  214
09-26-2005 05:47 PM ET (US)
Don asks, are there any software commands translated into Yorùbá? /m206

The First step to having most commands translated into Yorùbá by software developers in Windows OS environment is getting Microsoft to create a National Language Support (NLS) for Yorùbá.

The Microsoft Language Enabling Pack has the template and forms required to submit a request for Yorùbá NLS. To get more information about this subject go to http://www.microsoft.com/globaldev

Our experience with Microsoft is that the effort to create an NLS for a language is better undertaking by a Government organization (In Nigeria like NITDA or Institutions).

Adé Oyegbọla
http://www.konyin.com
KONYIN  213
09-20-2005 09:52 PM ET (US)
Edited by author 09-20-2005 09:55 PM
Dear Andrew, /m212

Simply type KONYIN and Agfa in your search engine and you will get a copy of the press release.

http://www.businesswire.com/webbox/bw.050802/221282121.htm

Why don't you get a copy of the keyboard to find out, test it and report your experience to all.

LANCOR Technologies recommend the following Fonts as best for your daily computing:
1. Microsoft Sans Serif
2. Arial Unicode MS
3. Tahoma
4. Gentium

These fonts have been tested and have all the alphabets, tonal marks and currency symbols needed for Nigerian languages.

Adé
Andrew  212
09-20-2005 09:44 PM ET (US)
Dear Adé,

Thankyou for your correction. My aplogies. I misunderstood.

WRT KONYIN output, is it NFC, NFD or some hybrid output?

I couldn't access the monotype link, although considering its a 2002 PR and as recenty as this year MOnotype Imaging staff have indicated that Monotype themselves do not curently produce OpenType fonts supporting combining diacritics (ie fonts suitable for Yoruba). So not sure whay I'd be looking at that.
KONYIN  211
09-20-2005 09:32 PM ET (US)
If your kids speak only english and you want to teach them Yorùbá using your PC, there is only one keyboard in the world that gives you all Yorùbá alphabets and all English alphabets on a single layout for easy direct access typing.

The keyboard is KỌNYIN Nigeria Multilingual Physical Keyboard made by LANCOR Technologies of Boston, MA (http://www.konyin.com)

KỌNYIN Multilingual Keyboard
KONYIN  210
09-20-2005 09:26 PM ET (US)
Dear Andrew,

KỌNYIN was designed and developed in fully Unicode compliant environment from the beginning dating back to June 1998. See Agfa Monotype Press release (http://www.monotypeimaging.com/about/pr_di...asp?year=2002&pr=98) (Messages /m183, /m182, /m179, /m177)

There seems to be a real misconception out there that our product is similar to some virtual keyboard layouts created using MSKLC. Our technology is patented and far superior to Microsoft keyboard layout creator (MSKLC).

You should get a copy of the physical keyboard and experience it before you make any judgment or continue to make references to the product.

Thanks
Adé
Andrew  209
09-20-2005 06:15 PM ET (US)
Thanks Adé,

for the update in /m208. Its good to hear that Konyin made the transition to Unicode.

Andj.
KONYIN  208
09-20-2005 04:49 PM ET (US)
Edited by author 09-20-2005 05:04 PM
Please take notice that the information provided by Andrew post /207 is incorrect with regards to KỌNYIN.

KỌNYIN Multilingual Keyboard is full Unicode compliant and the link to the website is (http://www.konyin.com)

It is also very important to take notice that KỌNYIN Multilingual keyboards are physical keyboards. This disntinction is very important and differentiates our product from the vitrual keyboard layouts that are uni-lingual (ADB & Atl-I) or tri-lingual (NITDA) based kbd compilers using Microsoft keyboard layout creators. (See the Press Release below)

04-22-2005 (Boston) LANCOR Technologies, a division of Lagos Analysis Corporation today announced the release of its KỌNYIN Physical Multilingual Keyboard - The First Truly Complete Computer Keyboard designed to accommodate combined character-sets for Multiple Language Groups on a single keyboard layout using LANCOR Technologies’ multi-function enabled kbd driver. The new physical keyboard is based on the QWERTY layout with additional spaces for twenty-six (26) alphabets and twenty-one (21) combining diacritical marks from any combination of languages.

The new multi-function enabled kbd driver technology is particularly appropriate for most countries with more than one Official Language, and Institutions that deals in multiple languages at the same time. The new kbd driver technology does away with the need to switch keyboard layouts from one language to another during regular typing.

Most multilingual input devises in the market today use the virtual multi-layout keyboard approach which allows users to switch from one language layout to another in other to access character-sets for various languages. These virtual keyboard layouts cannot be considered as true multilingual input devices, because users can only input one or two language character-sets at a time; and users have to understand and remember multiple combination keystrokes and shortcuts during typing. These problems have been eliminated with the introduction of the first multi-function enabled single-layout multilingual physical keyboard from LANCOR Technologies. Simply put, KỌNYIN multilingual keyboard users do not have to switch keyboard layout to type in any language. The keyboard truly does not change how users type today, and does not use the "dead key" typing process.

Thanks
Adé G Oyegbọla
KỌNYIN Multilingual Keyboard
Andrew  207
08-28-2005 09:20 PM ET (US)
Edited by author 08-28-2005 09:28 PM
re /m206

This is from memory, so i could be wrong:

NITDA (http://www.nitda.gov.ng/projects/kbd/index.php) Unicode based, but originally had some invalid unicode character sequences (ie used depreciated characters). Haven't tested recently

Alt-I (http://www.alt-i.org/projects.htm) has two alternative keyboard layouts that replace Latin characters not used by Yoruba. Not sure what output they produce. I suspect, from the keyboard layout, that they were developed with MSKLC so its likely that they're using single characters for sub-dotted characters and using combining diacritics for tones. Just a guess though. Non-NFC, Non-NFD. Optimised for typing in Yoruba (based on character frequency). If i'm right about its output, then this keyboard arrangement would need to use OpenType fonts. Haven't tested it.

Konyin (http://www.africanportal.net/Publications/ABD/mktut1.htm) took a non Unicode approach.

ADB: also used MSKLC to create a keyboard layout. Although they take a different approach to Alt-I, Rather than using deadkeys or diacritic keys, all extended latin Yoruba letters exist as single Alt-Gr keystrokes. It would appear that they use precomposed characters for all letters with tones, and a combining diacritic for the sub-dotted letters. Non-NFC, Non-NFD. Qwerty based layout with additional characters on AltGr key sequences. I suspect that this character output arrangement was an attempt to get Yoruba output from non-OpenType fonts.

Open Road (my stuff): an old experiment created for testing purposes. Uses Keyman. Versions for creating NFC or NFD output. Most of our more sophisticated and recent layouts have been for Igbo and Sudanese languages, rather than Yoruba.

NITDA, Alt-I and ADB all produce different character sequences. Alt-I also produces a spell checker for MS Word, which would be incompatible with the NITDA and ADB keyboard layouts.

re /m205:

I tend to work with multiple languages and with web services. For a long time the preferred normalization form for the web has been NFC. So easiest for me to create NFC output, also create NFD versions for by libraries. MARC21 tends to use decomposed character sequences and most major library management systems follow suit.

The character order you specify in /m205 is the only real practical way of creating a keyboard using MSKLC, due to its limitations. I.e. to make a simple, efficient keyboard design for Yoruba it isn't possible to create an NFC layout with MSKLC. You could create a fully decomposed version, but you'd have no control over whether a user typed a tone first or a sub_dot first. Different typists may type different character sequences.

Keyman is capabale of more control. It is possible to guarantee NFC or NFD output.

Really comes down to what you want to do with the data. If I was developing a Yoruba web service, I'd put in a normalization layer and normalize all data to either NFC or NFD based on the needs of the web service being developed. Since different keyboard solutions for Yoruba would produce differnet character sequences and cause problems when processing data.

If i was going to make a yourba keyboard today, I'd be tempted to base one on the Alt-I keyboard arrangements and develop a Keyman layout that I could use with Keyman on Windows, or with KMFL and SCIM on Linux. I'd force output to NFC or NFD. Probably NFC.

Should be all of 5 minutes work.

At least that would be my preference at the moment.

Depending on the needs sequence-checking could be built in.

Just my 2 cents worth.

Andrew
BisharatNetPerson was signed in when posted  206
08-28-2005 08:37 PM ET (US)
Am I correct that none of the currently available software that facilitates input of Yoruba (e.g., Konyin, Paradigm) has the commands translated into Yoruba?

Has anyone done an independent evaluation/comparison of these and the various proposed keyboards (Alt-I, NITDA, ABD, OpenRoad [Andrew's])? There has been some interesting discussion on this board and elsewhere (see for example http://lists.kabissa.org/lists/archives/pu...forum/msg00233.html), but to my knowledge nothing where all have been discussed together.

TIA...

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  205
08-28-2005 08:21 PM ET (US)
Edited by author 08-28-2005 08:28 PM
Hi Andrew. I've a question whether one should use composed (NFC) forms for tone marks over vowels at all. For the sake of argument, if you have, say "e with acute accent" (acute accent matking a high or rising tone) and "e dot-below with acute accent," does it make sense to use the NFC for the first when the NFC for the latter adds the accent as a combining diacritic? Note that I'm not arguing that the "e dot-below with acute accent" should be e-with-acute-accent + dot-below, just wondering about consistency.

Re /m203, by the same reasoning, for the NFD (decomposed) combination of "e with vertical line and accute accent," should it be e + verticle_line_below + accent.

I.e., it seems to make sense that all tone marks be handled as combining diacritics (decomposed characters) added on. On the other hand, common usage will certainly continue to rely on the precomposed simple vowels plus accents where they are available.

So basically I guess the issue comes down to whether it matters, or if there can be algorithms or whatever that treat proper NFC and NFD forms equally (and for that matter handle or change the incorrect forms).

Don Osborn
Bisharat.net
Andrew  204
08-15-2005 07:07 PM ET (US)
AT one stage I created a table for either cc or TECKit to convert NFC (dot-below) to/from NFC (vertical-line-below). I'll have to have a look through my old files to see if I still have a copy of it.

Andrew
Andrew  203
08-13-2005 08:33 AM ET (US)
re /m200

yes, 1) is the correct NFD and 3) is the correct NFC (if you use the dot below).

If you use the verical line below on the other hand:

NFD would be e + verticle_line_below + accent

while NFC would be e-accent + vertical_line_below (since there is no precomposed form with e and a vertical line below, so e and the accent would combine).

Andrew
BisharatNetPerson was signed in when posted  202
08-12-2005 08:38 AM ET (US)
Edited by author 08-12-2005 09:00 AM
Mike, Thanks for your messages and thanks. Actually though, with this board now having over 200 messages, if you look through them my contributions have not been the most important. There are a lot of people to thank even if there be, as your questions indicate, some issues yet to resolve.

The question you raise is similar to one discussed in messages /m31 and following, and /m53 and following. But it is good to raise the issue again. Personally I like the option 3 (precomposed dot-unders plus combining diacritic tone marks). My thought (not as well reasoned as you have put it) is that it makes sense to treat tone marks as add-ons that may not always be necessary. The first two decomposed options seem like a lot of unnecessary work.

However, there is the issue of the alternate form of small-vertical-line under, or bar as you put it, for which there is no precomposed anything.

We may be talking about two standards for input/composition of Yoruba text. Or perhaps algorithms that would take whatever bunch of codes is thrown at it (selections 1-4 in your example, or the simpler case of composed characters with no dot under but with accents (tone marks) or decomposed versions; or the s subdot precomposed or decomposed) and interpret them in one (or two if the small vertical line is used) standard ways.

Ideally NITDA or some authority on Yoruba language would take the lead in making such choices regarding handling of Yoruba text in Unicode fonts. But in the absence of that, perhaps you have to make the best decision possible.

Perhaps others have other ideas?

Don Osborn
Bisharat.net
Mike Maxwell  201
08-11-2005 03:28 PM ET (US)
Looks like my post got cut off. I'm assuming that's due to a size limit, so here's the second half of the posting (with a little overlap):
First, I believe (1) is the correct Unicode Normalization Form D (= Decomposition) (and therefore (2) is the INcorrect decomposition), because:
Dot under: U+0323, combining class 220
Acute accent: U+0301, combining class 230
   (again, there are other acutes, but they are definitely
   the wrong characters)
and 220 < 230.

(And similarly for the bar under and grave accent, which are
respectively parallel to the dot under and acute accent:
Bar under: U+0329, combining class 220
   (there is another bar under in the IPA area, U+02CC, but it
   has a combining class of zero, and is the wrong character
   to use for Yoruba)
Grave accent: U+300, combining class 230
   (again, there are other graves, but they are definitely
   the wrong characters)
)

Given that (1) is the correct decomposition, that implies that (3) is the correct Unicode Normalization Form C (= Composition), and (4) is INcorrect.

Does anyone see any holes in my reasoning?
--
 Mike Maxwell
 Linguistic Data Consortium
 maxwell@ldc.upenn.edu
Mike Maxwell  200
08-11-2005 03:11 PM ET (US)
I'm converting Yoruba text that uses a hacked font into Unicode, and I have a question concerning normalization (specifically, Unicode
normalization form C = Composition). There was some previous discussion of this on the kabissa list, see
http://lists.kabissa.org/lists/archives/pu...forum/msg00144.html.
In Yoruba, as you all know there are two vowels that use a dot under (or a vertical bar, but for the moment I'm using the dot under). These can also occur with acute or grave accent, as can the plain (dotless) vowels.
Unicode includes the vowels (in both upper and lower case) with the dot under (and also the 's' with the dot under), and it also includes vowels with either acute or grave accent. But it does not include any
characters with both an accent and a dot under (and is not likely to in the future).

Thus, I could represent the e-dot-accent and the o-dot-accent in any of four ways:

(1) e + dot + accent
     (U+0065 U+0323 U+0301)
(2) e + accent + dot
     (U+0065 U+0301 U+0323)
(3) dotted e + accent
     (U+1EB9 U+0301)
(4) accented e + dot
     (U+00E9 U+0323)

(and likewise for the 'o' and the upper case variants)

First, I believe (1) is the correct Unicode Normalization Form D (= Decomposition) (and therefore (2) is the INcorrect decomposition), because:
Dot under: U+0323, combining class 220
Acute accent: U+0301, combining class 230
   (again, there are other acutes, but they are definitely
   the wrong characters)
and 220 < 230.

(And similarly for the bar under and grave accent, which are
respectively parallel to the dot under and acute accent:
Bar under: U+0329, combining class 220
   (there is another bar under in the IPA area, U+02CC, but it
   has a combining class of zero, and is the wrong character
   to use for Yoruba)
Grave accent: U+300, combining class 230
   (again, there are other graves, but they are definitely
   the wrong characters)
)

Given that (1) is the correct decomposition, that implies that (3) is the correct Unicode Normalization Form C (= Composition), and (4) is INcorrect.

Does anyone see any holes in my reasoning?
--
 Mike Maxwell
 Linguistic Data Consortium
 maxwell@ldc.upenn.edu
Mike Maxwell  199
08-10-2005 11:32 AM ET (US)
Donald Osborn--I forgot to say, thank you! (And I'm saying it here, so it's public!)
--
 Mike Maxwell
 Linguistic Data Consortium
 maxwell@ldc.upenn.edu
Mike Maxwell  198
08-10-2005 11:31 AM ET (US)
QT - BisharatNet wrote:
>> BTW, we have a fewer older (c. 2001) Yoruba newspapers, and it
>> appears that in the text, although not in the headlines, the
>> underdots have been added in by hand. None of these papers marks
>> tone, except in the name of the newspaper itself, at the top of
>> the front page.
>
>
> Interesting. Having said that the typesetting is computerized
> didn't of course mean that the systems are cutting-edge. As for
> the tone marks, it would be interesting to know how readers find
> the texts. Probably a fluent speaker/reader would sort out the
> context at least most of the time. Are there a few tone marks -
> seemingly random to the non-speaker but probably for
> disambiguation - or none at all?

None (except in the name of the newspaper, on the front page). Of course these are older papers, so it may be that the underdots and/or tone marks are now added.

--
 Mike Maxwell
 Linguistic Data Consortium
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  197
08-09-2005 11:36 PM ET (US)
Edited by author 08-10-2005 12:13 AM
Mike, Sorry for the slow reply...

You asked (here in italics):
1) Does anyone know of any radio sites that have significant amounts of Yoruba text (preferably transcripts, although that seems unlikely)? I know of Change Radio, which does have significant amounts of text.

No, unfortunately. It may not be a priority of organizations involved in other media to maintain such internet sites with Yoruba news.

Marcel Diki-Kidiri and Edema Atibakwa Baboya, in an article on the presence of African languages on the web in Cahiers du Rifal No. 23 (Nov. 2003) http://www.rifal.org/cahiers_rifal/rifal23.pdf had a list of 24 sites relating in one way or another to Yoruba, but not a lot of text among them:


2) How are newspapers in Nigeria produced? Are they typeset on computers? If so, it might be possible to negotiate with the publisher for rights to the use of the electronic form of the text, even though this is not on-line.

I don't know for sure, but I would imagine from what I've gathered in Mali and Niger (without having investigated the question) that all newspapers in Africa use computers for typesetting. Getting rights to use copy is another issue. If you haven't checked AllAfrica.com, they have links or contact info for a lot of papers, but mainly in English and French. You could ask through a list like H-West-Africa for more info.

The only other thing I can think of is to put you in touch with an old contact at Michigan State, Folu Ogundimu, a journalism professor who would have much more relevant information and contacts in this area.

You also wrote:
BTW, we have a fewer older (c. 2001) Yoruba newspapers, and it appears that in the text, although not in the headlines, the underdots have been added in by hand. None of these papers marks tone, except in the name of the newspaper itself, at the top of the front page.

Interesting. Having said that the typesetting is computerized didn't of course mean that the systems are cutting-edge. As for the tone marks, it would be interesting to know how readers find the texts. Probably a fluent speaker/reader would sort out the context at least most of the time. Are there a few tone marks - seemingly random to the non-speaker but probably for disambiguation - or none at all?

The issues surrounding production of Yoruba text with available typesetting or desktop systems, which is to say the inconvenience of dealing with the diacritics (dots or tone marks) when one is not aware of systems, fonts, or keyboards that facilitate use of same, would seem to be a factor in hindering more (& better) production.

Hope this helps.

Don Osborn
Bisharat.net
Mike Maxwell  196
07-27-2005 08:22 AM ET (US)
QT - BisharatNet wrote:
> Mike, Thanks for your note (/m194). Re news sites and newspapers
> in Yoruba, or the lack of same, I would not make too much of
> this. Journalistic enterprises in Africa live very much on the
> edge financially, so go with the mass market such as it is.
> Which means for print, that even if most people don't speak/read
> English (or French) in a country, most of the people who can
> read and might pay for a paper know this language. In the case
> of Yoruba the main mass market would have to be radio. A website
> of a radio with Yoruba content might include Yoruba text (and
> some newspapers elsewhere have had sections online in African
> languages) but that takes a lot of effort to keep up to date.

Two follup-questions to all this, if I may.

1) Does anyone know of any radio sites that have significant amounts of Yoruba text (preferably transcripts, although that seems unlikely)? I know of Change Radio, which does have significant amounts of text.
2) How are newspapers in Nigeria produced? Are they typeset on
computers? If so, it might be possible to negotiate with the publisher for rights to the use of the electronic form of the text, even though this is not on-line.

BTW, we have a fewer older (c. 2001) Yoruba newspapers, and it appears that in the text, although not in the headlines, the underdots have been added in by hand. None of these papers marks tone, except in the name of the newspaper itself, at the top of the front page.
--
 Mike Maxwell
 Linguistic Data Consortium
 maxwell@ldc.upenn.edu

 "When I get a little money I buy books;
           and if any is left I buy food and clothes."
 --Erasmus
BisharatNetPerson was signed in when posted  195
07-25-2005 07:47 PM ET (US)
Mike, Thanks for your note (/m194). Re news sites and newspapers in Yoruba, or the lack of same, I would not make too much of this. Journalistic enterprises in Africa live very much on the edge financially, so go with the mass market such as it is. Which means for print, that even if most people don't speak/read English (or French) in a country, most of the people who can read and might pay for a paper know this language. In the case of Yoruba the main mass market would have to be radio. A website of a radio with Yoruba content might include Yoruba text (and some newspapers elsewhere have had sections online in African languages) but that takes a lot of effort to keep up to date. But there too, I think that a lot of Yorubaphones who might create a market of sorts for Yoruba text online can't access the internet for one reason or another, or if they can, the cost may be an issue. Another factor certainly is the relative lack of attention to first language literacy in schools.

One might see such an issue as an emerging one. As internet becomes more accessible the natural tendency is for first language content - and localized software - to get more interest and show its practical value. In the case of Yoruba we're talking of course about tens of millions of speakers, which is on a par with some European languages.

At the same time, English as the official language of the Federal Republic of Nigeria, as a LWC (language of wider communication) and as a "language of the belly" (i.e., people see it as a ticket to opportunity) can't help but retain its importance.

Don Osborn
Bisharat.net
Mike Maxwell  194
07-24-2005 09:49 PM ET (US)
QT - BisharatNet wrote:
> Also a quick request. Would it be possible to contribute the
> info that you gather to a wiki page on Yoruba at
> http://www.bisharat.net/wikidoc/pmwiki.php/PanAfrLoc/Yoruba ? It
> would be most appreciated.

I should be able to get an OK to do that (and I'll have a look at what's already there).

BTW, I did a fair amount of googling on Friday, looking for Yoruba language news sites, and came up empty handed. The only thing seems to be Change Radio, and they apparently haven't added anything since around 2002. By "news sites" I mean on-line versions of newspapers, magazines etc., such as exist in abundance in English (including many English-language newspapers in Nigeria). My guess is that there aren't any, because English is the domimant language in Nigeria and other countries where Yoruba is spoken; and although there are some Yoruba-language print newspapers, there is no market for on-line versions in Nigeria (or elsewhere). I would expect the same holds true for Igbo and Hausa. But I would love to be proved wrong! --
 Mike Maxwell
 Linguistic Data Consortium
 maxwell@ldc.upenn.edu
BisharatNetPerson was signed in when posted  193
07-24-2005 04:31 AM ET (US)
Isaac, Thanks for your question (ref /m189). Although this message board mainly deals with information technology & Yoruba language, there are some language learning sites that have been mentioned. Feel free to browse or search the board for information. Below are a couple of references - we'd be interested in knowing about more web-based Yoruba language instruction initiatives, and how they present Yoruba text online.

Don Osborn
Bisharat.net

Learn Yoruba
http://www.learnyoruba.com

Yoruba at the University of Wisconsin-Madison
http://lang.nalrc.wisc.edu/nalrc/yoruba/index.html
BisharatNetPerson was signed in when posted  192
07-24-2005 03:29 AM ET (US)
Mike, Andrew, all,

Thanks for this news re the work at Penn's LDC and comments (ref /m188, /m190, /m191).

There are a few 8-bit fonts referred to on this message board, as you've seen. Though it lacks a search function one can click "All messages" at the head of the column of messages and then do an ordinary Ctrl-F search on strings. You can also look at the A12n page at http://www.bisharat.net/A12N/#font , under language-specific fonts (this whole page needs updating however).

Re the nature of the fonts - in principle, since Yoruba has three mark-under character pairs and then a series of possible tone marks, you are right to assume that the number of characters would not dictate a need for modifying the ASCII range (aka Basic Latin). I haven't evaluated the Yoruba fonts enough to be sure. Again in principle I think that using the upper ANSI range (aka Latin-1 Supplement) with a few modifications one could have a workable 8-bit font with all necessary combinations. Sorry I can't give any more specifics but maybe Andrew or someone else can, and if not we'd be interested to know your findings.

That said, there is another reason why the ANSII/Basic Latin range has been modified in 8-bit fonts for some languages, and that has to do with provisions for input. At least one set of fonts was created by replacing the glyphs of characters "not used" in Bambara with the extended characters needed to write the language. Not sure that this was ever done for Yoruba. Naturally, as you point out, this is an incovenience if you want to include words or text in another language too. (Not to mention the lack of compatibility issue of course.)

There's a whole set of other issues that have come up in such discussions before relating to standardized ways of handling characters in Yoruba. In the absence of a full repertoire of precomposed diacritical combinations for Yoruba (and arguably, in the absence of a need for same, if dynamic composition indeed lives up to its promise) there are different ways one can arrive at, say a lower case o with dot under and a high tone (acute accent). All of these require reaching beyond the 256 character ANSI (Basic Latin + Latin-1 Supplement) character set in one way or another. A useful convention would be to use precomposed dot-under characters from the extended Latin ranges plus combining diacritics for the tone marks. That is, if it is agreed to use dots under rather than the small vertical line for standard Yoruba (the latter could be used as an option that we dubbed "classic Yoruba" but that is merely a suggestion). Basically there are some standardization issues with regard to handling Yoruba text for which there are to my knowledge no "official" decisions/guidelines.

One other point: If you're not aware of it, there is a pretty well established project for the conversion of legacy fonts to Unicode funded by Agence Intergouvernementale de la Francophonie called RIFAL. They have a lot of experience in this issue with other languages of West Africa, though not Yoruba I'm sure, but might be worth contacting if you haven't already.

Also a quick request. Would it be possible to contribute the info that you gather to a wiki page on Yoruba at http://www.bisharat.net/wikidoc/pmwiki.php/PanAfrLoc/Yoruba ? It would be most appreciated.

Thanks for letting us know about this.

Don Osborn
Bisharat.net
Mike Maxwell  191
07-21-2005 11:19 PM ET (US)
QT - Andrew wrote:
> I'll go through my collection of fonts and see what I have.
> Although from memory I fon't have many 8-bit ones. Been
> concentrating on Unicode support.
>
> Encoding issues really depend on the purpose you are putting the
> fonts to. The first issue is that none of the 8-bit fonts in use
> for Yoruba follow any documented international standard. Which
> essentially means that all existing encoding conversion tools
> will not work with Yoruba text. It would be necessary to develop
> custom mapping tables and use tools like TECKit to convert to
> and from these encodings.

Yes, we're quite familiar with that problem in Indic languages, and we've built encoding converters before. It should be a lot simpler with Yoruba, I would hope; the problem lies in figuring out what each code point means, and how it (or in the worst case, a series of code points) should get turned into Unicode.

> Its easier to convert 8-bit fonts to unicode and work with
> unicode these days.

Yes, that's exactly the plan. If it's a "bilingual" font, meaning that the ASCII (lower 128) characters are left intact, then we can run the whole html page through the encoding converter. Alternatively, if it's a "monolingual font", as some of the Indic ones are, meaning that all code points are potentially used for special characters, then we extract the text from the html code, and do the conversion on the text alone. I would imagine the Yoruba fonts are bilingual fonts, as there aren't that many special characters that need to be accomodated, and most of the ASCII characters are needed anyway (maybe all, if English names appear in their English form).

> At least they are some of the issues. If you could let me know
> what you wnat to know, i could be more precise.

What I would need is simply to know what glyphs are at each code point. I wouldn't have to have the actual font, if there is a problem with that (like you have to pay to get the font); a gif of the 256 (less control characters) code points would suffice.

Well, I guess it would also be nice to know which fonts are actually used commonly on the web. It looks to me like a lot of the Yoruba text on the web (not that there is a whole lot) leaves off the tone accents and underdots entirely, but I haven't really had a chance to look. Given that there are likely to be multiple encodings, my usual search technique (google for common words) probably won't work, and I'll have to resort to portals etc.
--
 Mike Maxwell
 Linguistic Data Consortium
 maxwell@ldc.upenn.edu
Andrew  190
07-21-2005 09:16 PM ET (US)
Hi Mike,

I'll go through my collection of fonts and see what I have. Although from memory I fon't have many 8-bit ones. Been concentrating on Unicode support.

Encoding issues really depend on the purpose you are putting the fonts to. The first issue is that none of the 8-bit fonts in use for Yoruba follow any documented international standard. Which essentially means that all existing encoding conversion tools will not work with Yoruba text. It would be necessary to develop custom mapping tables and use tools like TECKit to convert to and from these encodings. I've done this for a number of Southern Sudanese languages, where we've needed to convert old word documents and rtf file before digitising them.

Other key issue is the availability of the fonts and the copyright/licensing agreements (ie can the fonts be free downloaded and access or redistributed as part of a project).

When using XML docuemnts, it is not possible to identify the character encoding unambiguously (since you're refering to a character encoding that all existing XML software would not recognise).

The killer, at least for information on the internet, is the behaviour of web browsers. Web pages in Yoruba 8-bit fonts would need to be incorrectly labeled as a western european character encoding or identified as a user-defined character encoding.

If web pages are passed and latered by some intermediatary web service, then a page identified as western european may get mangled.

Additionally windows-1252 codepoints in the G0 zone are converted to appopriate unicode codepoints before the browser displays the page. If the Yoruba font has characters in the G0 codepoints windows-1252 uses, these characters will be incorrectly rendered (unless of course thge font has been hacked to accomodate this).

If the web page uses the user-defined slot, then similar problems may occure depending on the browser. In theory, the user defined encoding slot in web browsers shouldn't be remapped, although soem testing indicates that IE (for instance) actually uses a windows-1252 mapping for the user defined slot. I haven't verified this.

Its easier to convert 8-bit fonts to unicode and work with unicode these days.

At least they are some of the issues. If you could let me know what you wnat to know, i could be more precise. Sorry for the vagueness of the response.
Isaac Jadesimi  189
07-21-2005 04:08 PM ET (US)
I thank you very much for your e-mail message.

Could you kindly let me know the relevant Websites, in order to locate courses on "Beginning/Learning Yoruba", as well as the relevant textbooks, relative to those beginning to learn the Yoruba language.

Thanking you in advance,
Sincerely,
Isaac Jadesimi.
jades@tamcotec.com
>
< replied-to message removed by QT >
Mike  188
07-21-2005 02:27 PM ET (US)
Andrew--

At the Linguistic Data Consortium (LDC) of the University of Pennsylvania, we are just embarking on a corpus-building effort for Yoruba. We also have a researcher here (Dr. Yiwola Awoyale) who is making a Yoruba-English dictionary.

In the past, we've discovered how important it is to understand the encoding systems that are actually in use. I would be very interested in learning what you have in the way of non-Unicode fonts for Yoruba, especially the encoding issues. Thanks!
Dr. Samuel Olamijulo  187
07-07-2005 05:32 AM ET (US)
SUBJECT: Yoruba Reference Bible Online : Bibeli Yoruba Atoka - Announcement by Olamijulo S.K. Family
 
At http://www.hopeafricaepublisher.com/yrb062505.html
 
Access by ALL to the Holy Bible in eventually many African Languages provided FREE by African Bibles Online.
 
We begin with Bibeli Yoruba Atoka for reasons of logistics.
 
Dr. Samuel Kayode Olamijulo Family
Andrew  186
05-08-2005 08:19 PM ET (US)
Thanks for your comments Don,

There are versions of the perl script for both Igbo and Dinka as well as Yoruba, and I should have the Nuer version finished later today.

The form works best on either Mozilla, Firefox or IE on Windows XP SP2. Fonts with the appropriate OpenType tables include Code2000, Doulos SIL, AfRomanSerif and AfSans.

haven't tested it on linux or MacOS yet.

I was planning on writing a small document on "Unicode and African languages" focusing on the need of African languages that use the Latin script, but require complex processing. That will also address what normalization means.

No that I have a spare week, I can concentrate on getting my site uptodate.

I was thinking of adding accesskey attributes to all the buttons. I just need to determine the best keyboard shortcuts to achieve that. Using the access key approacj should be relatively simple.

Although I was thinking taht another approach would be to design a Mozilla Firefox toolbar. I've been looking at a Mozilla/Firefox toolbar for Indic languages, which allows you to type in Indoian languages within HTML forms in Firefox or Mozilla.

A similar approach could be useful for African languages. Something i'll experiemnt with when my current projects are complete.

I'm also in the process of compiling some mapping tables for legacy african fonts, which will allow a quick and easy conversion from the legacy encoding to Unicode. I've been using TECKit and CC. The new evrsion of Encoding Convertors by SIL is very useful. It allows you to copy old 8-bit data to the clipborad convert it to unicode and paste it into an application as unicode. Alternatively you can open an old word document and convert the document to Unicode and chage the fonts at the same time. It has proved quite useful.

When I have time, I'll prepare some mapping tables for some of the 8-bit Yoruba fonts I have access to.

Andrew
BisharatNetPerson was signed in when posted  185
05-08-2005 05:35 AM ET (US)
Andrew, Re the the page with virtual keys, composition window and converters you mention in /m181, this is quite useful. I note among other things, correct placement of the combining-diacritic tone marks on capital and lower case vowels.

My system does not display the dot-under s's (lack of appropriate font) but other than that all works well.

It occurs that some small explanation of "Unicode Normalization Forms C and D might be useful.

Another feature that might be helpful, though harder to implement, might be a key combo input option like what one finds at the http://french.typeit.org page. There Ctrl+e equals é ; strike it twice and get è ; three times, ê ; and four, ë. Yoruba is different in that all (?) vowels can have tone markings, so the key combos would be more complicated. But if one started with say a Ctrl+e, o or s equalling ẹ, ọ, or ṣ, then that might be a start. Ctrl+' or ` could make the combining rising or falling tones.

Just some thoughts.

Don Osborn
Bisharat.net
FAKINLEDE TOMIIWO  184
05-07-2005 05:49 AM ET (US)
DADDY
     MY NAME IS TOMI THE SONE OF YOUR BROTHER I JUST WANT TO SAY HELLO TO YOU


                                  THANKS
Ade G. Oyegbola  183
03-14-2005 05:11 PM ET (US)
Edited by author 03-15-2005 08:47 PM
Response to Message #182 by Dr. Ọlamijulọ

We at KỌNYIN were indeed very glad to contribute to this discussion group.

I, Ade G Oyegbọla, wish to state that I'm not an Engineer as stated in the message by Dr. Ọlamijulọ in message #182. I can understand the assumption after having provided such detail technical information to the group, but the information I provided are the result of over six years of work by many people in our parent company Lagos Analysis Corporation (LANCOR). I think credit about my technical education and the information I have provided belongs to the following individuals: Engr. O Walter Oluwọle, Chief Technology Officer; Kelvin Oluwọle, Software Engr.; Sergey Torpokov, Software Engr.; George C.K Van-Lare, Research Director; Funmi Oyekusibẹ, Research staff; Chinyere Offor, Research staff; including Dr. Victor Manfredi and Kọle Ade-Odutọla, linguists.

Our product KỌNYIN keyboard with 106 multi-function enabled physical keyboard includes 14 Nigeria language specific alphabets and 13 diacritical signs, clearly labeled on the keyboard. The keyboard layout is patented under Nigeria Law and represents the "First Truly Complete PC Keyboard for Nigeria Official Languages" combined. (English, Ẹdo, Ẹfik, Fulani, Hausa, Igbo, Kanuri and Yoruba)

The keyboard truly does not change how you type today, and does not use the dreaded "dead key" typing process.

Please visit us at http://www.konyin.com for more information regarding the availability of various models of the keyboard.(PS/2, USB, Multimedia, Wireless, etc)

Mr. Ade G. Oyegbọla
President & CEO
www.konyin.com
Dr. Samuel Olamijulo  182
03-14-2005 03:17 PM ET (US)
 SUBJECT: Yoruba Today on Computers and Internet- 03.14.05 Endorsements

The Yoruba language has made very gratifying progress on computers in recent times. Real practical challenges currently however remain, for easy, effective communication by ordinary users at various Internet and e-mail servers/portals.

We are very grateful to all who have participated openly and productively in Yoruba Language Development for Yoruba People Development Worldwide.

Contributions by Engineer Ade Oyegbola, KONYIN President and CEO and Dr. Kayode Fakinlede author of Yoruba – English, English- Yoruba Dictionary, quoted below are endorsed for serious consideration and constructive action by all in everyway we can whenever we have the opportunity. We are all enjoined to play our own parts until together we succeed in persuading Microsoft authorities to create the apparently much needed codepage that includes Yoruba language specific alphabets and others in the Niger-Congo group of African Languages.

The only direction we must look from here is up.

Olodumare Oba A le wi le se a fun gbogbo wa se o - Amin.

Olamijulo S.K.
----------------------------------------------------------------------------

QUOTE 1. From Engineer Ade. G. Oyegbola:

Here is a recap of where we think we are today when it comes to Yoruba language and computing environment.
1. Input Devices: Keyboard (No Problem)
There are a lot of input devices in the market today that can be used to type Yoruba in computing environment. There are numerous virtual Keyboard layout software, and there is the Physical keyboard from KONYIN. There is not shortage of input devises, just the question of convenience from having to understand multiple combination keystrokes and shortcuts; this problem will be eliminated with the introduction of the first 106 double function physical keyboard from KONYIN

2. Display: Fonts (No Problem)
To translate the digital code from the input device (keyboard), the operating system needs a font that has the glyphs representing the letter. There are lots of fonts in the market today that will correctly display Yoruba alphabets in any application. Fonts like: Gentium, Ariya, Arial Unicode MS, and more. All Yoruba alphabets are correctly represented in the Unicode Charset, so any font developer can use the Unicode code point in creating fonts for use by Yoruba people.

3. Applications: (generally no problem)
Most applications that are Unicode compliant can handle input and display of Yoruba alphabets. While some have problems saving files with a name that includes Yoruba alphabets, this problem is limited to a few old applications that still rely on the system default codepage as controlling. All Microsoft Applications can handle Yoruba alphabets. Corel and Adobe applications also can handle Yoruba alphabets. If you have any application that is a 2000 version and above, you should be able to input and display Yoruba alphabets. Sending what you type to another person over the internet or e-mail portals/servers is another issue.

4. Operating Systems: (Interoperability text transmission problem)
Most Operating systems, like Microsoft, Apple, Linux, IBM, Solaris and others used to depend on a text translation process that uses codepages to identify languages and then determine the correct codepage to translate a digital code into text. More recently, these Operating systems have now migrated to a text translation process that uses a more global approach called Unicode Charest. The introduction of the Unicode standard is expected to make the codepage obsolete, and it is working but very slowly. To the best of our knowledge, most Operating systems are 90% Unicode compliant. Unfortunately the remaining 10% represent a very critical pathway for interoperability (i.e. Internet and e-mail servers). Because some of these operating systems still use codepages, when there is an issue with text translation using Unicode Charest, the system tries to use any available codepage in the system to resolve the issue. Yoruba language specific alphabets are not represented in any codepage, so translation of our text over cross-platform is inconsistent. If by some miracle all major operating systems become 100% Unicode compliant, this problem will go away. This problem does not affect documents sent as attachments .

5. In General:
We do not have much expertise on the issues relating to display of Yoruba alphabets on website, however in creating our own website, we used the web embedding software from Microsoft, and this software allows website creators to embed their specific font into the website, so that all visitors will see the site in this font. This solution takes care of the issue of incorrect display of alphabets because the Computer in use does not have the required font to view the site correctly. http://www.microsoft.com/typography/web/embedding

6. Finally:
With the exception of cross-platform interoperability for most Internet and e-mail servers/portals, the Yoruba language is computer ready. This is not to say that, we should standby and wait for all operating systems to be fully Unicode complaint, this will take a very, very long time. We need to find a way for a backward compatibility with old applications that still rely on codepages. It is worth remembering that a lot of computer users in Nigeria and most third world countries still use old operating systems and applications.

I hope this recap is informative and will help to further focus our efforts in making Yoruba language a force in global computing.

Ade G Oyegbola
President & CEO
www.konyin.com

END OF QUOTE
----------------------------------------------------------------------------------------------

QUOTE 2 - From Dr. Kayode Fakinlede:

The efforts of all of those who work to make our language, Yoruba, easily accessible by computer and the Internet are commendable.
I do not profess to be computer savvy, however, I cannot but be impressed by the continued persistence of people like Dr. Olamijulo, Mr. Oyegbola, etc. to make things easy for the estimated ‘100 million’ people who, in a way, have some Yoruba blood in them.

I believe, however, that a fundamental issue is being overlooked here. The supposition is that once the problems of non-English letters are completely solved, Yoruba people will begin to use the alphabets to communicate. This would probably not happen, or it would not happen to any appreciable extent.
The main issue, as I see it, is that if we follow all the nuances of written Yoruba as it is today, the language is very difficult and time wasting to write accurately, and to type. For example, it takes me an average of 4 hours work presently to accurately produce (type) a manuscript that will take a writer in English 1 hour.

Secondly, accuracy, to me, in the Yoruba language, is not accuracy to someone else, even someone from Southwest Nigeria. Written Yoruba in other nations take different dimensions. Where there is some complexity in anything as important as a language, what normally would happen is that some rules will be put in place to make things easy and universal for all. There have to be some rules put in place in writing Yoruba, so that we can easily communicate with one another and across borders and all those who are eager to learn the language will benefit.

For our children, even in Nigeria, to appreciate the beauty in our language and our culture, we need to simplify the writing and reading of the language ……………..
Respectfully Yours,
Fakinlede

END OF QUOTE
Andrew Cunningham  181
03-10-2005 07:58 PM ET (US)
Hi everyone,

for those interested, I've thrown together a CGI script that generates a HTMl form allowing you to type in Yoruba Unicode and normalize it to NFC and NFD. It will also convert extended Latin characters to HTML numerical character references.

The form also has a virtual keyboard.

Have a play at http://www.openroad.net.au/cgi-bin/tools/yoruba.cgi

Andj.
Andrew Cunningham  180
03-01-2005 09:13 PM ET (US)
Dear Ade Oyegbola,

thank you for some interesting emails. Its given me something to think about.

QT - Ade G Oyegbola (Oyegbọla) wrote:
> --QT-------------------------------------------------------------
> Reply by email or visit
> http://www.quicktopic.com/15/H/KKgbRqJUAR8/m179
> -----------------------------------------------------------------
>
> Dear Andrew Cunningham,
>
> All your points are well taken, but they do not address the core
> issue raised by Dr. Olamijulo's (Ọlamijulọ's)
> conference to wit: Yoruba Internet Display Issues
>

True, they're things i've discussed in the past, but I tend to discuss these issues so often, in some many different forums that I forget where i've commented and where I haven't commented.

> The issue of correct text display for messages sent via email
> engines,internet protals and internet application platforms are
> different from generic applications support for character
> display in the application itself.
>

The crux is that you are refering to web services that haven't been internationalized.

If I send and recieve email from a unicode aware email client using a POP3 or IMAP acocunt I have no problems what so ever sending and recieving Yoruba emails.

I'm not an expert in MIME, but form my understanding a few different things need to happen, and a lot of web based email services get soem of this wrong.

Yahoo (english language versions) and Hotmail (English language
versions) are notorious for this.

The US and Australian versions of Yahoo, and I assume other English language versions of Yahoo will identify the character set of an email as being US-ASCII. That tends to screw up display for email clients that correctly handle character encodings, forcing the user to select an appropriate encoding. Some email services do not send out charcater encoding info at all.

depending on the browsers default character encoding, Yoruba text may either be inserted as a unicode character or as a HTML decimal NCR. NCRs aren't helpful for text based email clients.

The best work around I've found for Yoruba and other languages when using Yahoo or Yahoo Groups is to use Firefox, set the default encoding to UTF-8 and add the URIid extension. It is then posisble to write CSS instructions in the userContnet.css file specifically for Yahoo mail and Yahoo groups to set an appropraite font for Yoruba (or other languages). This technique is at best an awkward hack.

The reality is the English language interface for Yahoo mail or Yahoo Groups wasn't written for languages other than English.

What is really needed is a Unicode based web mail client that can be tailored (via preferences) for each African language and that respects the email protocols (including transfer encodings).

As an example, we're building a portal for Australian Public Libraries which will contain multilingual web directories. It is all Unicode based. It is intended to be flexible enough to handle any language we may need (including African languages).

The portal will support language specific CSS, and other language specific features, including support for virtual keyboards and in some case custom collation routines.

Most web designers are very bad at internationalization, and aren't familiar with the W3C character model or the W3C's internationalization authouring techniques.

There is a wealth of information on how to developed a well
internationalized web service. Many web developers don't bother or are unaware of how to do this.

If american and european based web services are inadequate, maybe its time to start pulling together a group of organizations that will develop web services and portals suitable for African languages, and hopefully not reproduce the mistakes and errors of existing services.
Personally I feel that a lot of work needs to be done, and rather than relying on US and European solutions that are unsuitable for African languages, it may be better to advocate for and champion projects that have been designed to accomodate African languages.

> A very good example is this QuickTopic platform and
> application.(I can easily type my messages using all unicode
> characters, but if I use any character that is not in the
> default codepage of the hosting server/portal for my posting
> name, it will not display correctly.)
>

LOL, yep, but then QuickTopic wasn't designed to do that. The point is most web services and portals were designed to meet the need of the designers. Often internationalization isn't considered.

Its worth reembering the sender header in an email is supposed to be 7-bit, soem email servers will reject messages that don't have 7-bit clean headers. So in your case, the sender field needs to be reencoded. This should be done by the email client or by the web-based mail server.
But as I said many web based solutions aren't written properly and according to the specs.

> You have to belief me, we have spent of 6 years on the issue of
> input and display devices for Africa (Niger-Congo) latin
> scripting. The only reasonable answer is a codepage for this
> region. (By the way, Arabic is the scripting for most of the
> North Africa Languages)

Granted.

Although worth noting that the Arabic codepage, really only supports the Arabic language itself, other languages like Persian, Dari, Sindhi, Pashto, Urdu, etc which use the Arabic script can not be displayed using the Arabic codepage. For these languages all support is Uniocde based or using hacked legacy codepages.

I started working with Vietnamese web pages in 1994. There are lots of different organisation or company specific codepages, there were also national standards for Vietnamese. Evenetually Microsoft created a Vietnamese codepage, but very few people actually used it, and it has been superceeded by Unicode.

And only some opensource applications provided support for the Windows Vietnamese codepage, others avoided implimenting it.

>
> Yes, 24 tonal marks, represented in the Unicode Chart and used
> by most languages in the Africa (Niger-Congo) area of Africa.
> Our focus at KỌNYIN (KONYIN) is not a signle language.

ahh, ok, then its not just a code page issue, it may also be a font rendering issue. Having looked at your codepage proposal, I suspect that even if Microsoft did implement it, they would implement it i a similar manner to that for Vietnamese. Ultimately all data in the codepage would be converted internally to unicode for processing and display. It would require OpenType fonts that would handle diacritic positioning and stacking, i.e. you would have gone around in a circle. To implement teh codepage, you'd already have to have the pieces in place to handle Yoruba in Unicode.

Vietnamese codepage is implemented via unicode, using Vietnamese language specific tables in the OpenType font to handle tone marks. Although all Vietnamese characters exist as precomposed characters in Unicode, Microsoft's approach to keyboard design and langauge
implementation lead to a codepage that has single unicode points for discrete vowels and combining diacritics for tones. The Open Type fonts and Uniscibe shipped with Windows2000 added the required support.
If they follow this pattern, which seems to be what would have to happen in order to support the 24 diacritics in your proposal, then you require the necessary pieces to be in place to support Yoruba unicode in order to support your Yoruba 8-bit codepage?

But then again I could be wrong.

Best of luck with your work.

Andrew
Ade G Oyegbola (Oyegb&#7885;la)  179
03-01-2005 07:45 PM ET (US)
Dear Andrew Cunningham,

All your points are well taken, but they do not address the core issue raised by Dr. Olamijulo's (Ọlamijulọ's) conference to wit: Yoruba Internet Display Issues

The issue of correct text display for messages sent via email engines,internet protals and internet application platforms are different from generic applications support for character display in the application itself.

A very good example is this QuickTopic platform and application.(I can easily type my messages using all unicode characters, but if I use any character that is not in the default codepage of the hosting server/portal for my posting name, it will not display correctly.)

You have to belief me, we have spent of 6 years on the issue of input and display devices for Africa (Niger-Congo) latin scripting. The only reasonable answer is a codepage for this region. (By the way, Arabic is the scripting for most of the North Africa Languages)

I don't want to go off issue here.

All the items you have listed as solutions are perfectly OK going forward, but unless you can make all the internet servers and email portals around the world update their softwares to Unicode standard, and stop relying on users' system default codepages, we will continue to have the display problems.

If Microsoft did not have any problem with updating all their existing codepages in 2000 to include the Euro currency symbol, I think Microsoft should not have a problem creating a new codepage to account for Africa (Niger-Congo) languages spoken and writing by over 500 million people.

Yes, 24 tonal marks, represented in the Unicode Chart and used by most languages in the Africa (Niger-Congo) area of Africa. Our focus at KỌNYIN (KONYIN) is not a signle language.

I don't really agree with you that "it is bad software design." For the life of computing, all developers have used the codepage as bases for text scripting and to conclude that becasue we now have a better global standard for the process, does not mean that we should immediatley discard all legacy applications.

I can assure you that it will take a very very long time before all the internet portals and email servers around the world upgrade to fully Unicode standard.

Ade G Oyegbọla (Oyegbola)
www.konyin.com
Andrew Cunningham  178
03-01-2005 06:30 PM ET (US)
Dear Ade Oyegbola,

QT - Ade G Oyegbola wrote:

> However, this issue of cross-platform
> interoperability does not start and stop with Microsoft alone.
>

The first question is, have you tried the ISO path to get a new ISO-8859 character set?

I suspect that Microsoft will not budge on this one, and even if it did, it will not help most users who are not on the latest microsoft windows or the latest microsoft applications.

> For example, major applications like Adobe, Corel, Winzip and
> even Microsoft's Outlook still uses the operating system default
> codepage for some text computing, and where the character is not
> included as a code point is the default codepage, you cannot use
> it. Even Office 2003 & Office XP dictionary is completely
> codepage dependent, you cannot add a word containing a Yoruba
> character into the dictionary. There are still pockets of
> computing in Windows XP that relys on the system default
> codepage.

Understood. But thats why I feel its important to push for better Unicode support for Yoruba. Otherwise these things will not improve, and Yoruba will continue to be in limbo.

I suppose my situation is different to your situation. You require adequate support for Yoruba.

I require adequate support for multiple languages using multiple scripts. The practical approach for me is to lobby for better Unicode support.

I've tested lots and lots of software over the years that claim to be unicode compatible. Most are built using the windows95
internationalization model, ie their support for uniocde is actually based on codepages rather than native unicode support. Very frustrating.
Umm .. are you sure about the dictionaries in Office 20003 and Office XP?
 From memory Gujarati, Hindi, Kannada, Marathi, Punjabi, Tamil and Telugu have spell checkers in Office 2003 (via the proofing tools package) and I was under the impression that Microsoft did not support codepages for these languages. I thought that support for these
languages was Unicode based.

> We have had many a long conference calls with some Microsoft
> employees about the need to provide additional support for all
> the legacy applications out there today that still depends on
> the system codepage for text translation. While, most of them
> agree the bridge needs to be built, no one wants to commit to
> building it.

a problem.

> We have suggested creating a new codepage using the CP_1252
> template and replacing about 76 code points that mostly contains
> diacritical marks, with unique characters and then adding
> combining diacritical marks as independent code points. (52
> characters and 24 tonal marks)
>

24 tonal marks?

> If you are interested in the template, please send me a seperate
> mail (Oyegbola@Konyin.com)
>
> Finally, Microsoft Longhorn will be great for Microsoft
> Application environment, but it will take a long time for other
> application developers to fully adopt the Unicode approach.
>

catch-22. the same issue will probably arise for a Niger-Congo codepage.
> Microsoft can and should create this one last codepage for
> Africa (Niger-Congo) area and release it as critical patch for
> all the currently supported operating systems. (Windows 2000, XP
> and 2003).
>

Unfortunately, Microsoft aren't likely to add additional language support to Windows2000. The trend is to add langauge support in a new version of the operating system, or maybe to the latest operating system via a service pack as was the case with WinXPSP2.

If they were to add a new codepage, it would also have to be base don unicode support, ie they way they implemented Vietnamese. From my understanding this would limit Yoruba to versions of Windows that had an appropraite version of Uniscribe installed, ie Windows XP SP2.

Then there is the additional issues. If Microsoft create a codepage for Niger-Congo, they'll also be under pressure to create codepages for every other language group or region in Africa, South East Asia and Central Asia.

I understand your concerns, and am sympathetic to them. I just think that a better way forward would be to push Microsoft to improve their Unicode support. And push other software developers to internationalize their software properly.

To my mind, the key problem is bad software design practices.


Andrew
Ade G Oyegbola  177
03-01-2005 10:31 AM ET (US)
Edited by author 03-01-2005 02:32 PM
Reply to Message 176

Dear Andrew Cunningham,

I will agree with you that Microsoft is firm in moving in the direction of Unicode scripting. Your points on this issue is quite correct. However, this issue of cross-platform interoperability does not start and stop with Microsoft alone.

For example, major applications like Adobe, Corel, Winzip, LotusNotes, Eudora, other e-mail & internet platforms and even Microsoft's Outlook still uses the operating system default codepage for some text computing, and where the character is not included as a code point is the default codepage, you cannot use it. Even Office 2003 & Office XP dictionary is completely codepage dependent, you cannot add a word containing a Yoruba character into the dictionary. There are still pockets of computing in Windows XP that relys on the system default codepage.

We have had many a long conference calls with some Microsoft employees about the need to provide additional support for all the legacy applications out there today that still depends on the system codepage for text translation. While, most of them agree the bridge needs to be built, no one wants to commit to building it.

(It is OK that some people know how to swim and will not need a bridge to cross the river, but should we force everybody else to learn to swim just to cross the river?)

We have suggested creating a new codepage using the CP_1252 template and replacing about 76 code points that mostly contains diacritical marks, with unique characters and then adding combining diacritical marks as independent code points. (52 characters and 24 tonal marks)

If you are interested in the template, please send me a seperate mail (Oyegbola@Konyin.com)

Finally, Microsoft Longhorn will be great for Microsoft Application environment, but it will take a long time for other application developers to fully adopt the Unicode approach.

Microsoft can and should create this one last codepage for Africa (Niger-Congo) area and release it as critical patch for all the currently supported operating systems. (Windows 2000, XP and 2003).

Just my thought.

Ade G Oyegbola (Oyegbọla)
www.konyin.com
Andrew Cunningham  176
02-28-2005 10:44 PM ET (US)
Hi all, a few quick comments:

QT - Ade G Oyegbola wrote:

>
> First, Yoru&#C3;C;ba&#C3;&#AC;s (Yorubas) needs to get Microsoft to create our
> own National Language Support b (NLS) (also called Language
> Locale) page in Windows.
>

I doubt this will happen. Microsoft have clearly identified their position that new language support will be Unicode based.

And they would also argue that as of Office 2003 and Windows XP SP2, windows correctly handles Yoruba diacritics. They would also tell you that new fonts will be available in Longhorn.

Using fonts like Code2000 or Dolous SIL on Office 2003 or in any unicode app that supports uniscribe on WinXP SP2 you already have Yoruba support.
If Microsoft did the unlikely thing and add a codepage for Yoruba, its likely that would only be added to future operatings systems like longhorn or after and not made available to older operating systems.
These days Microsoft doesn't tend to retrofit, take Internet Explorer 7 as an example, it will only be available for WinXP SP2, not on older versions of windows.


> Second, each NLS is tied to a codepage that contains all the
> characters needed to translate the text in any operating
> systems.
>

only if evbryone adopts the new NLS and adds support for it. A new NLS is taken up, it may be years befoere you see sufficient follow through.
>
> FYI, we at KỌNYIN (KONYIN) have submitted a suggested
> codepage template request to Microsoft for creation, but
> Microsoft keeps telling us that it will not create a new
> codepage and we should concentrate our effort in promoting
> Unicode text scripting to application developers instead. We can
> use some help from the Yoruba community to pressure Microsoft
> into action.
>

Even if Microsoft add a new codepage, this would only affect Longhorn or maybe the operating system version after longhorn. It will do nothing for older operating systems such as Win95/98/ME/NT/2000/XP and so on.
Also, a Microsoft codepage is unlikely to get picked up in some open source software, where the preference is to support iso-8859 series encodings rather than Microsoft codepages.

Yoruba can be done in Unicode on WinXPSP2 or in Office 2003 now. Thats better than Myanmar and other languages that aren't supported yet and may or may not be supported in Longhorn.

An 8-bit locale that isn't compatible with older software, older fonts and older operating systems provides little benefit, and that would take years to propogate is a problematic move. Also there is the political issue of which of the numerous legacy 8-bit fonts would form the basis for a new 8-bit encoding.

The xml and html tools I use now, support Yoruba in Unicode. Why would I want to go backwards?

There are problems with some web services and tools. Best to fix the problems for the future, rather than leaving the problems there and creating a new set of incompatabilities.

How many codepoints would be needed to add support for an Africa
(Niger-Congo) Language Group codepage?

Just my 2 cents worth, please feel free to disregard.

Andrew.
Ade G Oyegbola  175
02-28-2005 08:51 PM ET (US)
New Codepage as solution for Yoruba Internet Display Issues

What is a Codepage?
A codepage is an ordered set of characters in which a numeric index (code point values) is associated with each character. The first 128 characters of each codepage are functionally the same and include all characters needed to type English text. The upper 128 characters of OEM and ANSI codepages contain characters used in a language or group of languages. Each defined codepage contains a maximum of 256 code points.

The system locale (sometimes referred to as the system default locale), determines which codepages and associated bitmap font files are used as defaults for the system. These codepages and fonts enable non-Unicode applications to run as they would on a system localized to the language of the system locale. These codepages and fonts are used by non-Unicode applications to emulate operation on a system localized to the language selected as the system locale.

In a cross-platform transmission of texts (Internet):

The characters are converted to digital codes using the code points of the input language codepage. When the digital codes arrive at the receiving system/platform, the incoming digital codes checks if the system/platform is using Unicode for translation, and if yes, the digital codes in the incoming message is translated using the Unicode code point table, if the answer is no, then the incoming digital codes will look for the similar input language codepage on the receiving system/platform to translate the digital code points into displayable characters.

The problem we are having with YoruÌbaì (Yoruba) characters and tonal signs, is that these characters are not contained in any of the 20-150 codepages in most operating systems. Most YoruÌbaì (Yoruba) keyboard layouts use 1252-codepage and it does not contain our unique characters. In that case when the digital codes use the input language 1252 codepage for translation, we get intelligible or weird prints because the code point for the digital codes cannot be found in the codepage.

I hope this explanation is not too technical and can be understood by all.

We really need Microsoft to create a new Codepage for Africa (Niger-Congo) Language Group, which includes YoruÌbaì (Yoruba)

Thanks

Ade G Oyegbọla (Oyegbola)
President & CEO
http://www.konyin.com

--------------------------------------------------------------------------------
Ade Oyegbola <Oyegbola@konyin.com> wrote:

Dear Dr. Ọlamijulọ (Olamijulo)

The problem is not fonts but “codepage”

Operating systems developers like Microsoft, Apple, Linux, IBM and others have to create a new “codepage” that contains all the characters in question. Without this new codepage text translation in a non-Unicode compliant platform/application will be incorrect, which is what is happening.

An application that is fully Unicode compliant does not need a codepage to translate texts. While, a lot of application developers are migrating towards Unicode text scripting in general, there are going to be lots of legacy applications out there will still be dependent on codepages for text translation, so we really need to push, especially Microsoft, to create a new codepage for Africa. After-all, all major language regions of the world have their own codepages.

First, YoruÌbaìs (Yorubas) needs to get Microsoft to create our own National Language Support b (NLS) (also called Language Locale) page in Windows.

Second, each NLS is tied to a codepage that contains all the characters needed to translate the text in any operating systems.

Third, the new codepage has to be released as systems upgrade for all operating systems in the world to recognize and translate these texts correctly at all times.

At this time most YoruÌbaì (Yoruba) keyboard layout are using 1252 (Latin 1) codepage. This codepage does not have our Ẹẹ, Ọọ, Ṣṣ (Ee, Oo, Ss, with dot below) included and only the acute and grave tonal signs are in this codepage.

FYI, we at KỌNYIN (KONYIN) have submitted a suggested codepage template request to Microsoft for creation, but Microsoft keeps telling us that it will not create a new codepage and we should concentrate our effort in promoting Unicode text scripting to application developers instead. We can use some help from the Yoruba community to pressure Microsoft into action.

Ade G Oyegbọla (Oyegbola)
Andrew Cunningham  174
02-27-2005 06:45 PM ET (US)
Dear Dr. Samuel Olamijulo,


QT - Dr.Samuel Olamijulo wrote:
> --QT-------------------------------------------------------------
> Reply by email or visit
> http://www.quicktopic.com/15/H/KKgbRqJUAR8/m173
> -----------------------------------------------------------------
>
> SUBJECT: Yoruba Internet Display Issues 02.25.05

> At my own user end, typing ALL Yoruba Letters, under -marks and
> tonal signs included, is now easy with practice, using free
> Arial Unicode MS font and ABD Yoruba Keyboard .

Arial Unicode MS isn't an ideal font, since

1) it is only available to people who own an appropraite Microsoft product like MS Office 2000,XP or 2003.

2) it doesn't have the open the open type features that handle combinig diacritics.

Have a look at

http://www.openroad.net.au/languages/afric...uba/sample_nfd.html http://www.openroad.net.au/languages/african/yoruba/sample.html

you'll need the appropriate fonts loaded.

> One important persisting user headache is in communication
> accross e-mail , yahoogroups, other web forums, websites and
> other Internet applications even when Arial Unicode MS or other
> Unicode Compatible fonts are used in the creation of the
> message.
>

e-mail will work well if you use a unicode email client (using a POP3 or IMAP email account) and an appropriate font rendering technology.
Most web-based email services are flawed (from the point of view of multilingual capabilities).

Hotmail and Yahoo email services (the US, UK, AU and NZ ones at least) are only really suitable for English and a handful of other languages. Tey're not designed to handle a language such as Yoruba.

The same goes for Yahoogroups and many web forums.

For Yoruba, it is necessary to develop and implement Unicode based solutions that are appropriate for Yoruba, rather than relying on generic US-centric services that are not really suitable.

> The display at the author end may be good but at the receiver
> end letters Ee; Oo ;Ss with under-marks and Aa; Ee; I i ; Oo
> ; Uu with tonal signs with or without under-marks are often
> variously compromised.
>

... for a number of reasons ... if you want to use Yoruba on the internet ... use web tools suitable for Yoruba. I suspect that
appropriate tools will need to be developed.



> For example in the display of the Yoruba passage below :
>

If you had included tones in your example, you would have had a more interesting and more rigorous test.

Andrew

--
Andrew Cunningham
e-Diversity and Content Infrastructure Solutions
Public Libraries Unit, Vicnet
State Library of Victoria
328 Swanston Street
Melbourne VIC 3000
Australia

andrewc+AEA-vicnet.net.au

Ph. 3-8664-7430
Fax: 3-9639-2175

http://www.openroad.net.au/
http://www.libraries.vic.gov.au/
http://www.vicnet.net.au/
Dr. Samuel Olamijulo  173
02-25-2005 06:00 AM ET (US)
SUBJECT: Yoruba Internet Display Issues 02.25.05

Dear Mr. Edward Chelin, Greetings.

I appreciate your mail below and your very helpful interest in Yoruba Language Development for Yoruba People Development.

At my own user end, typing ALL Yoruba Letters, under -marks and tonal signs included, is now easy with practice, using free Arial Unicode MS font and ABD Yoruba Keyboard .

Yoruba Desktop Publishing is making tremendous progress with imputs from many good contributors from all over the world.

One important persisting user headache is in communication accross e-mail , yahoogroups, other web forums, websites and other Internet applications even when Arial Unicode MS or other Unicode Compatible fonts are used in the creation of the message.

The display at the author end may be good but at the receiver end letters Ee; Oo ;Ss with under-marks and Aa; Ee; I i ; Oo ; Uu with tonal signs with or without under-marks are often variously compromised.

For example in the display of the Yoruba passage below :

 

The Grace In Yoruba:

 

Ki õre ọfẹ Jesu Kristi Oluwa wa

 

Ifẹ mimọ Ọlọrun

 

Idapọ awọn enia mimọ

 

Ki o mã ba wa gbe

 

Lati isinyi lọ

 

Ati titi lailai

 

AMIN

 

Typed with



Arial Unicode MS font in MS Word 2003 and ABD Yoruba Keyboard available for free from:



http://www.africanportal.net/Publications/ABD/mktut1.htm



The initiating display at my end is perfect. I shall deliberately post at Quicktopic, a 12n- forum and some Yahoo e-groups for display comparisons at the other end.
  

Thank you once again for your very constructive interest.

Dr. Samuel Kayode Olamijulo

------------------------------------------------------------------------------
Edward Cherlin <edward.cherlin@etssg.com> wrote:
On Saturday 12 February 2005 00:16, you wrote:

> Dear Mr. Edward Cherlin,

>

> 1. Please go ahead and report back to the Unicode list we are

> agreed that Yoruba Ss Oo Ee with undermarks be single

> letters that can be written with either a dot or a vertical

> line below in different fonts.


Done.


> 2. My Previous Comment:

> >This apparently must be coupled with representations

> > by ALL who have the access to Microsoft, Yahoo, Hotmail and

> > several other Major world ICT players to use on their

> > servers fonts that have the glyps to present ALL Yoruba

> > letters correctly.

>

> Your Comment:

> >All of these, plus Apple, IBM, Sun, the Linux and BSD

> >communities, and many more will support whatever characters

> > are in Unicode. Some font vendors and developers of Free

> > fonts will include glyphs for every character in Unicode in

> > their font offerings. There is no problem there.

>

> My Updated Comment:

>

> Your observation sounds very encouraging. We look forward to

> actual speedy implementation by all to overcome current Yoruba

> display, storage and compatibility issues accross different

> products and services.


There is no difficulty in displaying and transmitting Yoruba in Unicode. All that is needed is a suitable font, such as Arial Unicode MS or the free Code2000 or ClearlyU fonts. I find that I have more than 20 fonts in a variety of styles with these letters in my Debian Linux distribution. I can display Yoruba correctly in dozens of applications.


Anybody with experience in creating Linux keyboard layouts could add these Yoruba letters to a standard Latin alphabet keyboard in an hour. (I personally have been entering them throughout this discussion using a Unicode character table utility. I just keep the appropriate page (30 decimal, 1E hexadecimal) of the Unicode space open on my desktop while I type.) The Macintosh would take only a little longer, and Windows requires special tools, but I have been told of a Yoruba keyboard for Windows coming out next month. Let's see if I can find the message...Ah, yes.

============

> Finally, in any case I will like to inform all that

> KỌNYIN Keyboard with fourteen (14) additional Nigeria

> languages specific characters and twelve (12) tonal signs,

> which will allow for unlimited tonal sign combination on any

> vowel character without changing how users type today, should

> be available for sale before the end of March 2005. The new

> KỌNYIN keyboard is totally Windows 2000, 2003 and XP

> based and completely application and font independent.


Excellent.


> Ade G Oyegbá»la

> LANCOR Technologies

> www.konyin.com

>

==========

> Thank you for your very helpful interest.

>

> Dr. Samuel Olamijulo



--

Edward Cherlin, Simputer Evangelist

Encore Technologies (S) Pte. Ltd.

The Village Information Society

http://cherlin.blogspot.com
UBJECT: Yoruba Internet Display Issues 02.25.05
Dr. Samuel Olamijulo  172
02-15-2005 04:48 PM ET (US)
SUBJECT: Yoruba Decimal Counting System Re-Endorsement 02.15.05

Dear Compatriot Oladokun,
 
Thank you for your latest e-mail message to me . I appreciate our united worthy interest in the Development of Yoruba Language for the Development of ALL Yoruba People who have a variety of political and religious persuations in our different current locations in many counties of today's world.
 
I think that the Yoruba Decimal Counting System proposed by Dr. Kayode Fakinlede is excellent. So convinced was I that last year I bought some copies of his dictionary containing the proposal in print to keep in my library and share with some very appreciative family members and friends.
 
I have also used every appropiate opportunity to bring this historically significant proposal to the attention of people .
 
For example at the Hope Africa E-Publisher Website in April 2004
 
http://www.hopeafricaepublisher.com/yoruba...contribution21.html
 
 
CONTRIBUTION 21
FAKINLEDE KAYODE
 
In the USA. Author of
"Modern Practical Dictionary : Yoruba-English, English-Yoruba"
  Which can be purchased from

E-Mail Contact : "Dr. Kayode Fakinlede" jfakinlede@aol.com
   
Proponent of historically significant Yoruba Decimal Counting System, very adequate for current Information and Communications Technology Age.

" Fakinlede J." In LagosForum

http://www.lagosforum.com/comment.php?NR=976

QUOTE

Reevaluation Of Yoruba's Complex Numerical System
If Yoruba would be counted as one of the major languages of the world, its complex numerical system would have to be simplified. Science and technology, the engines that drive the modern world depend largely on number manipulations.

REEVALUATION OF YORUBA'S COMPLEX NUMERICAL SYSTEM
by Dr. Kayode J. Fakinlede - Author of Modern Practical Dictionary;
Yoruba-English, English-Yoruba

"Four scores and seven years ago, our forefathers brought forth .."
The event being referenced happened eighty seven years before or thereabouts. If the event had taken place in Yoruba land, and Lincoln were a Yoruba leader, he would have expressed the year as "three less ten less five times twenty years ago.." This is cumbersome indeed!

But then, using this form of numerical system in the Gettysburg Address was Lincoln's way of injecting some sense of history and emotion into a situation that needed it. Today, a young, American schoolboy, unencumbered
with the rigors of a civil war, would be better served if Lincoln had just said "Eighty seven years ago .."

If Yoruba would be counted as one of the major languages of the world, its complex numerical system would have to be simplified. Science and technology, the engines that drive the modern world depend largely on number manipulations. This means that the system of performing rigorous mathematical mechanics before arriving at a given quantity has to give way. The duodecimal numerical system, with which our forebears counted their heads of cattle, chickens in the barn and cocoa bags would have to be replaced with a decimal system.

It is the decimal numeric system that we can use to count the number of people that occupy Yoruba land, the distance of the earth to the moon, the number of molecules in a substance, etc. The duodecimal system would just
not move the Yoruba language into the modern era.

Another reason to change to the use of the decimal system of counting is that it stops making counting itself the objective. It takes a Herculean task to express a number like 59,993 in Yoruba, even for learned speakers of the language. A speaker of English would have expressed this number as fifty nine thousand, nine hundred and ninety three, and proceeded to other endeavors, while the Yoruba person would still be busy with his own system of successive divisions, multiplications, substracions and additions, just to arrive at the said number.

The third reason why this change is neccesary is that it immediately includes more people in the national debate. How many Yoruba people, besides the educated ones, actually understand the implications of the national budget? How does one explain to an uneducated person that the government
will spend more than one trillion naira in a year? If an average Yoruba farmer were to grasp the full implication that the government's debt is forty billion dollars and that it intends to borrow another three billion more, he probably would be more critical of the government's priorities.

In the decimal system, the numbers are called as they are written. In the Yoruba system, the written number does not always have a corresponding voice pattern. For example, a simple number like 97, is called, etadinlogoorun - three less five times twenty. None of the numbers used in the pronunciation can be deciphered from the written symbol.

The Yoruba system already starts on the good foot. All that needs to be
done is a little fine-tuning to get to the destination. A unit in Yoruba is called eyo. Thus, three is eyo meta. A bundle comprising of ten units is taught in the primary schools as idi. Thus 40 can be called idi merin.
In Modern Practical Dictionary: Yoruba-English, English-Yoruba, the following words have been used for higher denominations: Apo (bag)represents 1 hundred; Oke (a big bag) represents 1 thousand; odu (a collection of bags) represents 1 million, eeru represents 1 billion.
The logic behind these deliniations is this: we tie eyo to make an idi; we collect many idi into an apo. A collection of apo makes an oke. A load of oke form an odu.The word eeru (billion) derives from the fact that when things multiply exceedingly, Yorubas say o gbeeru.
 Therefore, a collection of odu is referred to as eeru.

Therefore, a number like 59,993 can now be called as it is written, idi marun le esan oke, apo mesan ati idi mesan le eta.

All in all, a logical step in making the Yoruba language user friendly and relevant to the modern age begins by revamping the numerical system to meet new, world developments.

Dr. Kayode J. Fakinlede (2004-02-03)

END OF QUOTE
 
I heartily re-endorse to all Yorubas worldwide this Yoruba Decimal Counting System for us all to also repetetively endorse as individuals and in groups to as many other people as possible.
 
It will be helpful when Dr. Fakinlede concludes and implements his arrangements to publish at least this Yoruba Decimal Counting proposal complete online for easy, more effective Global Access and common usage soonest possible.
 
Thank you
 
Dr. Samuel Kayode Olamijulo
Dr. Samuel Olamijulo  171
02-09-2005 11:02 AM ET (US)
SUBJECT: Yoruba Proverbs Online
Dear All, Greetings.

About over 100 million people in the Yoruba World Community - Iran Yoruba, must be deeply appreciative of a very large number of Yoruba Proverbs published in Yoruba online

at http://libr.unl.edu:2000/yoruba/

by Professor Oyekan Owomoyela at the University of Nebraska - Lincoln. USA.

We are also very greatful to Compatriot Irenikatche Akponikpe for bringing this to our attention.

At my own end, access to the website is easy. Viewed in Internet Explorer with the default Western European(Windows) encoding, the Yoruba is beautifully complete with under marks, tonal signs and all.
The content is high quality.

Please endorse to family members ,friends and others who may or should be interested.

Thank you.
Olamijulo S.K.

-------------------------------------------------
QUOTE
From: akponikpe [1]
 Sent: 01 February 2005 15:59
 To: yorubaworld@yahoogroups.com
 Subject: [YorubaWorld] Yoruba proverbs

 Mo ki gbogbo yin, at'ewe at'agba,

 For those who don't know, this website below may be of great interest. It is a set of thousands of Yoruba proverbs compiled and translated by Dr. Oyekan Owomoyela frm University of Nebraska - Lincoln.

 "Owe leshin oro, bi oro ba sonu, owe la fi nwa a" (Proverb is the horse of speech, when speech is lost, proverb is the means we use to hunt for it).

 http://libr.unl.edu:2000/yoruba/

 Ire o
 Irenikatche Akponikpe


END OF QUOTE
   170
02-05-2005 06:13 PM ET (US)
Deleted by topic administrator 02-05-2005 07:05 PM
Ade G Oyegb&#7885;la  169
02-01-2005 04:38 PM ET (US)
Subject: Individual characters and combining diacritical marks, cross platform translation of characters, and dot or vertical.

First, there seems to be a misunderstanding here are to what needs to be done relating to Yoruba characters like Ẹẹ, Ọọ, and Ṣṣ. (Note that the S with dot did not show correctly, this is because the font in use on the server does not have the glyphs to present it correctly.) These three characters are not combining diacritics in Yoruba language, the characters are indeed independent character and have there own code points in the Unicode charset. The Ẹẹ and Ọọ are also vowels for which tonal signs are applicable; therefore using these two characters with tonal signs will qualify as combining diacritical marks.

Second, with regards to Dr. Ọlamijulọ's comment on the display of fonts from one platform to another. Here is the issue, most of the multilingual keyboard layouts are created using Language IDs that are based on 1252 codepage. But, 1252 codepage does not have all Yorùbá characters included; however, if a receiving platform is Unicode compliant, the platform will not need the codepage to translate the text correctly, it will use the more expansive Unicode commands. So OS platforms like Linux, and email servers used by Yahoo, AOL and others that are not fully Unicode compliant, uses the system codepage to translate the text and where a specific character in not included in the codepage, the translation is lost and incorrect.

Third, dot or vertical line? It is very interesting that Yorubas are still trying to fit everything into a single straight jacket. The dot or vertical line and for that matter horizontal or tear drop should be considered as writing styles and not emphatic or de-facto character representation. The Unicode Organization told us sometime ago that these characters are already represented by a codepoint in their chart and they do not make any representation as to the official glyphs. Let’s think about this again, if the single glyphs position will be the only acceptable style, then font creators will be out of business.

Finally, in any case I will like to inform all that KỌNYIN Keyboard with fourteen (14) additional Nigeria languages specific characters and twelve (12) tonal signs, which will allow for unlimited tonal sign combination on any vowel character without changing how users type today, should be available for sale before the end of March 2005. The new KỌNYIN keyboard is totally Windows 2000, 2003 and XP based and completely application and font independent.

We are now almost done with the Africa (Niger-Congo) version of the keyboard that will include Twenty-six Africa language specific characters and Nineteen (19) tonal signs.

Ade G Oyegbọla
LANCOR Technologies
www.konyin.com
Ade G Oyegboòla  168
02-01-2005 04:24 PM ET (US)
Deleted by author 02-01-2005 04:31 PM
Omo Ewi  167
01-28-2005 11:46 AM ET (US)
Yoruba nikan la nso lori Egbe Yahoo! ti a npe ni Tiwantiwa.

E le ye wa wo lori:

http://uk.groups.yahoo.com/group/Tiwantiwa

A oo maa reti yin lohun!
Dr. Samuel Olamijulo  166
01-28-2005 03:46 AM ET (US)
Subject: Yoruba Letters Undermark Discussion Update- Jan 05

Dear Dr Don Osborn and James Fox,

1. Yes the display of especially the Yoruba component of my posting on QuickTopic M162 typed with ABD Yoruba Keyboard using Arial Unicode MS font was perfect at my end. This was unfortunately not the same at Yahoo e-groups or even A12n-forum.

2. In April 2004, there was a useful discussion with inputs from experts and Yoruba language stakeholders from all over the world on the Yoruba Vowels and S undermarks issue. Find contributions links is at :


http://www.hopeafricaepublisher.com/yoruba...tributionlinks.html

THE CONSENSUS

QUOTE
It is useful for Yoruba Language Products Developers in particular and modern Yoruba Speakers, Writers and Students all over the world in general to be at least aware of some of the many Yoruba fonts and keyboards currently available. This should equip all stakeholders to participate better in the worthy pursuit of useful harmony in the face of different potential choices. Informed contributors observed that variety in Yoruba font styles is neither bad nor unique among font styles available for many other languages generally worldwide. It appears reasonable and practicable in contemporary Yoruba to accept the suggestions for a "standard" Yoruba using the dot under "E,e"; "O,o"; "S,s" and a "classical" Yoruba using the vertical line under "E,e" ; "O,o"; "S,s" .
The drive is to get the same code assigned to these two styles of the same letter leaving the installed font to render it one way or the other as per the underlying chosen style of the font. This should facilitate the development and better distribution of many more user friendly UNICODE COMPATIBLE YORUBA PRODUCTS AND SERVICES for the good of all Yoruba Language stakeholders worldwide.

END OF QUOTE

It is useful to know what various experts and Yoruba stakeholders think now.

Olamijulo S.K.
-----------------------------------------------------

QuickTopic daily digest <qtopic+15-KKgbRqJUAR8@quicktopic.com> wrote:
Date: 28 Jan 2005 05:07:11 -0000
Subject: Yoruba language & ICT (fonts, keyboards & applications)
From: QuickTopic daily digest
To: samola43@yahoo.com

--QT-------------------------------------------------------------
Messages for the topic "Yoruba language & ICT (fonts, keyboards & applications)" for 01-27-2005.
Reply by email or visit
http://www.quicktopic.com/15/H/KKgbRqJUAR8

-----------------------------------------------------------------

From: BisharatNet Time: 01:26 PM
Samuel, Your text in /m162 displays only one dot-under, but tone
marks are there. Does it appear on the QuickTopic screen as you
intended it to?

Don Osborn
Bisharat.net
------------------------------------------------------------
From: BisharatNet Time: 01:31 PM
The following suggestion re standardizing the choice of
diacritic for the dot/line under characters was posted to the
Unicode list today. U0329 is the codepoint for the small
vertical line under combining diacritic (meaning it can be added
to letters to make the diacritic character):

You probably know how Yoruba uses a mark below certain
vowels, as well as the letter s, to represent certain phonetic
distinctions. There seems to be a variety of shapes for this
symbol: a dot below, a short vertical line below, the same as
before, only attached to the letter above it, a teardrop below,
and even (so I've read), a small Greek cross below.

I've read on this list and elsewhere, that there is no real
preference between these shapes. However, since there are
different Unicode characters for at least the combining dot
below and the combining short vertical line below, there is a
lot of confusion over what to use. I believe that it has even
been said earlier on this list, that until a particular
diacritic is picked, a proper representation of Yoruba in
computers will be stymied.

However, after thinking about it, I think there could be an
alternative. There is an analogous situation in Greek, where a
particular diacritic (the perispomeni) can be shown as a tilde,
an inverted breve, or a macron, depending on the style. All of
these already have their own codepoints, but the perispomeni has its own position, U0342. This avoids having to change codepoints every time you change the typeface, and treats a single logical character as such, rather than a set of 3 different characters.

Perhaps something like this could be done for the Yoruba
under-mark? After all, the different forms used are apparently
merely a matter of style, but Unicode disunifies them. Would it
be possible to pick, say, U0329, as the Yoruba under-mark
*exclusively*, and note that it has possible alternative glyphs
in the code charts? Or, if that would conflict with possible
alternate uses for U0329, could a new combing diacritic be
encoded just for Yoruba?

Does anyone have any comments?

James Fox
------------------------------------------------------------
From: BisharatNet Time: 01:48 PM
Here is my reply to the letter in the previous message below,
/m164.

James, Thanks for bringing this up. The possibility of
choosing one form and letting font designers modify the glyph in Yoruba Unicode fonts for reasons of style and aesthetics has seemed to me to be a reasonable avenue to resolve this issue but there are complications.

Using the combining vertical line below U+0329, which seems to be in the charts uniquely for Yoruba (?), and using it
effectively as "combining mark below" would resolve the problem.But I had the impression that this kind of solution is frowned upon (I am not familiar with the Greek example you mention).

On another level, it might handicap efforts at
unified/standardized Nigerian usage of dot-under (mainly
precomposed) characters. Some other languages notably Igbo have similar diacritics, but appearing uniquely (?) as dots. I may be the only one bringing this kind of issue up, but if you look at current pan-Nigerian software efforts that seek to serve a multilingual market (Konyin, Paradigm, to name two small
commercial ones), they utilize dot-under diacritics.

Are there ways that software can handle different characters and combinations of characters as equivalents in certain language settings?

Don Osborn
Bisharat.net
------------------------------------------------------------
BisharatNetPerson was signed in when posted  165
01-27-2005 01:48 PM ET (US)
Here is my reply to the letter in the previous message below, /m164.

James, Thanks for bringing this up. The possibility of choosing one form and letting font designers modify the glyph in Yoruba Unicode fonts for reasons of style and aesthetics has seemed to me to be a reasonable avenue to resolve this issue but there are complications.

Using the combining vertical line below U+0329, which seems to be in the charts uniquely for Yoruba (?), and using it effectively as "combining mark below" would resolve the problem. But I had the impression that this kind of solution is frowned upon (I am not familiar with the Greek example you mention).

On another level, it might handicap efforts at unified/standardized Nigerian usage of dot-under (mainly precomposed) characters. Some other languages notably Igbo have similar diacritics, but appearing uniquely (?) as dots. I may be the only one bringing this kind of issue up, but if you look at current pan-Nigerian software efforts that seek to serve a multilingual market (Konyin, Paradigm, to name two small commercial ones), they utilize dot-under diacritics.

Are there ways that software can handle different characters and combinations of characters as equivalents in certain language settings?


Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  164
01-27-2005 01:31 PM ET (US)
The following suggestion re standardizing the choice of diacritic for the dot/line under characters was posted to the Unicode list today. U0329 is the codepoint for the small vertical line under combining diacritic (meaning it can be added to letters to make the diacritic character):

You probably know how Yoruba uses a mark below certain vowels, as well as the letter s, to represent certain phonetic distinctions. There seems to be a variety of shapes for this symbol: a dot below, a short vertical line below, the same as before, only attached to the letter above it, a teardrop below, and even (so I've read), a small Greek cross below.

I've read on this list and elsewhere, that there is no real preference between these shapes. However, since there are different Unicode characters for at least the combining dot below and the combining short vertical line below, there is a lot of confusion over what to use. I believe that it has even been said earlier on this list, that until a particular diacritic is picked, a proper representation of Yoruba in computers will be stymied.

However, after thinking about it, I think there could be an alternative. There is an analogous situation in Greek, where a particular diacritic (the perispomeni) can be shown as a tilde, an inverted breve, or a macron, depending on the style. All of these already have their own codepoints, but the perispomeni has its own position, U0342. This avoids having to change codepoints every time you change the typeface, and treats a single logical character as such, rather than a set of 3 different characters.

Perhaps something like this could be done for the Yoruba under-mark? After all, the different forms used are apparently merely a matter of style, but Unicode disunifies them. Would it be possible to pick, say, U0329, as the Yoruba under-mark *exclusively*, and note that it has possible alternative glyphs in the code charts? Or, if that would conflict with possible alternate uses for U0329, could a new combing diacritic be encoded just for Yoruba?

Does anyone have any comments?

James Fox
BisharatNetPerson was signed in when posted  163
01-27-2005 01:26 PM ET (US)
Samuel, Your text in /m162 displays only one dot-under, but tone marks are there. Does it appear on the QuickTopic screen as you intended it to?

Don Osborn
Bisharat.net
Dr. Samuel Olamijulo  162
01-19-2005 04:19 PM ET (US)
Yoruba Classical, Net Message by Olamijulo S.K. -01.20.05

 

ABD Yoruba Keyboard downloaded with free tutorial from:

 

http://www.africanportal.net/Publications/ABD/mktut1.htm

 

 Windows XP Professional,

MS Word 2003 Web Layout View, Arial Unicode MS Font

Saved as Web Page,

Sent with I.E. Western European (Windows) Encoding

 

Ẹ kú dẽde iwòyí gbogbo iran Yorùbá ni àgbáiyé ,

Èdùmàrè á fi ibùkún si ãyan ati õgun gbogbo wa o - Amin

 

QUESTION: All of the above is beautifully clear on my desktop.

Can you read with under dots and tonal signs at your end?
Thank You for your helpful attention.

Olamijulo S.K.
Maliha Syeda  161
01-12-2005 12:30 PM ET (US)
I desperately need someone who can type up less than 150 words of hand written yoruba on the computer in yoruba font. I am willing to pay for this service.

Please call me urgently on 01582517448

Maliha
Paradigm International  160
01-04-2005 12:08 PM ET (US)
Edited by author 01-04-2005 12:12 PM
Greetings and Happy New Year.

Kindly see our modest contributions towards the promotion of Nigerian languages since 2001. Our award winning word processor and translator - Paradigm Lingua® makes it easy to produce documents in several African languages on the PC.
http://www.paradigmint.net/lingua.htm
For details, please see our News Page.
BisharatNetPerson was signed in when posted  159
12-30-2004 07:53 PM ET (US)
Thank you Samuel for your thanks and for the list of people contributing in one way or another to the use of Yoruba in ICT (although personally I can't claim to have done much). The list itself is a very helpful summary, though if anyone can think of others, please let us know.

I'd also like to take the opportunity to recognize people working for computer & internet access in the field who are taking account of the language dimension in their work (like Pam McLean and the OOCD organization).

Best wishes to all for a happy and productive new year 2005!

Don Osborn
Bisharat.net
Dr. Samuel Olamijulo  158
12-29-2004 04:12 PM ET (US)
Subject: Yoruba Language Development- December 2004 Thank You All.

The Renewed Gratitude of Yoruba World Community-Iran Yoruba- goes to but is not limited to:

 1. Bomi Olamijulo Oki for ABD YORUBA KEYBOARD.

This is still the most user friendly YORUBA Keyboard I have so far come
across to use for YORUBA language desktop work at my end. I can type both Yoruba and
English on the same keyboard with no need for switches. Underdots, tonal signs,
and combinations of both I now easily do with this keyboard. Above all it is
available for free downloads and free tutorials to everyone from the weblink
below. It can be used with Arial Unicode MS or other Unicode compatible fonts also
available for free to all on the Internet.

 http://www.africanportal.net/Publications/ABD/mktut1.htm

 2. Fakinlede Kayode in USA.
Proponent of historically significant Yoruba Decimal Counting System and Author of Modern Practical Dictionary: Yoruba-English, English-Yoruba.

E-mail Contact : Dr. Kayode Fakinlede jfakinlede@aol.com

3. Olamijulo S.K of Hope Africa E-Publisher

For various free products at

     Website : http://www.hopeafricaepublisher.com

 
4. Schleicher Yetunde Antonia in USA of the University of Wisconsin- Madison Yoruba Program,Producers and Distributors of Yoruba Products including Yoruba Fonts
available for download at:

 http://african.lss.wisc.edu/yoruba/font/index.html

5.Aloba Babatola in Vienna Austria of Aloba Publishers


at http://www.aloba.at/eng/index-e.html

6. Onayemi A.O. in Ontario Canada of Learn Yoruba Website .
Distributors of Free YorubaOK Font and Keyboard Program
    at: http://www.learnyoruba.com

7. Olamijulo S.K. of Hope Africa E-Publisher
on "Yoruba African Computer Keyboard - Do It Yourself Technique" First Published July 2003,Updated November2004 at

http://www.hopeafricaepublisher.com/yoruba-keyboard.html

8. Lawal Olalekan in the United Kingdom of YorubaWorld at


     http://groups.yahoo.com/group/yorubaworld

9. Oyegbola Ade G. in USA of KONYIN KEYBOARD


    at http://www.konyin.com

10. Olawole Oladele in Norway of D-Net Communications


   at http://www.dnetcom.com/products/fonts.html


  and http://www.africaservice.com

11.Awoyale Yinka in USA the University of Pennsylvania Linguistic Data
Consortium working on an Electronic Yoruba Dictionary using a Yoruba Font:
at http://ccat.sas.upenn.edu/afl/yoruba.html

12.Arasanyin Olaoba F. in USA of Georgia Southern University Edeyede Yoruba Digital Dictionary Project. Producers of Yoruba Font, which can be downloaded from

http://www.yoruba.gasou.edu/template.php?a=13


13.Osborn Don in Niger of Bisharat.net


      at http://www.bisharat.net

14. Cunningham Andrew in Australia of Openroad
    at http://www.openroad.net.au/languages/files/yo41.html

15.Adegbola Tunde in Ibadan Nigeria
of African Languages Technology Initiative (ALT-I)
  at http://www.alt-i.org/projects.htm

16. Ajayi G. O. in Abuja Nigeria
of the National Information Technology Development Agency
      at http://www.nitda.org/projects/kbd/index.php

17. Akindana Martin in USA
of AfriK Network Groups
at http://www.ChatAfriK.com
  Moderator E-Groups including Yorubas-Community at
http://groups.yahoo.com/group/yorubas-community


18. Kehinde L.O. in Ile-Ife, Nigeria.
      Deputy Vice-Chancellor ( Administration )
      At the Obafemi Awolowo University, Ile- Ife, Oshun State, Nigeria.
  Website: http://www.oauife.edu.ng

19. Omidire Felix Ayoh in Salvador-Bahia, Brazil.
       Exchange Professor ( OAUIfe/UFBa),
       Centro de Estudos Afro-Orientais da UFBA,
       Salvador-Bahia ,
        Brazil.
         Website: http://www.ufba.br.
        E-mail Contact: fomidire@yahoo.fr

 

20. Several other Yorubas and Friends in the International community like Chris Harvey,James Kass,Trod Trosterud and others who have worked hard on Yoruba Language
products and services.We appreciate you all. May Olodumare abundantly bless you all and crown your labors with breakthrough successes in year 2005 and beyond.

 

Thank you.

Olamijulo S.K.
BisharatNetPerson was signed in when posted  157
12-25-2004 12:28 AM ET (US)
There certainly seems to be a lot of activity re Yoruba and ICT, particularly with regard to keyboards. I was particularly interested to note the open-source effort mentioned by Remi-Niyi Alaran in /m155. May I ask who exactly is working on it?

It would also be interesting to know how the OSS and MS efforts will deal with some of the character combination issues that have been discussed on this board and elsewhere like on A12n-forum & A12n-collaboration.

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  156
12-25-2004 12:23 AM ET (US)
Edited by author 12-25-2004 12:29 AM
This item may be of interest...

Don Osborn
Bisharat.net

"Microsoft Endorses Due Process in IT" This Day (Lagos)
December 15, 2004, Posted to the web December 16, 2004
http://allafrica.com/stories/200412160060.html

" ... Microsoft Nigeria has also begun a process of software localisation in the three major Nigerian languages as part of its commitment towards reinvesting into the Nigerian society. By this, Microsoft intends to come up with a software programme that would enable functions to be performed in the three local languages of Igbo, Yoruba and Hausa. Ilukwe disclosed this in Lagos last week, saying that such a programme has already been produced in South African for Swahili language."
Remi-Niyi Alaran  155
11-23-2004 06:41 PM ET (US)
Edited by author 11-23-2004 06:52 PM
E sheun gàn ò fun "keyboard".
Àwà ti n gbimò lati sò "applications" bi "Mozilla web browser, Open office productivity suite" ati "Linux operating system" si édé wa. Ishe yi nlò lòwò. Opò rè jè itùnmo (translations). Àwòn Eleyedé ti shé ishéè yi ri. Sugbon itùnmo kàn si èyò kàn loshè shè pèlu Eleyedé. Óma ràn wà lòwò tì Eleyedé bà ran wà lòwò.

Àpà kéji ni itùnmo tuntun fun òro ti a nlo nì "Science, ICT, Engineering, Medicine...".

// Thanks for the keyboard.
// We have started work on translating applications like Mozilla, OpenOffice and Linux into our language. This work is ongoing. Much of (the basic dictionary) work has been done by Eleyede. But Eleyede only allows word by word searches. It will be very helpful to collaborate with Eleyede developers.

// The second part of our work will conprise translations for the newer terms used in S.T.E.M ... Examples of this work can be seen at the link provided below.

Link=
http://p221.ezboard.com/fnigeriadiscussion...pic&start=1&stop=20

***ata***
Bomi Olamijulo-Oki  154
11-15-2004 04:51 PM ET (US)

Hearty congratulations to Mr. Adegbola and the team at Alt-I. Makes me feel really good to know that more and more people, many of whom I am not even aware of I'm sure, are working on Yoruba Language Development all over the world.

I look forward to a few years(?) from now, when 100% functioning, universally compatible, and universally accessible Yoruba keyboards will be taken as a "normal" thing.... we will get there!

Once again, thanks to everyone who has contributed / is contributing/will contribute to Yoruba development. Ire o!

Regards,

Bomi Olamijulo-Oki
http://www.africanportal.net
Dr. Samuel Olamijulo  153
11-15-2004 01:48 AM ET (US)

SUBJECT: Yoruba Song with ABD Yoruba Keyboard-Nov15,2004
 
Free user tutorial and download at
http://www.africanportal.net/Publications/ABD/mktut1.htm

Ọmọ ẹgbọn ọmọn àbúrò

Ẹ fãra yín mọnra

 
Olamijulo S.K.
Dr. Samuel Olamijulo  152
11-15-2004 01:14 AM ET (US)


SUBJECT: ABD Yoruba Keyboard User Experience-Nov.9,2004

I accessed for free to read all of a short, simple and helpful guide at the ABD Yoruba Keyboard link below updated on November 7.

After that, I downloaded and installed the Unicode Compatible Keyboard to my computer for free.

With MS Word, Selecting Arial Unicode MS font, Using this ABD Yoruba Keyboard;

 I do with the same keyboard now write English and without any switches:

1. Type ALL Yoruba Alphabets.

2. Undermark ALL Yoruba vowels and “S”.

3. Apply tonal signs on ALL vowels and “n”.

4. Apply combined undermark and tonal signs to “E e” and “O o”.

5. Write Naira, USD, BPS , EURO and Yen symbols.

6. Send and Read Emails with Yoruba passages.

In conclusion, I have personally found the Free ABD Yoruba Keyboard excellent for Unicode Compatible Yoruba writing on my computer and for doing e-mails.
Yoruba postings on web forums still present a challenge. Improvements are currently being explored. In any case, the only direction Yoruba on computers and Internet must head is UP.

As a privileged African, Nigerian, Yoruba elder, I am very happy to be alive in this era of Yoruba Language and People Development in the broader context of African People Development Worldwide. I pray and sincerely believe that Olodumare will bless and prosper many eminent persons from different country and language nationalities, known and unknown; who have contributed, are contributing and will in future contribute to Yoruba Language and People Development.

We love and appreciate them all.

Learn more, Download and Share with others for free from
 
ABD Yoruba Keyboard at:
 
http://www.africanportal.net/Publications/ABD/mktut1.htm
 
Thank you.
Olamijulo S.K.
BisharatNetPerson was signed in when posted  151
11-14-2004 11:32 AM ET (US)
The influential American newspaper The New York Times had an article in its Nov. 12, 2004 edition by Marc Lacey entitled "Using a New Language in Africa to Save Dying Ones."*

Below are excerpts mentioning Yoruba and Alt-I (re the latter, see also on this message board, /m72 and /m104).**

Don Osborn
Bisharat.net

"Technology can overrun these languages and entrench Anglophone imperialism," said Tunde Adegbola, a Nigerian computer scientist and linguist who is working to preserve Yoruba, a West African language spoken by millions of people in western Nigeria as well as in Cameroon and Niger. "But if we act, we can use technology to preserve these so-called minority languages."

...

Mr. Adegbola, executive director of the African Languages Technology Initiative, has developed a keyboard able to deal with the complexities of Yoruba, a tonal language. Different Yoruba words are written the same way using the Latin alphabet - the tones that differentiate them are indicated by extra punctuation. It can take many different keystrokes to complete a Yoruba word.

To accomplish the same result with fewer, more comfortable keystrokes, Mr. Adegbola made a keyboard without the letters Q, Z, X, C and V, which Yoruba does not use. He repositioned the vowels, which are high-frequency, to more prominent spots and added accent marks and other symbols, creating what he calls Africa's first indigenous language keyboard. Now, Mr. Adegbola is at work on voice recognition software that can convert spoken Yoruba into text.

...

Mr. Adegbola ... is distributing his keyboard free to influential Yoruba speakers, hoping to attract some deep-pocketed entrepreneur who could turn it into a business venture.

...


* The article is on the web at http://www.nytimes.com/2004/11/12/internat...frica/12africa.html & http://news.com.com/Using+the+kompyuta+to+...7337_3-5449598.html
** The ALT=I website is http://www.alt-i.org
   150
11-06-2004 10:59 PM ET (US)
Deleted by topic administrator 11-07-2004 08:44 AM
Bomi Olamijulo-Oki  149
11-01-2004 01:20 PM ET (US)

"ABD YORUBA" KEYBOARD

* Free, easy download
* Extremely easy to use, with reduced keystrokes
* Types Yoruba and English very well, on the same keyboard
    - No keyboard switching required
* Send pure Yoruba Email with Hotmail, Yahoo, MS Outlook
* Create/Transfer Yoruba MS Word documents
* Publish your Yoruba work online


Click on the link below to learn more, download and view examples. Please share with family members, friends and others who may wish to type in Yoruba.

URL:
http://www.africanportal.net/Publications/ABD/mktut1.htm

Thank you,

Bomi Olamijulo-Oki
Bomi Olamijulo-Oki  148
11-01-2004 01:09 PM ET (US)
Re: Meaning of "Bologun"

Hello Sam (/m141)

I do not know the meaning of "Bologun".. Did you mean "Balogun"? Balogun means "General" i.e. Army officer in Yoruba.

Hope this helps!

Cheers,

Bomi
   147
10-28-2004 10:59 PM ET (US)
Deleted by topic administrator 10-29-2004 07:15 AM
BisharatNetPerson was signed in when posted  146
10-22-2004 07:37 AM ET (US)
The Yoruba text displays properly in /m143 with my MSIE6 browser setting on UTF8. In another encoding it is unreadable. Your text did not include any tone markings (as combining fdiacritics) over the dot-under vowels. I copy below a text from another of your letters with such tone markings. (Without a proper font, these may appear as empty boxes.)

Òkè t’álájàpá kò le gùn

à tí gùn á ti sá»Í€
 
Ìran Yorùbá pegedé

Gbogbo á»má» Yorùbá ni àgbáyé nilati fi á»wá»Í
sowá»Ípá»Í€ pẹ̀lú ifáº¹Í fún ìtẹ̀siwájú.

Don Osborn
Bisharat.net
BisharatNetPerson was signed in when posted  145
10-22-2004 07:29 AM ET (US)
Re /m141 requesting help with the meaning of a word, see the link to an online Yoruba dictionary at /m89.

Re /m142, this is beyond the scope of this group's topic.

Don Osborn
Bisharat.net
gggg  144
10-20-2004 01:10 PM ET (US)
gfjli;;
Olamijulo S.K.  143
10-19-2004 05:53 PM ET (US)

SUBJECT: Yoruba Reading and Writing on Computers and Internet-Olamijulo S.K.-Oct.20, 2004

Over 100 million Yorubas, Ìran Yorùbá, currently living in many countries around the world, and others who have the need or the interest, should be enabled to read and write mutually intelligible Yoruba easily, often and efficiently on computers and the Internet.
We must and shall all cooperate constructively to make this happen very soon.
Please indicate if you can read the Yoruba lines below on your own computer.

A kí gbogbo Ìran Yorùbá ní igun mẹrẹrin àgbáyé
Okè la ó má a lá»
Nigbà tá ò sèbàjẹ á»mỠẹnikejì
Okè la ó má a lá»
Gbogbo wa níláti fi á»wá» sówá»pá» pẹlu ifẹ fún ìtẹsiwájú.

This piece was written using Nigerian Keyboard and Arial Unicode MS font.

For further information please access
“ FREE Nigerian Keyboard by NITDA- Aug.2004 Installation and Yoruba Usage Tutorial by Bomi Olamijulo-Oki

At http://www.africanportal.net/Publications/ykbdtut.htm

Also access related publications for free at

http://www.hopeafricaepublisher.com

Thank you for your interest.
Olamijulo S.K.
howells  142
10-19-2004 11:36 AM ET (US)
i would want a detaild answer on the topic the analysis of yorubars language its structureand contributions to societal developement
Sam  141
10-07-2004 02:44 PM ET (US)
Hello I would like help with the meaning of a word which I think is in the Yoruba language. The word is 'bologun' - Can anybody help.
Thank You
Bomi Olamijulo-Oki  140
08-26-2004 12:41 PM ET (US)
Hello all,

Just to let you know that I have created an online tutorial to support the Free Nigerian keyboard driver by NITDA. This tutorial is based on my experience with the keyboard, as well as on some questions I have answered from other users. It explains how to install and also how to use the keyboard. The tutorial is provided as a free service to whoever may require it, and gives several examples of (Yoruba) usage. If you want to learn how to
use the keyboard to type in PURE Yoruba, you may want to take a look at the tutorial, published at:

http://www.africanportal.net/Publications/ykbdtut.htm
(An MSWord version is also available for downloading/printing).

The NITDA keyboard driver download is available at:
http://www.nitda.gov.ng/projects/kbd/index.php
(Yoruba, Igbo and Hausa).


Regards,

Bomi Olamijulo-Oki.
Dr. Samuel Olamijulo  139
08-05-2004 01:46 PM ET (US)

Subject: Nigerian Keyboard Free Download

Respected Nigerian Compatriots and other well wishers,
I suggest that you download a Federal Government Provided, Nigerian Keyboard from:

http://www.nitda.gov.ng/projects/kbd/index.php

I came across the website just fortuitously whilst searching the net for things
of interest to me.

Explore the site from the homepage for background information and test out the keyboard. Endorse it to others who might, or should be interested, all free.

Share with others freely what you discover are the good things about this keyboard when used for any of Yoruba, Igbo or Hausa languages that you are fluent in. Feel free to make suggestions for improvements, if you have any. This way you can make useful contributions to the development of African people.Plus just imagine how much richer humanity will be if much of the sayings, wisdom and
skills of African elders can be effectively recorded and better shared with the rest of mankind when significantly large populations of Africans become computer literate and Internet enabled in their own language. Such through the use of free Unicode Compatible keyboards like this that can rapidly become easily accessible in African communities
worldwide.

In addition to posting anywhere you wish on the net, please send a copy of your response to me at the e-mail address below for collation soonest possible.

Also access and download for free

 "People Cooperation for African Development"

 at http://www.hopeafricaepublisher.com/

 Thank you.

Olamijulo S.K.

Contact: publisher@hopeafricaepublisher.com
Dr. Samuel Olamijulo  138
08-05-2004 01:32 PM ET (US)


Subject: Yoruba World Population – July 2004 Estimate by Olamijulo S.K.

Dear All,

Yoruba World Population – July 2004 Estimate:
   TOTAL ---------Above 100 MILLION ( One Hundred Million)

 PERSONS INCLUDED:

 1) All Persons who have one or more Yoruba parents, grandparents,great-grandparents, ancestors.

2) All Yorubas by marriage.

3) All who have become Yorubas, all over the world, through adoption or naturalization.

 Please Access Details Published by Hope Africa E-Publisher

at http://www.hopeafricaepublisher.com/ywp0704.html

Thank you.

Olamijulo S.K.
BisharatNetPerson was signed in when posted  137
08-02-2004 12:16 PM ET (US)
Edited by author 08-02-2004 12:17 PM
Re /m136 (learning Yoruba) and /m136 (info on the meaning of a name), you may find it helpful to check the index at /m100 (message 100) on this board - there are links to messages which give URLs for online learning sites (like LearnYoruba.com) and an online dictionary.

BTW, the recent dialogue between Andrew and Bolaji has been quite interesting ...

Don Osborn
Bisharat.net
Jana  136
07-31-2004 01:11 PM ET (US)
Hi,

I am from Czech Republic. I would like to learn Yoruba language. But we do not have neither courses or books.
Could you advise me if there is web where I can learn this language?

Thanks
Jeanette  135
07-25-2004 09:04 AM ET (US)
Hello

I have just found out that I am from the ethnic group of Yoruba. I wanted to know more about my heritage as i have been adopted. Also wanted to know more about my last name which was " Adeyinka" I also wanted help on finding my long lost birth father. If anyone can help me please e-mail me at felicita453@hotmail.com. Thanks
Bolaji Aluko  134
07-20-2004 08:37 PM ET (US)


Andrew:

I now have to reflect on ALL the "gems" that you have written below, and will get back to you!

Thanks again.


Bolaji Aluko

PS: Just one thought: If "all caps" is the only major problem with Arial Unicode MS, my preference would be to stick with it.
Andrew  133
07-20-2004 07:23 PM ET (US)
re /m130

Maybe if I illustrate with vietnamese. Take a word like "chào" where there is an a-acute.

Most Vietnamese typing/input software enters vietnamese characters using NFC (precompsoed characters) so that "à" would be a single character a-acute.

The Vietnamese keyboard in Windows 2000 and Windows XP on the other hand doesn't input Vietnamese in NFC or NFD but use their own order for grouping characters ... so I a web apge is created using this keyboard the "à" would actually be a + combining acute.

In practice, this menas that when I'm searching search engines like Google I have to search twice. Google does not normalize data. therefore i have to search once using Microsoft's Vietnamese keyboard and search again using another third party vietnamese keyboard.

The search results for each search will be completely different.

This is the danger that Yoruba faces in the move to Unicode. The keyboard layout isn't too important, but the unicode characters out put will be.

In your example:

odó (mortar)
odò (river)
ọ̀dọ́ (young)

IF they were input using NFC then in order to find them the keyboard would need to use NFC. If it used NFD or something else you wouldn't find them.

alternatively if the data comtained the dot-below and your keyboard inputs using the vertical-line-below then irregarless of wether you used an NFC or NFD or other keyboard you will not find these terms, because they use different characters than you are searching with.

The only way around this is to be consistent in usage between input of dtat and search

or build a custom Yoruba search tool which will search all possible combinatiosn at teh same time, irregardless of what is actually typed. Hope this makes sense. I'll try to illsutrate the point in my notes.

Andrew
Andrew  132
07-20-2004 07:06 PM ET (US)
re /m129

There is no specific relationship between keystrokes entered and the number of characters input into the data.

i could type a single key and get a-acute
I could type a single key and get a + combining acute
I could type a two keys and get a-acute
I could type two keys and get a + combining acute.

The design of the keyboard should be independant of the characters produced. You design the preferred keyboard layout and then create the rules that out put the characters you want from the key strokes you have identified.

Its a bit more complicated that that (with different input software placing different requirements and restraints on how you design the layout) but estentially thats what happens.

I can take the same keyboard layout and write a keyboard file that can produce NFC or NFD or non-normalized data.
Andrew  131
07-20-2004 06:58 PM ET (US)
re /m128

part of the purpose of the display, was the accompany the notes I'm writing.

Briefly NFC = all characters that can be precomposed will be precomposed. Precomposed means using a single unicode character for each yoruba letter (ie a-acute is precomposed a + combining acute isn't, both both are canonically equivalent in Unicode).

Not all Yoruba letters are precomposed in Unicode, so some Yoroba latters have to use combining diacritics in NFC

NFD = decomposed. ie no precomposed characters. All diacritics are combining diacritics (ie separate unicode characters).

So in NFC problem characters are Yoruba letters that have two diacritics (acute or grave and diacritic below). In Arial Unicode MS you'll see the problem most clearly when you are looking at the uppercase version of these characters, where the acute or grave character overstrikes the letter.

In NFD .. the characters that exist in Vietnamese should render correctly the acute and grave on A,a,E,e,I,i,O,o,U,u
but should be problematic on other characters. Arial Unicode MS does a relatively OK job with lowercase letters, but if you had to set text in NFC or NFD in all uppercase you'd run into problems.

You will get different results in Firefox (On Windows XP) and Internet Explorer. Internet Explorer will try to use the specified font. Firefox attempts to correct problems and does some font switching on my computer.

For me the ideal font display is with Dolous SIL .. there is a reason for why the best font for you is different from what I see. Doulos SIL is an OpenType and Graphite font which has support built in for correct rendering of combining diacritics. This feature isn't available in Arial Unicode MS (except for characters in the Vietnamese character range).

But to get these combining diacritic features to work, it is necessary to have the most recent version of Window's Unicode Script Processor (Uniscribe - usp10.dll). The version shipped with Office 2003 supports combining diacritics in the Latin script.

That said, if you don't wnat to use NFC or NFD its possible to get ore accurate rendering of Yoruba, by swapping the order of the diacritics in the NFC version. That will display better in Arial Unicode MS, but has the handicap of not being normalized. On the web and in a range of applications data maybe be normalized on input, so things would be coverted to NFD or NFC. Most of the new web services I'm working on will do this so that searching would be possible across a range of resources. Different keyboards may produce different results, ie NFC data, NFD data or something not normalized, so on imput we convert everything to one form to work with. Have a look at the W3C draft document on normalization there is a link form the sample page.

Andrew
Bolaji Aluko  130
07-20-2004 11:27 AM ET (US)

Andrew:

Can we set up a "live" example in which we search differently for the following text (for example) from a paragraph:

     odó (mortar)
     odò (river)
     òdó (young) - each "o" here should have a dot under,
                   but I cannot represent that here, but
                   could do so under Ariya.

That search would be aided by "pre-composition", would it not?


Bolaji
Bolaji Aluko  129
07-20-2004 11:08 AM ET (US)
Edited by author 07-20-2004 11:09 AM
Andrew:

For example, knowing that for Yoruba vowels:


a e i o u - each with accent aigu (/) or accent grave (\)
A E I O U - each with accent aigu (/) or accent grave (\)

are already pre-composed on a standard keyboard,

then we need only the following 14 letters:


e o s - each with dot under; e o with dot under and
                               accent grave (\) and accent
                               aigu (/) over

E O S - each with dot under E O with dot under and
                               accent grave and accent
                               aigu over


Question: which of them have fonts that are ALL pre-composed WITHOUT composition of two letters? That is, should we not aim for NFC rather than NFD?

Is a two-key-stroke combination considered as "pre-composed?" If that is the case, then by combining the [FN] key ALONE with unused letters (for that key) on the English or US keyboard, can we not PRE-COMPOSE all the above additional 14 Yoruba letters?

Or is this what has been done already?
Bolaji Aluko  128
07-20-2004 10:36 AM ET (US)
Edited by author 07-20-2004 10:39 AM
Andrew:

I was able to view ALL the font renderings in NFD, and only the Arial Unicode MS in NFC in

          Samples of Yoruba font renderings

But in BOTH NFC and NFD, only the Arial Unicode MS has the dot, bar or dot FIRMLY under the center of the bottom of the "S" letter: otherwise, they are displaced to the left of the bottom of the letter.

It appears to me that the Arial Unicode MS is IN GENERAL favored in Yoruba letter rendering - or is that a hasty generalization from these observations?


Bolaji
Andrew  127
07-20-2004 12:57 AM ET (US)
I've uploaded a new version of the sample and have included a link to a version that uses NFD, i.e. only uses combining characters rather than using precomposed characters where they exist.
Bolaji Aluko  126
07-20-2004 12:39 AM ET (US)

Andrew:

Thanks Andrew.

I hope that forum members don't think this a two-person useless private discussion, but rather I am of the firm belief that font and font rendering issues must be properly tackled first, followed by discussions on TERMINOGIES (as in "dictionary of technical terms") in ICT in the target language before serious translation efforts can really begin.
Andrew  125
07-20-2004 12:28 AM ET (US)
Also worth noting that there are two different versions of Arial Unicode MS in use. Different Microsoft products shipped with different versions, Version 0.86 and version 1.0. You might get better mileage out of the newer version.
Andrew  124
07-20-2004 12:25 AM ET (US)
OK, no need to convert the whole UDHR.

In the sample page I've included Arial Unicode MS, Doulos SIL and Code2000 all which support Yoruba. From memory, the larger the font size the less satisfactory Arial Unicode MS is, ie the more apparent the problems Yoruba has with the font is.Umm, maybe I should make a version of the sample page that allows the user to change the font size of the sample text.

The other fonts, standard windows fonts do not support all required characters.

I'm writing notes at teh moment, hopefully finshed later today or tomorrow which will explain the differences between Arial Unicode MS, Doulos SIL and Code 2000. Between the different Unicode Normailization forms, and the differences in unicode codepoints for Yoruba, and how this impacts on the end user. esp. since ideal display (esp at large point sizes) will require updated Windows components.
Bolaji Aluko  123
07-20-2004 12:01 AM ET (US)
Edited by author 07-20-2004 12:04 AM
Thanks a bunch again, Andrew! You sure know your stuff. Some of the complex font and type and etc. discussions that you referred do sound like Greek to me at times! :-)

But is there a reason why it is only the Arial Unicode MS choice that enables ALL of the glyphs (or is it fonts?) to be displayed right? Does that have to do with the fact that it is the only UNICODE font here - or am I wrong there?

I am also not sure why you need to translate the whole UDHR. The fragment that we are experimenting on sure makes the point. Of course, translating the whole UDHR is fine, but I believe that we are getting the point here...

By the way, if you send me your email, I might be able to convince the Ariya font developer to send you the software FOC (free of charge) for evaluation purposes. My email is alukome @ aol.com.
Andrew  122
07-19-2004 11:28 PM ET (US)
I've uplaoded a sample html document at http://www.openroad.net.au/languages/african/yoruba/sample.html

it shows the same text in unicode using each of the three orthographic conventions, it allows you to select a font and dislay the page using different fonts. Please let me know if you wnat other unicode fonts added to the list.

When I have time over the next few days i'll convert the full text of the UDHR into each of the orthographic conventions. I may need help correcting the text since there seems to be a number of errors in the document.

I'll also throw together a page describing issues relating to Yoruba and Unicode.

Andrew
Bolaji Aluko  121
07-19-2004 10:55 PM ET (US)
Edited by author 07-19-2004 10:57 PM
The "tí kò seé mú kúrò" means "that cannot be denied" or "that cannot be removed." The "tí" means "that", no doubt.
Andrew  120
07-19-2004 10:25 PM ET (US)
do oyu mena tí or tí̟ ?
Bolaji Aluko  119
07-19-2004 10:19 PM ET (US)
Edited by author 07-19-2004 10:19 PM

Yes, it should simply be . Too many characters.
Andrew  118
07-19-2004 09:35 PM ET (US)
For instance the seventeenth word in the first paragraph is : t̟̟̟̟í i.e. t&#799;&#799;&#799;&#799;í

I assume it should be tí̟ (tí&#799;) instead?

Andrew
Andrew  117
07-19-2004 09:29 PM ET (US)
I was indending to use the UDHR ofr the illustration, although when i went through it, I had to clean but the file, it was a hugh mess in parts it had two or three combining diacritics superimposed over each other.

I've also writen some quick and dirty mapping tables for CC to convert between the different unicode versions.

More later

Andrew
Bolaji Aluko  116
07-19-2004 08:56 PM ET (US)
Edited by author 07-19-2004 09:20 PM
Andrew:

Yes, we should probably wait for your illustrations, using certain files.

Maybe you can also use the first paragraph of the "Universal Declaration of Human Rights" in Yoruba to illustrate wherever possible?

http://www.unhchr.ch/udhr/lang/yor.htm


"Bí ó ti jé̟ pé s̟ís̟e àkíyèsí iyì tó jé̟ àbímó̟ fún è̟dá àti ìdó̟gba è̟tó̟ t̟̟̟̟í kò s̟eé mú kúrò tí è̟dá kò̟ò̟kan ní, ni òkúta ìpìlè̟ fún òmìnira, ìdájó̟ òdodo àti àlàáfíà lágbàáyé, ...
"

[Don't even know whether it will render right here.]

That has almost become like "Hello, world!" in Yoruba code/font demonstrations. :-)


Thanks.


Bolaji Aluko
Andrew  115
07-19-2004 07:09 PM ET (US)
re /m114

Hi Bolaji,

1) the only font that I know that will work is Doulos SIL, Code2000 might work as well. Microsoft have indicated that they will ship new fonts tha would be suitable in their next operating system (Longhorn).

The problem with Yoruba in Unicode is that not only do you require appropraite fonts, you also need an appropriate font rendering system. Which on windows means you need MS Office 2003. Windos XP service pack 2 should also have necessary support.

At least thats what i've found works here.


I'll eleborate on the opentype stuff very soon.

2) I don't use underlining, even in web pages. Its not necessary.

as to your final question "what glyphrepresentations therefore do these desirable features remove, or which ones are thereby highly recommended?"


I don't recommend any particular codepoint assignment. I just believe that the Yorubaa language would benefit from using a consistent set of codepoints. Assuming you wanted to search a range of Yoruba texts (in Unicode) for particular words or phrases. Would you find all the occurances of the word/phrase, if different texts used different codepoints for particular Yoruba characters?

I just thing that consistency in representation of Yoruba characters in Unicode would generally benefit the Yoruba language in an electronic format.

If I have time later today, I'll try to throw together soem files to illustrate what I mean.

Andrew
Bolaji Aluko  114
07-19-2004 01:09 PM ET (US)
Edited by author 07-19-2004 01:13 PM
re /m