This is an old revision of the document!


~~SLIDESHOW~~

NVDA Japanese enhancements

  • The Localization of NVDA for Japanese Language Users
  • Takuya Nishimoto, Director of NVDA Japanese Team
  • nishimotz @ gmail.com / Twitter @nishimotz
  • to appear in

Prehistory

  • 1980s: Japanese screen reader. character description.
  • 1996: Japanese screen reader for Windows released, known as '95Reader'.
  • 1997: IBM Home Page Reader. male voice for text, female voice for link.
  • 1998: 'PC-Talker' released. best selling in Japan.
  • 1999: '95Reader' with Japanese braille support.
  • 2001: JAWS for Windows from IBM Japan.
  • 2006: Japanese mobile phone which announce the input method.

History

  • 2006: NVDA released.
  • 2008: Internet Technology Research Committee (ITRC) Universal Access to the Internet (UAI) started discussions of NVDA Japanese support. (Prof. Takayuki Watanabe, Mitsue-Links Co.,Ltd)
  • 2010: Japanese speech engine, input method support. (Takuya Nishimoto, Masataka Shinke)
  • 2011: Japanese braille support. 64bit system support.
  • 2012: NVDA Japanese team restructured.

Internationalization

  • NVDA Translation team: 40 languages.
    • messages
    • documents
    • symbols, character descriptions
    • add-ons: OCR, Vocalizer
    • community web site
  • Dependency (Japanese support unavailable)
    • eSpeak: multi-lingual open-source speech synthesizer
    • liblouis: multi-lingual open-source braille translator

Japanese team (2012)

  • Team: non-profit developer community
    • 24 persons
    • weekly Skype meeting
  • Japanese Users List
    • 179 persons
    • e-mail discussions
  • Friends
    • text book for support volunteers
  • Taiwan-Japan collaboration
    • bi-monthly Skype meeting

Works

  • Represent Japanese users
    • develop enhancement for Japanese
    • 2012.2.1jp: more than 2000 downloads
    • define requirement for Japanese support
  • Support international users of Japanese language
    • add-on speech engine
  • Participate in international team
    • support East-Asia enhancement work
    • migration from localization to internationalization

Japanese requirements

  • Why/how Chinese support and Japanese support are different?
  • Domestic conventions?
    • screen readers since 1980s
  • Characteristics of Japanese language
  • Technical limitations
    • keyboard, speech engine

Input methods

  • Correct pronunciation, wrong characters
    • 'ha i ri so u de su': 'Yes, it is an ideal.' / 'It is likely to go into inside.'
  • Same technology
    • Input Method Manager (IMM), Text Service Framework (TSF)
  • Difference of writing systems
    • Candidate is not single character
    • State transition is differenct

State transitions

  • Chinese
    • initial state
    • editing reading string (on the spot)
    • candidate selection (composition window)
  • Japanese
    • initial state
    • editing reading string (on the spot), before translation key pressed
    • composition string (on the spot), after translation key pressed
    • candidate selection (candidate window pop up), translation key pressed twice

Half shape/full shape

  • History: single byte/double byte
  • Unicode: shape does not correspond to byte size
  • Input method
    • disabled: half shape only
    • enabled: half shape and full shape are selectable
  • Discrimination
    • not necessary for understanding the meaning
    • necessary for writing, typing URLs, filling web forms (especially user authorization), software development

Characters 1

  • Latin (alphabet)
    • half shape, full shape
  • Number
    • half shape roman digit, full shape roman digit, ideograph
  • Pronunciation syllable (Kana)
    • Hiragana: grammatical words, inflectional endings
    • Katakana: foreign words. half shape and full shape

Characters 2

  • Ideographs
    • Kanji in Japanese, Hanzi in Chinese, Hanja in Korean
    • Chinese traditional > Japanese > Chinese simplified
    • typically many readings for Japanese
  • Symbols, punctuations
    • 600 full shape symbols by Japanese standard.
    • some symbols available for half shape and full shape

Writing system 1

  • Sentence contains both syllables and ideographs
  • Word is not separeted with spaces
  • Pronunciation of ideograph depends on the context
  • Input method
    • syllable-to-ideograph conversion using dictionary

Writing system 2

  • Braille
    • popular six-dot system
      • syllable based symbols are defined for Japanese braille.
      • special rules for word breaks.
    • ideographic braille: six-dot system and eight-dot system
  • Speech synthesis
    • pronunciation of single character (spelling reading) is undefined.
  • Morphological analysis: useful in many places
    • based on dictionary, and sometimes statics (machine learning)

Character descriptions 1

  • Chinese: many examples and explanations
    • announce first or announce all
  • Japanese: one example or explanation
    • written in syllable character
  • Usage
    • input method candidates
    • character review
    • text edit: very frequently

Character descriptions 2

  • Charater description
    • spelling reading (with dictionary)
    • ideographic character: necessary for descrimination
      • both for speech and braille display
    • phonetic reading: help listening of speech
      • latin character for English
      • syllable character for Japanese katakana and hireagana
  • Character attributes
    • similarity to announce capital letter
      • possible with speech, beep, pitch
    • half-shape/full-shape (Latin, katakana, number, symbol)
    • types of syllable character: katakana/hiragana

Syllable input

  • Transliteration: Latin-to-syllable (roman)
    • US key array
    • Japanese key array
    • braille input (Latin six-dot system)
  • Direct: (kana)
    • Japanese key array
    • braille input (Japanese six-dot system)

Keyboard 1

  • Half-shape/Full-shape key
    • enable or disable input method
    • US key array: ALT-tilde is equivalent
  • Non-Conversion key
    • input mode change
    • do not convert reading string (Hiragana)
    • convert reading string from Hiragana to Katakana
  • Conversion key
    • space key is equivarent in most cases
  • Katakana/Hiragana/Roman key
    • with ALT modifier, syllable input mode change

Keyboard 2

  • Caps Lock key
    • in Japanese array, cannot be used as NVDA modifier key
  • Variations
    • Non-Conversion equivalents: Ctrl-U,I,O,P and F6-F9
  • Preferences
    • change mode before typing syllables
    • (non) convert after typing syllables

Applications

  • Input method editors (third parties)
    • ATOK from Just System, Google Japanese Input, Baidu IME
  • PC-Talker companions
    • NetReader may be associated with web pages.
  • OpwBE
  • Hidemaru editor
    • ControllerClient enhancement: speakSpelling

Devices

  • KGS braille display
    • serial port support requested

Braille translator design

  • (1) Pre-process
    • separation: Japanese, English, computer
  • (2) Morphological analysis (Japanese)
    • Mecab already used for JTalk speech engine
  • (3) Morphological unit processing (Japanese)
    • word break, long vowel marks
    • Latin, number, Japanese symbol, space
    • number translation
  • (4) Word break detemination
  • (5) Post-process
    • merge Japanese and English/computer
    • foreign mark, cap mark, number mark, braille symbols
    • make dot patterns

Future

  • Separate interface language and content language
    • speech is already multilingual
    • multilingual input methods support
  • Costly development in Japanese community
    • is our language really special?
  • Character/symbol description
    • efficiency: for engineer
    • correctness: for school
nvdajp.1347352332.txt.gz · Last modified: 2012/09/11 17:32 by nishimotz
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0