Debian Spelling Dictionaries and Tools Policy

Rafael Laboissire

The Debian Project

David Coe

The Debian Project

Agustn Martn Domingo

The Debian Project

Ren Engelhard

The Debian Project
Release 1.23.13 (2014-9-24)
Status: draft

This text is distributed according to the GNU General Public License .


Table of Contents
Introduction
Background
The dictionaries-common and dictionaries-common-dev Packages
General Requirements on the Packages (for maintainers)
Ispell Dictionary and wordlist selection Support via Debconf
Emacs, jed and mutt Support
Support for Other Packages
Registering aspell and myspell/hunspell dictionaries for use from emacs and squirrelmail.
A. Dictionaries common dependencies checklist
B. Dictionaries common internals
C. Enchant ispell mapping

Introduction

This document is intended for Debian maintainers whose packages relate in some manner to spell-checking programs, ispell/aspell/myspell or hunspell dictionaries and/or wordlists (application-independent dictionaries). It concerns therefore the ispell package itself, the language-specific ispell/aspell/myspell or hunspell dictionaries, the language-specific wordlists, some editors like (X)Emacs and jed, some of the mail and news user agents (MUAs, NUAs), and other tools that give the user a choice of dictionaries for spell checking.

The main goal of this Policy is to establish the basic requirements for the said packages in a Debian system, allowing a high degree of integration among them. This should yield a coherent behavior of all ispell/aspell/myspell or hunspell dictionary and wordlist-related packages, both at installation and usage time.

This document no longer affects aspell packages for anything but its use under emacs (see the Section called Registering aspell and myspell/hunspell dictionaries for use from emacs and squirrelmail.). For information on aspell dictionary policy please look at the aspell package documentation and at the aspell manuals

Warning

emacs-snapshot package should not be affected by this document. It will contain the bleeding edge code from emacs cvs and we have decided to keep it as standalone as possible to minimize interferences from external code.


Advantages For Debian Users

  • Correct selection of available dictionaries by applications.

  • Consistent, simpler, management of multiple dictionaries, for e.g. multiple languages and/or multiple specialties.

  • Single question at ispell dictionary or wordlist installation time, via debconf.

  • Consistent ways for administrators and users to select from among the available spelling dictionaries and wordlists for system-wide default, user default and individual application session default.


Advantages For Debian Package Maintainers

  • Easier package configuration.

  • Fewer bug reports.

  • Better integration of wordlists and spelling dictionaries.


Quick checklist for the impatient maintainer

  • If you maintain an ispell dictionary or wordlist package you must install the package dictionaries-common. This package will conflict with all old style ispell dictionary or wordlist packages. Since from this policy on ispell hash files should be 128 character per string it will also conflict with older ispell packages. If you use debhelper and want to use the debhelper like scripts you must also install package dictionaries-common-dev.

    There is no need to change the name of your package for ispell dictionary or wordlist packages. You however have to take care with one thing; since all such packages conforming to this policy will replace the old packages and pre-depend on the package dictionaries-common, you need to notify the dictionaries-common maintainer which one will be the last version of your package(s) not using the new policy. That versioned conflict will be added to the dictionaries-common package and after that your new policy compliant package can be used.

    For myspell dictionary packages however you have to rename your existing package if it followed the old "OpenOffice.org Spellcheck Packages Packaging Guide". You have to rename openoffice.org-spellcheck-<langcode> to myspell-<langcode>

  • Set the correct dependency relationships as in the Section called Relationships.

  • Verify the architecture and priority level in debian/control (see the Section called Choice of Architecture and Priority Level).

  • Be sure that you are installing the correct files and symlinks in /usr/lib/ispell/, /usr/share/hunspell/ or /usr/share/dict/. (see the Section called Installation Directories and Symlinks).

  • Update the maintainer scripts.

  • If you are packaging a ispell dictionary, you no longer need to add an emacsen startup file. It will be automatically generated from the info file once all the dictionaries are installed.

  • Rebuild and test that everything is working O.K. When adding a new ispell dictionary or wordlist package debconf will query for the default selection. When removing or purging a package containing the default selection that query will be done again unless only one package of his class is left. Also check that the emacs menu(s) corresponding to your package displays properly. If not, look at the info file and at your package entries in the autogenerated file at /var/cache/dictionaries-common/emacsen-ispell-dicts.el

  • Upload.


Background

Wordlists and Ispell dictionaries are interrelated but separate packages. This system provides a common background for them and intends to be able to handle a number of different possibilities. The simplest one is a package providing just one dictionary. There are also some packages that do provide more that one dictionary. For instance, ifrench-gut provides both french-gut and french-gut-tex8b, or the norwegian dictionary provides both nynorsk and bokml. That will be handled properly by the system.

The miscfiles package also provides an english wordlist (web2, all the words from the 1934 Webster's Second International Dictionary), as well as many other different things. This can also be handled properly by the system.


The dictionaries-common and dictionaries-common-dev Packages

A package called dictionaries-common is created. It allows (ispell/aspell/myspell/hunspell)-dictionary and wordlist packages to be coherently integrated by providing necessary infrastructure, including configuration scripts, commands for selecting default dictionaries, and initialization routines for the different emacsen flavors and jed. It also provides support for registering ispell/aspell/myspell/hunspell dictionaries for use under emacs and squirrelmail. This is the basic package for the system to work.

A package dictionaries-common-dev, to which this Policy belongs, is also provided. This is a package for maintainers of spellchecking dictionary or wordlist packages. It contains the Policy document itself, as well as debhelper-like scripts to simplify the debianization of spellchecking dictionary or wordlist packages for maintainers using debhelper. It also (via this policy document) may provide suggestions or patches for other related packages.


System wide configuration

Besides the debconf configuration at installation time, there will be a /usr/sbin/select-default-(ispell|wordlist) script available to the system administrator (in /usr/sbin/) that will call the debconf question at any time and will be responsible to set the appropriate links /usr/lib/(ispell|words)/default.*. This scripts are included in the dictionaries-common package.


Ispell wrapper

A ispell wrapper command will be made available by dictionaries common. This command will accept all the ispell options plus -L <language>, where language must correspond to one of the languages installed in the system (Perl regular expressions will be probably available here, such that calling ispell-wrapper -L ".*brasil.*" will select "Portugus Brasileiro").

An interactive selection script (select-default-iwrap) will also be available. This is an interactive selection script for selecting the user-specific default ispell dictionary for ispell-wrapper. The result will be placed in ~/.default-ispell.

The system wide default value for ispell-wrapper will be the globally selected one at installation time or through select-default-ispell.

These are included in the dictionaries-common package.


Add-on support

Emacsen, jed and mutt add-on support will be fully auto-generated by the update-default-(ispell|wordlist) or update-dictcommon-(aspell|hunspell)script.

Do not add emacs or jed startup files to your package. That will surely interfere with the autogenerated system and be the major source for problems.


General Requirements on the Packages (for maintainers)

Naming of Language-specific ispell dictionary and wordlist Packages

Language-specific ispell dictionary packages and wordlists must be named the classical way, like "ifrench", "wfrench", "iswedish", etc. Use of non-English language names is discouraged; for example "ingerman" should not be named "indeutsch". (This is based on existing practice, and is for consistency and the convenience of Debian administrators in all languages.)

Dictionary sources may provide multiple dictionaries. Each of the binary packages can contain more than one dictionary. In this case, the maintainer must provide info entries for each dictionary in the package info file (see the Section called The info file).

Note: It is not possible for a system to have a mixture of new-style and old-style ispell dictionary or wordlist package concurrently. The package dictionaries-common conflicts with all the old i<language> and w<language> packages prior to the first version using the new policy, because there is now a new way to manage their alternative symlinks.

For that reason you have to check the entry corresponding to your ispell dictionary or wordlist package in the versioned conflicts line of dictionaries-common package and notify maintainer of that package which was the last version of your package using the old system, to include the right versioned conflict in the dictionaries-common package.


Naming of Language-specific myspell or hunspell dictionary Packages

myspell dictionary packages must be called myspell-<isocode> and hunspell dictionary packages must be called hunspell-<isocode> (<isocode> being the two-digit isocode of the language). Use the myspell prefix for myspell dictionaries and the hunspell prefix for hunspell only dictionaries that use hunspell features.

If there are more dictionaries for a language (e.g. de_DE, de_CH, ..) then the country code can be added to the package name (e.g. myspell-<langcode>-<countrycode>).


Relationships

Note: Please check Appendix A for information about versions where some features were introduced. This is needed to set correct versioned dependencies and build-dependencies.

The package relationships declared in the debian/control files should be as follows for ispell dictionary or wordlist packages:

  • Because wamerican will provide a /usr/share/dict/words->/usr/share/dict/american-english symlink, needs to conflict with dictionaries-common (<< 0.98) where /usr/share/dict/words diversion was first introduced. No dependency on dictionaries-common is needed as long as wamerican maintainer scripts do not fail without it (see the Section called Maintainer Scripts for ispell dictionaries and wordlists).

  • All ispell dictionary or wordlist packages but wamerican have to depend on dictionaries-common in their debian/control files.

    Every ispell dictionary or wordlist package using the helpers from dictionaries-common-dev has also to declare

    
Build-Depends: debhelper, dictionaries-common-dev

    if your package contains architecture dependent ispell or aspell dicts (See ispell-autobuildhash or aspell-autobuildhash manual pages for info about how to make our ispell or aspell dict package 'arch: all' by building hashes from package postinst), or

    
Build-Depends: debhelper
    Build-Depends-Indep: dictionaries-common-dev

    if all ispell, aspell, myspell/hunspell dicts and wordlist packages using dictionaries-common-dev are architecture independent. This will make autobuilders, lintian and debuild happy.

    They must also provide the appropriate virtual package ("ispell-dictionary" or "wordlist").

  • Ispell dictionaries must depend on ispell. If your package uses ispell during the building process you must also set the appropriate build dependency.

  • Each ispell dictionary package should suggest the corresponding wordlist package. (This is because ispell can use wordlists in addition to ispell dictionaries, but doesn't actually require them.)

  • The dictionaries-common package suggests ispell. (A stronger relationship was considered and rejected, because users might want some wordlists and not want ispell.)

  • The ispell package depends on ispell-dictionary and recommends wordlist.

  • Packages containing tools that can use ispell (editors, MUA, etc.) may suggest or recommend ispell, but should not depend on ispell.

  • Packages that use ispell and allow users to select or specify (from within the running application) which dictionary to use, should depend on dictionaries-common and should invoke an appropriate dictionaries-common dictionary-selection interface as documented in the Section called Ispell Dictionary and wordlist selection Support via Debconf).

For myspell/hunspell dictionary packages the relationships in debian/control should be as follows:

  • The myspell/hunspell dictionary packages must depend on at least dictionaries-common (>= 0.10) because in that revision the myspell support was added.

  • The myspell-<isocode> packages must provide the virtual packages myspell-dictionary and myspell-dictionary-<isocode>. The hunspell-<isocode> packages must provide the virtual packages hunspell-dictionary and hunspell-dictionary-<isocode>.

  • hunspell and myspell dictionary packages should declare a Suggests on Libreoffice or OpenOffice.org and on the Mozilla flavors in Debian that support the spellchecker. Something like

    
Suggests: hunspell,
     libreoffice-core | openoffice.org-hunspell | openoffice.org-core (>= 2.0.2),
     iceape-browser | iceweasel | icedove

  • myspell dictionary packages must Conflict: against openoffice.org (<= 1.0.3-2)

  • hunspell dictionary packages should conflict against old versions of Mozilla / OpenOffice.org not supporting hunspell

    
mozilla-browser (<< 1.8+1.1.1-2),
     iceape-browser (<< 1.1.1-2),
     firefox (<< 2.0.0.3-2),
     thunderbird (<< 2.0.0.1+0dfsg-0),
     iceweasel (<< 2.0.0.3-2),
     icedove (<< 2.0.0.0-4),
     libxul0d (= 1.8.0.11-3),
     openoffice.org (<= 1.0.3-2),
     openoffice.org-core (<< 2.1~m190-1)
    

  • The myspell/hunspell packages having an "old" version named openoffice.org-spellcheck-* (regardless of whether that was in Debian once or not) must declare the magic Conflicts: / Provides: / Replaces: combination "against" the old package.

  • If there are hunspell and myspell dictionary packages for a given language and the hunspell dictionary package installs files with the same name as the myspell dictionary package, the hunspell dictionary package must conflict against the myspell dictionary package.


The info file

All ispell dictionary or wordlist packages must install a file /var/lib/dictionaries-common/(ispell|wordlist)/<package-name>. Aspell dictionaries can install it at /var/lib/dictionaries-common/aspell/<package-name>. myspell/hunspell dictionaries can install it at /var/lib/dictionaries-common/hunspell/<package-name> (not to be confused with the now obsolete info file that was needed for OpenOffice.org < 3). General format of that file (reminiscent of the RFC 822 format) is, including all possible entries for ispell and wordlist packages:


Language: portugus brasileiro (Brazilian Portuguese)
Hash-Name: brazilian
Emacsen-Name: brasileiro
Elanguage: portugues brasileiro (Brazilian Portuguese)
Casechars: [a-zA-Z]
Not-Casechars: [^a-zA-Z]
Otherchars: [---']
Many-Otherchars: yes
Additionalchars: 
Ispell-Args:
Extended-Character-Mode:
Coding-System: iso-8859-1

adapted to the corresponding language and ispell dictionary or wordlist package.

Each field in this file must be contained in a single line. They may also have the right side (i.e., after the character ":") empty, which is equivalent to suppressing the field (the default value will be used in this case, see below). 8-bit chars in Casechars, Not-Casechars and Additionalchars must be represented in the same encoding declared for the dict in the info file, either as the char itself or as its octal \xxx representation. This later is highly preferable if another string like Language contains utf-8 chars.

Several records as the above may be present in each file and must be separated by a blank line. They can correspond to different dictionaries or to different ways of accessing the same dictionary from ispell wrapper or emacs (this will have no effect for use of plain ispell). They must have unique values for the Language field.

The records supplied by each dictionary package will be used by the core of dictionaries-common to provide site-wide configuration, including Debconf list of choices, ispell/wordlist default symlinks, automatic generation of add-on support (emacsen, jed and mutt). This file is therefore essential for the correct integration of the dictionary packages into the dictionaries-common scheme.

Note: It is very important that the Language name be unambiguous and informative to the system administrator, because at debconf dictionary-selection time only the list of package names, and not their descriptions, will be visible.

Here is an explanation of the fields shown above:

  • Language: (this field is mandatory)

    Comprehensive description of the language. It is advised to include both the description in the original language (in UTF-8 coding system) and a description in English between parentheses. This will help the system administrators around the world, who does not now how the dictionaries are spelled in their original language. Once set, do not change it for entries triggering a debconf question unless there is a really good reason for that, because any change may trigger a debconf prompt or, if choice was the old one, even change current settings without prompt if in debconf low priority.

    If you think you really need to change text in Language field for a debconf used entry, please look first at Elanguage field, described below.

    You need to use UTF-8 for this string, because otherwise it will not be displayed (will display an empty value) in UTF-8 systems. The drawback is that 8bit characters will display strangely in a non UTF-8 terminal, but it will still be readable. You should consider using only 7 bit chars if possible when you create this field for the first time.

    This field will be used in the Debconf list of choices as well as a key for determining the language default for the ispell-wrapper utility. Hence, it has to be unique among all the installed dictionaries in the system. English description should preferably be unique too.

  • Hash-Name: (this field is mandatory)

    Base name of the files that will appear as symlinks in both /usr/lib/ispell or /usr/share/dict. Has to be unique among the installed dictionaries in the system.

  • Emacsen-Name: (optional, defaults to Hash-Name value)

    Entry name of the dictionary that appears in the list of choices of the emacsen ispell package. Maintainers should try to respect the tradition of that package, by keeping the name that they used to have in the past.

  • Elanguage: (optional, defaults to Language value)

    Alternative language name to be displayed by debconf. Not needed at all unless you use the debhelper like scripts and are changing the language name to be displayed, avoiding extra debconf prompts. Its format is the same as Language. See the Section called The templates file for more info about this.

  • Casechars: (optional, defaults to [a-zA-Z])

    Emacs-Lisp regexp of valid characters that comprise a word. It is typically enclosed between square brackets. Do not use ranges here for non 7bit chars.

  • Not-Casechars: (optional, defaults to [^a-zA-Z])

    Opposite regexp of Casechars.

  • Otherchars: (optional, defaults to ['])

    Regexp of characters in the Not-Casechars set but which can be used to construct words in some special way. (See the ispell.el documentation for details.)

  • Many-Otherchars: (optional, default value no)

    Boolean variable (assuming either the values yes or no). If it is non-nil when multiple Otherchars are allowed in a word. Otherwise only a single Otherchars character is allowed to be part of any single word.

  • Additionalchars: (optional, defaults to an empty string)

    Characters other than ASCII that may be part of a word. For emacsen this is somehow redundant with the Casechars field, but is necessary for the proper working of jed and the -w option to ispell.

  • Ispell-Args: (optional, defaults to -d <Hash-Name>)

    List of additional arguments passed to the ispell.

  • Extended-Character-Mode: (optional, defaults to the empty string)

    Set when dictionaries are used which have been configured in an ispell affix file. (For example, umlauts can be encoded as \"a, a", "a, ...)

  • Coding-System: (optional, defaults to the empty string)

    Used for languages with multibyte characters. Any coding system will be accepted if the {x}emacs version being run accepts it. Maintainers, please check that the provided coding system works with the different emacsen flavors. If the coding system is not one of iso-8859-1, iso-8859-2, iso-8859-3 or koi8-r make your package depend on at least dictionaries-common (>=0.24), where the other encodings were allowed.

    Warning

    At the time of this writing there are some encoding unification problems in at least XEmacs between iso-8859-1 and iso-8859-15 charsets, being the same character represented differently in the emacs internal mule encoding. For this reason please do not blindly replace the old iso-8859-1 entry by iso-8859-15. If you require the iso-8859-15 encoding, better add a new emacs only iso-8859-15 entry (see debconf-display: no) as a temporary workaround. This way the iso-8859-1 entry will work with iso-8859-1 and UTF-8 texts and fail with iso-8859-15, while the new iso-8859-15 entry will work with iso-8859-15, but will fail with iso-8859-1 and UTF-8. The same might also apply to other charsets, please doublecheck.

  • {debconf,emacs,jed}-display: (optional, defaults to yes)

    If emacs-display or jed-display are set to no, the corresponding entry will not be displayed by emacs or jed when building the cache files. Needs a versioned dependency on dictionaries-common.

    If debconf-display is set to no, this entry will not be added by installdeb-ispell to the debconf template with the possible values of choice. It will remain available to ispell-wrapper and emacs/jed unless {emacs,jed}-display are set to no. Needs a versioned build dependency on dictionaries-common-dev

  • squirrelmail: (optional, defaults to Language value)

    If squirrelmail is set to no, this entry will not be added to the squirrelmail spellcheck list. Any other value will override string derived from Language value in spellcheck list.

  • aspell-locales: (aspell only, optional, no default), hunspell-locales: (myspell/hunspell only, optional, no default)

    Comma separated list that represents the set of locales associated to the aspell (or myspell/hunspell) dictionary to try guessing the emacs ispell.el default aspell (or myspell/hunspell dictionary) after the contents of the LANG environment variable. When there is no possible confusion the two digits language iso code is enough, but you can add other locales to make it more complete (e.g. Aspell-Locales: es, es_ES, es_ES@variant). The long form will be selected first if matches the value of the LANG environment variable. This last will be stripped of @.. and compared and stripped of _... and compared for a match. Same for hunspell-locales.

    When there are two variants of a language (e.g., for new and old German) use the prefix 1: for the non preferred variant, e.g. Aspell-Locales: de, de_DE for new German and Aspell-Locales: 1:de, 1:de_DE for old German. Same for hunspell-locales.

The values of the fields Otherchars, Many-Otherchars, and Additionalchars must have the same encoding as the dictionary encoding. Each character can be written in the \xxx format, where xxx is its octal value. For instance, can be written as \351.

Wordlist packages only need to set the Language and Hash-Name fields. Other fields will silently be ignored.

Note for debhelper users

If you use debhelper and the helpers installdeb-{ispell,wordlist} provided by the package dictionaries-common-dev most of the required work will be automatically done after this info file. In this case you have to name this file with extension .info-ispell or .info-wordlist depending of the package class.

The full name rules are similar than for other debhelper files like dirs or docs.

Note for cdbs users

If you use cdbs for building your package, just use:


include /usr/share/cdbs/1/rules/debhelper.mk
include /usr/share/doc/dictionaries-common/cdbs/dict-common.mk

in your debian/rules. This will take care of the inclusion of all the installdeb-* commands at the appropriate places. Notice that this is only guaranteed to work when used in conjunction with debhelper.

For this to work the package should build-depend on cdbs and debhelper, and also contain a versioned build-dependency on dictionaries-common-dev >= 0.70.

For an example of use of the cdbs support see the myspell.pt source package (version 20060602-2 or later).


Choice of Architecture and Priority Level

Because the hash files generated by the buildhash program are binary files subject to big/little-endian differences, all ispell dictionary dictionary packages directly installing the hash file should have "Architecture: any" in their debian/control files. All wordlist packages should have "Architecture: all" (unless other files in the package prevent that).

If your ispell hash file is built at package postinst it should have "Architecture: all" in the debian/control file. See ispell-autobuildhash or aspell-autobuildhash manual pages for info about how to make your ispell or aspell dict package "Architecture: all" by building hashes from package postinst.

The priority level for all ispell dictionary and wordlist packages should be set to "optional", as we believe that any production system should provide spelling support for at least one language.

wamerican has priority "standard", because the idea of "standard" is to define a minimal, standard Unix-like setup. A wordlist is certainly part of that, and since wamerican wordlist contains what has traditionally been in /usr/dict/words, it is made priority standard. (thanks to Charles Briscoe-Smith, previous wenglish maintainer)

myspell/hunspell dictionary packages are "Architecture: all" and have priority "optional".

dictionaries-common will be of optional priority. There will be a special fine-tuning between wamerican and dictionaries-common so wamerican can be installed standalone, but will use all the dictionaries-common capabilities if present.


Installation Directories and Symlinks

* ispell dictionaries and wordlists

Ispell dictionary hashed files (.hash) and affix table files (.aff) must be placed in the directory /usr/lib/ispell/. Wordlist dictionary files must be placed in the directory /usr/share/dict/. wamerican will also set a /usr/share/dict/words->/usr/share/dict/american-english symlink.

When, as in the polish ispell dictionary, hash table is build at install time and can be later rebuilt with different options, things need to be done in a different way. In that case, the hashed table must be installed in /var/lib/ispell and a link from /usr/lib/ispell must be installed pointing to the hash file. The aff file will be installed as for any other dictionary.

If desired, symlinks can be created in this same directory with other language names, like italiano.hash -> italian.hash. This may be of some help to local users who do not know the language names in English.

enchant is a spell-checking portable library similar to what pspell was, but with more possible backends, developed by the Abiword people. Currently Debian ispell dictionaries are compatible with enchant and can be made available to it under the names and encodings expected by enchant (See Appendix C). If your ispell dictionary is one of those listed there and the default encoding is the right one you only need to set a symlink, e.g.


/usr/share/enchant/ispell/espanol.hash -> /usr/lib/ispell/espa~nol.hash

* myspell/hunspell dictionary files

myspell/hunspell dictionary files (*.dic and *.aff) must be installed in /usr/share/hunspell.

Note: Temporary symlinks at /usr/share/myspell/dicts location are obsolete. Do not set them!.

For better Mozilla integration, xx-XX.{dic,aff}->xx_XX.{dic,aff} symlinks were previously needed. Mozilla* now understands both variants, so this is no longer the case. For that reason, dictionaries must always be installed in the xx_XX.{dic,aff} variant.

When used, the script installdeb-myspell, from the dictionaries-common-dev package, will take care of setting the appropriate symlinks if required, after the names found in the myspell info file if given (See installdeb-myspell manual page). With the --srcdir option can in some systems try to install the {.dic,.aff} files, see its documentation. Use of installdeb-myspell needs a versioned build dependency on dictionaries-common-dev (See Appendix A)


Maintainer Scripts for ispell dictionaries and wordlists

Notice that the debhelper scripts installdeb-ispell and installdeb-wordlist, provided by dictionaries-common-dev will handle most of the following automatically after the info file (See the Section called The info file). These are "debhelper-like" commands but are not officially part of the debhelper package. Debhelper may one day contain something with a completely different design and usage, that accomplishes about the same thing.

  • wamerican: For the special case of wamerican scripts should be similar as below, but modified so they do not fail in case dictionaries-common is not installed. Note that this needs to be done manually, installdeb-wordlist will only create the standard maintainer scripts.

  • postinst: should source the script update-default-ispell or update-default-wordlist (provided by the dictionaries-common package) when called with argument "configure". Here is a template for inclusion in the postinst script of ispell dictionary packages,

    . /usr/share/debconf/confmodule
    SCRIPT="update-default-ispell"
    
    if [ "$1" = "configure" ] ; then
        if which $SCRIPT > /dev/null 2>&1; then
    	$SCRIPT  --rebuild
        else
    	echo "Error: $SCRIPT not present or executable. Missing dependency on dictionaries-common?" >&2
    	exit 1
        fi
    fi

    And here is a similar template for the wordlist packages,

    . /usr/share/debconf/confmodule
    SCRIPT="update-default-wordlist"
    
    if [ "$1" = "configure" ] ; then
        if which $SCRIPT > /dev/null 2>&1; then
    	$SCRIPT  --rebuild
        else
    	echo "Error: $SCRIPT not present or executable. Missing dependency on dictionaries-common?" >&2
    	exit 1
        fi
    fi

    installdeb-ispell and installdeb-words will automatically include above code into the final postinst scripts.

  • postrm: should source the script remove-default-ispell or remove-default-wordlist. when invoked with argument "remove" or "abort-install". The goal here is to prompt the administrator for a new selection if the package being removed is the current default. Here is a template for the code to be inserted into the postrm script of ispell dictionary packages (do not forget to edit #PACKAGE#),

    rmscript="remove-default-ispell"
    
    case "$1" in abort-install|remove)
    	if which $rmscript > /dev/null 2>&1; then
    	    $rmscript #PACKAGE#
    	else
    	    echo "Warning: $rmscript not present or executable." >&2
    	fi
    
            # Remove shared question stuff on package removal, not only on purge
    	if [ -e /usr/share/debconf/confmodule ]; then
    	    . /usr/share/debconf/confmodule
    	    db_purge
    	fi
    esac

    The template for the wordlist packages is pretty similar:

    rmscript="remove-default-wordlist"
    
    case "$1" in abort-install|remove)
    	if which $rmscript > /dev/null 2>&1; then
    	    $rmscript #PACKAGE#
    	else
    	    echo "Warning: $rmscript not present or executable." >&2
    	fi
    
            # Remove shared question stuff on package removal, not only on purge
    	if [ -e /usr/share/debconf/confmodule ]; then
    	    . /usr/share/debconf/confmodule
    	    db_purge
    	fi
    esac

where #PACKAGE# will be substituted by the package name. installdeb-ispell and installdeb-wordlist will automatically included above codes into the final postrm scripts with the right substitution for #PACKAGE#.

The scripts update-default-ispell and update-default-wordlist are responsible for the manipulation of the appropriate symlinks in /usr/lib/ispell/ and in /usr/share/dict/, respectively, taking into account the selections made via debconf (see the Section called Ispell Dictionary and wordlist selection Support via Debconf).


Ispell Dictionary and wordlist selection Support via Debconf

debconf makes it possible to have the selection of a default dictionary happen only once, just before dpkg install phase, even if several dictionary packages are being installed at the same time. This is in contrast with the old system, where the user was prompted for each new package being installed/upgraded, or was just advised to manually update the wordlist dictionary symlink in /etc/alternatives.

Individual users may, of course, override the administrator's chosen default ispell dictionary, by using either DICTIONARY environment variable, or by explicitly giving the dictionary name in the command line (e.g. with the -d option of ispell).

Some applications also allow the user to select a dictionary from within that application -- such changes affect only that application and (usually) only the current application session. (See the Section called Support for Other Packages).

In order to accomplish these things, some discipline is required.


The templates file

Every ispell dictionary or wordlist package must have a templates file defining a shared debconf multiple-choice question as well as another template defining the languages the package provides. This is an example for an ispell dictionary:

Template: shared/packages-ispell
Type: text
Description:

Template: #PACKAGE#/languages
Type: text
Default: #LANGUAGES#
Description:

The example for the wordlist is this one:

Template: shared/packages-wordlist
Type: text
Description:

Template: #PACKAGE#/languages
Type: text
Default: #LANGUAGES#
Description:

The token #PACKAGE# must be replaced by *exactly* the name of the package and #LANGUAGES# must be replaced by a comma separated list of languages provided by the package. The entries in the #LANGUAGES# substitution must be *exactly* (including that this entry is not to be localized) the same that appear in the Languages field in file /var/lib/dictionaries-common/(ispell|wordlist)/<package-name> unless you really know what you are doing. Any missing entry will not be displayed by debconf. While sometimes this might be what you want, you are suggested to use for that the "debconf-display" entry in your info-file and the debhelper-like command installdeb-ispell.

A #PACKAGE#/elanguages template entry similar to #PACKAGE#/languages can be used to override the debconf prompt text in the later. Its format is the same as for language. If containing more that one language, both languages and elanguages must have entries for the same languages and in the exact same ordering. Please try making elanguages base text as portable as possible (that is, try hard to make it 7 bit clean, using appropriate transliteration if possible). #PACKAGE#/elanguages has another difference, it can be localized. While this field is localizable, for most cases the former poor man Language internationalization is enough and translators should have another priorities. When localizing, please change only the non-parenthesized part for consistency with other entries. A #PACKAGE#/elanguages will be created by the installdeb-* scripts after either Elanguage value, if given in the info file, or Language value otherwise. This can be disabled when calling the script.

Additional templates may be added, if needed.

Note: If you are using debhelper and the debhelper like scripts provided by the dictionaries common system, the above templates file will be automatically generated from information gathered from the info file. If you do not need additional templates you do not have to worry about this.

Otherwise, if you need additional templates, do not put them in a file named debian/<package-name>.templates, since it will be overwritten by the installdeb-* scripts. The exact way for doing that depends on whether you use po-debconf or not to maintain localized versions of the templates.

  • If you use po-debconf, your master templates file is expected to be named debian/<package-name>.po-master.templates. You do not need to merge the translations by yourself, since installdeb-* will do that for you. See the po-debconf manual page for details about how to create master templates file and po files. Remember that the templates file is now named debian/<package-name>.po-master.templates to avoid conflicts with the autogenerated one. Remember also putting the appropriate po-debconf dependencies as described in the po-debconf manual since dictionaries-common sets no dependency on it.

  • If you do not use po-debconf, put them in a file named debian/<package-name>.templates.in. installdeb-* will merge the templates and will install the merged templates file the right way. This system can coexist with localized templates like debian/<package-name>.templates.ru corresponding to localizations of your extra templates. dh_installdebconf, called internally from the installdeb-* scripts will merge them with the templates file that is auto generated at debian/<package-name>.templates. Note that this is being deprecated by debconf.


The config file

There must also exist a config file, which will be responsible for getting the user selection during the configuration run of dpkg. The debian/config script for a ispell dictionary must contain (after the #!/usr/bin/perl line) exactly the Perl code below,

use Debconf::Client::ConfModule q(:all);

version ('2.0');

my $class  = "ispell";
my $script = "/usr/share/dictionaries-common/dc-debconf-select.pl";

if ( -e $script ){
    require $script;
    dc_debconf_select($class);
}

wordlist packages other than wamerican must contain (after the #!/usr/bin/perl line) exactly the following perl code

use Debconf::Client::ConfModule q(:all);

version ('2.0');

my $class  = "wordlist";
my $script = "/usr/share/dictionaries-common/dc-debconf-select.pl";

if ( -e $script ){
    require $script;
    dc_debconf_select($class);
}

config file for wamerican is similar to the above, but checking also for existence of /usr/share/dictionaries-common/elanguages before loading the common stuff. This is intended to avoid problems if dictionaries-common is made an optional package in the future, as intended. (See http://lists.alioth.debian.org/pipermail/dict-common-dev/2008-June/000739.html for more details).

If some other debconf actions need to be added and they are in Perl, just add them to the script above. If you prefer to write in shell, wrap the script above in the following shell code:


tmp=`tempfile`
cat > $tmp <<EOF
** (the script above) **
EOF
perl $tmp
rm -f $tmp

Both the templates and the config files are package-independent, so that the maintainer should just copy the files above to his control area at build time, typically debian/tmp/DEBIAN directory.

Note: If the maintainer is using debhelper and package dictionaries-common-dev is installed, the installdeb-ispell and installdeb-wordlist, when called in debian/rules, will create the above files automatically when no other actions are required in the config file. Do not call dh_instaldebconf in your rules file, since the scripts above already call it internally.

If other actions are to be added in the debian/config file and you are using debhelper and the helper scripts from dictionaries-common-dev proceed as follows. Add your actions to a debian/config.in (or debian/<package-name>.config.in) file.

If your actions are written in perl the config.in should look something like


#!/usr/bin/perl

#DEBHELPER#

# Now the package local stuff with your code

and if they are written in a sh script, they should look like


#!/bin/sh

#DEBHELPER#

# Now the package local stuff with your code

installdeb-ispell or installdeb-wordlist will take care of including the code required by this policy document into the config scripts.


Emacs, jed and mutt Support

Startup files for emacs and jed are automatically generated after installation of the ispell dictionaries (see the Section called Add-on support). Mutt support is also provided.

As regards the user interface, a new jed command is now available: M-x ispell_change_dictionary , which prompts the user for the ispell dictionary which will be used in the current editing session for spell checking.


Support for Other Packages

Many debian packages (and non-debian-packaged applications) don't provide a way for the user to select from multiple spelling dictionaries -- they just use ispell, which uses its currently-selected default dictionary. This is fine, and the user doesn't have to learn anything new to switch from one dictionary to another (see the Section called Ispell Dictionary and wordlist selection Support via Debconf).


Registering aspell and myspell/hunspell dictionaries for use from emacs and squirrelmail.

The dictionaries-common system will only use the contents of the emacs ispell-dictionary-alist variable in ispell.el if the corresponding entry is not redefined after the really installed dictionaries. A registration system similar to that provided for ispell dictionaries is available for aspell and myspell/hunspell dictionaries. To use this system, please see the following guidelines


Add an info file

An info file similar to that described in the Section called The info file must be installed as /var/lib/dictionaries-common/aspell/<package-name> for aspell dictionaries, or as /var/lib/dictionaries-common/hunspell/<package-name> for myspell/hunspell dictionaries, containing one entry for each aspell (or myspell/hunspell) dictionary it provides.

If there is an equivalent ispell dictionary, Emacsen-Name must be the same of it. Otherwise things like ;; ispell-local-dictionary: "brasileiro" in the spell checked file will not work similarly under ispell, aspell or hunspell.


Modify the maintainer scripts

For aspell, add a call to update-dictcommon-aspell to your postinst

SCRIPT="update-dictcommon-aspell"

if [ "$1" = "configure" ] ; then
    if which $SCRIPT > /dev/null 2>&1; then
	$SCRIPT  
    else
	echo "Error: $SCRIPT not present or executable. Missing dependency on dictionaries-common?" >&2
	exit 1
    fi
fi
and to your postrm
rmscript="update-dictcommon-aspell"

case "$1" in abort-install|remove)
	if which $rmscript > /dev/null 2>&1; then
	    $rmscript 
	else
	    echo "Warning: $rmscript not present or executable." >&2
	fi
esac

For myspell/hunspell, add a call to update-dictcommon-hunspell to your postinst

SCRIPT="update-dictcommon-hunspell"

if [ "$1" = "configure" ] ; then
    if which $SCRIPT > /dev/null 2>&1; then
	$SCRIPT  
    else
	echo "Error: $SCRIPT not present or executable. Missing dependency on dictionaries-common?" >&2
	exit 1
    fi
fi
and to your postrm
rmscript="update-dictcommon-hunspell"

case "$1" in abort-install|remove)
	if which $rmscript > /dev/null 2>&1; then
	    $rmscript 
	else
	    echo "Warning: $rmscript not present or executable." >&2
	fi
esac


Set the right relationships

If you want to use this system for your aspell dictionary you must make it depend on dictionaries-common (>= 0.9.1).

If you want to use this system for your myspell/hunspell dictionary you must make it depend on dictionaries-common (>= 1.0).


installdeb-aspell and installdeb-hunspell: debhelper like helpers for aspell and myspell/hunspell dictionaries.

Debhelper like scripts are provided to make even easier the steps above. They are installdeb-aspell and installdeb-hunspell, and rely in the existence of a info-{aspell,hunspell} file conforming to the specified in the Section called The info file, and named as for other debhelper files (.docs, .manpages, ...). Calling it in debian/rules will install the aspell/hunspell info file and create postinst and postrm debhelper snippets to be installed by debhelper. Note, that, unlike installdeb-{ispell,wordlist} this script does not know about debconf so you should install your debconf stuff, if any, in the usual way.

If you use the installdeb-aspell script you must make your package build depend on dictionaries-common-dev (>= 0.9.1). If you use the installdeb-hunspell script you must make your package build depend on dictionaries-common-dev (>= 1.0)


A. Dictionaries common dependencies checklist

This is a summary of the dictionaries-common(-dev) versions that may imply a dependency or build-dependency in your package.

dictionaries-common:
====================

1.23.0:
  * Automatically create /usr/lib -> /var/lib symlinks from
    autobuildhash script. Create a .remove file that will contain a
    list of auto-generated symlinks and try to remove stale .remove
    files and contents.

1.12.0:
  * No longer install update-openoffice-dicts script.

1.9.0:
  * Add triggers support for 'ispell-autobuildhash' and 'aspell-autobuildhash'

1.1.0:
  * No need to explicitly add "-d $hashname" in $ispellargs for emacsen.

1.0.0:
  * Support for hunspell dicts registration for use under emacsen.

0.98.14:
  * Deal with OOo>=3 obsoleted dictionary.lst, removing it if appropriate.

0.98.0:
  * If a previous non dictionaries-common /usr/share/dict/words
    exists it is diverted. This (with more changes) is the base
    for allowing standalone wamerican be installed without
    dictionaries-common.

0.81.0:
  * New support for squirrelmail through sqspell.php file

0.60.0:
  * Added cp1251 as a coding system alias to windows-1251
    for xemacs.

0.50.0:
  * aspell-autobuildhash is now considered stable and working.

0.25.0:
  * Put a list of installed ispell dicts in file
    /var/cache/dictionaries-common/ispell-dicts-list.txt.

0.24.0:
  * policy/dsdt-policy.xml.in,debian/patches/800_ispell.el.dpatch:
    - Allow any charset supported by {x}emacs as a possible value of
      Coding-System. Thanks to Joao Cachopo for the patch.
    - Warn about possible problems about that in the policy document,
      and explain when is a new dependency needed.

0.22.30:
  * ispell-autobuildhash is working since some versions ago, but
    here are some (mostly cosmetic) changes.

0.10.0:
  * Added update-openoffice-dicts, including myspell support

dictionaries-common-dev:
========================

1.23.3:
  * postinst-compatfile.in:
    - Make sure that /var/lib/{aspell,ispell} is *always* available.

1.23.1:
  * postrm-varlibrm.in:
    - Support directories to remove in rmfile.
    - Remove /var/lib/{aspell,ispell} if empty.

1.23.0:
  * No longer create automatically /usr/lib -> /var/lib symlinks.

1.22.0:
  * Initial support for dh_aspell-simple debhelper snippet.
  * Initial support for $compat.remove in debhelper snipets.

1.21.0:
  * installdeb.in: Populate substvars $class:Depends.

1.11.2:
  * Make changes in 1.10.9 more robust.

1.11.0:
  * installdeb.in: Automatically enable elanguages if provided.

1.10.9:
  * Make autobuildhash more robust by resetting compat in preinst
    (and in postinst for reconfigure). **Buggy**, go to 1.11.2!!

1.10.5:
  * installdeb.in: No longer install myspell postinst/postrm snippets.

1.10.3:
  * When using auto-compat and no previous stuff present,
    automatically create
      /usr/share/{i,a}spell/$hash->/var/lib/{a,i}spell/$hash
    symlink from installdeb-{a,i}spell.

1.10.2:
  * For {a,i}spell-autobuildhash, installdeb-{a,i}spell can now
    handle compat files creation/reset along with compat and hashes
    removal via 'auto-compat' field in info-{a,i}spell file. No
    longer need to modify files shipped by the package. Works
    better with triggers (dictionaries-common >1.9)

1.5.3:
  * o2compat is no longer the default for installdeb-myspell.

1.4.0:
  * Mozilla symlinks must no longer be set in hunspell destdir.
  * dicts in hyphen form will be installed in lowbar one.

1.3.0:
  * New location for hunspell dicts and OpenOffice hyphenation and
    thesaurus files. installdeb-myspell modified for this. Backwards
    compatibility symlinks will be installed until all apps look in
    the new location.

1.2.0:
  * postrm scripts with dictionaries-common-dev snippets can sometimes
    be run without dictionaries-common even installed
    (e.g., dpkg --unpack dict && dpkg --remove dict). Deal gracefully
    with this instead of signalling an error.

1.0.0:
  * installdeb-hunspell helper and associated debhelper snippets.

0.98.13:
  * Do not add path for maintainer scripts

0.96.0:
  * New installdeb-{ispell,wordlist}options for elanguages and future
    dictionaries-common migration to optional:
    --write-elanguages     Create the elanguages stuff.
    --no-installdebconf   Do not run dh_installdebconf nor remove templates
			  and config file.
    --no-pre-post         Do not install {pre,post}{inst,rm} snippets.


0.70.0:
  * Added support for CDBS

0.63.7:
  * scripts/debhelper/installdeb-myspell:
    - Do not try to automatically set mozilla symlinks if
      debian/$myspell_dict_package.links exists. Those
      symlinks were previously set unconditionally.

0.50.1:
  * scripts/debhelper/installdeb-myspell:
    - Make --srcdir option work also with thesaurus files.

0.30.0:
  * scripts/debhelper/installdeb-myspell:
    - Will also install openoffice hyphenation files if
      called from the hyphenation package

0.16.0:
  * Included installdeb-myspell script and associated debhelper
    snippets. --srcdir option is available in this branch from
    this version on.


B. Dictionaries common internals


C. Enchant ispell mapping

This mapping shows locale name, expected hash name and expected encoding for enchant ispell interface. Extracted from file src/ispell/ispell_checker.cpp at the enchant sources. This appendix might be out of date, please refer to that file for the most recent values.

    {"ca"    ,"catala.hash"         ,"iso-8859-1" },
    {"ca_ES" ,"catala.hash"         ,"iso-8859-1" },
    {"cs"    ,"czech.hash"          ,"iso-8859-2" },
    {"cs_CZ" ,"czech.hash"          ,"iso-8859-2" },
    {"da"    ,"dansk.hash"          ,"iso-8859-1" },
    {"da_DK" ,"dansk.hash"          ,"iso-8859-1" },
    {"de"    ,"deutsch.hash"        ,"iso-8859-1" },
    {"de_CH" ,"swiss.hash"          ,"iso-8859-1" },
    {"de_AT" ,"deutsch.hash"        ,"iso-8859-1" },
    {"de_DE" ,"deutsch.hash"        ,"iso-8859-1" },
    {"el"    ,"ellhnika.hash"       ,"iso-8859-7" },
    {"el_GR" ,"ellhnika.hash"       ,"iso-8859-7" },
    {"en"    ,"british.hash"        ,"iso-8859-1" },
    {"en_AU" ,"british.hash"        ,"iso-8859-1" },
    {"en_BZ" ,"british.hash"        ,"iso-8859-1" },
    {"en_CA" ,"british.hash"        ,"iso-8859-1" },
    {"en_GB" ,"british.hash"        ,"iso-8859-1" },
    {"en_IE" ,"british.hash"        ,"iso-8859-1" },
    {"en_JM" ,"british.hash"        ,"iso-8859-1" },
    {"en_NZ" ,"british.hash"        ,"iso-8859-1" },
    {"en_TT" ,"british.hash"        ,"iso-8859-1" },
    {"en_ZA" ,"british.hash"        ,"iso-8859-1" },
    {"en_ZW" ,"british.hash"        ,"iso-8859-1" },
    {"en_PH" ,"american.hash"       ,"iso-8859-1" },
    {"en_US" ,"american.hash"       ,"iso-8859-1" },
    {"eo"    ,"esperanto.hash"      ,"iso-8859-3" },
    {"es"    ,"espanol.hash"        ,"iso-8859-1" },
    {"es_AR" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_BO" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_CL" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_CO" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_CR" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_DO" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_EC" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_ES" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_GT" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_HN" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_MX" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_NI" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_PA" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_PE" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_PR" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_PY" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_SV" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_UY" ,"espanol.hash"        ,"iso-8859-1" },
    {"es_VE" ,"espanol.hash"        ,"iso-8859-1" },
    {"fi"    ,"finnish.hash"        ,"iso-8859-1" },
    {"fi_FI" ,"finnish.hash"        ,"iso-8859-1" },
    {"fr"    ,"francais.hash"       ,"iso-8859-1" },
    {"fr_BE" ,"francais.hash"       ,"iso-8859-1" },
    {"fr_CA" ,"francais.hash"       ,"iso-8859-1" },
    {"fr_CH" ,"francais.hash"       ,"iso-8859-1" },
    {"fr_FR" ,"francais.hash"       ,"iso-8859-1" },
    {"fr_LU" ,"francais.hash"       ,"iso-8859-1" },
    {"fr_MC" ,"francais.hash"       ,"iso-8859-1" },
    {"hu"    ,"hungarian.hash"      ,"iso-8859-2" },
    {"hu_HU" ,"hungarian.hash"      ,"iso-8859-2" },
    {"ga"    ,"irish.hash"          ,"iso-8859-1" },
    {"ga_IE" ,"irish.hash"          ,"iso-8859-1" },
    {"gl"    ,"galician.hash"       ,"iso-8859-1" },
    {"gl_ES" ,"galician.hash"       ,"iso-8859-1" },
    {"ia"    ,"interlingua.hash"    ,"iso-8859-1" },
    {"it"    ,"italian.hash"        ,"iso-8859-1" },
    {"it_IT" ,"italian.hash"        ,"iso-8859-1" },
    {"it_CH" ,"italian.hash"        ,"iso-8859-1" },
    {"la"    ,"mlatin.hash"         ,"iso-8859-1" },
    {"la_IT" ,"mlatin.hash"         ,"iso-8859-1" },
    {"lt"    ,"lietuviu.hash"       ,"iso-8859-13" },
    {"lt_LT" ,"lietuviu.hash"       ,"iso-8859-13" },
    {"nl"    ,"nederlands.hash"     ,"iso-8859-1" },
    {"nl_NL" ,"nederlands.hash"     ,"iso-8859-1" },
    {"nl_BE" ,"nederlands.hash"     ,"iso-8859-1" },
    {"nb"    ,"norsk.hash"          ,"iso-8859-1" },
    {"nb_NO" ,"norsk.hash"          ,"iso-8859-1" },
    {"nn"    ,"nynorsk.hash"        ,"iso-8859-1" },
    {"nn_NO" ,"nynorsk.hash"        ,"iso-8859-1" },
    {"pl"    ,"polish.hash"         ,"iso-8859-2" },
    {"pl_PL" ,"polish.hash"         ,"iso-8859-2" },
    {"pt"    ,"brazilian.hash"      ,"iso-8859-1" },
    {"pt_BR" ,"brazilian.hash"      ,"iso-8859-1" },
    {"pt_PT" ,"portugues.hash"      ,"iso-8859-1" },
    {"ru"    ,"russian.hash"        ,"koi8-r" },
    {"ru_MD" ,"russian.hash"        ,"koi8-r" },
    {"ru_RU" ,"russian.hash"        ,"koi8-r" },
    {"sc"    ,"sardinian.hash"      ,"iso-8859-1" },
    {"sc_IT" ,"sardinian.hash"      ,"iso-8859-1" },
    {"sk"    ,"slovak.hash"         ,"iso-8859-2" },
    {"sk_SK" ,"slovak.hash"         ,"iso-8859-2" },
    {"sl"    ,"slovensko.hash"      ,"iso-8859-2" },
    {"sl_SI" ,"slovensko.hash"      ,"iso-8859-2" },
    {"sv"    ,"svenska.hash"        ,"iso-8859-1" },
    {"sv_SE" ,"svenska.hash"        ,"iso-8859-1" },
    {"uk"    ,"ukrainian.hash"      ,"koi8-u" },
    {"uk_UA" ,"ukrainian.hash"      ,"koi8-u" },
    {"yi"    ,"yiddish-yivo.hash"   ,"utf-8" }