Variant Conversion Info (VARCON)

Revision 3

January 3, 2003

Copyright 2000-2003 by Kevin Atkinson (kevina@gnu.org)

This package contains tables to convert between American, British, and
Canadian spellings and vocabulary as well as well as a table listing the 
equivalent forms of other variants.

The latest version can be found at http://aspell.sourceforge.net/wl/.

The abc.tab contains mappings between American, British, and Canadian
spellings.  The "ise" spelling is used for the British column (If you
are interested the "ize" spellings let me know and I will see what I
can do.) The fields are separated by a tab character and have the Unix
EOL character.  The first column is the American spelling, the second
the British, and the third the Canadian.  The last column is used to
mark words where the American or British spelling is also used in the
British or American spelling but only when the word has a certain
meaning.  American words that are also used in British / Canadian
spellings are marked with a "A" while British words that are also used
in American / Canadian spellings are marked with a "B".

The file voc.tab is like abc.tab except that it converts between
vocabulary instead of spelling.  If more than one word if often uses to
describe the same thing the words are separated with commas.  The last
column contains additional notes on when the word is used.  Unlike
abc.tab it is generally not suitable for automatic conversion.

The file variant-also.tab contains additional mappings between various
spellings of a word which are not necessarily region dependent.  Only
mappings that are not listed in abc.tab are included in this mapping.
No attempt is made to distinguish the primary form of a word.  The
file variant-infl.tab is like variant-also.tab except that it is
created autmatically from the AGID inflection database.  The file
variant-wroot.tab is like variant-infl.tab except that it also
included the root form of the word.

Both the translation array and variant table includes a lot of strange
affixations of words which I have no intention of removing as some
people might find them useful.

The "make-variant" Perl script will combine abc.tab, variant-also.tab,
and variant-infl.tab into one huge mapping and will output the result
to "variant.tab".  If the "no-infl" option is given than
variant-infl.tab will not be included.

The "split" script will create 4 words lists from abc.tab:
american.lst, british.lst canadian.lst, and common.lst.  The
common.lst file contains words that were marked in the last column as
explained above and the other three contain all the possible words that
might be used by that particular country, included some of the words
marked in the last column.

The "translate" Perl script will translate a text file from one
spelling to another. Its usage is:

translate <options> [<translation array>] <from> <to>
<options> is any of
  -?,-h,--help this screen
  -m,--mark     mark words where the translation is questionable
  -i,--include  include words where the translation is questionable
<translation array> is the file name of the translation array,
                    defaults to "abc.tab".
<from> and <to> are one of: american, british, or canadian

Text is read in from standard input and is outputted to standard out.
Words are marked with a '?' before and after the questionable word
when the option is enabled.

If you discover any errors in these mappings, besides the strange
affixations, or have suggestions for additions be sure and let me know
at kevina@gnu.org.

SOURCE:

These mappings were compiled from numerous sources.

The abc.tab was originally created from the American and British word
lists found in the Ispell distribution and the Canadian word list
created by Garst R. Reese <reese@isn.net>:

  What I have discovered is that Canadian is a modification of British.
  Canadians use ize ization, izing izable like Americans, and gram instead
  of gramme. The one exception I found was practise. It does not go to
  practize.  Otherwise they use British spelling. So, what I am currently
  checking books with is a an edited version of British, where I have
  changed all occurrences of ise to ize, isab to izab, isation to ization,
  ising to izing, and gramme to gram except I allow programme, which is
  sometimes proper unless you are talking about a computer program. I did
  bunches of greps to be sure these substitutions would work as expected.

Many other words have been added to abc.tab which were not in the
original Ispell word lists.

Many different web sources were consuled when crating the tables.  They
include:

  The American-British British-American Dictionary
    http://www.peak.org/~jeremy/dictionary/dictionary.html
    American and British Spelling Differences
      http://www.peak.org/~jeremy/dictionary/spellcat.html
  Dave (VE7CNV)'s Truly Canadian Dictionary of Canadian Spelling
    http://www.luther.bc.ca/~dave7cnv/cdnspelling/cdnspelling.html
  Canadian Spelling Convention
    http://imej.wfu.edu/articles/1999/1/02/demo/tutorial/canas.html
  Cornerstone's Canadian English Page
    http://www.web.net/cornerstone/cdneng.htm
  Inter-Play Translation: British/Canadian/American Spelling
    http://www.inter-play.com/translation/spel-ukus.htm
  Inter-Play Translation: British/Canadian/American Vocabulary
    http://www.inter-play.com/translation/voc-ukus.htm

As well as several online dicionaries:

  Marriam-Webster: http://www.m-w.com/
  American Heritage: http://www.bartleby.com/61/
  Cambridge (ESL): http://dictionary.cambridge.org/

CHANGELOG:

From Revision 2 to Revision 3 (January 2, 2003)

  - Added an option for not including variant-infl.tab for the
    make-variant perl script
  - Added the file variant-wroot.tab
  - Added a few entries given to me by Francis Bond and Edward Betts

From Revision 1 to Revision 2 (January 27, 2001)

  - Removed all "B" markers because I could not find any evidence for
    them
  - Corrected a few Canadian entries, especially those with the "B"
    markers
  - Added some more entries by trying fixed changes (such as ize to
    ise) to words in SCOWL and hand-checking over the ones with semi-common
    words in them. 
  - Added variant-infl.tab

COPYRIGHT:

Copyright 2000-2003 by Kevin Atkinson

Permission to use, copy, modify, distribute and sell this array, the
associated software, and its documentation for any purpose is hereby
granted without fee, provided that the above copyright notice appears
in all copies and that both that copyright notice and this permission
notice appear in supporting documentation. Kevin Atkinson makes no
representations about the suitability of this array for any
purpose. It is provided "as is" without express or implied warranty.

Since the original words lists come from the Ispell distribution:

Copyright 1993, Geoff Kuenning, Granada Hills, CA
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright
   notice, this list of conditions and the following disclaimer in the
   documentation and/or other materials provided with the distribution.
3. All modifications to the source code must be clearly marked as
   such.  Binary redistributions based on modified source code
   must be clearly marked as modified versions in the documentation
   and/or other materials provided with the distribution.
4. All advertising materials mentioning features or use of this software
   must display the following acknowledgment:
     This product includes software developed by Geoff Kuenning and
     other unpaid contributors.
5. The name of Geoff Kuenning may not be used to endorse or promote
   products derived from this software without specific prior
   written permission.

THIS SOFTWARE IS PROVIDED BY GEOFF KUENNING AND CONTRIBUTORS ``AS IS'' AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED.  IN NO EVENT SHALL GEOFF KUENNING OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
SUCH DAMAGE.
