FindinSite-CD: Search engine for CD/DVD   .
 
Powered by FindinSite-MS
. Home | Examples | Starting | Set up | Advanced | Languages | Purchasing | Email .
. .
  Overview | Character sets | Japanese | Chinese | Traditional Chinese

 

findinsite-cd Language support


Locales
Startup and Usage
Adding languages
Changing languages
Languages to use
Languages button
String definitions

Language codes
Country codes

Overview

FindinSite-CD is supplied fully internationalised for many languages:
  • FindinSite-CD detects the user's preferred language and uses the most appropriate language for its display prompts, etc. See screen shots of FindinSite-CD running in different languages.
  • If you specify an index for the user's preferred language, then this will be chosen by FindinSite-CD at startup.
  • By default, FindinSite-CD will show a "Languages" button that lets you switch between the available languages.
You can configure the FindinSite-CD language handling in many ways. You can support whole new languages or alter existing languages. In addition, you can configure what languages are made available, and whether to show the Languages button.
FindinSite-CD Languages supported
Languages supported
  • • English
  • • български (Bulgarian)
  • • Česky (Czech)
  • • Dansk (Danish)
  • • Deutsch (German)
  • • Eesti keel (Estonian)
  • • Español (Spanish)
  • • ελληνικά (Greek)
  • • Français (French)
  • • Hrvatski (Croatian)
  • • Italiano (Italian)
  • • Magyar (Hungarian)
  • • Latviešu valoda (Latvian)
  • • Lietuvių kalba (Lithuanian)
  • • Malti Maltese
  • • Nederlands (Dutch)
  • • Norsk (Norwegian)
  • • Polski (Polish)
  • • Português (Portuguese)
  • • Română (Romanian)
  • • Slovenčina (Slovak)
  • • Slovenščina (Slovenian)
  • • Suomi (Finnish)
  • • Svenska (Swedish)
  • • العربية (Arabic)
  • • 日本語 (Japanese)
  • • ภาษาไทย (Thai)
  • • 简体中文 (Simplified Chinese)
  • • 繁体中文 (Traditional Chinese)
FindinSite-CD-Wizard and Findex
The indexing tools, FindinSite-CD-Wizard and Findex, read files in many languages. For most non-English languages, you need to write your web pages in the appropriate character set. FindinSite-CD-Wizard and Findex support most common character sets used in the world. Make sure that you specify the correct META Content-Type tag. See the HTML Character sets, PDF Support and File types pages for more details.
Characters with accents, etc.
If your files have characters with accents, umlauts, etc., then FindinSite-CD-Wizard and Findex will find all these characters correctly. In addition the FindinSite-CD search page will let you search for words with these characters correctly.

Note carefully that searching for donnee, for example, will find words with accents, such as donnée and données, and vice versa. While this is useful most of the time, a search for thé will also find the which may be slightly confusing. In general we believe it is better to find more words than less words.
You can restrict the search by putting in single quotes, eg searching 'thé' will not find the.

Word highlighting
Word highlighting in Navigator does not usually work for characters with accents, because the word highlighter does not recognise characters such as é.


Locales

Example locale strings
en English
enGB English, United Kingdom
fr French
frFR French, France
frCA French, Canada
de German
it Italian
ja Japanese
zh Chinese
zhTW Chinese (Taiwan)
A language is defined by its locale string.  In fact, a locale specifies a language and optionally a country. For example, the locale "en" refers to English while "enGB" refers to English as used in the United Kingdom.

The locale string is used in various places:

The first two characters of the locale string give the ISO Language Code. These codes are the lower-case two-letter codes as defined by ISO-639. Here is a full list of language codes taken from http://www.ics.uci.edu/pub/ietf/http/related/iso639.txt

If supplied, the next two characters of the locale string give the ISO Country Code. These codes are the upper-case two-letter codes as defined by ISO-3166. Here is a full list of country codes taken from http://www.chemie.fu-berlin.de/diverse/doc/ISO_3166.html

DefaultLocale

When run in most recent browsers, FindinSite-CD can determine the user's preferred locale. However older browsers such as Navigator 3 or Internet Explorer 3 assume that the user's preferred language is English.

You can change the default locale for these older browsers by setting the defaultLocale applet parameter. For example, to set the default locale to French, add the following parameter to your search page:

<PARAM NAME=defaultLocale VALUE="fr">


Startup and Usage

At startup, FindinSite-CD gets the user's preferred locale (from the operating system, not the browser) and sets the language and index:
  • The best matching available language is chosen.
  • If more than one index parameter is specified with locales, then the most appropriate index is chosen - see the Indexes page for full details.
After startup:
  • the user can switch language using the "Languages" button (if it is available)
  • If present, the user can change index using the "Indexes" button.


Adding languages

Thanks to the European Investment Bank for several European Union language translations.
Arabic (العربية) translation provided by Lubna Sorour.
Chinese (简体中文) translations provided by Nan Chem and Mary Rack.
Croatian (Hrvatski) translation provided by Zvonimir Bulaja at www.bulaja.com.
Czech (Česky) translation provided by Milan Hampl.
Dutch (Nederlands) translation provided by Hans Schipper
French (Français) translation done in-house.
German (Deutsch) translation provided by Renate Heath, and Julian Calvert of Software AG.
Italian (Italiano) translation provided by Carmelo Cutuli of Global Communication, and Dr Stefania Goffredo of Reggiani S.p.A.
Japanese (日本語) translation provided by Yuichi Tokunaga of Cybernet System Co Ltd.
Norwegian (Norsk) translation provided by Anderson F. R. dos Santos, Norway.
Portuguese (Português) translation provided by Fernando Nunes, Macau.
Slovenian (Slovenščina) translation provided by Luka Malenšek, Slovenia.
Spanish (Español) translation mainly provided by Eduardo Zamora of the Instituto Latinoamericano de la Comunicación Educativa (ILCE), Mexico.
To add support for a new locale, you must do two steps:
  • Write a language file
  • Tell FindinSite-CD to use it using the Languages parameter.
Once your language is recognised by FindinSite-CD, double-check that all the strings display correctly. Note that not all browsers can display all strings at the same time, eg Asian text may not display correctly on a Western system.
Writing a language file
The display button labels, text and messages for a language are defined in a language file. FindinSite-CD has built-in support for English and many other languages. The language files are in the COM/phdcc/lang subdirectory of the FindinSite-CD runtime. By convention, a language file usually has a file extension of .hil.

There are three different language file formats. The recommended format is a plain text file, starting with a line containing the number 3. Then specify more lines with name=value string definitions. See the string name definitions section below for details all strings.

A language file must include Language, Country, Localname and Englishname definitions, as can been seen in this excerpt from the German language file:

3
Language=de
Country=
Localname=Deutsch
Englishname=German
 
L_AND=UND
L_OR=ODER
L_NOT=NICHT
 
L_PAGES_AND=\ Seiten und
L_WORDS=\ W\u00F6rter.

The last two definitions illustrate these points:

  • If you want a space at the beginning or end of a string value then you must write it as .
  • Any non-USASCII characters should be expressed in Unicode format, ie \uHHHH where HHHH is the hexadecimal for the Unicode character. For example:
    • \u00F6 for the small letter o with a diaeresis:  ö, ie U+00F6
    • \u20AC for the Euro symbol:  , ie U+20AC.
  • The backslash character must be represented as \\.
  • Other characters may also be preceded by \, as in the space example above .
Alternatively, you can store the language file in UTF-8 format, with UTF-8 prefix bytes 0xEF 0xBB 0xBF. Windows Notepad normally stores these prefix bytes automatically if you store in the UTF-8 Encoding. Note that you must use for a space at the end of a string.

If you do not specify definitions for any strings, then the English version will be used.

Languages parameter
Once you have written a language file, you must tell FindinSite-CD to use it. Put the language file in your FindinSite-CD directory in your CD image and add a Languages parameter.

The optional Languages parameter should contain a comma-separated list of language file URLs. If you supply a language file for a language that is built-in to FindinSite-CD, then your language definition will take precedence.

For example, to add Greek language support and replace the English language file, you could specify the following, where FindEl3.hil has the Greek locale language code of el, and FindMyEn3.hil has the English locale language code of en.

<PARAM NAME=Languages VALUE="FindEl3.hil,FindMyEn3.hil">
For older language file formats, you need to specify language and country codes in addition to the language file URL. Put these after semi-colons after the URL, eg:
<PARAM NAME=Languages VALUE="FindEl3.hil,FindEnUK.hil;en;gb">


Changing languages

You can change individual language strings using applet parameters. This may be easier to use than providing a whole new language file. It also lets you change the default English strings without having to provide a whole English language file.

This method will change any language, including the ones you supply using the Languages parameter.

You can provide one or more parameters for each built-in language. The parameter name must be in this format:

Lang_<language code><country code>_<index>
where
<language code> is the language code
<country code> is the optional language country code
<index> is a number incrementing from 1 for each language

The parameter value is in this format:

<string name>,<new string>
where
<string name> is the name of the string that you want to replace in upper case
<new string> is the new string

The parameter name is case insensitive, but the <string name> in the parameter value must be in upper case. The <new string> must use the backslash escape sequences described above if necessary.

The string name definitions are given below. Note that you cannot change the Language, Country, Localname and Englishname definitions.

This example replaces two English strings and one string each for French and Traditional Chinese:

<PARAM NAME="Lang_en_1" VALUE="L_SUBSETS,Sets">
<PARAM NAME="Lang_en_2" VALUE="L_SELECT_SUBSETS, Choose which data set to use\:">
<PARAM NAME="Lang_fr_1" VALUE="L_DATABASES,bases de donn\u00E9es">
<PARAM NAME="Lang_zhTW_1" VALUE="L_SUBSETS,\u7528\u6237\u7535\u8BDD\u673A">


Languages to use

If - for some reason - you do not want FindinSite-CD to make all its languages available, then you can specify the list of available languages in the UseLanguages parameter. Specify a comma-separated list of locale strings, ie language and optional country codes. For example, to only make English, French and Traditional Chinese available:
<PARAM NAME=UseLanguages VALUE="en,fr,zhTW">
If you only ever want to use English, then use the following:
<PARAM NAME=UseLanguages VALUE="en">
If only one language is available, then the "Languages" button will not be shown.

Do not specify a UseLanguages parameter at all if you want all the supplied FindinSite-CD languages available.


Languages button

If there is more than one available language, then FindinSite-CD shows the "Languages" button by default. If you do not ever want to show the "Languages" button then you must set parameter ShowLanguages to no, eg:
<PARAM NAME=ShowLanguages VALUE="no">

Language file string name definitions

The following table listed all the localisable strings supported by FindinSite-CD and FindinSite-JS, together with the English default value. The Language, Country, Localname and Englishname strings must be provided in a language file, but cannot be changed using Lang_XXX parameters.

In most cases the string name is self explanatory. However, a special description is added where necessary. Note carefully that some strings require spaces at the beginning and end. As can be seen, you can use basic HTML in some strings, as described in the Screen layout - Results layout section.

String Name Special Description English
Header
LanguageThe language code:  lower-case two-letteren
CountryThe country code:  upper-case two-letter 
LocalnameLanguage name in the languageEnglish
EnglishnameLanguage name in EnglishEnglish
Logical operators
L_ANDMust be a single wordAND
L_ORMust be a single wordOR
L_NOTMust be a single wordNOT
L_NEARMust be a single word
(Not used yet.)
NEAR
Button labels
L_SEARCHPad with spaces at either end if L_STOP is longerSearch
L_STOP Stop
L_HELP Help
L_SUBSETS Subsets
L_INDEXES Indexes
L_LANGUAGES Languages
Status messages
L_LOADINGDisplayed as search databases loaded at startupPlease wait...
L_READINGFollowed by search database URLReading
L_HELP_UNREADABLE The search database files could not be read.
L_SHOWINGFollowed by page filename; shown in browser status barShowing
L_POST_SHOWINGPreceded by page filename; shown in browser status bar
Main help text
L_ENTER_TEXT Enter your search text in the box above and click on Search.
L_HELP_SEL_PAGE To view a page in the results list, click on its title.
L_HELP_TOP <B>FindinSite-CD</B> finds pages that contain all the words in your search text anywhere on the page.
L_HELP_MATCH Use single quotes <B>' '</B> to find matching capital letters.
L_HELP_ADJACENT Use double quotes <B>" "</B> to find adjacent words.
L_HELP_WILD Use <B>?</B> to match exactly one character and <B>*</B> to match any number of characters.
L_HELP_LOGICAL_OPERATORS1"AND, OR, NOT" put after this phrase and before L_HELP_LOGICAL_OPERATORS2Use <B>
L_HELP_LOGICAL_OPERATORS2 </B> and parentheses <B>(</B> <B>)</B> to do logical searches.
Help text: information display
L_INDEX Index:
L_DESCRIPTION Description:
L_CONTAINS Contains:
L_PAGES_AND  pages and
L_WORDS  words.
L_CREATED Created:
L_FILE File:
L_SITE Site:
L_LANGUAGE Language:
L_USER_LOCALE User locale:
L_RULESET Rules:
Search text error reporting
L_NO_SEARCH Nothing to search for
L_FOUND_UNORDERED These words were found, but not in this order:&nbsp;
L_WORDS_NOT_FOUND These words were not found in any pages:
L_ABORTED Search aborted
L_NO_CONTIG Sorry, adjacent word searches are not supported by this search database
L_2DQ_NEEDED Sorry, your search text has mismatched double-quotes
L_NO_EXACT Sorry, this search database does not store different letter cases
L_MISMATCHED_PARENTHESES Sorry, your search text has mismatched parentheses ( and )
L_BAD_BRACKETS Parentheses not allowed within "double quotes"
L_INCORRECT_PLACEPreceded by AND, OR or NOT in incorrect place
L_BAD_WILD * and ? not allowed within 'single quotes'
L_BAD_ASTERISK_DQ * not allowed within "double quotes"
Results reporting
L_PAGEUsed when reporting "1 page found" page
L_FOUNDUsed when reporting number of pages found, except if overridden by L_FOUND_ZERO or L_FOUND_PLURAL. found
L_PAGES  pages
L_PRE_FOUNDUsed in languages where found appears before number when reporting "10 pages found"
L_FOUND_PLURALIf specified, then plural of "found"
L_FOUND_ZEROIf specified, then use to report "0 pages found"
L_PAGES_ZEROIf specified, then use to report "0 pages found"
L_STOP_WORDS Ignored common words: 
Subsets messages
L_SELECT_SUBSETS Choose subsets to include in search:
L_DATABASES databases
L_NO_SUBSETS_SELECTED No subsets are selected.
L_SUBSET_SELECTALL Select all
L_SUBSET_DIVIDE  / 
L_SUBSET_SELECTNONE Select none
Indexes messages
L_SELECT_INDEX Select index:
Languages messages
L_SELECT_LANG Select language:
SearchAndGetResults() messages
L_SEARCH_WAITThe message returned if the database is still being loadedSearch database loading... please try again soon.
L_SEARCH_TIMEDOUTThe message returned if a search has timed outSearch timed out.
FindinSite-JS and FindinSite-MS
L_APPNAME FindinSite-JS or FindinSite-MS
L_APPNAME_HTML findinsite-js or findinsite-ms
L_SEARCH_ENGINE search engine
L_HELP_TOP_SERVER <B>FindinSite-JS</B> finds pages that contain all the words in your search text anywhere on the page.
L_SEARCH_FOR Search for:
L_SEARCH_RESULTS_FOR Search results for:
L_PREVIOUS Previous
L_NEXT Next
L_FIRST First
L_LAST Last
L_OF of
L_RESULTS Results
L_SECONDS seconds
L_HTML_TAGSet to DIR=rtl for languages that read from Right To Left
L_BODY_TAGSet to DIR=rtl for languages that read from Right To Left
L_ALIGN_TAGSet to ALIGN=right for languages that read from Right To Left
  All site Copyright © 1996-2011 PHD Computer Consultants Ltd, PHDCC   Privacy  

Last modified: 22 March 2010.

Valid HTML 4.01 Transitional Valid CSS!