#LyX 1.3 created this file. For more info see http://www.lyx.org/ \lyxformat 221 \textclass article \begin_preamble \usepackage{url} \end_preamble \language english \inputencoding latin1 \fontscheme default \graphics default \paperfontsize default \spacing single \papersize a4paper \paperpackage a4 \use_geometry 0 \use_amsmath 0 \use_natbib 0 \use_numerical_citations 0 \paperorientation portrait \secnumdepth 3 \tocdepth 3 \paragraph_separation indent \defskip medskip \quotes_language english \quotes_times 2 \papercolumns 1 \papersides 1 \paperpagestyle default \layout Title The forindex utilities \newline Version 0.1 \layout Author Guido Milanese \newline \begin_inset ERT status Collapsed \layout Standard \backslash url{guido.milanese@unicatt.it} \end_inset \layout Standard \begin_inset ERT status Collapsed \layout Standard \backslash thispagestyle{empty} \end_inset \layout Date January 2005 \layout Abstract \noindent Making a good index is a very important part in the process of writing a document, particularly books and manuals. Entering data manually can be a very long process; although a certain amount of data must be entered manually, some tasks can be performed automatically, e.g. an index of geographical names or other trivial tasks. This task can be achieved using the program \family sans doindex \family default , that prepares a file to be processed by \family sans makeindex \family default . Another useful feature is to remove all the \family typewriter \backslash index \family default entries in a LaTeX file, obtaining a clean file with no indexing (program \family sans cleanindex \family default ). The programs are written in Snobol4; the only requirement is to install the interpreter. For Windows, a standalone file compiled with Spitbol is also provided, and the programs can be run without the need of an external interpreter. \layout Standard \begin_inset LatexCommand \tableofcontents{} \end_inset \layout Section The programs \layout Subsection The program \family sans doindex \layout Standard The programs \family sans doindex \family default reads a LaTeX file, using a list file, and enters index entries in the file according to this list. Previously entered index entries are left unchanged, making it possible to add further indexing to an already indexed file. \layout Standard The input LaTeX file may have extension \family typewriter tex \family default or \family typewriter latex \family default , both uppercase and lowercase (not mixed as in \family typewriter Tex \family default ). \layout Standard The list file is meant to contain all the words to be indexed. It must have exactly the extension \family typewriter wls \family default . Sub-entries are identified with the separator character '/'. See \family sans test.wls \family default as example: \layout Quote \begin_inset ERT status Collapsed \layout Standard \backslash begin{verbatim} \newline animals \newline dogs/animals \newline cats \newline house/nouns/english/languages \newline sleeping@sleep \newline évita/Italian/foreign words \newline ça/French/foreign words \newline drücken/German/foreign words \newline \backslash end{verbatim} \end_inset \layout Standard No particular order in this file is required. Some users will prefer alphabetical order, others different orders, so the programs has no requirements concerning order/sort in this file. Entries as \family typewriter sleeping@sleep \family default use the standard \family sans makeindex \family default syntax and are left unchanged. The \begin_inset Quotes eld \end_inset logic \begin_inset Quotes erd \end_inset of this syntax is opposite to the internal logic of \family sans makeindex \family default , that is -- I think -- very clever at the stage of typesetting an index, but not at the stage of designing an index. \begin_inset Quotes eld \end_inset A dog is an animal \begin_inset Quotes erd \end_inset ( \family typewriter dog/animal \family default in my syntax) seems to me to be more natural than \begin_inset Quotes eld \end_inset Among animals there are dogs \begin_inset Quotes erd \end_inset ( \family typewriter animals!dogs \family default in the \family sans makeindex \family default syntax). The LaTeX file produced by \family sans doindex \family default follows, of course, the \family sans makeindex \family default conventions. \layout Standard The original LaTeX file is left unchanged. A new file will be written, identified by \family sans -ind \family default . For example, from \family sans file.tex \family default you will get \family sans file-ind.tex \family default . Of course, you'll have to run \family sans makeindex \family default as usual. \layout Standard The purpose of the program is similar to what is provided by the program \begin_inset ERT status Collapsed \layout Standard \backslash mbox{ \end_inset \family sans ixgen \family default \begin_inset ERT status Collapsed \layout Standard } \end_inset ( \begin_inset ERT status Collapsed \layout Standard \backslash url{http://www.iit.upco.es/~oscar/ixgen/} \end_inset ) written by \shape smallcaps Oscar Lopez \shape default ( \begin_inset ERT status Collapsed \layout Standard \backslash url{oscar@iit.upco.es} \end_inset ), but \family sans forindex \family default was designed to be a bit more flexible. \layout Subsection The program \family sans cleanindex \layout Standard The program \family sans cleanindex \family default removes \family sans \backslash index \family default sequences from a LaTeX file. The program can be used e.g. if a user is not happy with the indexing of a file and wants to start it over again. \layout Standard The input file may have extension \family typewriter tex \family default or \family typewriter latex \family default , both uppercase and lowercase. \layout Standard The original file is left unchanged. A new file will be written, identified by \family sans -noind \family default . For example, from \family sans file.tex \family default you will get \family sans file-noind.tex \family default . In this file, lines concerning \family sans makeindex \family default will be left but commented, in order to avoid an empty \emph on Contents \emph default section in the output. You can uncomment the lines as soon as you want to reindex the file again. \layout Section Installation \layout Subsection GNU/Linux and other *nix systems \layout Enumerate Install \family sans snobol4 \family default from \begin_inset ERT status Collapsed \layout Standard \backslash url{http://www.snobol4.org} \end_inset . This is Philip Budne's CSNOBOL implementation. You need a \family typewriter c \family default compiler to compile the interpreter; it's normally a very quick and easy process. \layout Enumerate Make sure \family sans snobol4 \family default is in your PATH or make a symbolic link. \layout Enumerate Copy all the files from the source directory in a suitable directory (you do not need the \family typewriter bat \family default files, provided for Windows, and can safely remove them). \layout Enumerate Make executables the scripts ( \family sans doindex \family default and \family sans cleanindex \family default with no extensions), e.g. \family typewriter chmod +x doindex \layout Enumerate Run the scripts as follows: \begin_deeper \layout Enumerate -- to index a text: \family typewriter ./doindex file.tex \layout Enumerate If you want to exclude words with accents: \family typewriter ./doindex file.tex --noacc \layout Enumerate -- to remove \family sans \backslash index \family default sequences: \family typewriter ./cleanindex file.tex \layout Enumerate If the current directory is in your PATH, you do not need \family typewriter ./ \family default before the script name. \newline \end_deeper \layout Subsection Windows \layout Standard The package offers exe files compiled with Spitbol (see ( \begin_inset ERT status Collapsed \layout Standard \backslash url{http://www.snobol4.com} \end_inset ). Make a directory and copy all the file in the bin/windows directory. There must be two \family typewriter *.exe \family default files and the two \family typewriter test.* \family default files. \layout Standard Run the programs as follows: \layout Standard -- to index a text: \family typewriter doindex file.tex \layout Standard If you want to exclude words with accents: \family typewriter doindex file.tex -noacc \family default . Accents must be encoded using the \family sans latin1 \family default encoding (see the list of Todo). \layout Standard -- to remove \family typewriter \backslash index \family default sequences: \family typewriter cleanindex file.tex \layout Subsection Windows from source \layout Standard Basically, follow the same directions given about GNU/Linux, but make sure to use the \family typewriter bat \family default files and to install the Windows version of the interpreter. Before using the sources, that are in Unix format, use a script to translate from Unix to Dos-Windows format. If you do not have such a script, open the files with a text editor and save the sources in Windows-Dos format. This can be done reading and saving each file with the DOS \family sans edit \family default program, with \family sans vim \family default or any other editor able to deal with different file formats. Do not alter the files if you are not sure of what you are doing. Please (1) do not use a word processor (as Word or similar) but a simple text editor and (2) make sure to leave the encoding of file \family sans acc.inc \family default to \family sans ISO-8859-1 \family default or \family sans 8859-15 \family default , not to plain DOS or Unicode. \layout Subsection Cygwin \layout Standard I suggest to follow the same directions given for GNU/Linux, but the EXE files provided for native Windows can be used anyway if preferred. \layout Subsection Macintosh \layout Standard Not yet tested (I do not have a Mac right now). It's in the TODO list. \layout Section Test files \layout Standard Please test the program on \family sans test.tex \family default and \family sans test.wls \family default . The produced file will be called \family sans test-ind.tex \family default if you use \family sans doindex \family default , \family sans test-noind.tex \family default if you use \family sans cleanindex \family default . \layout Section Bugs and TODO \layout Standard The program does not support Unicode files. At this moment, most LaTeX users are still using latin1, but the situation is rapidly changing. \layout Standard List of features that I would like to add: \layout Enumerate Index also included files. \layout Enumerate Add typographical styles, such as italics for the most important locations of a word. \layout Enumerate Add support for several indexes (particularly with class \family sans memoir \family default ) \layout Enumerate Add an option to generate a rough index for all the words. \layout Enumerate Add a support to index words listed with regular expressions. E.g. \family typewriter read* \family default should index \emph on read \emph default , \emph on reads \emph default , \emph on reading \emph default , \emph on readings \emph default , all under the same heading \emph on read \emph default . \layout Enumerate Make possible to use another separator for the list file, e.g. a simple blank or other char preferred by user. \layout Enumerate Test the programs on a Mac. \layout Enumerate Add Unicode support. \layout Section Acknowledgements \layout Standard The program \family sans ixgen \family default gave me the idea of \family sans forindex \family default . Many thanks to \shape smallcaps Oscar Lopez \shape default for this very good program. \layout Standard Some questions sent by \shape smallcaps Carlo Pellegrino \shape default (Modena University, Italy) gave me the idea of transforming a very rudimentary script into a general purpose utility. \shape smallcaps Maurizio Loreti \shape default (Padua University, Italy) sent me very useful remarks on the problems of automatical generations of indexes, which I made use of in the introduction to this text. \layout Standard My warmest thanks to \shape smallcaps Phil Budne \shape default ( \begin_inset ERT status Collapsed \layout Standard \backslash url{phil@ultimate.com} \end_inset ) for making his excellent CSNOBOL available. Many thanks to the community of Snobol users, particularly to the members of the list \begin_inset ERT status Collapsed \layout Standard \backslash url{snobol4@mercury.dsu.edu} \end_inset , and, among them, to \shape smallcaps Gordon Peterson \shape default ( \begin_inset ERT status Collapsed \layout Standard \backslash url{http://personal.terabites.com/} \end_inset ), \shape smallcaps Michael Radow \shape default ( \begin_inset ERT status Collapsed \layout Standard \backslash url{mikeradow@yahoo.com} \end_inset ), \shape smallcaps Gregory L. White \shape default ( \begin_inset ERT status Collapsed \layout Standard \backslash url{glwhite@netconnect.com.au} \end_inset ) and to \shape smallcaps Rafal M. Sulejman \shape default ( \begin_inset ERT status Collapsed \layout Standard \backslash url{rafal@engelsinfo.de} \end_inset ) whose \family sans vim \family default syntax files are a daily blessing. \layout Standard Thanks to Jim Hefferon < \begin_inset ERT status Collapsed \layout Standard \backslash url{ftpmaint@alan.smcvt.edu} \end_inset > who pointed out that the original name of the package, \family sans 4index \family default , was not acceptable due to XML syntax rules. \layout Section Author, copyright, license, disclaimer \layout Standard This program is Copyright \begin_inset ERT status Collapsed \layout Standard \backslash copyright{} \end_inset 2005 \newline Guido Milanese < \begin_inset ERT status Collapsed \layout Standard \backslash url{guido.milanese@unicatt.it} \end_inset > \newline under the terms of the GNU General Public License. \newline \layout Standard \size footnotesize \begin_inset ERT status Collapsed \layout Standard \backslash begin{verbatim} \newline This program is free software; you can redistribute it and/or modify it under \newline the terms of the GNU General Public License as published by the Free Software \newline Foundation; either version 2 of the License, or (at your option) any later \newline version. \newline \newline This program is distributed in the hope that it will be useful, but WITHOUT ANY \newline WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A \newline PARTICULAR PURPOSE. See the GNU General Public License for more details. \newline \newline If you do not have a copy of the GNU General Public License write to the Free \newline Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. \newline \newline If the author of this software was too lazy to include the full GPL text along \newline with the code, you can find it at: http://www.gnu.org/copyleft/gpl.html. \newline \backslash end{verbatim} \end_inset \the_end