Currently only tabs and blanks are used for tokenizing the description,
which breaks when a term is at the end of a line or has () appended to
it.
1. Use also other white space characters such as new-lines and carriage
return for splitting.
2. Remove some common non-word characters from the token before lookup.
Signed-off-by: Philipp Hahn <hahn@univention.de>
This fixes a number of issues most of them raised by Eric Blake on the
generated documentation output:
- parsing of "long long int" and similar
- add parsing of unions within a struct
- remove spurious " * " fron comments on structure fields and enums
- fix concatenation of base type and name in arrays
- extend XSLT to cope with union in structs
* docs/apibuild.py: fix and extend API extraction tool
* docs/newapi.xsl: extend the stylesheets to cope with union in
public structures
* docs/ChangeLog.xsl docs/newapi.xsl docs/site.xsl: change all
stylesheets to output UTF-8 HTML instead of ISO Latin 1 which was
breaking on some people names.