indexer detects document character set in this order:
"Content-type: text/html; charset=xxx"
<META NAME="Content" CONTENT="text/html; charset=xxx">
Defaults from "Local Charset" field in Common Parameters