[PREVIOUS CHAPTER]
1 Generating HTML Articles


This routine creates html articles in htdocs/ and adjusts index.html
and thread.html. By this you can read html articles under a www server
without CGI, SSI ... Usually www server administrators think that our
server does not provide such a function for security and avoidance of
high load. Under such a circumstance, you can provide ML articles as a
HTML style. CGI is more flexible but may be unsafe. Fml HTML function
is restricted without functions provided by CGI, but not affected by
WWW server operation policy (security and high load average). It is a
trade-off but it is effective I believe under some environments.

1.1	Automatically generating HTML articles from ML spool ($DIR/spool)


When $AUTO_HTML_GEN = 1;, FML generates both plain articles in
$DIR/spool and HTML articles in $DIR/$HTML_DIR ($DIR/htdocs/).


The internal design of FML is running HTML generator functions as a
hook. Running external program SyncHtmlfiles.pl is an old time design,
which is obsolete.


FML creates HTML articles in sub-directories under htdocs/. The unit
of sub-directories is are based on days, weeks, month and article
sequences. The unit is $HTML_INDEX_UNIT.

Example:

	htdocs/
	htdocs/19980120/
	htdocs/19980120/1.html
	htdocs/19980120/2.html
	.....

1.2	&SyncHtml($HTML_DIR, $ID, *Envelope);

SYNOPSIS:

	&SyncHtml($HTML_DIR, $ID, *Envelope);

	*Envelope	%Envelope


	$HTML_DIR 	Location where HTML articles is stored.
	$ID		Current article ID. Global variable ID is defined
			in &Distribute.  
	*Envelope	%Envelope

Example: 

	$FML_EXIT_HOOK .= q#;
		&use('synchtml');
		$HTML_DIR = 'htdocs';
		&SyncHtml($HTML_DIR, $ID, *Envelope);
	#;


Running &SyncHtml makes

	htdocs/index.html 
	htdocs/ID.html 


index.html is the following listing structure.

	<UL>
	<LI>
	<A HREF=1310.html>
	96/09/16 18:12:33 Subject
	</A>


If $HTML_INDEX_REVERSE_ORDER is set (default), FML generates <LI>
entries in reverse order. Latest article is on the top of index.html.


The cache file is $HTML_DATA_CACHE (default is .indexcache). 
Each directory has each cache file. 


Also a cache file for threading is $HTML_DATA_THREAD.


spool2html.pl is the command line interface.

1.3	Unit of HTML directory

	$HTML_INDEX_UNIT (default is 'day')


	value:	

		"day"
		"week"
		"month"
		"infinite"
		number (e.g. 100, 1000, ...)


$HTML_INDEX_UNIT is the unit of sub-directories under htdocs/. The
default unit is 'day'. FML creates each sub-directory for each day and
stores HTML articles.


Example: Creation of HTML articles on 1996/09/16

	htdocs/19960916/index.html
	htdocs/19960916/100.html
	htdocs/19960916/101.html
	...


If $HTML_INDEX_UNIT = number (e.g. 100), each sub-directory is 100 HTML
articles.

Example:

	htdocs/100/
	htdocs/100/101.html
	htdocs/100/102.html
	...
	htdocs/200/
	htdocs/200/201.html
	htdocs/200/202.html
	...

1.4	$HTML_INDEX_UNIT == "infinite"


Generate html files in one directory, just under $HTML_DIR.  If you
have a ML with large traffic, you should not use this setting since
only inconsistency check routine must run very longly.

1.5	thread.html; Threading


Index file for threaded html articles are created when

	$HTML_THREAD = 1; 


FML Threading depends on In-Reply-To: and References: fields. Hence
FML cannot recognize mails sent from some MUA's such as Eudora since
they ignore In-Reply-To: and References: fields.

	thread.html
	index.html
		SUB-DIRECTORY/thread.html
		SUB-DIRECTORY/index.html
	...


$HTML_THREAD_REF_TYPE defines how articles are referred within them.
In default FML uses all elements in References: and In-Reply-To:.
Even if plural links are shown (duplicated) in thread.html. When

	$HTML_THREAD_REF_TYPE = "prefer-in-reply-to"; (3.0 default)

	1.	In-Reply-To: 


FML selects one message-id in References: and In-Reply-To:. 
	1.	In-Reply-To: 
	2.	If no In-Reply-To: is given,
		the last message-id in References:

1.6	Variables to Customize HTML File 

           $HTML_FORMAT_PREAMBLE


		From the begin to <HEAD>

           $HTML_DOCUMENT_SEPARATOR
		(default) </HEAD><BODY>


		main body

           $HTML_FORMAT_TRAILER

		</BODY></HTML> (default) 


	$INDEX_HTML_DOCUMENT_SEPARATOR
	$INDEX_HTML_FORMAT_PREAMBLE
	$INDEX_HTML_FORMAT_TRAILER


The header fields defined in @HtmlHdrFieldsOrder is written to html
files. If $HTML_HEADER_TEMPLATE is defined, only the content is
written and @HtmlHdrFieldsOrder is ignored.

1.7	HTML 4.0


HTML 4.0 comes from a patch by

	From: TAMURA Kent <kent@hauN.org>
	fml-support: 04153


Thank you the contribution.


HTML 4.0 has a css as a standard. $HTML_STYLESHEET_BASENAME is a style
sheet file. Please consider the relative path.

	$HTML_STYLESHEET_BASENAME = "../fml.css";

A style sheet example is automatically installed under automatic html
generation.

1.8	Expiration over HTML articles

	$HTML_EXPIRE_LIMIT	(default 0)


The unit of expiration over HTML articles. If the unit <= 0,
expiration does not run, so HTML articles breeds only:).


The current expiration algorithm follows: 
Firstly the HTML directory has the following structure.

	thread.html
	index.html
		SUB-DIRECTORY/thread.html
		SUB-DIRECTORY/index.html
		SUB-DIRECTORY/100.html


If removing articles one by one, FML requires sub-directory
consistency checks for re-creating index.html and so on. It is
difficult to keep consistency. Hence our algorithm is "removing the
whole sub-directory if all articles in the sub-directory are expired".
After removing, FML re-creates the top index.html and thread.html.


Expiration codes requires some overheads. It must be of no use to run
expiration each time FML runs since our algorithm is applied for each
directory but an expiration occurs sometimes.  FML runs expiration
codes once $HTML_EXPIRE_LIMIT * 5 times in default.
$HTML_EXPIRE_SCAN_UNIT can control this value.

1.9	BASE64 Decoding


If $BASE64_DECODE is defined, FML tries to decode BASE64 parts
contained in the mail when a HTML article is created.

Example:
	$BASE64_DECODE = "/usr/local/bin/mewdecode";


Fml uses bin/base64decode.pl as a default when $BASE64_DECODE is not
defined. (After 2.2A#11)

1.10	$HTML_OUTPUT_FILTER


When creating HTML articles, apply $HTML_OUTPUT_FILTER as a filter.

1.11	$HTML_TITLE_HOOK


$HTML_TITLE_HOOK is evaluated just before saving HTML files.


Example: to change HTML article subject.

	q#$HtmlTitle = sprintf("%s %s", $Now, $HtmlTitle);#;


1.12	Classification by keywords (obsolete?)

* please ignore ;D

1.13	&TalkWithHttpServer

SYNOPSIS:
	&TalkWithHttpServer($host, $port, $request, $tp, *e); 

	$tp		TYPE OF PROTOCOL (http, gopher, ftp, ...)


	$host		host	(e.g. www.iij.ad.jp)
	$port		port	(70, 80, 21, ...)
	$request	request (e.g. http://www.fml.org/)
	$tp		TYPE OF PROTOCOL (http, gopher, ftp, ...)
	*e		stab of the result


If the request begins without http://host, the www server to prepend
is

	$DEFAULT_HTTP_SERVER


the default port number is

	$DEFAULT_HTTP_PORT 


On gopher

	$DEFAULT_GOPHER_SERVER
	$DEFAULT_GOPHER_PORT 


Example:

    if ($tp =~ /http/i) {
	$host = $host || $DEFAULT_HTTP_SERVER;
	$port = $port || $DEFAULT_HTTP_PORT || 80;

	# connect http server
	&Mesg(*e, ">>> HREF=$tp://$host$request\n");
	&TalkWithHttpServer($host, $port, $request, $tp, *e); 
	&Mesg(*e, ">>> ENDS HREF=$tp://$host$request\n");
    } 

1.14	Server to get the request URL and send back it


$START_HOOK = q#
    require 'libhref.pl';
    &HRef($Envelope{'Body'}, *Envelope);
    $DO_NOTHING =1;
#;


It send backs the URL contents as a "fml status report".

1.15	Download URL's content

SYNOPSIS:
    &HRef($request, *Envelope);


Download the URL which URI is $request and set the content in
$Envelope{'message'}. Request types below are available.

	http://
	gopher://
	ftp://


ftp:// automatically changes to local or to be related to ftpmail. 

1.16	NOTE: Special Character (libsynchtml.pl)


    &lt;
      <

    &gt;

    &amp;

    &quot;


    &ouml;

    &ntilde;

    &Egrave;