overview.html

   1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
   2 <HTML>
   3 <HEAD>
   4 <TITLE>EXIM OVERVIEW</TITLE>
   5 <META NAME="generator" CONTENT="txt2html v1.21">
   6 </HEAD>
   7 <BODY bgcolor="#ccccff">
   8 <H1>EXIM OVERVIEW</H1>
   9
  10
  11 <P>
  12 Date: 29 November 1996
  13
  14 <P>
  15 Exim is a mail transport agent (MTA) developed at the University of Cambridge
  16 for use on Unix systems connected to the Internet. It is freely available
  17 under the terms of the GNU General Public Licence. In style it is similar to
  18 Smail 3, but its facilities are more extensive, and in particular it has some
  19 defences against mail bombs and unsolicited junk mail, in the form of options
  20 for refusing messages from particular hosts, networks, or senders.
  21
  22 <P>
  23 Exim is in production use on a number of sites that move tens of thousands of
  24 messages per day. This document contains an overview description of the way
  25 Exim works, with a certain amount of simplification to keep it fairly short.
  26 Please address any enquiries about Exim to Philip Hazel:
  27
  28 <P>
  29 Email:   &lt;ph10@cus.cam.ac.uk&gt;<BR>
  30 Phone:   +44 1223 334714<BR>
  31 Fax:     +44 1223 334679
  32
  33 <P>
  34 University of Cambridge<BR>
  35 Computer Laboratory<BR>
  36 Pembroke Street<BR>
  37 Cambridge CB2 3QG<BR>
  38 United Kingdom
  39
  40 <P>
  41 This document is copyright (c) University of Cambridge 1996, but copying
  42 permission is granted to all.
  43 <HR>
  44
  45
  46 <P>
  47     "If I have seen further it is by standing on the shoulders of giants."
  48 <BR>
  49                                                                 (Isaac Newton)
  50 <HR>
  51
  52
  53 <H2> Background</H2>
  54
  55 <P>
  56 Exim owes a great deal to Smail 3 and its author, Ron Karr. Without the
  57 experience of running and working on the Smail 3 code, I could never have
  58 contemplated starting to write a new mailer. Many of the ideas and user
  59 interfaces are taken from Smail 3, though the actual code of Exim is entirely
  60 new.
  61
  62 <P>
  63 My intention was to write a mailer that had more functionality than Smail 3,
  64 but which retained the simple lightweight approach, as this seemed to me to be
  65 all that was needed for systems directly connected to the Internet, where most
  66 messages are delivered almost immediately.
  67
  68
  69 <A NAME="0"><H2>2. Availability</H2></A>
  70
  71 <P>
  72 The current distribution of Exim is available from
  73
  74 <P>
  75 ftp://ftp.cus.cam.ac.uk/pub/software/programs/exim/exim-n.nn.tar.gz
  76
  77 <P>
  78 where n.nn is the version number. The distribution contains an ASCII copy of
  79 the documentation; other formats are available from
  80
  81 <P>
  82 ftp://ftp.cus.cam.ac.uk/pub/software/programs/exim/exim-postscript-n.nn.tar.gz
  83 ftp://ftp.cus.cam.ac.uk/pub/software/programs/exim/exim-texinfo-n.nn.tar.gz
  84
  85 <P>
  86 The following operating systems are currently supported: AIX, BSDI, FreeBSD,
  87 HP-UX, IRIX, Linux, NetBSD, DEC OSF1 (aka Digital UNIX), SCO, SunOS4, SunOS5,
  88 and Ultrix.
  89
  90
  91 <A NAME="1"><H2>3. Limitations</H2></A>
  92
  93 <P>
  94 For the benefit of those reading this overview to see whether Exim is of
  95 interest to them, its limitations are listed first.
  96
  97 <UL>
  98   <LI> Exim is written in ANSI C. This should not be much of a limitation these
  99      days. However, to help with systems that lack a true ANSI C library, Exim
 100      avoids making any use of the value returned by the sprintf() function,
 101      which is one of the main incompatibilities. It has its own version of
 102      strerror() for use with SunOS4 and any other system that lacks this
 103      function, and a macro can be defined to turn memmove() into bcopy() if
 104      necessary.
 105
 106   <LI> Exim uses file names that are longer than 14 characters.
 107
 108   <LI> Exim is intended for use as an Internet mailer, and therefore handles
 109      addresses in RFC 822 domain format only. It cannot handle 'bang paths',
 110      though simple two-component bang paths can be converted by a straightforward
 111      rewriting configuration.
 112
 113   <LI> Exim insists that every address it handles has a domain attached. For
 114      incoming local messages, domainless addresses are automatically qualified
 115      with a configured domain value. Configuration options specify from which
 116      remote systems unqualified addresses are acceptable.
 117
 118   <LI> The only external transport currently implemented is an SMTP transport
 119      over a TCP/IP network (using sockets), suitable for machines on the
 120      Internet. However, a pipe transport is available, and there are facilities
 121      for writing messages to files in 'batched SMTP' format; this can be
 122      used to send messages to some other transport mechanism. Batched SMTP
 123      input is also catered for.
 124 </UL>
 125
 126
 127 <A NAME="2"><H2>4. Main features</H2></A>
 128
 129 <P>
 130 Exim follows the same general approach of decentralised control that Smail 3
 131 does. There is no central process doing overall management of mail delivery.
 132 However, unlike Smail, the independent delivery processes share data in the
 133 form of 'hints', which makes delivery more efficient in some cases. The hints
 134 are kept in a number of DBM files. If any of these files are lost, the only
 135 effect is to change the pattern of delivery attempts and retries.
 136
 137 <P>
 138 Here is a summary of Exim's main features. More details are given in the
 139 sections which follow.
 140
 141 <UL>
 142   <LI> Many configuration options can be given as expansion strings, and as
 143      these can include file lookups, much of Exim's operation can be made
 144      table-driven if desired. For example, it is possible to do local delivery
 145      on a machine on which the users do not have accounts.
 146
 147   <LI> Regular expressions are available in a number of configuration
 148      parameters.
 149
 150   <LI> Domain lists can include file lookups, making it possible to support a
 151      large number of local domains.
 152
 153   <LI> Exim has flexible retry algorithms, applicable to mail routing as well as
 154      to delivery.
 155
 156   <LI> Exim contains header and envelope rewriting facilities.
 157
 158   <LI> Unqualified addresses are accepted only from specified hosts or networks.
 159
 160   <LI> Exim can perform multiple deliveries down the same SMTP channel after
 161      deliveries to a host have been delayed.
 162
 163   <LI> Exim can be configured to do local deliveries immediately but to leave
 164      remote deliveries until the message is picked up by a queue-runner
 165      process. This increases the likelihood of multiple messages being sent
 166      down a single SMTP connection.
 167
 168   <LI> When copies of a message have to be delivered to more than one remote
 169      host, up to a configured maximum number of remote deliveries can be done
 170      in parallel.
 171
 172   <LI> Exim supports optional checking of incoming return path (sender) and
 173      receiver addresses as they are received by SMTP.
 174
 175   <LI> SMTP calls from specific machines, optionally from specific idents, can
 176      be locked out, and incoming SMTP messages from specific senders can also
 177      be locked out.
 178
 179   <LI> It is possible to control which hosts may use the Exim host as a relay
 180      for onward transmission of mail; the control can be made to depend on the
 181      address domain.
 182
 183   <LI> Messages on the queue can be 'frozen' and 'thawed' by the administrator.
 184
 185   <LI> The maximum size of message can be specified.
 186
 187   <LI> Exim can handle a number of independent local domains on the same
 188      machine; each domain can have its own alias files, etc. These are
 189      commonly called "virtual domains".
 190
 191   <LI> Exim stats a user's home directory before looking for a .forward file, in
 192      order to detect the case of a missing NFS mount.
 193
 194   <LI> Exim contains an optional built-in mail filtering facility. This enables
 195      users to set up their own mail filtering in a straightforward manner
 196      without the need to run an external program. There can also be a system
 197      filter file that applies to all messages.
 198
 199   <LI> There is support for multiple user mailboxes controlled by prefixes or
 200      suffixes on the user name, either via the filter mechanism or through
 201      multiple .forward files.
 202
 203   <LI> Periodic warnings are automatically sent to messages' senders when
 204      delivery is delayed - the time between warnings is configurable.
 205
 206   <LI> A queue run can be manually started to deliver just a particular portion
 207      of the queue, or those messages with a recipient whose address contains a
 208      given string.
 209
 210   <LI> Exim can be configured to run as root all the time, except when
 211      performing local deliveries, which it always does in a separate process
 212      under an appropriate uid and gid. Alternatively, it can be configured to
 213      run as root only when needed; in particular, it need not run as root when
 214      receiving incoming messages or when sending out messages over SMTP.
 215
 216   <LI> I have tried to make the wording of delivery failure messages clearer and
 217      simpler, for the benefit of those less-experienced people who are now
 218      using email.
 219
 220   <LI> The Exim Monitor is an optional extra; it displays information about
 221      Exim's processing in an X window, and an administrator can perform a
 222      number of control actions from the window interface.
 223 </UL>
 224
 225
 226 <A NAME="3"><H2>5. Performance</H2></A>
 227
 228 <P>
 229 Although I did not specifically set out to write a high-performance MTA, Exim
 230 does seem to be fairly efficient. The busiest site I know of is an ISP that
 231 handles over 40,000 messages a day on a Sun Ultra box. Our central mail
 232 service machine in Cambridge (a SPARCstation-20) handles over 30,000 messages
 233 on a typical day, the volume being around 130 megabytes on the day I looked.
 234 The largest number of messages delivered in any one hour was 2753.
 235
 236 <P>
 237 A system of a different character is sunsite.doc.ic.ac.uk, a SPARCserver 1000
 238 system with 8 cpus, which is unusual in that virtually all mail deliveries are
 239 remote and relatively large, because it is a data archive that can deliver
 240 copies of its holdings via an email interface. On a fairly busy day 14,014
 241 messages were received from 231 different hosts and 12,534 deliveries were
 242 made to 468 different hosts. The total amount of outgoing mail was 431
 243 megabytes. The largest number of deliveries in any one hour was 787.
 244
 245
 246 <A NAME="4"><H2>6. Interface</H2></A>
 247
 248 <P>
 249 Like many MTAs, Exim has adopted the Sendmail interface so that it can be a
 250 straight replacement for /usr/lib/sendmail. All the relevant Sendmail options
 251 are implemented. There are also some additional options that are compatible
 252 with Smail 3, and some further options that are new to Exim.
 253
 254 <P>
 255 The runtime configuration interface is a single file which is divided into a
 256 number of sections. The entries in this file consist of keywords and values,
 257 in the style of Smail 3 configuration files.
 258
 259 <P>
 260 Control of messages on the queue can be done via certain privileged command
 261 line options. There is also an optional monitor program called eximon, which
 262 displays current information in an X window and contains interfaces to the
 263 command line options.
 264
 265
 266 <A NAME="5"><H2>7. Method of operation</H2></A>
 267
 268 <P>
 269 When Exim receives a message, it writes two files in its spool directory. The
 270 first contains the envelope information, the current status of the message,
 271 and the headers, while the second contains the body of the message. The status
 272 of the message includes a complete list of recipients and a list of those that
 273 have already received the message. The header file gets updated during the
 274 course of delivery if necessary.
 275
 276 <P>
 277 A message remains in the spool directory until it is completely delivered to
 278 its recipients or to an error address, or until it is deleted by an
 279 administrator or by the user who originally created it. In cases when delivery
 280 cannot proceed - for example, when a message can neither be delivered to its
 281 recipients nor returned to its sender, the message is marked 'frozen' on the
 282 spool, and no more deliveries are attempted. The administrator can thaw such
 283 messages when the problem has been corrected, and can also freeze individual
 284 messages by hand if necessary.
 285
 286 <P>
 287 As delivery proceeds, Exim writes timestamped information about each address
 288 to a per-message log file; this includes any delivery error messages. This log
 289 is solely for the benefit of the administrator. All the information Exim
 290 itself needs for delivery is kept in the header spool file. The message log
 291 file is deleted with the spool files. If a message is delayed for more than a
 292 configured time, a warning message is sent to the sender. This is repeated
 293 whenever the same time elapses again without delivery being complete.
 294
 295 <P>
 296 The main delivery processing elements of Exim are called directors, routers,
 297 and transports. Code for a number of these is provided, and compile-time
 298 options specify which ones are actually included in the binary. Directors
 299 handle addresses that include one of the local domains, routers handle remote
 300 addresses, and transports do actual deliveries.
 301
 302 <P>
 303 When a message is to be delivered, the sequence of events is roughly as
 304 follows:
 305
 306 <UL>
 307   <LI> If there is a system filter file, it is obeyed. This can check on the
 308      contents of the message and its headers, and cause delivery to be
 309      abandoned or directed to alternative or additional addresses.
 310
 311   <LI> Each address is parsed and a check is made to see if it is local or not,
 312      by comparing the domain with the list of local domains, which can be
 313      wildcarded, or even held in a file if there are a large number of them.
 314
 315   <LI> If an address is local, it is passed to each configured director in turn
 316      until one is able to handle it. If none can, the address is failed.
 317      Directors can be targeted at particular local domains, so several local
 318      domains can be processed independently of each other.
 319
 320   <LI> A director that accepts an address may set up a local or a remote
 321      transport for it, or it may generate one or more new addresses (typically
 322      from alias or forward files). New addresses are fed back into this
 323      process from the top, but in order to avoid loops, a director will ignore
 324      any address which has an identically-named ancestor that was processed by
 325      itself.
 326
 327   <LI> If an address is not local, it is passed to each router in turn until one
 328      is able to handle it. If none can, the address is failed.
 329
 330   <LI> A router that accepts an address may set up a transport for it, or may
 331      pass an altered address to subsequent routers, or it may discover that
 332      the address is a local address after all. This typically happens when an
 333      partial domain name is used and (for example) the DNS lookup is
 334      configured to try to extend such names. In this case, the address is
 335      passed back to the directors.
 336
 337   <LI> Routers normally set up remote transports for messages that are to be
 338      delivered to other machines. However, a router can pass a message to a
 339      local transport, and by this means messages can be routed to other
 340      transport mechanisms.
 341
 342   <LI> When all the directing and routing is done, addresses that have been
 343      successfully handled are passed to their assigned transports. Local
 344      transports handle only one address at a time, but remote ones can handle
 345      more than one. Each local transport runs in a separate process under a
 346      non-privileged uid.
 347
 348   <LI> If there were any errors, a message is returned to an appropriate address
 349      (the sender in the common case).
 350
 351   <LI> If one or more addresses suffered a temporary failure, the message is
 352      left on the queue, to be tried again later. Otherwise the spool files and
 353      message log are deleted.
 354 </UL>
 355
 356
 357 <A NAME="6"><H2>8. Mail filtering</H2></A>
 358
 359 <P>
 360 Exim can be configured to allow users to set up filter files as an alternative
 361 to the traditional .forward files. A filter file can test various characteristics
 362 of a message, including the contents of the headers and the start of
 363 the body, and direct delivery to specified addresses, files, or pipes
 364 according to what it finds. The system-wide filter file uses the same control
 365 syntax.
 366
 367
 368 <A NAME="7"><H2>9. Directors</H2></A>
 369
 370 <P>
 371 The existing directors are listed below. I use the RFC 822 term local-part to
 372 mean that portion of an address that comes before the @ character.
 373
 374 <UL>
 375   <LI> aliasfile: This director handles local-part expansion via a traditional
 376      alias file. The name of the file is obtained by string expansion, and may
 377      therefore depend on the local-part or the domain. Generated pipe and file
 378      addresses can be (independently) locked out.
 379
 380 </UL>
 381 <P>
 382      The aliasfile director can also be used to test a list of local parts and
 383      direct any messages for them to a specific transport. In this case the
 384      data associated with the local part in the file is not used for address
 385      expansion, but is available for other purposes. For example, files
 386      containing records of the form
 387
 388 <P>
 389        foo: uid=1234 gid=5678 mailbox=/home_1/foo/inbox
 390
 391 <P>
 392      could be used on a system that did local deliveries without consulting
 393      its passwd file. The aliasfile director could use the file to verify that
 394      the local part was valid, and then the appendfile transport could use it
 395      to get a uid, gid, and mailbox for the delivery.
 396
 397 <UL>
 398   <LI> forwardfile: This director handles local-part expansion via a traditional
 399      forward file or, if so configured, by a user's filter file. The name of
 400      the file is obtained by string expansion, and may therefore depend on the
 401      local-part or the domain, though if it is not an absolute path it is
 402      automatically assumed to be in the home directory of the user whose login
 403      name is the local-part. Mailing lists can be handled by file names of the
 404      form
 405
 406 </UL>
 407 <P>
 408        /some/list/directory/${local_part}
 409
 410 <P>
 411      and it is possible to specify an error address for each list that depends
 412      on the list name. Generated pipe and file addresses can be (independently)
 413      locked out.
 414
 415 <UL>
 416   <LI> localuser: This director matches the local-part of an address to a user
 417      of the machine. It can also be configured to do a pattern match on the
 418      user's home directory name. This makes it possible to partition the set
 419      of local users according to their home directories.
 420
 421   <LI> smartuser: This director matches any local-part. It can be used to pass
 422      messages for unknown users to a script that generates a helpful error
 423      message, or it can be used to send such messages to another host,
 424      optionally changing the envelope address in the process.
 425
 426 </UL>
 427 <P>
 428 The configuration file determines which directors are actually used, and in
 429 which order. It is possible to use the same director more than once, with
 430 different options.
 431
 432 <P>
 433 The addresses a director handles can be constrained in the following ways:
 434
 435 <UL>
 436   <LI> A specific set of local domains may be specified, in which case the
 437      director is called only for addresses that contain one of those domains.
 438
 439   <LI> A specific set of local parts may be specified, in which case the
 440      director is called only for addresses that contain one of those local
 441      parts. This could be used, for example, to handle 'postmaster' independently
 442      of the particular local domain.
 443
 444   <LI> A director may be configured to handle local-parts that start with a
 445      certain prefix and/or end with a certain suffix. For example, a director
 446      can be set up to handle local-parts of the form xxxx-request only.
 447
 448   <LI> A flag controls whether a director is called when an address is being
 449      verified, as opposed to being directed for delivery.
 450
 451 </UL>
 452 <P>
 453 In addition, certain files can be required to exist or not exist for a given
 454 director to be run.
 455
 456
 457 <A NAME="8"><H2>10. Routers</H2></A>
 458
 459 <P>
 460 The existing routers are:
 461
 462 <UL>
 463   <LI> domainlist: This director searches a list of domains for the one it is
 464      trying to route. The list may either be a string in the configuration
 465      file, possibly including wild cards or regular expressions, or it may be
 466      in a file, or both may be provided. In the case of a file, keys of the
 467      form *.foo.bar.com can be used for simple wildcarding.
 468
 469 </UL>
 470 <P>
 471      If the domain is found, its entry can either specify a single replacement
 472      domain name that is passed on to subsequent routers, or it can specify a
 473      list of domain names that are looked up by this router. The lookup can be
 474      done by the gethostbyname function, or by DNS lookup, and in the latter
 475      case it is configurable whether MX or A records or both are used. As well
 476      as providing explicit routing for certain domains, the domainlist router
 477      can be used to set up gateways for partial domains (e.g. for *.uucp) and
 478      it can also be used as a 'smarthost' router by using the all-inclusive
 479      wild card.
 480
 481 <UL>
 482   <LI> lookuphost: This router looks up domain names either by calling the
 483      gethostbyname function, or by using the DNS. In the latter case, it can
 484      be configured to use the DNS resolver options for qualifying singlecomponent
 485      names and for searching parent domains. It is also possible to
 486      specify explicit text strings for widening domains that are not found
 487      initially. It is possible to insist on the presence of MX records for
 488      certain sets of domains. A configuration option controls whether the
 489      message's headers are rewritten when a domain name is changed.
 490
 491   <LI> queryprogram: This router passes the address to a script that runs in a
 492      separate process under an unprivileged uid and gid. The script returns a
 493      line of text specifying whether it matched the domain or not. If it did
 494      match, it may specify a transport name, or it may specify that the
 495      transport specified for the router is used. The script may also send back
 496      a new domain name to replace the current one, and specify a method of
 497      looking this name up (gethostbyname, DNS, or pass to next router).
 498
 499 </UL>
 500 <P>
 501 The configuration file determines which routers are actually used, and in
 502 which order. It is possible to use the same router more than once, with
 503 different options.
 504
 505 <P>
 506 Like directors, routers can be constrained to handle only certain domains or
 507 certain local parts (though I haven't seen a good use for that yet). If a
 508 router times out, either the delivery can be deferred, or the address can be
 509 passed on to the next router.
 510
 511 <P>
 512 A flag controls whether a router is called when an address is being verified,
 513 as opposed to being routed for delivery.
 514
 515
 516 <A NAME="9"><H2>11. Transports</H2></A>
 517
 518 <P>
 519 Local and remote transports are handled differently. A local transport is
 520 always run in a separate process with an appropriate real uid and gid. Their
 521 values can be specified in the transport's configuration, or passed over from
 522 the director that handled the address. The existing transports are:
 523
 524 <UL>
 525   <LI> appendfile: This local transport appends the message to a file whose name
 526      is specified as a string containing variable expansions. The current
 527      local-part can be inserted via the expansion mechanism, and file names
 528      such as
 529
 530 </UL>
 531 <P>
 532      /home/${local_part}/inbox<BR>
 533      /var/mail/${local_part}
 534
 535 <P>
 536      are typical examples. However, it is possible to look up each individual
 537      user's inbox name in a file, should that be required.
 538
 539 <P>
 540      Exclusive access to the file is ensured by using the traditional mailbox
 541      locking strategy of creating a lock file. The lock creation process uses
 542      a 'hitching post' algorithm (similar to that used by Pine) which is
 543      robust when the mailbox file is NFS-mounted. The file is also locked
 544      using the lockf function.
 545
 546 <P>
 547      Options on this transport allow for the insertion of a prefix line (e.g.
 548      'From xxx...') and suffix line, special processing of message lines
 549      starting with 'From', and the addition of Return-path, Delivery-date, and
 550      Envelope-to headers. If the mailbox file is not a regular file, or does
 551      not have the correct owner, group, or permissions, no delivery takes
 552      place; the address is deferred and the postmaster is informed, except
 553      that, if the file's permissions are greater than those required, Exim
 554      reduces the permissions and carries on. There are additional checks to
 555      reduce the possibility of security exposures caused by race conditions.
 556
 557 <UL>
 558   <LI> pipe: This local transport passes the message via a pipe to a specified
 559      command (program or script) which is run in a separate process under a
 560      given uid and gid. Various parameters of the message are passed as
 561      environment variables, and there are the same options as for appendfile
 562      for controlling the form of the message.
 563
 564 </UL>
 565 <P>
 566      The returned status of the command may be used to determine success or
 567      failure, or it can be ignored. A configuration option specifies whether
 568      any standard output generated by the transport is to be returned to the
 569      sender. If this is set and output is actually generated, the delivery is
 570      deemed to have failed, whatever the returned status of the command. The
 571      maximum amount of output generated by the command can be controlled, and
 572      a timeout may be set for it.
 573
 574 <UL>
 575   <LI> smtp: This remote transport delivers a message using SMTP over TCP/IP.
 576      All addresses in the message that route to the same set of hosts, and
 577      have the same errors address (return path), are normally sent in a single
 578      transaction. An explicit list of hosts can be set for the transport, or a
 579      host list may be attached to an address by one of the routers. If all the
 580      hosts are temporarily unable to accept the message, it is delivered to
 581      one of a list of fallback hosts, if configured.
 582 </UL>
 583
 584
 585 <A NAME="10"><H2>12. Exim logs</H2></A>
 586
 587 <P>
 588 Exim write four different log files:
 589
 590 <UL>
 591   <LI> The main log records the arrival of each message and the result of each
 592      delivery attempt in a single line in each case. The format is as compact
 593      as possible, in an attempt to keep down the size of log files. A number
 594      of other events are also recorded on the main log.
 595
 596   <LI> The reject log records information from messages that are rejected
 597      because their return paths are invalid (a configurable option). The
 598      headers are written to this log, following a copy of the one-line message
 599      that is also written to the main log. Other types of message rejection
 600      also cause writing to this log.
 601
 602   <LI> The panic log is written when Exim suffers a disaster and has to bomb
 603      out.
 604
 605   <LI> On systems that support signal handlers that restart a system call on
 606      exit, Exim reacts to a USR1 signal by writing a line describing its
 607      current activity to the process log. This makes it possible to find out
 608      what each exim process on a machine is currently doing.
 609
 610 </UL>
 611 <P>
 612 A utility script for renaming and compressing the main and reject logs each
 613 night is provided. There are also scripts for extracting statistics from log
 614 files and for searching log files for the entries for messages that match a
 615 given pattern. For example, one can pull out all entries relating to messages
 616 for a given local part.
 617
 618
 619 <A NAME="11"><H2>13. Exim databases</H2></A>
 620
 621 <P>
 622 Exim maintains a number of databases in DBM files to help it perform efficient
 623 mail delivery. In effect, the files contain hints, and if they are lost it is
 624 not a disaster - Exim's performance just suffers a bit. The three databases
 625 currently used are:
 626
 627 <UL>
 628   <LI> retry: This contains information about each failing remote host and
 629      temporary failing local delivery - when the first failure was detected,
 630      when the delivery (or directing or routing) was last tried, and when it
 631      should next be tried. More details about retry algorithms are given
 632      below.
 633
 634   <LI> wait-smtp: This contains information about messages that are waiting for
 635      particular hosts after an SMTP delivery failure (see the next section).
 636
 637   <LI> reject: This contains information about SMTP message rejections (see
 638      below).
 639
 640 </UL>
 641 <P>
 642 There is a utility program that lists the contents of one of these databases,
 643 and another that allows manual modifications to be applied in some cases.
 644 Database records are timestamped, and there is a utility that removes records
 645 that are older than a given period, and also cleans up wait-smtp records
 646 containing references to messages that no longer exist. Running this daily or
 647 weekly should be sufficient to keep the files reasonably tidy.
 648
 649
 650 <A NAME="12"><H2>14. SMTP batching</H2></A>
 651
 652 <P>
 653 When an SMTP delivery attempt fails, causing the message to be deferred till
 654 later, Exim updates a DBM database that contains records keyed by host name
 655 plus IP address. Each record holds a list of messages that are waiting for
 656 that host and address.
 657
 658 <P>
 659 When an SMTP delivery succeeds, Exim consults the database to see if there are
 660 any other messages waiting for the same host and address. If it finds any, it
 661 creates a new Exim process and passes it the open SMTP channel and a message
 662 identification. The new process then delivers the waiting message down the
 663 existing channel and may in turn cause the creation of yet another process.
 664 Any other waiting addresses in the message are skipped. The maximum number of
 665 messages sent down one connection is configurable.
 666
 667 <P>
 668 This scheme achieves some SMTP efficiency when a number of messages have been
 669 queued up for a given host, without the overhead of a heavyweight queueing
 670 apparatus.
 671
 672
 673 <A NAME="13"><H2>15. Retries</H2></A>
 674
 675 <P>
 676 When a message cannot immediately be directed, routed, or delivered, it
 677 remains on the queue and another delivery attempt occurs at a later time.
 678 While failures to deliver to remote hosts are the most common cause of this,
 679 it is also possible for a message to be deferred as a result of temporary
 680 local delivery failure, or following directing or routing. A local delivery
 681 can fail if the user is over quota, while directing can be delayed if a user's
 682 home directory is not available (e.g. missing NFS mount), and therefore the
 683 existence of a .forward file cannot be tested. Routing can be delayed by DNS
 684 timeouts.
 685
 686 <P>
 687 Exim can be given a set of rules which specify how often to retry deferred
 688 addresses, and when to give up. These rules apply to directing and routing as
 689 well as to transporting, and are keyed by (wildcarded) domain name or, for
 690 local users, by local-part and domain name, either of which can be wildcarded.
 691
 692 <P>
 693 Each rule is actually a sequential list of subrules, which are applied
 694 successively as time passes. At present there are two kinds of subrule: fixed
 695 interval, and geometrically increasing interval. For example, it is possible
 696 to specify a rule such as 'retry every 15 minutes for 2 hours; then increase
 697 the interval between retries by a factor of 1.5 each time until 8 hours have
 698 passed; then retry every 8 hours until 4 days have passed; then give up'. The
 699 times are measured from when the address first failed, so, for example, if a
 700 host has been down for 2 days, new messages will immediately go on to the
 701 8-hour retry schedule.
 702
 703 <P>
 704 Exim does not have an elaborate series of alarm clocks to cause retries to
 705 happen exactly on schedule. A queue-runner process is started periodically, to
 706 attempt delivery, one by one, of messages containing addresses that have
 707 passed their next retry time. If such an address fails again, a new retry time
 708 is computed, and so subsequent messages queued for the same address get
 709 skipped. The queue is not processed sequentially, but in a 'random' order, to
 710 prevent one rogue message that causes a problem blocking other messages to the
 711 same destination for ever.
 712
 713 <P>
 714 When the maximum time for retrying has passed, pending addresses are failed.
 715 However, a next try time is still computed from the final subrule. Until that
 716 time is reached, any new messages for the address are immediately failed. When
 717 the next try time is passed, one further delivery attempt is made; if this
 718 fails, a new next try time is computed, and so on.
 719
 720 <P>
 721 The increasing number of small computers on the Internet has caused there to
 722 be a lot of messages addressed to hosts that are never going to listen. The
 723 retry logic described above should reduce the amount of wasted time spent on
 724 trying to deliver such messages. However, some administrators are unhappy
 725 about this rather draconian approach, which can cause an address to be failed
 726 without any deliveries being attempted. Exim can alternatively be configured
 727 always to try at least once those hosts whose last failure was before the
 728 arrival of the message. This option increases the number of attempts to
 729 deliver to dead hosts.
 730
 731 <P>
 732 Retry rules can be predicated on particular errors as well as on domain names,
 733 and for domains that are looked up in the DNS, further discrimination on
 734 whether MX records were used or not is also possible. Thus it is possible to
 735 treat 'connection refused' and 'connection timed out' differently, or to
 736 distinguish between 'connection refused and there was only an A record' and
 737 'connection refused from a host pointed to by an MX record'.
 738
 739 <P>
 740 When a local delivery fails because a user is over quota, the retry rule can
 741 be predicated on the length of time since the mailbox was last read. For
 742 example, if the mailbox has been recently read, the delivery can be retried
 743 for a while; otherwise it can be failed quickly.
 744
 745
 746 <A NAME="14"><H2>16. Header rewriting</H2></A>
 747
 748 <P>
 749 There are those who argue that header rewriting is a totally Bad Thing; there
 750 are others who swear they cannot live without it. Exim provides the facility -
 751 you do not have to use it!
 752
 753 <P>
 754 Exim can be configured to rewrite the address portions of headers when a
 755 message is received. For debugging purposes, the original headers are retained
 756 in the spool file, but are not, of course, transported with the message.
 757 Rewriting rules can be targeted at individual headers and the envelope fields;
 758 it is possible, for example, just to rewrite the 'From' header and no others.
 759
 760 <P>
 761 Rewriting rules are keyed by local-part and domain, either of which can be
 762 wildcarded, and the replacement text is a general expansion string which can
 763 contain file lookups. This makes it possible to replace login names by
 764 'friendly' names in outgoing addresses via a DBM lookup, for example. The
 765 other most common rewriting requirement of replacing *.foo.bar with foo.bar is
 766 also easily handled.
 767
 768 <P>
 769 Headers are also automatically rewritten by Exim in two cases:
 770
 771 <UL>
 772   <LI> If a locally-generated message contains addresses without domains, a
 773      configured qualifying domain is added to each of them. It is also
 774      possible to specify which remote systems are permitted to send messages
 775      containing unqualified addresses. These too get qualified on reception.
 776
 777   <LI> Routing of a domain may reveal that is was only a partial domain, in
 778      which case the headers are rewritten to contain the full domain. For
 779      example, as a result of routing, an address such as xxx@foo may turn into
 780      xxx@foo.bar.ac.uk.
 781
 782
 783 <A NAME="15"><H2>17. Host verification</H2></A>
 784
 785 </UL>
 786 <P>
 787 Exim can be configured to accept incoming SMTP calls from certain hosts only,
 788 or it can be configured to reject calls from certain hosts. In both cases, the
 789 test may include an RFC 1413 identification check. A system that gets all its
 790 mail via a central hub might want to lock out the rest of the world, while a
 791 number of systems under one management might want to exchange mail only via
 792 the standard mailer, and hence reject mail from all but certain specified ids
 793 within the group.
 794
 795 <P>
 796 When a host fails the acceptance test, Exim can either give an error code
 797 immediately on connection, or allow the connection to proceed and then give
 798 error codes to all the message's recipients. The latter approach is useful
 799 when using the mechanism to reject unsolicited junk mail and mail bombs,
 800 because it normally prevents the sender from trying again with the same
 801 message.
 802
 803
 804 <A NAME="16"><H2>18. SMTP port reservation</H2></A>
 805
 806 <P>
 807 The maximum number of simultaneous incoming SMTP calls can be set, and in
 808 addition, a number of them can be reserved for particular hosts or particular
 809 IP networks. It is also possible to specify a system load value above which
 810 only calls from the reserved hosts are accepted.
 811
 812
 813 <A NAME="17"><H2>19. Control of relaying</H2></A>
 814
 815 <P>
 816 A host is said to act as a relay if it accepts an incoming message from an
 817 external host and delivers it to an external host. Unscrupulous persons have
 818 been known to use unsuspecting hosts as relays in an attempt to disguise the
 819 origin of messages. An Exim host can be configured to accept mail from any
 820 host for onward transmission to a specified set of domains only, and to accept
 821 mail only from a specified list of hosts or networks for onward transmission
 822 to any domain.
 823
 824
 825 <A NAME="18"><H2>20. Sender verification</H2></A>
 826
 827 <P>
 828 The return path of a message (also known as the 'envelope sender') is used
 829 when Exim has to return an error message. If this is a bad address, the error
 830 message cannot be delivered, and the postmaster has to sort things out.
 831
 832 <P>
 833 Sender verification (a configurable option that applies to SMTP input) is
 834 intended to pass this work to a foreign postmaster, by refusing to accept the
 835 message in the first place. There is an exception list which can specify
 836 certain hosts (with optional RFC 1413 identifications) that are allowed to
 837 bypass the check.
 838
 839 <P>
 840 There are two main causes of bad return paths: misconfigured mailers (gateways
 841 in particular), and users fooling around with mail. Sadly, the latter are
 842 rather common in educational institutions. Sender verification catches both of
 843 them. It operates by passing the sender address through the directors and
 844 routers in verification mode; if this fails, the message is not accepted.
 845
 846 <P>
 847 The first thing foreign postmasters ask when they learn about a rejected
 848 message is 'What were the headers?'. For this reason, and also to collect
 849 evidence in cases of mail forgery, Exim does not initially reject a message
 850 after the MAIL FROM command in the SMTP session. It reads the message, so as
 851 to be able to write the headers to the rejection log, and then gives a hard
 852 error response to the sending host.
 853
 854 <P>
 855 Unfortunately, several mailers believe that any error response after the data
 856 for a message has been sent indicates a temporary error. Consequently, such
 857 mailers will continue to try to send a message that has been rejected as
 858 described above. To prevent this, whenever a message is rejected, Exim records
 859 the time, bad address, and host in a DBM database. If the same host sends the
 860 same bad address within 24 hours, it is rejected immediately at the MAIL FROM
 861 command.
 862
 863 <P>
 864 Sadly, even this doesn't stop some mailers from repeatedly trying to send the
 865 message. As a last resort, if the same host sends the same bad address for a
 866 third time in 24 hours, the MAIL FROM command is accepted, but all subsequent
 867 RCPT TO commands are rejected. If this does not stop a remote mailer then it
 868 is badly broken.
 869
 870 <P>
 871 If the attempt to verify the sender address cannot be completed (typically
 872 because of a DNS timeout) Exim gives temporary error code to the MAIL FROM
 873 command, which should cause the remote mailer to try again later. However, it
 874 is possible to configure Exim to accept the message in these circumstances.
 875
 876 <P>
 877 Many messages with bad return paths in fact contain perfectly valid 'From' or
 878 'Reply-to' headers. For administrators that want a quieter life, there is a
 879 configuration option which causes Exim to check these headers if the return
 880 path is bad, and if a good address is found, to use it to replace the return
 881 path. The old value is retained in an X- header.
 882
 883
 884 <A NAME="19"><H2>21. Sender lock out</H2></A>
 885
 886 <P>
 887 More and more unsolicited junk mail is being seen on the Internet. It is
 888 sometimes useful to be able to reject messages (from any host) with particular
 889 sender addresses in the envelope. Exim can be configured to reject messages
 890 whose sender addresses match certain patterns, either by failing the MAIL FROM
 891 command, or (because some mailers take no notice of that) by failing all RCPT
 892 TO commands.
 893
 894
 895 <A NAME="20"><H2>22. Receiver verification</H2></A>
 896
 897 <P>
 898 Exim can be configured so that it checks the addresses given in incoming SMTP
 899 RCPT TO commands as they are received. A failing address can be immediately
 900 rejected, or it can be logged and accepted. If verification cannot be
 901 completed (typically because of a DNS timeout) either a temporary error code
 902 can be given, or the address can be logged and accepted.
 903
 904
 905 <A NAME="21"><H2>23. The 'percent hack'</H2></A>
 906
 907 <P>
 908 The so-called 'percent hack' is the feature of mailers whereby a local-part
 909 containing a percent sign gets interpreted as an entire new address, with the
 910 percent replaced by @. This is used for explicit mail routing and sometimes
 911 for testing. In Exim, it is possible to configure which local domains, if any,
 912 allow the 'percent hack'.
 913
 914
 915 <A NAME="22"><H2>24. Security</H2></A>
 916
 917 <P>
 918 Exim is written as a single binary that has to run setuid to root. I did start
 919 off trying to write it as a number of different modules, but soon came to the
 920 conclusion that, for this type of mailer, it was not worth it, because the
 921 functions don't decompose cleanly. For example, if you want to verify
 922 addresses while receiving mail you need all the directing and routing
 923 apparatus to be available.
 924
 925 <P>
 926 Exim runs each local delivery in a separate process which is setuid to the
 927 relevant local user. In addition, it can be configured to run under a given
 928 non-root uid (and gid) for much of the rest of the time. In particular, it
 929 need not be root while sending or receiving SMTP mail. On systems that do not
 930 have the seteuid function, it uses setuid to give up root, which requires it
 931 to re-invoke itself in order to regain the privilege when it needs to deliver
 932 a message. On systems that do have seteuid, it can be configured to use that
 933 function instead, thereby saving some resources.
 934
 935 <P>
 936 Exim can be configured to use seteuid (on systems that have it) when reading a
 937 .forward file in a user's home directory. This is necessary when home
 938 directories are NFS mounted without root privilege, unless .forward files are
 939 required to be world readable.
 940
 941 <P>
 942 Exim checks the permissions and owners of files to which messages are to be
 943 appended, and refuses to proceed with the delivery if things are not right.
 944
 945 <P>
 946 Delivery of messages to pipes or files is supported only as a result of
 947 expanding an address via an alias or a forward file, provided this is
 948 permitted by the configuration. Externally generated local addresses cannot
 949 specify files or pipes - no special action is taken for addresses starting
 950 with the file or pipe characters, so they will usually fail.
 951
 952 <P>
 953 Use of the VRFY function in SMTP connections is controlled by a configuration
 954 option. The EXPN and DEBUG functions are not supported at all.
 955
 956
 957 <A NAME="23"><H2>25. The Exim Monitor</H2></A>
 958
 959 <P>
 960 A program for monitoring Exim and displaying information in an X window is
 961 provided. This can be configured to show stripcharts of incoming and outgoing
 962 mail in various categories. It also shows a 'tail' of the main log file, and
 963 information about messages on the queue.
 964
 965 <P>
 966 There is a menu of operations that can be performed by suitably privileged
 967 users. Messages can be frozen, thawed, deleted, caused to be delivered,
 968 modified, or returned to their senders from this interface.
 969
 970
 971 </BODY>
 972 </HTML>