#ident	"@(#)smail:ToDo,v 1.65 2001/07/30 18:45:46 woods Exp"

Things that should be done before the next minor release (patches are,
of course, gratefully accepted!):

Important Bugs:
--------------------

- DO NOT RUN SMAIL AS ROOT!!!!

  + the daemon will setuid(nobody:nogroup) and re-exec itself after it
    has a file descriptor already bound to port 25 (it will have to be
    started by root on most systems, of course)

  + the main smail/sendmail binary will be setgid-smail (not "mail") so
    that it can write to the queues.

  + local delivery to spool files will be done by a separate setgid-mail
    agent ala *BSD mail.local, with kernel locking where possible,
    *.lock files only where absolutely necessary.

  + initial spool file creation, if necessary, will be done by a tiny
    setuid-root helper on systems that have a root-only chown(2) [one
    exists somewhere already -- search the net]; they will be owned by
    the user, group 'mail', and mode 660.  The spool directory will be
    mode 555 if kernel locking is possible, else 575 & group 'mail' if
    necessary for *.lock files (and owned by root, of course).

  + all mail readers will be expected to either use a small helper to
    safely copy the spool file to a private place (ala movemail, which
    will use kernel file locking when possible or be setgid-mail and use
    *.lock files if necessary), or to use kernel file locking to access
    the spool file (and hopefully copy it away to a private place while
    th user does his/her thing); mail readers (including movemail) will
    be "encouraged" to keep the spool file after emptying it; and setgid
    mail readers will be very Very VERY strongly warned against

  + .forward files will have to be world (or at least group 'smail')
    readable, *or* "Forward to" support can be used.

  + delivery to files and pipes done only as nobody:mail (courtesy the
    setgid-mail local delivery agent) regardless of where they're
    expanded from

  + access to locked files will only be attempted for a limited amount
    of time and messages will be left in the queue if delivery is
    unsuccessful because of such a lock

- make sure TXT records returned from RBL lookups, as well as strings
  included in smtp_hello_reject_hosts, cannot violate SMTP message
  response rules.

- similarly always run raw network data through strvis(3) before sending
  it back out to the network in the likes of an SMTP response message,
  or before logging it.  Borrow strvis(3) from 4.4BSD for other types of
  systems.

- do strip quotes from quoted local parts....  maybe.

- think about ways to avoid doing DNS lookups on all Xsucceed and Xfail
  addresses when a mailing list or other multiple-recipient message is
  retried from the queue [this is fixed now?].

- fix bug where qualify_domain() isn't called for addresses specified on
  the command line (i.e. it only seems to be called when '-t' is used)

- use ftruncate() to remove partially written messages in appendfile.c
  if ERR_135 [if possible].

- investigate extra <>'s in received for bounces (and other places):

  Received: from most.weird.com (4544 bytes) by most.weird.com
	via sendmail with P:bsmtp/D:user/T:local
	(sender: <MAILER-DAEMON>) (ident <MAILER-DAEMON> using unix)
	id <m0x2LJo-00076wC@most.weird.com>
	for <<woods>>; Sat, 23 Aug 1997 14:53:52 -0400 (EDT)
	(Smail-3.2.0.98-Pre 1997-Aug-19 #7 built 1997-Aug-20)

  Extra <>'s also appear in message log entries too... most of the time.

- deal with un-qualified local hostnames when there's no qualify file in
  some sane way....  (the qualify.c stuff is perhaps overloaded and
  shouldn't be used to qualify both local names in outgoing headers at
  the same time as being used to qualify destination hostnames).

- fix "from_field" to never allow "From:" to go missing and if it's nil
  do something appropriate....

- fix db lookup parser to allow '#' in left-hand side (if quoted?) [aliasfile]

- investigate the Apparently-To: being set, while input_addr not being
  set: (no To/Resent-To/Cc/Bcc, etc., header in data, just envelope
  "MAIL FROM:")
	
	Received: from [204.92.254.3] by most.weird.com
		via sendmail with smtp (ident woods using rfc1413)
		id <m0udnxb-00076qC@most.weird.com>
		for <unknown>; Tue, 9 Jul 1996 21:20:59 -0400 (EDT)
		(Smail-3.2 1996-Jul-4 #1 built 1996-Jul-4)
	Apparently-To: foo@anet

  perhaps $input_addr should be set from envelope (always?).

- investigate smail vs. MH via SMTP and BCC.  Seems the BCC line can end
  up in the initial Received header.  The exact address that appears in
  the first Received header will vary if there are multiple destination
  addresses.

    Received: from woffi.planix.com([204.29.161.34]) (1436 bytes) by whome.planix.com
	via sendmail with P:esmtp/D:aliases/R:inet_hosts/T:smtp
	(sender: <andreas@planix.com>) 
	id <m0x3NUM-0008NDC@whome.planix.com>
	for <partners@planix.com>; Tue, 26 Aug 1997 11:25:02 -0400 (EDT)
	(Smail-3.2.0.97 1997-Aug-19 #2 built 1997-Aug-25)
    Received: from localhost.planix.com(localhost[127.0.0.1]) (1104 bytes) by woffi.planix.com
	via sendmail with P:esmtp/R:inet_hosts/T:smtp
	(sender: <andreas@planix.com>) 
	id <m0x3NUL-000EExC@woffi.planix.com>
	for <customers_hidden@planix.com>; Tue, 26 Aug 1997 11:25:01 -0400 (EDT)
	(Smail-3.2.0.97 1997-Aug-19 #2 built 1997-Aug-19)
    To: customers@planix.com (to /dev/null), partners@planix.com (an alias)
 >> Dcc: customers_hidden@planix.com (a private alias to everyone)

  Note also that MH uses 'Dcc' instead of 'Bcc' for normal (direct)
  blind carbon and that this header may not be stripped either!

- do something about the premature lower-casing of user names.  Users
  with upper case characters may not be able to receive mail (or at
  least read the stuff they've received....)  The correct solution is
  probably to provide another field in struct addr in which the
  un-adulterated user-id can be stored for use in the "local"
  transport's filename expansion.  I.e. the "user" director, with the
  'ignore-case' attribute set, will do a caseless match of the user-id
  against the mailbox portion of the address, and then the actual
  user-id with case preserved can be used in generating the mailbox
  spool filename.  [PR#295 notes that getpwbyname() in pwcache.c
  explicitly lowercases the user name passed to it before a getpwnam()
  search is proposed and the PR actually suggests removing this
  lowercasing (so that the case is preserved in the cache), but still
  doing a case-insensitive search through the password file, though it
  doesn't pay heed to the ignore-case attribute, nor does it provide for
  storing the case-preserved user-id in struct addr.]

- fix aliasfile parser to allow case sensitive aliases (ala above?)
  [keep in mind the lists director uses "lists/${lc:user}"]

- turn down the verbose logging of failed locks, if known other smail
  process holds lock....  eg:

	02/28/96 12:07:36: open_spool: /local/var/spool/smail/input/0trpIB-00076nC: lock failed: Permission denied

  Unfortunately this will probably require re-writing the spool locking
  functions to use pid-in-a-lock-file mechanisms.  [effectively fixed in
  3.2.1 for systems that return EAGAIN if lock_fd() meets another lock?]
  [it has been noted that there may be real race conditions in here!]

- check out re-writing From: if from '-f'

- Make sure "From:" and "To:" are always generated correctly for all
  locally originating mail and never for anything else.

- check out what's going on with Apparently-From being added multiple
  times [Apparently-From should be gone from 3.2.1].

- stop smail from generating those horrible Apparently-* headers now
  that the envelope is completely available in the default received
  header [Apparently-From should be gone from 3.2.1].


Incomplete Features:
--------------------

- add a command-line flag for "mailq" (et al) to print the error queue!

- don't allow bogus A RR's to "match" (0, 255.255.255.255, 127/8,
  RFC-1918 addresses etc.).  Probably need to provide a config variable
  that contains a list of "bogus" addresses.  Perhaps names given in
  HELO would skip this check if the client address matches in
  smtp_remote_allow or something.  What about names for internal MX
  hosts though?  How do we know if they're "internal"?

- When command-line recipients are given, but there's no "To:" header,
  add one like "To: undisclosed-recipients:;", just as Postfix does.

- add a new flag "permit_mx_backup" to do what Postfix does.  Perhaps
  instead add a variable "authorised_backup_mxs" in which the domain
  names of authorised MXs could be listed.  Maybe even the hosts and/or
  networks where authorised primary MXs live could be listed in
  somethign like  "authorised_backup_for_mxers".

- think (but not too hard) about adding a (per-RBL with above) recipient
  exception list to allow individual recipients to negate rejection of
  snmp_session_denied connections.  Note that this unfortunately means
  inventing a mechanism to delay rejection to RCPT-TO time if this
  feature is used.

- if a reject delay mechanism is implemented then add
  smtp_host_reject_recipients instead of changing the way any of the
  non-per-recipient rejects (eg. RBL, paranoid, hosts, etc.) work.  This
  would list the IP#s of hosts that explicitly should not be rejected
  until RCPT-TO time.  This list would be used by a postmaster who's
  getting spammed by a stupid broken mailer that does not honour 5xy
  rejects at HELO time, such as L-soft's LSMTP (instead of adding them
  to a firewall rule, for example).  This would include an optional
  text message after a ';' (such as with smtp_reject_hosts) so that the
  postmaster can tell the remote user how stupid and broken their damn
  spamming mailer is!  ;-)

- try to include the "ORIG-TO:" field in the logs when a message bounces
  (i.e. in the "Failed" log entry) -- otherwise it's almost impossible
  to see what the input address was.

- think about addding a config variable to allow a much more restricted
  number of recipients (eg. 1! :-) to be specified when the sender is
  "<>" (i.e. don't allow fake bounces).

- this happens when multi-homed and connecting to a sibling in the same
  alternate network....

	07/07/1999 17:47:12: [4931] remote EHLO: questionable operand: 'becoming.weird.com': from root@becoming.weird.com source [204.29.161.180]: Remote address PTR lookup failed (Unknown host).

   This would probably be fixed by always greeting with the name
   matching our actual source address, but getting that is "hard", and
   would re-invent/remove the meaning of 'primary_name'.

- fully support $max_message_size [perhaps also add a new option
  $truncate_oversize_bounce or similar with default ON].

- think about allowing $listen_name to be set on command line too [if
  this is used for more than one domain then you'll need separate config
  files anyay, so just use -C; but if you are using this to avoid having
  SMTP on some interfaces then this info may be easier to manage in one
  place in the /etc/rc* files or whatever].

- do something to make aliasfile parsing identical across lookup protos.
  (related to 'db lookup parser' bug above?)

- implement 'mailq' to follow through on '-t' option (i.e. read header)

- have 'mailq' print "Mail queue is empty' when it is (isatty()?) ala sendmail

- Put the following in default.c for SVR4's local, pipe, & file transports:

	remove_header="Content-Length",
	append_header="${if !header:Content-Type :Content-Type: text}",
	append_header="Content-Length: $body_size",

- think about how to integrate checkerr and savelog so that security
  violations can be snarfed from logfile just after it is cycled.
  Perhaps a new over-all maintenance script (smailmaint?) could do the
  work and there would only be one crontab entry necessary.  Note that
  there's no need to use the antiquated savelog on systems that have a
  newsyslog(1) capable of not compressing the .0 file (eg. my version!).
  [syslog logging would also change all of this since then security
  violations will get higher priority from syslog if the admin so
  desires...]

- add an "always" attribute to the directors drivers, esp. aliasfile.

- add 'senders' and 'senders_except' attributes to directors and routers
  to implement restricted aliases, transports, etc.

- fix match_ip() to do a last-match-wins algorithm and to support
  pattern negation (eg. "192.168.*:!192.168.1.*" would match any Class-C
  subnet in 192.168/16, except 192.168.1/24.

  maybe steal the code from BIND (v8?) for address_match_list:

     Elements can be negated with a leading exclamation mark ("!"), and the
     match list names "any", "none", "localhost" and "localnets" are
     predefined.

- think about allowing hostnames in match_ip() by doing a reverse lookup
  on the address and matching the resulting PTR(s) with any hostname
  patterns [regex's too, or just glob(3), or just domain suffixes?].
  Remember to always do the safe thing when no PTR is found --
  i.e. return a code saying that a test was not possible (either
  temporary error indicator if DNS times out, or permanent if
  authoritative NXDOMAIN) and let the calling code can do the "safe"
  thing (eg. reject a relay attempt).

- think about making smtp_remote_allow and other users of match_ip()
  capable of specifying a file lookup mechanism in a list element:

       smtp_remote_allow="localnet:10/8:192.168/16:\
		${lookup:sender_host_addr:ipsearch:{
				/etc/smail/remote.allow}:$value}"

  where "ipsearch" iterates the [new] match_ip() function over all the
  values in the file.  (does this mean keeping the double compare?)
  (the file should probably be cached in-core and treated as a list if
  it's not too big).  See next item too about how to specify the
  variable containing the value being searched for instead of magically
  knowing what it is as in the above example:

- think about fixing parsing of all list-style variables so that they
  can optionally include an element that is run through expand_string(),
  in particular so that ${lookup can be used.  The trick here is in
  making sure there's some way to always specify the value being
  searched for in the list.  Perhaps a common pseudo-variable
  (eg. $value) should be made available by all the routines that search
  lists like this....

- Think about splitting lsearch and USE_LSEARCH_REGEXCMP into a plain
  old lsearch and a new "research" (is this a bad name? ;-) [JPR
  suggests "grep", how how about "grepsearch"?] for straight RE linear
  searches.  Think about not using double quotes to trigger the RE match
  in "grepsearch", but rather doing it for every key value.  Think about
  a combined lsearch+grepsearch that would do what lsearch+REGEXCMP does
  now with the double-quote trigger, but of course also not require the
  double-quote trigger.

- adjust the error messages in config file parsing to include at least
  the line number, and anything else helpful, not just:

	05/07/1997 15:40:59: /local/etc/smail/config: parse error: unexpected end of attribute

- think about adding eqic{, ltic{, gtic{ operators that unify the case
  of their arguments before testing.

- think about changing the "var" portion of the eq{ et al operators to
  be a fully expanded value, not just a variable name (which would make
  the eqic{ et al operators suggested above effectively redundant).

- add support for Kiem-Phong Vo <kpv@research.att.com> Vmalloc library,
  particularly debugging support [partly done].  Also add hooks to build
  with sfio (i.e. without the stdio layer).

- document ${eval: if it turns out to be useful.

- re-write aliasfile.c in the style of the fwdfile.c with a finish_*()
  function, etc.

- add someone's regex library to pd/regex (Ozan's?, PRCE?) and use that
  if the code's not ported to the current system's equivalent (or always
  use it?)

- figure out how to do the configuration for per-transport (or
  even per-target?) relaying control.

- pass a flag to fill_attributes() so that it can print a more
  meaningful error message that indicates if an unknown attribute is
  expected to be either a generic attribute, or a driver-specific
  attribute (possibly either the word "generic" or the driver name).

- the error message string returned by parse_header() doesn't indicate
  which header line the problem was with, never mind which specific
  address in the case of address parsing problems.

- implement optional $max_mailbox_size [optionally as a colon separated
  list of "user=size" tokens with something like '*' as the default user
  and "nolimit" to unset per user].  Or use ${lookup}?

- add configurable reserve space for spooldirs ($min_spooldir_free?)

- be careful about never filling the logfile too (can we instantly defer
  connections if we're out of resources like this?)

- try to ensure all variables are run through expand_string().

- add a way to supress warnings for smtp_helo_broken_allow.

- Microsoft IDIOTS:

  220 exchange1.ACC.WORKFORCE.COM Microsoft ESMTP MAIL Service, Version: 5.0.2195.1600 ready at  Tue, 6 Feb 2001 13:05:01 -0800 
  EHLO proven.weird.com
  250-exchange1.ACC.WORKFORCE.COM Hello [204.92.254.15]
  250-TURN
  250-ATRN
  250-SIZE
  250-ETRN
  250-PIPELINING
  250-DSN
  250-ENHANCEDSTATUSCODES
  250-8bitmime
  250-BINARYMIME			# "not supported"
  250-CHUNKING
  250-VRFY				# "not supported"
  250-X-EXPS GSSAPI NTLM LOGIN
  250-X-EXPS=LOGIN
  250-AUTH GSSAPI NTLM LOGIN
  250-AUTH=LOGIN
  250-XEXCH50
  250-X-LINK2STATE
  250 OK				# "not supported"
  quit
  221 2.0.0 exchange1.ACC.WORKFORCE.COM Service closing transmission channel

- checkerr should maybe try to find the original message-id for double
  bounces and look for related log entries for it too, then we could see
  right in its report the original source of the failing message.


New Features:
-------------

These are primarily things that should wait for the next major release.

- Think about a config variable that could (maybe $log_events?) that
  could control which items are logged and which are not [or wait for
  syslog support?]

- make the startup log message more verbose (version, build, build date,
  release date, etc.) [use $smtp_banner ???]

- write a minimal mailstats replacement (new log file format only)
  [real stats, not just what logsumm does]

- implement 'mailq' option to read the "error" queue (mailq -e?)

- implement '-R'

     -Rstring	    Go through the  queue  of  pending	mail  and
		    attempt  to	 deliver any message with a reci-
		    pient containing the specified string.   This
		    is useful for clearing out mail directed to a
		    machine which has been down for awhile.

- implement ETRN from RFC 1985 ala the above (patch already available,
  but needs some performance enhancements and support for '-R').

- implement other standards-track SMTP extensions....

- possible make the daemon children change their ps command line text to
  show what they are currently doing (on systems where this is possible)

- teach substitute() to recognize the variable names listed in
  conf_attributes, etc.(?)


Miscellaneous:
--------------------

- add #ifdef HAVE_UNISTD_H #include <unistd.h> where appropriate [or
  wait for autoconf?].

- remove nested includes from "jump.h" [and everywhere!].

- think about getting <string.h> out of defs.h [or wait for autoconf?]

- investigate this weird log message fragment:

	ORIG-ID:<199604230758.AA13625@post.tandem.com\POS,$ZNET^U5>

  (possibly related: what'll happen if a message-ID header has other
  crap, and even continued lines, in it too?)

- think about doing something to allow an alias to be used to force a
  "no-such-user" bounce.

- install ".so" (soelim) manual pages with their full longer names on
  systems with longnames [need to fix up xrefs too?]

- should we add IsValid*() checking?  from:
  <URL:ftp://ftp.cert.org/pub/cert_advisories/CA-96.04.corrupt_info_from_servers>

- read draft-ietf-drums-smtpupd-04.txt [or newer] more carefully.

- think about not stripping comments from aliases, etc., and providing
  GCOS info; esp. for EXPN and VRFY, perhaps re-using smtp_info to
  control.

- Should the "real_user" director set ignore_alias_match?

- consider allowing multiple whitespace characters to act as one when
  speparating words in a string parsed by expand_string().

- think about the possible benefits of having separate DBG_DRIVER types
  for each of the different kinds of drivers (router, director,
  transport).

- clean up the duplication between COPY_STRING() and copy().

- an interval of '1y' prints as '52w1d5h45m36s'.
