

		T R O U B L E S H O O T I N G



If you run into any problem during installation or when using this
package, please first read the following text and all other relevant
documentation. Especially you should consult your server's documen-
tation if you run into problems setting up your server. Also refer
to your network card's user manual or the documentation for the
operating systems of the diskless clients accordingly. However, if
you still can't solve the problem on your own, you can send me an
email to

		gero@gkminix.han.de

Users able to speak German can send me the mail in german. Otherwise
please write in english. I already received some emails in so poor
english that I haven't been able to even understand the problem. I
can't help you in that case. And please excuse me that I can't answer
questions sent to me by standard mail or telephone calls. I just don't
have the time for dealing with that.
If you decided to send me an email please describe your problem as
exactly as possible. It usually helps to send me relevant portions
of configuration files (I have to pay for my internet access by myself
so please keep quotings as short as possible). Especially with problems
with the bootrom it usually helps to _exactly_ write down the screen
output, not only but including any error messages. Also state as exact
as possible how you created the problem so that I can try to simulate
it on my own hardware.
Additionally please note that I can't help you with every problem with
your server, as there are so many different systems on the market. The
same is true for problems with network cards. I just don't have the
financial capabilities to buy any card on the market for testing. Per-
sonally I'm using NE2000 and WD8013 cards, so I can probably help you
with those.
If you find a problem which looks like a bug in the code I really
appreciate a short notice from you. And if you have a fix for the bug
I would even more appreciate your message.
Besides contacting me directly there also exists a mailing list related
to network booting which you can subscribe to. Write a mail with the
message 'subscribe netboot' in it's body to majordomo@baghira.han.de
(the subject of the mail doesn't matter). The readers of the mailing
list should also be able to help you with any problem you might have
while setting up a diskless client. And besides that I'm also going
to announce any new version of this netboot package to the mailing
list.




Problem: My operating system OS/XY is not supported by netboot

	I would gladly provide support for every operating system on the
	market, but I don't have the resources for doing this. However,
	if you want a particular operating system to be supported, you
	should get in contact with me. In any case you will have to provide
	me with a valid and licensed copy of that operating system. You are
	also invited to write your own boot loader, and send it to me for
	inclusion into netboot under the terms of the GNU GPL.



Problem: While trying to build a bootrom I get a compiler error

	The installation scripts require to compile a couple of utility
	programs which are only required during building the bootrom.
	They should compile on any Unix-type system, so if you get an
	error please report it to me, even when you are able to fix it
	yourself, so that I can include a patch for future releases.



Problem: I get a an error from make saying something like "missing delimiter"

	Some of the Makefiles use ifdef's, which older make programs don't
	understand. Even some more "modern" systems like SCO Open-Server 5
	have this problem. In that case you will have to get and install GNU
	make on your system (which is the better choice anyway).



Problem: The bootrom doesn't startup at all

	Either you have a floppy in your diskette drive or you have
	a hard disk installed with a partition marked as active, and the
	bootrom has been built so that it lets the BIOS look for active
	partitions first. Both conditions let the system boot from the
	bootable media instead of using the bootrom. Just remove the
	floppy or use fdisk to mark all partitions as unbootable (e.g.
	inactive). Alternatively you can also build the bootrom so that
	it does not allow the BIOS to look for bootable partitions. The
	program which actually creates the bootrom ('makerom', it gets
	called when you run 'make bootrom') will ask you about this right
	after selecting the bootrom kernel image.



Problem: The bootrom behaves strange during startup, and may even hangup
         the whole system

	If you compiled the mknbi programs on a system with big endian
	byte order (like Motorola or PPC systems) this might indicate
	that the configuration program couldn't find the correct byte
	order. It might also be that there is a bug in the byte ordering
	code. Some systems like SPARCs also do not allow data accesses at
	misaligned addresses. 'configure' should usually find out about
	these conditions. In any case, if 'configure' is not able to pro-
	perly detect what kind of system you are using, edit the file
	config.h by hand and try it again. Please report this condition,
	and also note which system you used for installation.



Problem: The packet driver is not able to start properly

	First check what error message the packet driver prints. Usually
	this problem is a result of an incorrect setup of the network
	card, so check that it uses an I/O address, interrupt line and DMA
	channel (if applicable) of it's own, and that the packet driver
	uses the correct values. Another common problem with ethernet
	cards which use shared memory (like WD80?3 cards) is an overlap-
	ping of this shared memory with the rom area used by the bootrom.
	Select a different shared memory address in that case. If that's
	ok you should next check that you configured the packet driver
	correctly with the bootrom configuration program. Usually the
	packet driver prints out what it expects the hardware to look
	like so you can use this information to check up your setup.



Problem: The bootrom tells me that there is not enough memory but I have
         xx megabytes installed

	This problem is a result of the fact that the BIOS starts the
	bootrom in the processor's real mode. The bootrom is therefore
	only able to access the lower 1 megabyte of memory, regardless
	of how much you installed. And 384kB of this is reserved for
	ROM's and the video memory, so there is only 640kB left. Unfor-
	tunately some systems even reserve memory from these lower 640kB
	for internal BIOS data. This is called extended BIOS data area,
	and known to be used on most PS/2 systems. But also some other
	BIOSes use such an extended BIOS data area, which is usually
	selectable in the system's setup. Therefore you should try to
	deselect such a feature. If that's not possible you are out
	of luck - sorry.



Problem: The bootrom doesn't receive a bootp answer and just hangs printing
         dots

	First you should check if bootpd runs on your server or is started
	properly from inetd. Then check that the server's /etc/bootptab is
	setup correctly. Especially the hardware address and the client's
	IP address and name have to be correct. 
	Most bootp servers have the ability to write debugging information
	into a log file. Use that feature to verify that your server really
	receives bootp requests from the client's bootrom and sends out a
	valid answer. Also check for error messages in the log file. Even
	if your bootpd doesn't write into a seperate log file it might use
	syslog on your system, so find the log file name from your syslogd
	configuration file and check for errors.
	If you are able to use a network tracing program like tcpdump you
	can check if the bootrom sends out correct requests and that the
	server is answering correctly. In that case it is more likely to
	be a problem in the bootrom, so you should create a new bootrom
	image with the packet driver debugging module included. You should
	then see the bootrom's request packets going out, and the server's
	answers coming in. If there are no packets coming in although you
	verified that the server is sending out correct replies there might
	be a problem with your network card. Did you set it up correctly,
	is a cable connected (no kidding, those things really happen)?
	If everything fails try to boot the diskless client with the
	intended operating system and try to access the network card
	using that operating system's tools.
	If the server is not sending out answer packets, but the bootpd
	logfiles indicates correct answers, it might be a problem with
	the arp setup on your server. Normally arp shouldn't be a concern
	for you. However, some older versions of bootpd for Linux had
	problems here, which could be solved by setting the kernel arp
	table manually.



Problem: The bootrom did get a bootp answer but is not able to load the
         bootimage file

	This is likely to be a problem with the tftpd setup on the server.
	Does tftpd run when you startup the bootrom code? If not check
	that inetd is configured correctly. Also there might be a TCP/IP
	wrapper running on your server which might prohibit access to
	the tftp service (which is known to be very insecure and therefore
	a candidate for getting started by an internet security wrapper
	like tcpd). Check any access configuration files for tcpd.
	Furthermore tftpd has to be able to access the bootimage file. It
	usually runs as a user with very low priviliges because of security
	reasons and might not be allowed to read the bootimage file, so
	you should check and set the bootimage file's permissions correctly.



Problem: The boot image loader reports an error

	Congratulations! You just discovered a bug in the boot loader.
	Please report it to me.



Problem: When I'm using the bootrom menu to load a Unix system off the local
         hard disk, it reports some weird error messages to me (especially,
         SCO Unix says that it's not able to open boot device). However,
         booting without the bootrom works without a problem.

	Some operating systems, especially Unix like systems, read the
	partition table after booting and try to find their own boot par-
	tition. When using the bootrom, it's not necessary to mark the
	Unix partition as bootable, so the Unix startup loader fails.
	To solve this problem, mark the Unix partition active with some
	fdisk program. To avoid that it starts running instead of the
	bootrom, create the bootrom so that it does not allow the BIOS
	to search for boot partitions on the installed hard disks (the
	'makerom' program, which gets run when you do a 'make bootrom',
	will ask you about this right after selecting a kernel image).



Problem: I'm loading Linux onto my diskless client and the kernel tells
         me to insert a root floppy and press enter

	First you should check that you built your kernel correctly. It
	should have support for the root filesystem built in. If you want
	to use an NFS mounted directory as root the kernel should have
	TCP/IP support installed. Also it has to have a driver for your
	network card built in, and NFS and NFSROOT have to be both speci-
	fied. When using a ramdisk it's support has to be compiled in
	as well as support for the filesystem with which you formatted
	the ramdisk image. Please note that the loaded kernel is not
	able to use modules at bootup time (only _after_ the root file-
	system has been mounted, but not before), so everything has to
	be compiled in.

	If the kernel is not able mount it's root via NFS, this might
	have many different reasons. It requires all addresses in the
	/etc/bootptab file to be correct, and the access rights on the
	server have to be set correctly - not only in /etc/exports but
	also the permissions for the directory to get mounted. If that's
	correct check that a portmapper is running on the server, and
	that it registered the mountd and nfsd services correctly. You
	can usually do this by running the command

			rpcinfo -p

	Note that services are only listed here if their associated server
	process is really running. The rpcinfo output should then look
	something like this:

		   program vers proto   port
		    100000    2   tcp    111  portmapper
		    100000    2   udp    111  portmapper
		    100003    2   udp   2049  nfs
		    100003    2   tcp   2049  nfs
		    100005    1   udp    663  mountd
		    100005    1   tcp    665  mountd

	However, the port numbers might be different.

	When the kernel starts mounting the NFS root directory it prints
	out the name of that directory on the server. It should be the
	same as the one configured in /etc/bootptab. Check that it's
	correct. If not you can try to use the -d option with mknbi-linux
	to specify the name explicitely.

	If the kernel gets an error from the server's nfsd, it prints
	a number which is defined according to the NFS protocol. The
	most commonly occurring numbers are:

		 1  -  permission denied to access directory
		 2  -  directory doesn't exist
		 5  -  I/O error on server filesystem
		13  -  nfsd is unable to access directory
		20  -  path name is not a directory
		63  -  path name is too long

	Note that some nfsd and mountd programs only read /etc/exports
	on startup. If you changed this file afterwards, you will have
	to restart both daemons. Additionally, with nfsd versions for
	Linux earlier than 2.1 you will have problems with special files
	like UNIX domain sockets or block/character special files on
	your NFS partitions. You should therefore use the latest avai-
	lable versions.



Problem: The Linux kernel mounts it's root correctly but doesn't give me
         a login prompt.

1.)	This might be the result of an incorrect setup of the root file-
	system (see No. 2 below). However, it's also possible that your
	server reported the wrong major/minor numbers for the console device
	even though you specified them correctly in the NFS mounted root
	directory. I know of this problem with AIX and HP-UX servers,
	but there might exist others as well which don't transfer special
	devices via NFS as Linux requires it. One solution to solve this
	problem is to boot the diskless client with a ramdisk image as
	it's root, and then mount the should-be-root directory on the
	server using NFS. Then you can create the special files in the
	dev directory using Linux's mknod program, and use the NFS root
	mounting bootimage again.
	Another way is to try to find out, how the server operating system
	encodes major/minor numbers on it's own filesystem. For example,
	HP-UX uses a 32 bit device number, with the 8 highest bits being
	the major number, and the lower 24 bits being the minor device
	number:

		major << 24 | minor   ==>   aaaaaaaabbbbbbbbbbbbbbbbbbbbbbbb

	In this representation (a) means a bit of the major number, and
	(b) means a bit of the minor number. Linux uses the following
	scheme instead:

		major << 8 | minor    ==>   0000000000000000aaaaaaaabbbbbbbb

	The NFS protocol now transfers these 32 bits just as they are,
	without any further interpretation regarding major/minor numbers.
	That means, that all relevant bits in the Linux representation
	fit into the minor number on HP-UX. Therefore, if you create a
	device on the HP-UX server, you have to alway give it a major
	number of zero and compute the minor number the way mentioned
	above for Linux. For example, to let Linux see a device 5/2 in
	it's NFS-mounted /dev directory, you can compute the minor device
	number on HP-UX as

		5 << 8 | 2    ==>  1282

	So the device to create on the HP-UX server is 0/1282. This will
	let Linux see 5/2 after the filesystem is mounted with NFS.

2.)	Another reason for this problem might be that the init process
	doesn't get started at all. This can be a result of incorrect
	shared libraries, which the client might see but without a proper
	ld.so.cache file. Or the shared libraries are not reachable by
	the client at all. Bruce Janson and Markus Gutschke collected a
	good list of possibilities, which you should check out:

		- you do not have a private copy of the /, /etc, /var, ...
		  directories

		- your /dev directory is missing entries for /dev/zero and/or
		  /dev/null or is sharing device entries from a server that uses
		  different major and minor numbers (i.e. a server that is not
		  running Linux - see above).

		- your /lib directory is missing libraries (most notably libc*
		  and/or libm*) or does not have the loader files ld*.so*

		- you neglected to run ldconfig to update /etc/ldconfig.cache
		  or you do not have a configuration file for ldconfig.

		- your /etc/inittab and/or /etc/rc.d/* files have not been
		  customized for the clients.

		- your kernel is missing some crucial compile-time feature
		  (such as NFS filesystem support, booting from the net, trans-
		  name (optional), ELF file support, networking support, driver
		  for your ethernet card).

		- missing init executable (in one of the directories
		  known by the kernel: /etc, /sbin, ?)

		- missing /etc/inittab

		- missing /dev/tty?

		- missing /bin/sh

		- system programs that insist on creating/writing to files
		  outside of /var (mount and /etc/mtab* is the canonical
		  example)



Problem: Can't compile the bootrom

	Please get in touch with me if you encounter any problems
	while recompiling the bootrom.

