Three tough cases

John Saxon gets down and dirty

I’ve always enjoyed reading "The Serviceman" articles in the electronics magazines, because they follow a comfortably familiar pattern. First the serviceman is presented with a TV or Video suffering from a very obscure fault, and after innumerable false starts and logical twists, right triumphs over evil and the nasty fault is fixed – much to the delight of all concerned. Well – I’m here to tell you that fixing computer problems is not often like that! No doubt I’m preaching to the converted, but it seems that even though hardware (on the non-Quantum level) has got simpler with lower chip and connection counts, the software, particularly operating systems, has got a lot more complicated. The problem is that there are almost an infinite number of hardware and software permutations and combinations, and ever more complex sets of settings and options add further layers of complexity. Perhaps Mac lover’s do have something to be smug about – but PCs are so much more fun (in a masochistic sort of way)! So while most major problems usually (eventually) get fixed, there are always many more tweaks and tunes that could be attempted if there were enough hours in the day – so one often tends to leave systems feeling vaguely dissatisfied.

Readers of my previous articles may have presumed that I only attempted to fix problems on my own system – often created by my ignorant actions. But over the last few years since retirement, I have been trying to live up to our motto by helping people with problems, occasionally with some success. These systems belong to friends of friends, people in the coffee and chat group, senior citizens inspired by our help with the Internet at the libraries, and more recently as notified by Nhan Tran’s "Help with Internet Team (HIT)" group. Most help is given in people’s homes on their own systems. Oh the tales I could tell! Here are a few of the more difficult ones – the owners will recognise their own symptoms, but hopefully will not object to this exposé.

Case 1 – The mystery of dropped connections:

System: Late’ish model Pentium, Plenty of RAM, Running W95 OSR2

Problem: Had been using TIP with no major problems – but started getting trouble "logging on"

Resolution: This one took two visits and about 4-5 hours total. As usual (with hindsight) it should have been solved much more quickly. The owner had been using Outlook Express to logon. I went through the main options in both Internet Explorer and the mail and news sections to stop auto dial up attempts and get back to a straight TIP logon sequence. Having set all that up, including the TIP script and DUNCE, getting on was a perfect one click exercise. But TIP apparently threw us off on about 5 occasions - some very soon after the logon was accepted. Went back to manual log-on sequence and recorded Modemlog.txt files and apparently the disconnects were happening at the TIP end. I thought the problem might have been something in all those MSIE settings. Even though "nothing had been changed" since the system worked well. Various modem init strings were tried with no success.

On the next visit, I started to think that perhaps the modem had developed an obscure fault (ran some minor diagnostics and all appeared O.K.). Even started trying to hook up an old 9.6Kb model! In the process we discovered that the serial port to the external modem was very slightly loose at the computer end. Tightened that up and no further problems! Rock solid connections – virtually and physically!

"Obvious" you say. "I would have thought of that first"! I should have as well, but what threw me was that all the connection protocols were exchanged 100% - conducting the logon manually was 100% right though and including the PPP entry – it was only between 2 and 50 secs later that the connection was dropped. Also the modem passed the W95 diagnostic! I can only assume that perhaps the receive side had a good connection but the transmit was intermittent and that logon’s and protocol negotiations are more tolerant than other TCP/IP activity – weird!

Case 2 – No happy ending:

System: 486, 32mb RAM, running Windows 3.1

Problem: Kit installed and working, Problems downloading web pages, E-mail O.K.

Resolution: I wish I could say that this one is resolved – but after 3 visits totalling 6 or more hours, the problem is still essentially as stated. Here’s the story.

Installed Winzip 6.3 SR1 and Netscape 3.04 as the owner wanted a later version than came with the TIP kit. The system had the kit installed and Eudora was working but Netscape 2 would just appear to hang with internal or external URLs (proxys were O.K). After installing NS 3.04 the system connected O.K. and Eudora, Agent, and Netscape appeared to worked O.K (if a little sluggishly), but only tried a couple of URLs.

But the sluggishness was not surprising as a check of the Winsock window showed continuous messages being generated as follows:

NetMessage(3FB0,<CCC4 10 0000 01>)

event detected 01

eventmask = 003F eventenabled = 002D

NetMessage(3FB0,<CCC4 9 0000 01>)

rearm 0001

eventmask = 003F eventenabled = 000D

doevent 0020

eventmask = 003F eventenabled = 000D

Comm errors [OVERRUN] = 93

That’s just a small sample from the full log of course.

The messages didn’t mean a lot to me. So I retired in confusion :-)) and checked the Net for Winsock tweaks, new Winsock versions etc. Eventually a friend suggested that some of Winsock's debug logging could be turned on. and the system was just displaying normal messages. This turned out to be the case, they were easily turned off and only the occasional comm overrun messages were now displayed (due to the 8550 UARTs). But the system still felt "sluggish" and Netscape was really intermittent - some sites came up normally (generally pcug ones and some overseas). Strangely, Australian ones seemed to fail more often. I got the impression that what we were seeing was some type of resource problem rather than geographical. The sites that failed just showed continuous download activity (meteorites & hour glasses), no error messages. Sometimes they half loaded - sometimes nothing. I suspected that high activity sites with graphics and perhaps Java were the worst problems.

So onto investigations of the system itself. Older than I first thought - brought in 1993. A 40 Mhz 486 SX. Upgraded at various times with 32Mb RAM, a 1.2 GB HDD, a 33.6 Kb external modem, etc. Stacks of "free" software installed by dealer (no disks or docs!) including Norton's for DOS (and Windows?) Vrs 7 etc. Old Bios, so new HDD set up as a single partition via Norton's (I think) - judging by lot's of rather unintelligible stuff in the Autoexec.bat file. Nortons checks for viruses each boot, and McAfee was also installed - but they were both old (never updated) versions. A new version of McAfee was available on floppy, so this was run from a DOS window.

Clue Nbr one - message from McAfee "Traces of MKC_BOOT virus found in memory - boot from a clean floppy and scan again in DOS".

Clue Nbr two -The user's boot disk (and one I made on my system) could not find a C: drive in DOS! Microsoft Diagnostics "MSD" could not find a C: drive - nor could McAfee! So no cleaning possible from DOS, a nice Catch-22 situation. Only "Checkit" diagnostic software was able to find the drive at the hardware level, and extensive testing showed no HDD hardware problems. All this indicated to me that the C: drive Master Boot Record was corrupted - in fact McAfee (from pure DOS) hinted that this might be the problem. But how was the system getting into W3.1? My guess is via Nortons which is handling the Logical Block Addressing (LBA), etc. In fact there is just a chance that there is no virus or MBR corruption and that MCAfee is being fooled by Nortons? Have been unsuccessful in finding info on this virus from the Web (so far). But I felt reasonably confident that all of the above was probably causing some type of resource problems which are not obvious in Eudora and Agent as they are relatively "light" system users. Also the combination of the non buffered UART and the fast modem was probably causing constant TCP/IP re-transmit requests.

So how to get this system to browse the web successfully? I really feel on a personal level that I have some responsibility for it - but where does the HIT team responsibility end? Essentially the same initially reported fault remains - and I thought it might be a simple proxy problem!

In the end I have advised the user as follows - in order from the least intrusive.

  1. Try disabling JAVA Script and turning off image downloads. Possibly return to Netscape 2.x.
  2. Slow down the modem connect speed – install improved comm drivers.
  3. Rebuild the MBR - I think the command is FDISK /MBR. Then when pure DOS recognises the C: drive - clean for viruses with the latest McAfee. Obviously back up vital data files first (and quarantine any floppys made this way :-)) But I am reluctant to do this not only because I think it is outside our HIT contract, but also (and more importantly) due to the risk of data loss.
  4. Format C! Start over with available software. Without a CPU upgrade there does not seem to be much point in installing W95.
  5. Upgrade or buy a better system! Actually this is the plan - but not for 6 months or more.

So what has been learned? Plenty! Firstly an apparently simple Web access problem can be much more complex. There is often a grey area between software configuration or install problems, "training" problems, and more serious underlying hardware problems. There is also the concern that one is putting several hundred dollars worth of effort into a system worth $500 or less. Like many systems the saga continues!

Case 3 – Multiple problems!

System: Pentium 100, 24Mb RAM, W95A

Problem: Failure to boot, clock runs slow (looses hours/day) – needs TIP access

Resolution: I know this system! The motherboard, CPU and many other components used to be mine! I suspected that the CMOS battery might be low – so this was replaced via phone instructions (nice Lithium $4.50 at Dick Smith). When I got there the first task was to set up the CMOS with the HDD data, time, etc. But the system would not boot at all! Went through the POST memory check and it then just sat there. No error beeps – looked like the BIOS was not completing. So – as the problem now seemed to be hardware (or BIOS firmware) it was back to trouble shooting that. Removed all cards except video and floppy disk (this mother board does not have peripherals on board) and tried to boot to DOS from a floppy disk. Very occasionally the boot would get as far as an A: prompt. But I was running out of time so I eventually took the system home.

I felt that despite checks and re-checks of every page of the CMOS – the problem was probably in there somewhere. Sure enough – a careful read of the motherboard manual indicated that the HDD IDE PCI card was using an "ISA legacy connector" – a little ribbon connector from the card to an 8 pin plug on the M/B. This allows the PCI device to use an IRQ (don’t ask me why), but I suppose the design spanned the transition period between ISA and PCI. The M/B manual mentioned that this capability had to be enabled with a CMOS setting on the PCI page, and was not a default setting! So set that, and bingo – faultless boots to W95!

So now to get the TIP setup working. Going through Mike Gellard’s instructions indicated that although Dial-up Networking was installed, the TCP/IP protocol was not. Damn! I had forgotten to bring the W95 CDROM home with the system! As the machine was supposed to be picked up within an hour or so, I decided to take a chance and install that from my own OSR2 CDROM – somewhat to my surprise it seemed to read and install the necessary files O.K. but when a re-boot of W95 was attempted, disaster struck! The machine did not complete the boot. Instead it would get as far as "starting W95" then drop back to the memory check, then back up to "starting W95" then back to memory check – ad infinitum! Surely Bill Gates could not be so vindictive as to organise W95 not to boot if a minor component was replaced from a different CDROM?! But when the disappointed owners came to collect their system, they confirmed that this was the same fault that the system had before the CMOS battery was replaced. The fact that it decided to re-appear after some W95 driver changes had been made was coincidental!

So back to the drawing board. Something was intermittently marginal. One more thing to try. The CPU was actually a Pentium 75Mhz, but had been overclocked to 100Mhz for 3 years or so, with no apparent problems. So I returned it to 75Mhz and the system booted normally and has been rock solid ever since. So perhaps extended periods at higher than normal temperatures can possibly cause CPU degradation but not total failure?

TIP access was set up and checked out O.K. but when I tried to set up scripting it ran into problems. The rnaplus.inf file was found on the original CDROM but it could not be installed (nothing happened when it was right clicked in Windows Explorer). I suspect that this is due to some of the dial-up networking software being installed from the W95A CDROM and some from W95B. One day I will have to delete all DUN and load again from the W95A CDROM. But the system is currently accessing TIP quite well using manual log-ins, despite the old internal 14.4Kbs modem. Postscript – the system clock is still loosing many hours per day. Needs further investigation also.

This article has focussed on some pretty horrible problems. Luckily not all visits and systems are that bad. Many are relatively simple, involve mainly basic Internet and Windows training, and can be a real delight when people are happy that even some of their problems are solved. One common thread is that the TIP Internet kits need a major overhaul. The W3.1x kit could be improved, but doubt if there is too much incentive these days. The W95 instructions handed to new TIP subscribers are difficult for beginners, and do not indicate W95A and B differences. I often meet people who do not even know what operating system has been installed on their system ("word processing" is a common answer) – but these people are expecting to do all the Internet things that they see in the libraries and on TV. The PCUG is supposed to include some PC experts, but we lag behind most ISPs in the provision of "painless" set-up disks. Hopefully W98 will make things easier. I’m not holding my breath!

Back to the hobby page