Software-Quality Discussion List
Digest # 019


Tenberry Home
Software-Quality Home
Reducing Defects
Automated QA
Consulting
Site Index

============================================================
           Software-Quality Discussion List

      S O F T W A R E - Q U A L I T Y   D I G E S T

      "Cost-Effective Quality Techniques that Work"
============================================================
List Moderator:                      Supported by:
Terry Colligan                       Tenberry Software, Inc.
moderator@tenberry.com               http://www.tenberry.com
============================================================
June 22, 1998                        Digest # 019
============================================================

====IN THIS DIGEST=====

    ==== MODERATOR'S MESSAGE ====

    Where's the Quality?


    ==== CONTINUING ====

    Answers to questions
      John Cameron 

    RE: Software-Quality Digest # 018
      David Bennett 



==== MODERATOR'S MESSAGE ====

  Where's the Quality?  (Or, A Quality Fable)

  I was extremely frustrated by my attempts to use computers/email
  during my recent vacation to Arizona.  The following is a problem
  description for you to find the quality failure.  The quality failure
  was that I was essentially out of contact with the world and my office
  for the whole time I was on vacation.

  When I was planning for my trip, I knew that communication would be
  an issue.  I had been sick, and needed the time off, but this list
  and my business needed more of my time than they would get if I just
  called in once a day.  So, we carefully planned a solution and tested
  it to insure that it worked before deploying it.

  As a bit of background, I have traveled a lot in the past, but the
  last few years I have been not been traveling at all, so all my
  previous technical solutions seemed incompatible.  (Only 1 of our
  four notebooks would even boot, and it was a 286!)

  So, I decided that the proper solution was to get a new notebook.
  Besides, buying a new machine is always fun -- sort of a pre-start to
  the vacation.

  In addition, all of our communication with the outside world goes
  through our dedicated Internet access lines -- no dial-up, no ISP,
  so I needed to get a dial-up ISP.

  Normally I use Windows NT (and a bit of Linux) in my day to day work.
  I couldn't find a portable computer which claimed to support NT, so
  we determined to use Windows/95.  The plan was to get a new notebook
  computer with both a network card and a modem card, and to get a
  "name-brand" ISP with local dial ports in Arizona.

  We made sure the hotel in Arizona had modem jacks and allowed for
  computer usage of the phones.

  The plan was to download my email files and work in progress using
  the network card, and then to use the modem to access the Internet.

  After a bit of shopping and asking lots of questions, we selected:

    - A new 233Mhz Pentium notebook with 64MB and a 3G hard disk.
	  (Laptop Superstore)

	- Windows/95

	- A 3/Com network card

	- A name-brand modem (Megahertz)

	- A name-brand ISP (AT&T)

    - A name-brand email package (Eudora Pro), which I used at the
	  office.

  I should note that *all* of these products feature quality in their
  marketing literature loudly and proudly.  The purchase was delayed
  a bit, due to my illness, but the machine was to arrive a week before
  we left.  While waiting, we signed up with AT&T, and tested out their
  Internet access.  We even dialed the Arizona phone numbers to make
  sure everything worked.


  Reality Sets In...

  The notebook was delivered on-time and we set to work.  The problems:

    1. The modem, although Plug-N-Play, had an IRQ conflict with caused
	   it to fail to connect to anything.  Solved by the store, on the
	   second attempt.

	2. The network card absolutely failed to connect to anything, nor
	   to even admit that it existed.  I bought another card from a
	   different store -- exact same problems.

	   (At this point, I would have gotten a different machine, except
	   that there was no time to do so before the trip.)

	3. I installed the e-mail software, and spent a frantic night
	   before the departure day madly copying files to floppies and
	   loading them on the notebook.

	4. We set up the Dial-up networking on Win95 to call Arizona, and
	   verified that we could make a connection and send and receive
	   email.

    5. Arrive in Arizona, check into hotel, set up computer, dial up
	   AT&T, start email, and everything works okay --

	   for about 20 minutes.

  After that, I could never maintain a connection long enough to send
  or receive email or do any web browsing.  No error messages, no
  warnings, no indication of any problems at all -- just no data would
  be transferred after a certain number of bytes were transferred,
  typically 3KB to 30KB.  It was much worse when connecting at 56KB.

    Windows/95 never indicated that there was any problem.

	The dial-up software never indicated there was any problem.

	AT&T continued to say there were no problems on their end, that
	had "100% uptime" in Phoenix.

	The modem software never indicated any problem, but it's little
	diagnostic tool simply stopped incrementing the number of bytes
	read or written.

    The problem didn't appear to be heat-related -- there was no
	correlation between temperature or length of power-on and how
	long the modem would transmit.

  I spent several hours each day trying to nurse the system to stay
  alive long enough to deal with my email, largely unsuccessfully.
  I was extremely frustrated because

  THE QUESTIONS...

  Where is(are) the quality failure(s) here?

    - The notebook?

	- Win95?

	- The modem?

	- AT&T?

	- Email software?

	- Me for not having sufficient diagnostics?

	- Me for not having a backup available?

	- Other...


  Should a high-quality system provide diagnostics?

    One of the biggest frustrations to me was there was apparently
	no way to diagnose the problem.  All of the software was designed
	to hide complexity (and making a dial-up TCP/IP connection
	involves a *lot* of complex things going on!)

	If everything went well, hiding complexity would be fine.
	Unfortunately, it didn't.


  What errors did I make?


  (My opinions next time...)



==== CONTINUING ====

++++ New Post, New Topic ++++

From: John Cameron 
Subject: Answers to questions

>  A question for you, John:  Do the people who think you are nuts value
>  the results you are producing?
>
I know they value what I do, but I don't think they quite get the
extent of the quality things I do.  I have done a lot of things well
in a relatively short time in a field that is new to me.  My
supervisor probably took quite a chance on me, but he has
been rewarded for it.  I am a contractor, so I suppose he could
have just given me the heave ho if it didn't work out.  On the
other hand he did need things done yesterday.

We just released a board flashed with all the firmware to QA
and for the first time in the company's history, everything
worked.  I can hardly take all the credit, but my stuff worked
well and by wandering through the code, I did find and repair
several problems, so I think I had a hand in our success.  It
will be interesting to see how this all turns out.  Will they be
interested in insuring that things work before they get to QA?

Stay tuned, John



++++ New Post, New Topic ++++

From: David Bennett 
Subject: RE: Software-Quality Digest # 018

Hi Terry

Welcome back!

*   In any case, I apologize for the delay.  I hope that this long delay
*   can be recovered from.

I also!  Your absence has shown me that moderated mailing lists are
like small children, pets and companies.  They require regular
attention and they make vacations difficult!

Children and companies can at least grow into self-sufficiency.  I
wonder if lists can too?  Anyway, welcome back.

++++ Moderator Comment ++++

  Judging from the number of posts I'm getting, it's pretty clear
  that the 2-3 weeks between issues is too long to keep discussions
  going...

++++ End Moderator Comment ++++


* One trick I have learned (I hope) is that when I feel like preaching
* (again!), I sometimes will stop myself and instead as the audience a
* question and see what happens next.

I may have started this line.  My original comment was intended as an
oh-so-gentle hint, but this is a good suggestion.  I really like to
hear what you have to say Terry, but I also like to hear others.  Not
all of them are as vigorous about butting in as yours truly!

Say your piece, but maybe you should attempt some controversial
statements or ask some penetrating questions too?

* My push, as you put it, is for the programmers to work on the very
*   same *code* as the customers and the QA staff.  I don't consider
*   #if-#endif's to be the same code.  Although I occasionally add
*   debugging traces/printouts to my code, I generally think that these
*   are not the most effective way to go.
*
*   I'm arguing for making the code be present all the time, and just
*   run-time disabling/enabling, rather than going the #if-route.
*
*   I still don't see why Eiffel precludes doing this.  It seems exactly
*   analogous to a screen designer which lets you pull down menus.
*   Hopefully the programmers spend some time with the real code, and not
*   just simulating it in the design tool.
*
*   I would rephrase your 2. to use "Disable" rather than "Trim".

Perhaps we have been arguing at cross-purposes.  If you re-read the
above, I see that you are really concerned at the potential for errors
in the actual construction of #if/#endif code.  I agree wholeheartedly.
We have one or two modules of what I call "ifdef soup".  For
portability they really have to be that way (or at least I don't know
any other way to do what they do), but they are very error-prone.

++++ Moderator Comment ++++

  For these kinds of situations, I much prefer either:

    - Different modules (i.e, vary what's in the build, particularly
	  since you are much more likely to have differing build scripts/
	  methods on different systems.)  For example, a SYSWIN.C and
	  a SYSUNIX.C which define the same functions and data.

	- Run-time checks to disable/select between alternatives.

  I've never felt that #if/#endif's are worth the trouble they always
  seem to cause. It seems to me that programming and debugging are hard
  enough without having to guess what code might be generated from the
  source you are looking at.

  (Of course, my comments above could just be the demented ravings of
  an old code-generator guy! ;-)

++++ End Moderator Comment ++++


What you need to make you happy and me happy (I think) is build-time,
automated removal of a proportion of the debugging/tracing code, in a
controlled and testable way, to ensure that

a) the programmers are *never* limited by space/speed concerns in
devising automated consistency checks and debug aids

b) the built system does not exceed its space/speed budget for
tracing/logging/debugging code

c) the in/out boundary line can be moved up or down from time to time
without having to actually edit the source code (more left in betas,
less in final production, even less if the budget is revised, etc).

We already do that.  We don't write any #ifdefs in our debug-only
software.  None.  Not a one.  Never.  [well, hardly ever]

What we use is that all of our debugging/tracing code is embedded in a
particular macro structure.  Here is an example.
ZD Trace (TRLEVM, "BringWindowToTop hw=%x", hWnd);

The ZD in column 1 makes it highly distinctive so that simple automated
build tools can remove these lines.  In debug builds this is defined to
be nothing.  Some compilers allow us to remove the lines in production
builds using macros, some do not.  Makes little difference.

We know with certainty that removing a line like this cannot change the
logic of the program (although just occasionally we get a compiler
optimisation fault, but these are mercifully very rare.)  [BTW when we
use external tools, we replace the line by a blank line, so all line
numbers are unchanged]

TRLEVM is the runtime value which determines whether this line does
anything or not.  If the line is included, it generates output only if
the current debug flags enable level TRLEVM in this module.

We have various different flags (not just ZD) to allow selective
removal of different levels of debug tracing depth, and we can remove
or enable tracing on a per-module basis (module is typically 1000-5000
lines for us).

It's a really powerful technique, and I can highly recommend it.  As it
happens, it's really close to what Eiffel does.

++++ Moderator Comment ++++

  David, I'm curious.  How large would your system be if you just
  enabled all the tracing at compile time?

  I find your approach cleaner than "#ifdef-soup", and it appears to
  work for you.  However, to my mind, it still has the same problem
  that I can't look at source code and tell whether or not it is
  present in the module.

  Do you have problems with people putting side-effects into your
  trace code?

++++ End Moderator Comment ++++

* David Bennett spoke of:
* "If you design or write code, you MUST devise a way to test it."
*
* I agree that the code that you put out should be bug-free as far as you
* know. However, there have been numerous times that I've said (on the
* day of installation) "I didn't think about that".
*
* So my statement is: You can't test for conditions that you didn't know
* you were supposed to program for in the first place.

Hmmmm.  I don't think this really goes to what I am saying at all.  Let
me try again.

I may, as part of a program, write the following lines of code.
	if (myindicator == 0)
		result = handle_zero_case();
	else result = handle_non_zero_case();

During the process of testing, I must devise a way for this fragment to
be exercised with myindicator zero and myindicator non-zero in order to
exercise both branches of the code.  If I don't achieve that, I have no
certainty that the code is adequately tested.  There could be a totally
disastrous system crash lurking down the non-tested branch, and it
would come out for the first time at a customer site.

Now, let me change the code.
	char* mypointer;
	mypointer = malloc(20);
	if (mypointer == NULL)
		result = ERROR_OUT_OF_MEMORY;
	else
		result = call_some_function (mypointer, other_arguments);
	free(mypointer);

Now the same logic applies.  This code contains a very bad bug.  If
this memory allocation fails, the code will attempt to free a NULL
pointer, which (in some implementations) breaks the heap and causes a
system crash.

++++ Moderator Comment ++++

  A small nit: the C standard, either ISO or ANSI, require that free()
  ignore calls with a null pointer, so the above should never crash

++++ End Moderator Comment ++++


You have to find a way to make sure that both branches of the if test
get tested, or you won't pick up the bug on QA testing.  If later you
find and fix the bug (ie a user finds it!) you still need to be sure
that the bug doesn't come back in the future when someone else works on
this code.

[incidentally, this is very hard to do.  It means you have to find a
way to cause each and every memory allocation in the entire system to
fail, individually and to order.  Not at all easy to achieve]

++++ Moderator Comment ++++

  We do this kind of testing during the development by having the
  programmer step through all the code in the debugger, where s/he can
  manually set the value of 'mypointer' to null to force execution of
  the unusual case.

  Actually, we try do design around these kinds of problems by defining
  resource allocators which never fail, thus much simplifying the
  writing, testing, and debugging of this kind of code.  For example,
  instead of calling malloc directly, we would define a 'mustmalloc()',
  which takes care of signalling errors, and is guaranteed to never
  return a null value.

  The best way to reduce bugs is to make it impossible for them to
  happen!

++++ End Moderator Comment ++++

This proposition is not directly concerned with whether this code
actually does what the user asked.  It is concerned with whether on
delivery you can say with any confidence that you know what the code
does.

If you deliver on time what the user asked for but not what they really
wanted, you have a discussion centred on the analysis, requirements and
communications processes.  Your software engineering is probably in OK
shape.  You're probably going to get paid (eventually), but you might
have to do some extra work first.

If you deliver something which is not what the user asked for, not what
was specified, not what the programmer thought they wrote, not what the
project leader thought he signed off on, which nobody can get to work
properly and which crashes randomly, you probably have a pretty
significant software engineering problem.  You may have great rapport
with the user but you probably aren't going to get paid and you may
even finish up in court.

What I'm about is primarily making sure that you deliver what is
specified, and that you get the quality right, not so much how you
devise the spec.

++++ Moderator Comment ++++

  We are in complete agreement here!

++++ End Moderator Comment ++++

* To clarify, I don't expect to have every possible feature someone
*   might ask for, but I do expect that, no matter what input the user
*   supplies, a program A) doesn't crash and either B) does what was
*   requested or C) says that it can't.

I agree.  Read up on "programming by contract".  Bertrand Meyer is a
good place to start.

* 1. Given unlimited computing power, along with unlimited amount of time
* to test, what would you do?
* A) Make the person requesting the program provide test data
* B) Make the programmer provide test data
* C) Have a separate testing department provide it's own test data
* D) Send it out as 'beta', when it really should be labelled 'Alpha'

Obviously A, B and C, but reworded.  (A) never works, but part of the
requirements process is to jointly construct test scenarios which will
satisfy the customer that the program does what they want.  (B) is
vital, because only the programmer knows what to test, where the
weaknesses and the boundaries are.  (C) works when the requirements
spec already has the required detail (how to test) and QA actually
construct the test suites to implement the spec.

Test data is not enough.  You need a validation/QA process which
encompasses test data, test procedures and success/failure criteria.

* 2. Who should be the one with the authority to delay the ship-date?
* A) The owner of the company
* B) The project manager
* C) The programmer
* D) The testing department

Pardon?  Anyone can delay the ship-date.  They just have to be on the
critical path and fail to deliver.

Seriously though, the answer is "whoever was given that authority when
the project was set up."  Call him the "project owner".

Implicitly, I think you're asking: if the project looks like missing
its ship date because it's got too many bugs, who makes the decision
how many bugs can you leave in and still ship the product?  The
programmer can say "I can't fix it by then", QA can say "there are
still XXX unresolved defects", the project manager can say "we aren't
going to achieve that level of quality by that date".  Only the project
owner can make the decision.

There are 4 constraints in building software: resources, functions,
quality and time.  Only the project owner (owner, department head,
financier, customer) can ultimately decide which constraint gets
altered: whether to add resources (but see Fred Brookes first), take
functions out, lower the quality or extend the time.

What you should *never* do is lower quality without knowing that's what
you're doing and the downstream consequences of doing it.

Regards
David Bennett
[ POWERflex Corporation     Developers of PFXplus ]
[ Tel:  +61-3-9888-5833     Fax:  +61-3-9888-5451 ]
[ E-mail: sales@pfxcorp.com   support@pfxcorp.com ]
[ Web home: www.pfxcorp.com   me: dmb@pfxcorp.com ]


++++ Moderator Comment ++++

  Agree 100%!  You said it better than I have, though.

=============================================================
The Software-Quality Digest is edited by:
Terry Colligan, Moderator.      mailto:moderator@tenberry.com

And published by:
Tenberry Software, Inc.               http://www.tenberry.com

Information about this list is at our web site,
maintained by Tenberry Software:
    http://www.tenberry.com/softqual

To post a message ==> mailto:software-quality@tenberry.com

To subscribe ==> mailto:software-quality-join@tenberry.com

To unsubscribe ==> mailto:software-quality-leave@tenberry.com

Suggestions and comments ==> mailto:moderator@tenberry.com

=============  End of Software-Quality Digest ===============

[Tenberry] * [Software-Quality] * [Zero Defects] * [Automated QA] * [Consulting] * [Site Map]
Last modified 1998.7.6. Your questions, comments, and feedback are welcome.