Apple Assembly Line - V3N6

Volume 3 -- Issue 6March 1983

In This Issue...

All About PTRGET and GETARYPT
Macro can Build Macros
Epson MX-80 Text Screen Dump
Optional Patch for TEXT/ Command
Division: A Tutorial
Short Note about Prime Benchmarks
Garbage-Collection Indicator for Applesoft
S-C Macro Assembler Version 1.1
More on the //e
The Visible Computer: A Review

S-C Macro Assembler Version 1.1

That's right, Version 1.1! I've added all the most-requested new features, corrected those few lingering problems, and it's almost ready. Look inside for more details.

A New Screen-Oriented Editor

Several people have asked about a screen-oriented editor for the S-C Macro Assembler. Well, Mike Laumer has come up with one for you. It runs with the Language Card version of the Macro Assembler, in the unused bank. I still prefer a line editor, but Bill is rapidly falling in love with the new screen editor. Now everyone has a choice! See Mike's ad inside.

65C02

Many of you have expressed an interest in the new Rockwell R65C02 microprocessor. Well, I still haven't heard any more than I mentioned a couple of months ago. We're as eager as you are to get a sample. We'll have a detailed report as soon as we know more.

All About PTRGET & GETARYPTBob Sander-Cederlof

Both Leo Reich and E. Melioli have asked for some clarification on how to pass array variables between Applesoft programs and assembly language programs. I hope this little article will be of some help to them.

The Variable Tables:

We need to start with a look at the structure of the Applesoft variable tables. There are two variable tables: one for simple variables, and the other for arrays. (You might turn to page 137 of the Applesoft Reference Manual now.) Entries in these tables include the variable names; some codes to distinguish real, integer, and string variables; and the value if numeric. String variables include the length of the string and the address of the string, but not the string itself.

The address of the start of the simple variable table is kept in $69,$6A. The next pair, $6B and $6C, hold the address of the end of the simple variable table plus one. This happens to also be the address of the beginning of the array variable table. The address of the end of the arrays plus one is kept in $6D,$6E. The actual string values may be inside the program itself, in the case of "string" values; or in the space between the top of the array variable table and HIMEM.

Here is a picture, with a few more pointers thrown in for good measure:

     (73.74) -->    HIMEM

                    <string values>

     (6F.70) -->    String Bottom

                    <free space>

     (6D.6E) -->    Free Memory Bottom

                    <arrays>

     (6B.6C) -->    Array Variable Bottom

                    <variables>

     (69.6A) -->    Simple Variable Bottom

                    <program>

     (67.68) -->    Program Bottom

Inside an Array:

Let's look a little closer at the array variable space. Each array in there consists of a header part and a data part. The header part contains the name, flags to indicate real-integer- string, the offset to the next array, the number of dimensions, and each dimension. The data part contains all the numeric values for real or integer arrays, and all the string length-address pairs for string arrays.

Here is a picture of the header part:

Bytes Contents ----------------------- 0,1 Name of Array 2,3 Offset 4 # of dimensions 5,6 last dimension ... x,y first dimension

The sign bits in each byte of the name combine to tell what type of array variable this is. If both bytes are positive, it is a real array; if both are negative, it is integer. Contrary to what it says on page 137 of the Applesoft manual, if the 1st byte is positive and the 2nd byte negative it is a string array. The manual has it backwards.

The value in the offset can be added to the address of the first byte of the header to give the address of the first byte of the header of the next array (or the end of arrays if there are no more).

The number of dimensions is one byte, which obviously means no more than 255 dimensions per array. Oh well! In my sample below I assume that no more than 120 dimensions have been declared. If you try to declare more than that, you will see how hard it is.

The dimensions are stored in backward order, last dimension first. Why? Why not? It has to do with the order they are used in calculating position for an individual element. Each dimension is also one larger than you declare in the DIM statement, because subscripts start at 0.

The data part of an array consists of the elements ordered so that the first subscript changes fastest. That is, element X(2,10) directly follows element X(1,10) in memory. Integer array elements are two bytes each, with the high byte first. Note: this is just about the only place in all the 6502 kingdom where you will find highbytes first on 16-bit values!

Real array elements take five bytes each: one byte for exponent, and four for mantissa. String array elements take three bytes each: one for length of the string, and two for the address of the string. Note: the string array elements DO NOT hold the string data, but only the address and length of that data!

Getting to the Point:

There is a powerful and much-used subroutine in the Applesoft ROMs which will find a particular variable in the tables. It is called PTRGET, and starts at $DFE3. It is too complicated to fully explain here, but here is what it does:

Reads the variable name from the program text.
Determines whether the variable is a simple one or an array.
Searches the appropriate table for the name.
If the name is not found, create a variable of the approriate type (simple or array; integer, real, or string).
Return with the address of the variable in Y,A (high-byte in the Y-register, low-byte in the A-register) and also in $83,84.

That is usually what happens. Actually there are several different entry points and two control bytes which modify PTRGET's behavior depending on the caller's whims. DIMFLG ($XX) is set non-zero when called by the DIM-statement processor, and is otherwise cleared to zero. SUBFLG ($YY) has four different states:

     $00 -- normal value
     $40 -- when called by GTARYPT
     $80 -- when called to process "DEF FN"
     $C1-$DA -- when called to process "FN"

We are concerned with the two cases SUBFLG = 0 and SUBFLG = $40, with DIMFLG = 0. Since the point of this whole article is to clarify access to array variables, I will concentrate on the main entry at $DFE3 and the GETARYPT subroutine at $F7D9. $DFE3 sets SUBFLG = 0, while GETARYPT sets SUBFLG = $40.

When we want to find an individual element inside an array, we call PTRGET at $DFE3. When we want to find the whole array, we call GETARYPT at $F7D9. GETARYPT is used by the STORE and RECALL Applesoft statements (which you might not realize even exist, since their function is only of interest to cassette tape users!)

The "& X" calls in the following program use PTRGET to find an array element.

On the other hand, if we want to sort the array, or if we want to save it all on disk, or some other feat which requires seeing the whole thing at once, we need to call GETARYPT. Then we can even find out how many subscripts were used in the DIM statement, and what the value of each dimension is. GETARYPT returns with the starting address of the whole array in $9B and $9C (called LOWTR).

The "& Y" call in the program prints out the starting address and length of each string of a string array.

I hope that as you work through the descriptions and examples above they are of some help.

    100  DIM A$(7,9)
    110 A$(3,5) = "ABCDEFG":A$(2,3) = "MNOPQRST"
    120  & X,A$(3,5)
    140  & X,A$(2,3)
    200  REM
    210  FOR J = 0 TO 7: FOR K = 0 TO 9
    215 A$ = ""
    220  FOR I = 1 TO  RND (1) * 5 + 5
    230 A$ = A$ + CHR$ ( RND (1) * 26 + 65)
    240  NEXT I
    245  PRINT J"  "K"  "A$
    250 A$(J,K) = A$: NEXT K: NEXT J
    260  & Y,A$

  1000 *  S.ARRAYS
  1010 *--------------------------------
  1020 CHRGET   .EQ $B1
  1030 CHKCOM   .EQ $DEBE
  1040 SYNCHR   .EQ $DEC0
  1050 PTRGET   .EQ $DFE3
  1060 GETARYPT .EQ $F7D9
  1070 PRNTAX   .EQ $F941
  1080 CROUT    .EQ $FD8E
  1090 PRHEX    .EQ $FDDA
  1100 COUT     .EQ $FDED
  1110 *--------------------------------
  1120 LENGTH       .EQ 0
  1130 STRING.ADDR  .EQ 1,2
  1140 ELEMENT.PNTR .EQ 3,4
  1150 ARRAY.END    .EQ 5,6
  1160 *--------------------------------
  1170        .OR $300
  1180 
  1190 START  LDA #X
  1200        STA $3F6
  1210        LDA /X
  1220        STA $3F7
  1230        RTS
  1240 *--------------------------------
  1250 *  GET ONE ARRAY ELEMENT
  1260 *--------------------------------
  1270 X      CMP #'X
  1280        BNE Y
  1290        JSR CHRGET
  1300        JSR CHKCOM      BE SURE COMMA IS NEXT
  1310        JSR PTRGET
  1320 *--------------------------------
  1330 *   NOW $83,84 POINTS AT A$(3,5)
  1340 *--------------------------------
  1350        LDY #0          FIRST BYTE IS STRING LENGTH
  1360        LDA ($83),Y     GET LENGTH
  1370        STA LENGTH
  1380        INY             NEXT TWO BYTES POINT
  1390        LDA ($83),Y     AT STRING VALUE
  1400        STA STRING.ADDR
  1410        INY
  1420        LDA ($83),Y
  1430        STA STRING.ADDR+1
  1440 *--------------------------------
  1450 *   NOW LET'S PRINT THE STRING, JUST FOR FUN
  1460 *--------------------------------
  1470        LDY #0
  1480 .1     CPY LENGTH
  1490        BCS .2          FINISHED
  1500        LDA (STRING.ADDR),Y
  1510        ORA #$80
  1520        JSR COUT
  1530        INY
  1540        BNE .1          ...ALWAYS
  1550 .2     JMP CROUT
  1560 *--------------------------------
  1570 *  GET ENTIRE ARRAY
  1580 *--------------------------------
  1590 Y      LDA #'Y
  1600        JSR SYNCHR
  1610        JSR CHKCOM
  1620        JSR GETARYPT
  1630 *--------------------------------
  1640 *   NOW $9B,9C HAVE ADDRESS OF START OF ARRAY
  1650 *   NEED TO MOVE POINTER UP TO FIRST ELEMENT
  1660 *--------------------------------
  1670        LDY #4       POINT AT LSB OF # DIMENSIONS
  1680        LDA ($9B),Y
  1690        ASL          DOUBLE IT (IGNORE MSB, #<120)
  1700        ADC #5       POINT AT FIRST ELEMENT
  1710        STA $9D
  1720        LDY #2       POINT AT LSB OF OFFSET
  1730        CLC          COMPUTE ADDRESS JUST PAST END OF ARRAY
  1740        LDA $9B
  1750        ADC ($9B),Y
  1760        STA ARRAY.END
  1770        LDA $9C      MSB
  1780        INY
  1790        ADC ($9B),Y
  1800        STA ARRAY.END+1
  1810 *--------------------------------
  1820 *      NOW COMPUTE FULL ADDRESS OF FIRST ELEMENT
  1830 *--------------------------------
  1840        CLC
  1850        LDA $9D
  1860        ADC $9B
  1870        STA ELEMENT.PNTR
  1880        LDA $9C
  1890        ADC #0
  1900        STA ELEMENT.PNTR+1
  1910 *--------------------------------
  1920 *      NOW WALK THROUGH STRINGS
  1930 *--------------------------------
  1940 .1     LDY #0                POINT AT FIRST ELEMENT
  1950        LDA (ELEMENT.PNTR),Y  GET LENGTH
  1960        STA LENGTH
  1970        INY
  1980        LDA (ELEMENT.PNTR),Y  GET ADDRESS
  1990        TAX
  2000        INY
  2010        LDA (ELEMENT.PNTR),Y
  2020        JSR PRNTAX
  2030        LDA #':+$80
  2040        JSR $FDED
  2050        LDA #' +$80
  2060        JSR $FDED
  2070        JSR $FDED
  2080        LDA LENGTH
  2090        JSR PRHEX
  2100        JSR CROUT
  2110 *--------------------------------
  2120        CLC
  2130        LDA #3
  2140        ADC ELEMENT.PNTR
  2150        STA ELEMENT.PNTR
  2160        LDA ELEMENT.PNTR+1
  2170        ADC #0
  2180        STA ELEMENT.PNTR+1
  2190 *--------------------------------
  2200        LDA ELEMENT.PNTR
  2210        CMP ARRAY.END
  2220        LDA ELEMENT.PNTR+1
  2230        SBC ARRAY.END+1
  2240        BCC .1
  2250        RTS

Macro Can Build MacrosMike Laumer

The S-C Macro Assembler can do a lot of things even its designer never dreamed of. The macro capability may be limited compared to mainframe systems, but it still has a lot of power.

A few days ago I got a bright idea that maybe you could even define macros inside macros, or write a macro that builds new macros. Lo and behold, it works! Here is what I tried:

     1000        .MA BLD
     1010        ]1
     1020        ]2
     1030        ]3
     1040        ]4
     1050        .EM

Notice that every line from the opcode field on is defined by a macro parameter. I called it with lines like this:

1060        >BLD ".MA ATOB","LDA A","STA B",".EM"
1070        >BLD ".MA BTOA","LDA B","STA A",".EM"

  1000 *SAVE S.MACRO.MACROS
  1010        .MA BLD
  1020        ]1
  1030        ]2
  1040        ]3
  1050        ]4
  1060        .EM
  1070        >BLD ".MA ATOB","LDA A","STA B",".EM"
  1080        >BLD ".MA BTOA","LDA B","STA A",".EM"
  1090 *--------------------------------
  1100 A      .BS 1
  1110 B      .BS 1
  1120 *--------------------------------
  1130        >ATOB
  1140        >BTOA

Here is how it all looks when you type ASM:

               1010        .MA BLD
               1020        ]1
               1030        ]2
               1040        ]3
               1050        ]4
               1060        .EM
0800-          1070        >BLD ".MA ATOB","LDA A","STA B",".EM"
               0000>        .MA ATOB
               0000>        LDA A
               0000>        STA B
               0000>        .EM
0800-          1080        >BLD ".MA BTOA","LDA B","STA A",".EM"
               0000>        .MA BTOA
               0000>        LDA B
               0000>        STA A
               0000>        .EM
               1090 *--------------------------------
0800-          1100 A      .BS 1
0801-          1110 B      .BS 1
               1120 *--------------------------------
0802-          1130        >ATOB
0802- AD 00 08 0000>        LDA A
0805- 8D 01 08 0000>        STA B
0808-          1140        >BTOA
0808- AD 01 08 0000>        LDA B
080B- 8D 00 08 0000>        STA A

I don't know whether this is really useful or not.... If you think of a way to use it that is significant, I'd like to hear from you!

Epson MX-80 Text Screen DumpUlf Schlichtmann
West Germany

Here is a short machine language program I wrote some time ago when I was working on a data-base program. It permits you to make a hard copy of the Apple text screen. It was written for an Epson MX-80 with Epson's Apple II Interface kit type 2, but with just one slight modification it should work with any other printer or interface as well.

I thought readers of AAL might have a use for this, especially after seeing a similar program in NIBBLE (Vol. 3 No. 3 pages 147-148) that was over three times longer to produce exactly the same result! The authors of that program required 149 bytes, and even used self-modifying code. My routine is only 40 bytes long.

There is one difference: in the NIBBLE program KSWL,H is changed so that the routine will be invoked every time control-P is pressed; also the ampersand vector is set up to re-install the KSWL,H vector whenever needed. I don't need these features, but even when they are added my program is still only about 78 bytes long (and WITHOUT any self-modifying code!).

Lines 1180-1200 direct all following output to the printer, and is equivalent to the Applesoft statements:

    PR#1 : PRINT

Next I store $8D (left over from MON.CROUT) as the number of columns for the printer, since any number greater than 40 will disable output to the screen. If you have a different printer interface card, you may need to use a different location than $678+SLOT. It should be stated somewhere in the printer interface manual. This is the slight modification I mentioned earlier.

Then I use the Applesoft VTAB routine to calculate the base address for each line. The entry point I chose requires the X-register to be loaded with the number of the desired line (starting with zero for the top-most line). The base address will then be stored in BASL,H. [ Note that using AS.VTAB means that this program will only work if Applesoft is switched on. If you call this when the other memory bank is on, no telling what might happen! ]

Next I let Y run from 0 to 39 to pick up all the characters in that particular line via indirect addressing. Each character is immediately fed to the printer. Upon completing a line, I call MON.CROUT to cause the printer to print the line. When I have sent all 24 lines, I then redirect output to the CRT and rehook DOS (lines 1340-1350).

Of course, there are a lot of possibilities for adding features to my basic screen dumper. The next version below does not rely on the Applesoft version of VTAB, so it can be called even when the Applesoft image is switched out. I also draw a border around the screen image: a line of dashes above and below, and vertical lines up down both sides.

Instead of using $8D as a line length to turn off the screen output, I masked out the flag bit in $7F8+SLOT. This works in the Grappler and Grappler Plus interfaces, whereas the former method did not. (It is equivalent to printing control-I and letter-N.)

Further, I now restore the value of BASL,H at line 1490. Otherwise the value in CV ($25) and the address in BASL,H do not agree after printing the screen.

The last enhancement is at lines 1340-1370. Here I now convert characters from flashing and inverse modes to normal mode, or to blanks in some cases. You might want to arrange for a different mapping here, according to your own taste.

Even with all these enhancements, the program is still only 86 bytes long. The first version could be loaded anywhere without reassembly, because there are no internal references. The second version does have an internal JSR, so it would have to be reassembled to run at other locations, or modified to be made run-anywhere.

  1000 *SAVE S.SCREEN PRINTER
  1010 *--------------------------------
  1020 *      INSTANT HARDCOPY PROGRAM
  1030 *      BY ULF SCHLICHTMANN
  1040 *--------------------------------
  1050 SLOT   .EQ 1
  1060 BASL   .EQ $28
  1070 BASH   .EQ $29
  1080 *--------------------------------
  1090 COLUMNS    .EQ $678
  1100 DOS.REHOOK .EQ $03EA
  1110 AS.VTAB    .EQ $F25A
  1120 MON.PR     .EQ $FE95
  1130 MON.CROUT  .EQ $FD8E
  1140 MON.COUT   .EQ $FDED
  1150 MON.SETVID .EQ $FE93
  1160 *--------------------------------
  1170        .OR $300
  1180 HCOPY  LDA #SLOT    SET UP OUTPUT VECTOR
  1190        JSR MON.PR   TO POINT AT PRINTER
  1200        JSR MON.CROUT     START A NEW LINE
  1210        STA COLUMNS+SLOT  DISABLE SCREEN
  1220        LDX #0       START AT TOP OF SCREEN
  1230 .1     JSR AS.VTAB  COMPUTE BASE ADDRESS
  1240        LDY #0       START IN COLUMN 1
  1250 .2     LDA (BASL),Y NEXT CHARACTER FROM THIS LINE
  1260        JSR MON.COUT
  1270        INY
  1280        CPY #40      END OF LINE YET?
  1290        BNE .2       NO
  1300        JSR MON.CROUT
  1310        INX          NEXT LINE
  1320        CPX #24      END OF SCREEN YET?
  1330        BNE .1       NO
  1340        JSR MON.SETVID
  1350        JMP DOS.REHOOK

  1000 *SAVE S.SCREEN PRINTER.PLUS
  1010 *--------------------------------
  1020 *      INSTANT HARDCOPY PROGRAM
  1030 *      BY ULF SCHLICHTMANN
  1040 *--------------------------------
  1050 SLOT   .EQ 1
  1060 BASL   .EQ $28
  1070 VLINE  .EQ $FC
  1080 *--------------------------------
  1090 FLAGS      .EQ $7F8
  1100 DOS.REHOOK .EQ $03EA
  1110 MON.VTAB   .EQ $FC22
  1120 MON.VTABZ  .EQ $FC24
  1130 MON.PR     .EQ $FE95
  1140 MON.CROUT  .EQ $FD8E
  1150 MON.COUT   .EQ $FDED
  1160 MON.SETVID .EQ $FE93
  1170 MON.DASH   .EQ $FD9E
  1180 *--------------------------------
  1190        .OR $300
  1200 HCOPY  LDA #SLOT    SET UP OUTPUT VECTOR
  1210        JSR MON.PR   TO POINT AT PRINTER
  1220        JSR MON.CROUT     START A NEW LINE
  1230        LDA FLAGS+SLOT
  1240        AND #$BF
  1250        STA FLAGS+SLOT
  1260        JSR DASH.LINE
  1270        LDX #0       START AT TOP OF SCREEN
  1280 .1     TXA
  1290        JSR MON.VTABZ   COMPUTE BASE ADDRESS
  1300        LDA #VLINE
  1310        JSR MON.COUT
  1320        LDY #0       START IN COLUMN 1
  1330 .2     LDA (BASL),Y NEXT CHARACTER FROM THIS LINE
  1340        ORA #$80     BE SURE IN RANGE FOR PRINTING
  1350        CMP #$A0
  1360        BCS .3
  1370        LDA #$A0     PRINT SPACE IN PLACE OF ILLEGALS
  1380 .3     JSR MON.COUT
  1390        INY
  1400        CPY #40      END OF LINE YET?
  1410        BNE .2       NO
  1420        LDA #VLINE
  1430        JSR MON.COUT
  1440        JSR MON.CROUT
  1450        INX          NEXT LINE
  1460        CPX #24      END OF SCREEN YET?
  1470        BNE .1       NO
  1480        JSR DASH.LINE
  1490        JSR MON.VTAB RE-ESTABLISH CURSOR POSITION
  1500        JSR MON.SETVID
  1510        JMP DOS.REHOOK
  1520 *--------------------------------
  1530 DASH.LINE
  1540        LDY #42
  1550 .1     JSR MON.DASH
  1560        DEY
  1570        BNE .1
  1580        JMP MON.CROUT
  1590 *--------------------------------

Optional Patch for TEXT/ CommandBob Sander-Cederlof

Several have asked how to patch the character output at the beginning of each line by the TEXT/ command. TEXT/ normally writes your source code as a text file with control-I in place of each line number.

At $1AAD in the mother-board version, or $DAAD in the language card version, you will find $88. This is control-I minus one. Put what every character you wish there, less one. For example, if you want a leading space on each line, put $1F in $1AAD and/or $DAAD.

Division: A TutorialBob Sander-Cederlof

Remembering long division in decimal can be hard enough, but visualizing it in binary and implementing it in 6502 assembly language is awesome! Study the following example, in which I divide an 8-bit value by a 4-bit value:

                 00110            6
             ----------         ---
       1101 ) 01010101      13 ) 85
   step A:   -0000              -78
              ----               --
               1010               7
   step B:    -0000
               ----
               10101
   step C:    - 1101
               -----
                10000
   step D:     - 1101
                -----
                  0111
   step E:       -0000
                  ----
                  0111   Remainder

In the binary version, I have not made any leaps ahead like we do in decimal. That is, I wrote out the steps even when the quotient digit = 0. Now let's see a program which divides an 8-bit value by a 4-bit value, just like the example above.

  1000 *SAVE S.DIV.8.BY.4
  1010 *--------------------------------
  1020 *      DIVIDE 8-BIT VALUE
  1030 *          BY 4-BIT VALUE
  1040 *--------------------------------
  1050 DIVIDEND   .EQ 0
  1060 DIVISOR    .EQ 1
  1070 QUOTIENT   .EQ 2
  1080 *--------------------------------
  1090 S.DIV.8.BY.4
  1100        LDY #5       COUNT OFF 5 STEPS
  1110        LDA #0
  1120        STA QUOTIENT
  1130        LDA DIVISOR       SEE IF DIVISOR IN RANGE
  1140        BEQ .3            DIVIDE BY ZERO IS ILLEGAL
  1150        ASL          SHIFT DIVISOR TO LEFT NYBBLE
  1160        ASL 
  1170        ASL 
  1180        ASL 
  1190        STA DIVISOR
  1200 .1     LDA DIVIDEND      COMPARE DIVIDEND TO DIVISOR
  1210        SEC
  1220        SBC DIVISOR
  1230        BCC .2            DIVIDEND IS SMALLER
  1240        CMP DIVISOR       SEE IF STILL LARGER
  1250        BCS .3            YES, OVERFLOW
  1260        SEC               SET QUOTIENT BIT = 1
  1270        STA DIVIDEND
  1280 .2     ROL QUOTIENT      SHIFT QUOTIENT BIT IN
  1290        LSR DIVISOR       SHIFT DIVISOR OVER
  1300        DEY
  1310        BNE .1            DO NEXT STEP
  1320        ROL DIVISOR  RESTORE DIVISOR
  1330        RTS
  1340 .3     BRK          DIVIDE FAULT

If you think this is a clumsy program, you may be right. Note that the loop runs five times, not four. This is because there are five steps, as you can see in the sample division above.

The first thing the program does is to clear the quotient value. In a 4-bit machine performing 8-bit by 4-bit division would yield a 4-bit quotient, so the top bits must be cleared. The rest of the bits will be shifted in as the division progresses.

Next the divisor is shifted up to the high nybble position, to align with the left nybble of the dividend. This is equivalent to step A in the example above. The loop running from line 1200 through line 1310 performs the five partial divisions.

If the divisor is zero, or if the first partial division proves that the quotient will not fit in four bits, the program branches to ".3". I put a BRK opcode there, but you would put an error message printer, or whatever.

To run the program above, I typed:

     :$0:55 0D N 800G 0.2

and Apple responded with: 0000- 07 0D 06

which means the remainder is 7, and the quotient is 6.

Dividing Bigger Values:

The following program will divide one two-byte value by another. The program assumes that both the dividend and the divisor are positive values between 0 and 65535. This program was in the original Apple II monitor ROM at $FB84, but is not present in the Apple II Plus and Apple //e ROMs.

  1000 *SAVE S.DIV.16/16
  1010 *--------------------------------
  1020 *      DIVIDE 16 BY 16
  1030 *--------------------------------
  1040 ACL    .EQ $50
  1050 ACH    .EQ $51
  1060 XTNDL  .EQ $52
  1070 XTNDH  .EQ $53
  1080 AUXL   .EQ $54
  1090 AUXH   .EQ $55
  1100 *--------------------------------
  1110 DIVMON LDY #16      INDEX FOR 16 BITS
  1120 .1     ASL ACL      DIVIDEND/2, CLEAR QUOTIENT BIT
  1130        ROL ACH
  1140        ROL XTNDL
  1150        ROL XTNDH
  1160        SEC
  1170        LDA XTNDL    TRY SUBTRACTING DIVISOR
  1180        SBC AUXL
  1190        TAX
  1200        LDA XTNDH
  1210        SBC AUXH
  1220        BCC .2       TOO SMALL, QBIT=0
  1230        STX XTNDL    OKAY, STORE REMAINDER
  1240        STA XTNDH
  1250        INC ACL      SET QUOTIENT BIT = 1
  1260 .2     DEY          NEXT STEP
  1270        BNE .1
  1280        RTS

As written, this program expects the XTNDL and XTNDH bytes to be zero initially. If they are not, a 32-bit by 16-bit division is performed; however, there is no error checking for overflow or divide fault conditions.

This program builds the quotient in the same memory locations used for the dividend. As the dividend is shifted left to align with the divisor (opposite but equivalent to the shifting done in the previous program), empty bits appear on the right end of the dividend register. These bit positions can be filled with the quotient as it develops.

Signed Division

With a few steps of preparation, we can divide signed values using an unsigned division subroutine. All we need to remember is the rule learned in high school: If numerator and denominator have the same sign, the quotient is positive; if not, the quotient is negative.

  1290 *--------------------------------
  1300 *      SIGNED DIVISION 32/16
  1310 *--------------------------------
  1320 SIGN   .EQ $2F
  1330 *--------------------------------
  1340 SIGNED.DIV.MON
  1350        LDY #0
  1360        STY XTNDL    CLEAR ACC EXTENSION
  1370        STY XTNDH
  1380        STY SIGN
  1390        LDX #ACL
  1400        JSR ABS
  1410        LDX #AUXL
  1420        JSR ABS
  1430        JSR DIVMON
  1440        LDA SIGN
  1450        BPL .1       RESULT POSITIVE
  1455        LDX #ACL
  1460        JSR COMPLEMENT
  1470 .1     RTS
  1480 *--------------------------------
  1490 ABS    LDA 1,X      LOOK AT SIGN
  1500        BPL ABSRET   POSITIVE
  1510        EOR SIGN     COMPLEMENT RESULT SIGN
  1520        STA SIGN
  1530 COMPLEMENT
  1540        SEC
  1550        TYA          =0
  1560        SBC 0,X
  1570        STA 0,X
  1580        TYA          =0
  1590        SBC 1,X
  1600        STA 1,X
  1610 ABSRET RTS

Double Precision, Almost:

What if I want to divide a full 32-bit value by a full 16-bit value? Both values are unsigned. The 32-bit dividend may have a value from 0 to 4294967295, and the divisor from 0 to 65535. All of the published programs I could find assume the leading bit of the dividend is zero, limiting the range to half of the above.

If the leading bit of the dividend is significant, a one bit extension is needed in the division loop. The following program implements a full 32/16 division.

  1000 *SAVE S.DIVIDE 32/16
  1010 *--------------------------------
  1020 DIVIDE LDX #17           16-BIT DIVISOR
  1040        CLC               START WITH NO OVERFLOW
  1050 .1     ROR OVERFLOW
  1060        SEC
  1070        LDA DIVIDEND+1    NEXT-TO-HIGHEST BYTE
  1080        SBC DIVISOR+1     LEAST SIGNIFICANT BYTE
  1090        TAY               SAVE RESULT
  1100        LDA DIVIDEND      HIGHEST BYTE
  1110        SBC DIVISOR
  1120        BCS .2            QUOTIENT BIT = 1
  1130        ASL OVERFLOW      TRUE QUOTIENT BIT
  1140        BCC .3
  1150 .2     STY DIVIDEND+1    QUOTIENT BIT = 1
  1160        STA DIVIDEND
  1170 .3     ROL DIVIDEND+3    SHIFT QUOTIENT BIT INTO END
  1180        ROL DIVIDEND+2    AND MOVE TO NEXT POSITION
  1190        ROL DIVIDEND+1
  1200        ROL DIVIDEND
  1210        DEX
  1220        BNE .1
  1230        ROR DIVIDEND      SHIFT REMAINDER BACK IN PLACE
  1240        ROR DIVIDEND+1
  1250        ROR OVERFLOW      SET SIGN BIT IF OVERFLOW
  1260        RTS
  1270 *--------------------------------
  1280 DIVIDEND   .BS 4
  1290 REMAINDER  .EQ DIVIDEND
  1300 QUOTIENT   .EQ DIVIDEND+2
  1310 DIVISOR    .BS 2
  1320 OVERFLOW   .BS 1
  1330 *--------------------------------
  1340        .LIF

Line 1020 sets up a 17-step loop, because the 16-bit divisor can be shifted to 17 different positions under the 32-bit dividend. To make it easier to understand the layout of bytes in memory, I departed from the usual low-byte-first-format in this program. I assume this time that the most significant bytes are first:

     Dividend:   $83A $83B $83C $83D
                 msb . . . . . . lsb

     Divisor:    $83E $83F
                 msb...lsb

I also have written this program to feed the quotient bits into the least significant end of the dividend register, as the dividend shifts left. The remainder will be found in the left two bytes of the dividend register, and the quotient in the right two bytes.

Watching It All Work:

Not being quite clairvoyant, I wanted to see what was really happening inside the 32/16 division program. So I added some trace printouts by inserting "JSR TRACE" right after lines 1050 and 1250. I also moved the variables into page zero, to show how much memory that can save. (All memory references are changed from 3-byte instructions to 2-byte instructions.)

  1000 *SAVE S.DIVIDE 32/16 WITH TRACE
  1010 *--------------------------------
  1020 OVERFLOW   .EQ $00
  1030 DIVIDEND   .EQ $01 THRU $04
  1040 REMAINDER  .EQ DIVIDEND
  1050 QUOTIENT   .EQ DIVIDEND+2
  1060 DIVISOR    .EQ $05 AND $06
  1070 *--------------------------------
  1080 MON.CROUT  .EQ $FD8E
  1090 MON.PRHEX  .EQ $FDDA
  1100 MON.COUT   .EQ $FDED
  1110 *--------------------------------
  1120 DIVIDE LDX #17           16-BIT DIVISOR
  1130        CLC               START WITH NO OVERFLOW
  1140 .1     ROR OVERFLOW
  1150        JSR TRACE
  1160        SEC
  1170        LDA DIVIDEND+1    NEXT-TO-HIGHEST BYTE
  1180        SBC DIVISOR+1     LEAST SIGNIFICANT BYTE
  1190        TAY               SAVE RESULT
  1200        LDA DIVIDEND      HIGHEST BYTE
  1210        SBC DIVISOR
  1220        BCS .2            QUOTIENT BIT = 1
  1230        ASL OVERFLOW      TRUE QUOTIENT BIT
  1240        BCC .3
  1250 .2     STY DIVIDEND+1    QUOTIENT BIT = 1
  1260        STA DIVIDEND
  1270 .3     ROL DIVIDEND+3    SHIFT QUOTIENT BIT INTO END
  1280        ROL DIVIDEND+2    AND MOVE TO NEXT POSITION
  1290        ROL DIVIDEND+1
  1300        ROL DIVIDEND
  1310        DEX
  1320        BNE .1
  1330        ROR DIVIDEND      SHIFT REMAINDER BACK IN PLACE
  1340        ROR DIVIDEND+1
  1350        ROR OVERFLOW      SET SIGN BIT IF OVERFLOW
  1360 *--------------------------------
  1370 TRACE  LDA #$B0
  1380        BIT OVERFLOW
  1390        BPL .1
  1400        LDA #$B1
  1410 .1     JSR MON.COUT
  1420        LDY #0
  1430 .2     LDA #$A0
  1440        JSR MON.COUT
  1450        LDA DIVIDEND,Y
  1460        JSR MON.PRHEX
  1470        INY
  1480        CPY #4
  1490        BCC .2
  1500        JSR MON.CROUT
  1510        RTS
  1520 *--------------------------------
  1530        .LIF

The trace program prints first the overflow extension bit. If this is "1" on the last line, the quotient is too large to fit in 16-bits. TRACE next prints the four hex-digits of the quotient, and lastly the remainder. A line is printed before each step, and at the end to show the final results.

Now here are the printouts for a few values of dividend and divisor.

    *1:00 00 FF FF  00 0A   doing $FFFF / $0A  (65535/10)
    *800G
    0 00 00 FF FF
    0 00 01 FF FE
    0 00 03 FF FC
    0 00 07 FF FE
    0 00 0F FF F0
    0 00 0B FF E1
    0 00 03 FF C3
    0 00 07 FF B6
    0 00 0F FF 0C
    0 00 0B FE 10
    0 00 03 FC 33
    0 00 07 F8 66
    0 00 0F F0 CC
    0 00 0B E1 99
    0 00 03 C3 33
    0 00 07 86 66
    0 00 0F 0C CC
    0 00 05 19 99  so $FFFF / $0A = $1999 rem 5  (65535/10 = 6553 rem 5)
    
    *1:00 00 19 99  00 0A
    *800G
    0 00 00 19 99
    0 00 00 33 32
    0 00 00 66 64
    0 00 00 CC C8
    0 00 01 99 90
    0 00 03 33 20
    0 00 06 66 40
    0 00 0C CC 80
    0 00 05 99 01
    0 00 0B 32 02
    0 00 02 64 05
    0 00 04 C8 0A
    0 00 09 90 14
    0 00 13 20 28
    0 00 12 40 51
    0 00 10 80 A3
    0 00 0D 01 47
    0 00 03 02 8F  so $1999/$0A = $28F rem 3  (6553/10 = 655 rem 3)
      
    *1:FF F8 00 00  FF FF
    *800G
    0 FF F8 00 00
    1 FF F0 00 00
    1 FF E2 00 01
    1 FF C6 00 03
    1 FF 8E 00 07
    1 FE 1E 00 0F
    1 FC 7E 00 3F
    1 F8 FE 00 7F
    1 F1 FE 00 FF
    1 E3 FE 01 FF
    1 C7 FE 03 FF
    1 BF FC 3F FE
    0 3F FE 1F FF
    0 7F FC 3F FE
    0 FF F8 FF F8  so $FF800000 / $FFFF = $FFF8 rem $FFF8
    
    *1:FF FE 00 01  FF FF
    *800G
    0 FF FE 00 01
    1 FF FC 00 02
    1 FF FA 00 05
    1 FF F6 00 0B
    1 FF EE 00 17
    1 FF DE 00 2F
    1 FF BE 00 5F
    1 FF 7E 00 BF
    1 FE FE 01 7F
    1 FD FE 02 FF
    1 FB FE 05 FF
    1 F7 FE 0B FF
    1 EF FE 17 FF
    1 DF FE 2F FF
    1 BF FE 5F FF
    1 7F FE BF FF
    0 FF FF 7F FF
    0 00 00 FF FF so $FFFE0001 / $FFFF = $FFFF
    
    *1:FF FF FF FF  FF FF
    *800G
    0 FF FF FF FF
    0 00 01 FF FF
    0 00 03 FF FE
    0 00 07 FF FC
    0 00 0F FF F8
    0 00 1F FF F0
    0 00 3F FF E0
    0 00 7F FF C0
    0 00 FF FF 80
    0 01 FF FF 00
    0 03 FF FE 00
    0 07 FF FC 00
    0 0F FF F8 00
    0 1F FF F0 00
    0 3F FF E0 00
    0 7F FF C0 00
    0 FF FF 80 00
    1 00 00 00 01  so $FFFFFFFF / $FFFF = $0001 overflow

Short Note About Prime BenchmarksFrank Hirai
West Lebanon, NH

About your faster primes articles (Vol 2 #1, Vol 2 #5, and Vol 3 #2).... If you go back to Jim Gilbreath's original BYTE article you will find that the times he lists are for TEN iterations. As such they are not unreasonable for Integer BASIC and Applesoft. When comparing times for your 6502 assembly language versions, remember to multiply by ten!

Even so, 1.83 seconds for 10 iterations using Anthony Brightwell's program in the Apple compares quite well against 1.12 seconds for 10 iterations in an 8 MHz Motorola 68000.

[ ...and wait till we try it on a Number Nine 6502 card at 3.6 MHz! Or with a 65C02! ]

Patching Applesoft for
Garbage-Collection IndicatorLee Meador

I wanted to know when (how often and how long) Applesoft was doing garbage collection. The following patch will cause an inverse "!" to placed in the lower right hand corner of the screen whenever garbage collection takes place.

It is a little tricky to patch Applesoft, since it is in ROM! The first step is to copy the ROMs into the language card RAM space (any slot 0 RAM card will do). If you have an old Apple II with Integer BASIC on the mother board, you can do this by booting the DOS 3.3 Master. Otherwise, here are the steps:

    ]CALL-151
    *C081 C081
    *D000<D000.FFFFM

Next you need to place some code inside the Applesoft image in the RAM card. I chose to place the new code on top of the HFIND subroutine at $F5CB. (The code from $F5CB through $F5FF is never used by Applesoft.) Here is the routine I put there:

    PATCH  PHA
           LDA #$21        INVERSE "!"
           STA $7F7        BOTTOM RIGHT CORNER
           PLA
           JSR GARBAG
           PHA
           LDA #$A0        BLANK BACK ON SCREEN CORNER
           STA $7F7
           PLA
           RTS

You also need to patch the existing "JSR GARBAG" inside Applesoft to jump to this new code. Here are the patches in hex:

    *C083 C083         write enable RAM card
    *E47B:CB F5
    *F5CB:48 A9 21 8D F7
    *F5D0:07 68 20 84 E4 48 A9 A0
    *F5E0:8D F7 07 68 60
    *C080              write protect RAM card
    *control-C
    ]run your program

Here is a little Applesoft program which generates a lot of garbage strings so you can see the patch in action:

    100 DIM A$(100)
    110 FOR I = 1 TO 100
    120 FOR J = 1 TO 200 : A$(I) = A$(I) + "B" : NEXT
    130 PRINT I, : NEXT

Try running the program with different HIMEM values, to see the different effects.

S-C Macro Assembler Version 1.1Bob Sander-Cederlof

A new version of the S-C Macro Assembler is just about ready, and it's going to be great!

I have added many new features, corrected a few problems, and created a special version to take advantage of the extra features of the new Apple //e computer. Here's a summary of the new items, so far:

New or Extended Features:

1. The .HS directive now allows optional "." characters before and after each pair of hex digits. (e.g., .HS ..12..34..AB) This makes for easier counting of bytes, and allows you to put meaningful comments above or below the .HS lines.

2. .DO--.FIN can now be nested to 63 levels, rather than just 8 levels.

3. In EDIT command, the insert mode is now invoked by ^A (ADD), rather than ^I. The TAB or ^I keys now perform a clear-to-tab function. Skip-to-tab is still invoked by ^T.

4. Comment lines may now begin with either "*" or ";".

5. Added .SE directive, which allows re-definable symbols.

6. Binary constants are now supported. The syntax is "%11000011101" (up to 16 bits).

7. ASCII literals with the high-bit set are now allowed, and are signified with the quotation mark: LDA #"X generates A9 D8. Note that a trailing "-mark is optional, just as is a trailing apostrophe with previous ASCII literals.

8. Blanks are now compressed inside macro skeletons when they are added to the symbol table. This saves about 30% of the space used by the skeletons.

9. The TEXT/ <filename> command now outputs the current TAB character (default ctrl-I). It used to put out control-I no matter what the current TAB character was.

10. During assembly, the assembler now protect $001F-$02FF and $03D0-$07FF, as well as MACLBL thru EOT and MACSTK thru $FFFF.

11. Now allow USER parameters to override memory protection. $101C-101D contains lower bound, and $101E-101F contains the upper bound of an area the user wants to UN-PROTECT. (The parameter for the starting page of the symbol table has moved from $101D to $1021, or $D01D to $D021.)

12. Added .PH and .EP directives, to start and end a phase. With these directives you can assemble a section of code that is intended to be moved and run somewhere else, without having to create a separate Target File.

13. Added .DUMMY and .ED to start and end a dummy section.

14. The TAB character may now be set to any character, including non-control characters, if you so desire.

Fixes to Known Problems:

1. Eliminated endless loop which occurred when a character > "Z" was typed in column 1 as a command.

2. .TI now properly spaces at top of each page, and at beginning of symbol table.

3. .AS and .AT now assemble lower case properly.

4. Changed the way the relative branches are assembled, so that "*" is equal to the location of the opcode byte. It used to be the location offset byte, which was non-standard.

5. Now pass two errors emit the proper number of object bytes, so that false range errors are not indicated.

6. HIDE now performs MERGE prior to HIDE, in case you forgot to do so.

Features added in support of Apple //e:

1. The Apple //e version allows you to change between 80- and 40- column screens at will, using PR#3 to go to 80-columns, or ESC-^Q to go to 40-columns.

2. In both normal input and edit modes, the DELETE key acts like a backspace key. It is interpreted the same as a left arrow (^H).

And there's more! The release disk will now include 80-column versions of the assembler for the Videx, STB, and ... 80-column cards.

I haven't made up my mind yet about a new price, how we'll handle the upgrades, or how much the charge will be. We'll have the final details in AAL next month.