In This Issue...
65C02 Notes
We now have a sample from Rockwell, and it shares the problem of not working in an older Apple. It's running just fine in the //e, but it doesn't work in the ][+. Rockwell's distributor says that regular delivery is now scheduled for November. Sigh....
There's a bug in the 65C02 chips! Among the new features are several new addressing modes for the BIT instruction, including BIT #immediate.
The BIT instruction actually does two operations:
Well, the BIT #immediate instruction does not do step two; it only modifies the Zero flag. The other new address modes for BIT behave correctly. BIT #$40 sure would have come in handy for a SEV (SEt oVerflow flag) instruction.
As always, we'll keep you posted.
Applesoft has a statement which allows branching according to a computed index:
ON X GO TO 100,200,300,400
Integer BASIC has a different method, simply allowing the line number after a GOTO, THEN, or GOSUB to be a computed value:
GO TO X*100
Most other languages have some technique for vectoring to one of a series of places based on the value of a variable. Modern languages like Pascal have a CASE statement, which can combine a comparison step.
case PIECE of Pawn : ...; Knight : ...; Bishop : ...; Rook : ...; Queen : ...; King : ...; end
I frequently find myself building various schemes to handle the CASE statement in assembly language. For example, I might accept a character from the keyboard and then compare it to a series of legal inputs, and branch accordingly to process the input.
One common way involves a series of CMP BEQ pairs, like this:
JSR GETCHAR CMP #$81 control-A? BEQ ... yes CMP #$84 control-D? BEQ ... yes CMP #$8D return? BEQ ... yes et cetera
If there are not too many cases, and if the processing routines are not too far away for the BEQs to reach, this is a good way to do the job. If the routines are bigger, and therefore tend to be too far away (causing RANGE ERRORS at assembly time), I might string together CMP BNE pairs instead:
JSR GETCHAR CMP #$81 control-A? BNE TRY.D no, try ctrl-D <code to process ctrl-A here> TRY.D CMP #$84 control-D? BNE TRY.M no, try return <code for ctrl-D here> TRY.M CMP #$8D return? BNE ... et cetera <code for ctrl-M here>
The trouble with the latter way is that programs get strung all over the place, and become very difficult to follow. Unstructured, some would say. The structure is really there, because we are just implementing a CASE statement; however, assembly language code over a sheet of paper long LOOKS unstructured, no matter what it is implementing. And once a programmer gets his CASE statement spread over several sheets of paper, the temptation to begin making a "rat's nest" out of it can be overwhelming.
I prefer to put things into nice neat data tables. Back in the August 1982 issue of AAL I presented a "Search and Perform" subroutine to handle a table like this:
.DA #$81,CTRL.A-1 .DA #$84,CTRL.D-1 .DA #$8D,RETURN-1 etc.
The table consists of three bytes per line, the first byte being the CASE value, and the other two being the address of the processing routine.
Another method is handy when the variable has a nice numeric range. For example, what if I have processing routines for every possible control character from ctrl-A through ESC? That is ASCII codes $81 through $9B. If I subtract $81, I get a value from 0 through 26 (decimal). If I then multiply the value by three, and add it to a base address, and store the result into another variable, and JMP indirect, I can access a series of JMPs to each processing routine:
JSR GETCHAR CASE SEC SBC #$81 CMP #27 BCS ...ERROR, NOT IN RANGE STA ADDR TIMES THREE ASL ADC ADDR ADC #TABLE PLUS TABLE BASE ADDRESS STA ADDR LDA #0 ADC /TABLE STA ADDR+1 JMP (ADDR) ADDR .BS 2 TABLE JMP CTRL.A JMP CTRL.B . . JMP ESCAPE
Note that if we got to the CASE program by doing a JSR CASE, then each processing routine can do an RTS to return to the main line program. This makes our CASE look like it is doing a series of JSR's instead of JMP's.
We can shave bytes off the above technique by only keeping the address in TABLE, without all the JMP opcodes. Then the variable only needs to be multiplied by two instead of three. We will have to use the doubled variable for an index to pick up the address in the table and put it into ADDR:
JSR GETCHAR CASE SEC SBC #$81 CMP #27 BCS ...ERROR, NOT IN RANGE ASL DOUBLE THE INDEX TAX LDA TABLE,X STA ADDR LDA TABLE+1,X STA ADDR+1 JMP (ADDR) ADDR .BS 2 TABLE .DA CTRL.A .DA CTRL.B . . .DA ESCAPE
I don't recommend self-modifying code, but I still use it sometimes. If you want to save two more bytes above, then you can store the jump address directly into the second and third bytes on a direct JMP instruction:
LDA TABLE,X STA ADDR+1 LDA TABLE+1,X STA ADDR+2 ADDR JMP 0
A much better way involves pushing the processing routine address onto the stack, and using an RTS to branch to the pushed address. Since RTS adds 1 to the address on the stack before branching, we have to push the address-1:
LDA TABLE+1,X PHA HIGH BYTE FIRST LDA TABLE,X PHA RTS TABLE .DA CTRL.A-1 .DA CTRL.B-1 . . .DA ESCAPE-1
Note that this method not only is not self-modifying, it also is a few bytes shorter and a tad faster.
All this is only necessary because the designers of the 6502 did not give us a JMP (addr,X) instruction. If they had, we could do it like this:
JSR GETCHAR CASE SEC SBC #$81 CMP #27 BCS ...ERROR ASL DOUBLE FOR INDEX TAX JMP (TABLE,X) TABLE .DA CTRL.A, CTRL.B,...,ESCAPE
Then the hardware would add the doubled character offset (0,2,4,...52 for ctrl-A thru ESC) to the base address of the table, pick up the address from the table, and jump to the corresponding processing routine.
Since that would be so nice, and the designers agreed, the new 65C02 chip has it! So if you know you are writing for a 65C02, and don't EVER intend to run in a plain 6502, you can use the JMP (TABLE,X).
It would also be nice to have JSR (TABLE,X), but you can simulate that by calling CASE with a JSR. Or in other situations, you might merely do it this way:
JSR CALL . . CALL JMP (TABLE,X)
Sometimes it so happens that your program can be arranged so that all the processing routines are in the same memory page. Then there is no need to store the high byte of the address in the table, right? Steve Wozniak thought this way, and you can see the result in the Apple monitor at $FFBE and following:
TOSUB LDA #$FE HIGH BYTE OF ALL ADDRESSES PHA LDA SUBTBL,Y PHA ZMODE LDY #0 STY MODE RTS . . SUBTBL .DA #BASCONT-1 CTRL-C .DA #USR-1 CTRL-Y .DA #BEGZ-1 CTRL-E . . .DA #BLANK-1 BLANK
Steve also used this technique inside the SWEET-16 interpreter. You can see the code at $F69E through $F6C6 in the Integer BASIC ROM or RAM image.
If the routines are not necessarily all in one page, but are all within one 256-byte range, you can add an offset from the table to a known starting address.
Here is a method I would NEVER use, but it is cute, and short:
LDA TABLE,X X IS CALCULATED INDEX STA BRANCH+1 INTO BCC INSTRUCTION CLC make branch always... BRANCH BCC BRANCH 2ND BYTE GETS FILLED IN BASE .EQ * ... ...all the routines here ... TABLE .DA #CTRL.A-BASE .DA #CTRL.B-BASE etc.
The table has pre-computed relative offsets from BASE, so that the values can be plugged directly into the BCC instruction. This is a fast and short technique, but somehow it scares me to think about self-modifying code. If you need it, go ahead and use it!
I wanted to use QUICKTRACE in conjunction with the S-C Assembler without having QUICKTRACE interfere with either my source file or any object code generated. Since I always use the LC version of the assembler, I modified the HELLO program on the S-C assembler disk as follows:
10 HOME:PRINT "LOADING QUICKTRACE..." 20 POKE 40192,211:POKE40193,142:CALL42964 30 PRINT CHR$(4)"BLOAD QUICKTRACE,A$8F00" 40 PRINT:PRINT "LOADING S-C ASSEMBLER..." 50 VTAB24:POKE34,23:PRINTCHR$(4)"EXEC LOAD LCASM" 60 END
Line 20 in the HELLO program modifies the location of the DOS buffers by $E00 bytes to make room for the QUICKTRACE program. After running the HELLO program, when the S-C prompt appears and BEFORE loading any S-C source files, enter:
:$8F00G <return>
This initializes QUICKTRACE.
I also changed the address at MON$ (from within QUICKTRACE) to MON$=D003 so when I press M from single-step mode, I return to the S-C Assembler with my source file intact.
Apparently nobody picked up my challenge at the end of the article about Charlie Putney's faster spiral screen clear program (August 1983 AAL, page 16). I suggested someone write a program in Applesoft which would in turn construct a machine language screen clear.
Nobody else did it, so I did. And whether you are interested in fancy ways to clear the screen or not, the techniques I used may be put to other uses.
The task of building a screen clear program can be divided into two parts. First, generate the memory addresses of the 960 cells on the screen, in the order (or path) that the spiral shift will follow. Second, using that table of addresses, generate the 959 pairs of LDA and STA instructions necessary to move the screen one position along the spiral path. There is really a third part: to generate the necessary prologue and postlogue instructions to make those 959 LDA-STA pairs be executed 960 times, and to clear the vacated byte at the tail end of the spiral path.
After trying various ways to understand the spiral path, I arrived at a table-driven approach. I put the table into data statements (lines 3000-3110 below), and made a simple loop to generate the 960 addresses (lines 100-150).
You might notice that the twelve lines of data correspond very closely to the parameters on Charlie Putney's macro calls. After I typed in the twelve lines, I noticed a definite pattern. I could have used only the first line of data, and computed the others by a simple algorithm: increment each value smaller than 13, and decrement each value 13 or larger. Well, no program is ever finished....
Once the 960 addresses are stored in array A%(0) through A%(959), I proceed to generate machine language code. Line 180 does it all, with the help of four simple subroutines. Then line 190 rings the bell, and line 200 calls the machine language program just generated for a fast two-and-a-half second demonstration.
During the address array building process, I fill up the screen with the letters U, D, L, and R. These show the direction (up, down, left, and right) which a given character will be shifted along the spiral path. The directions are just the opposite from the order in which the letters are displayed, because I generate the address list backwards (from head to tail).
During the generation of the machine language program, which takes about two minutes, I toggle the tail end character between normal to inverse video. This gives you something to watch for those lloooonnggg two minutes.
The generation process is broken into four parts, represented by four subroutines at 5000, 5100, 5200, and 5300.
GOSUB 5000 generates a four byte prologue, starting at memory address $2710, or 10000. The code looks like this:
LDX /-960 LDY #-960
Actually, not -960, but -960/S. S gives a step size. Sidestepping a little from the main discussion, let me tell you about S.
Don Lancaster called last week to talk about a few things with Bill, and passed on the results of his experiments with Charlie's program. He noted that the video refresh rate is 60 times per second, and that a 7.5 second screen clear moves a little more than two steps for each frame time. Therefore you don't really SEE each step. Therefore the screen clear routine could move each character two steps ahead at a time with the same smooth effect on the screen, but clearing the screen in half the time. Or three steps, clearing in one third the time. The variable S in my program lets you experiment with the number of steps each character moves during each pass. As listed, S=3, so the screen clears in 2.5 seconds.
GOSUB 5100 generates the requisite number of LDA-STA pairs to move the screen one step of size S along the spiral path.
GOSUB 5200 generates the instructions to clear the bytes at the tail end of the spiral. If S=3, you will get:
LDA #$A0 BLANK STA $636 STA $635 STA $634
GOSUB 5300 generates the end-of-loop code:
INY BNE LP INX BNE LP RTS LP JMP 10004
The screen need not necessarily be cleared to all blanks. By changing the value POKEd in the second part of line 5210 you can fill with all stars, or all white, or whatever.
Another interesting option occurs to me. Given a table in the A% array of all the screen addresses, in any arrangement that suits my fancy, I can clear the screen in 2.5 to 7.5 seconds by shifting the screen along that particular path. It could be random, spiral, kaleidoscopic, or whatever.
There are so many other things I could explain about this little program, I hardly know where to stop. I think I'll stop here, and leave the rest for your own rewarding investigation and analysis.
100 TEXT : HOME : DIM A%(1000) 105 N = 0 110 READ X,YB,YT: GOSUB 1000 120 READ Y,XL,XR: GOSUB 1200 130 READ X,YT,YB: GOSUB 1100 140 READ Y,XR,XL: GOSUB 1300 150 IF N < 960 THEN 110 160 REM BUILD MACHINE LANGUAGE SPIRAL SHIFT 170 S = 3 180 GOSUB 5000: GOSUB 5100: GOSUB 5200: GOSUB 5300 190 CALL - 1054: REM RING BELL 200 CALL 10000: END 500 REM POKE ADDRESS 510 AH = INT (A / 256):AL = A - AH * 256: POKE L + 1,AL: POKE L + 2,AH:L = L + 3: POKE 1588,256 - PEEK (1588): RETURN 1000 REM MOVE DOWN COLUMN X FROM YB TO YT 1010 C$ = "D": FOR Y = YB TO YT STEP - 1 1020 VTAB Y + 1: GOSUB 2000: NEXT : RETURN 1100 REM MOVE UP COLUMN X FROM YT TO YB 1110 C$ = "U": FOR Y = YT TO YB: VTAB Y + 1: GOSUB 2000: NEXT : RETURN 1200 REM MOVE LEFT ROW Y FROM XL TO XR 1210 C$ = "L": VTAB Y + 1: FOR X = XL TO XR: GOSUB 2000: NEXT : RETURN 1300 REM MOVE RIGHT ROW Y FROM XR TO XL 1310 C$ = "R": VTAB Y + 1: FOR X = XR TO XL STEP - 1: GOSUB 2000: NEXT : RETURN 2000 REM POST ADDRESS 2010 A = PEEK (40) + PEEK (41) * 256 + X:A%(N) = A:N = N + 1: POKE A, ASC (C$) + 128 2020 RETURN 3000 DATA 0,23,0, 0,1,39, 39,1,23, 23,38,1 3010 DATA 1,22,1, 1,2,38, 38,2,22, 22,37,2 3020 DATA 2,21,2, 2,3,37, 37,3,21, 21,36,3 3030 DATA 3,20,3, 3,4,36, 36,4,20, 20,35,4 3040 DATA 4,19,4, 4,5,35, 35,5,19, 19,34,5 3050 DATA 5,18,5, 5,6,34, 34,6,18, 18,33,6 3060 DATA 6,17,6, 6,7,33, 33,7,17, 17,32,7 3070 DATA 7,16,7, 7,8,32, 32,8,16, 16,31,8 3080 DATA 8,15,8, 8,9,31, 31,9,15, 15,30,9 3090 DATA 9,14,9, 9,10,30, 30,10,14, 14,29,10 3100 DATA 10,13,10, 10,11,29, 29,11,13, 13,28,11 3110 DATA 11,12,11, 11,12,28, 28,12,12, 12,27,12 5000 REM COMPILE PROLOGUE 5010 T = 65536 - 960 / S:TH = INT (T / 256):TL = T - TH * 256 5020 POKE 10000,162: POKE 10001,TH 5030 POKE 10002,160: POKE 10003,TL 5040 RETURN 5100 REM COMPILE LDA-STA PAIRS 5110 L = 10004: FOR I = 0 TO 957: POKE L,173:A = A%(I + S): GOSUB 500 5120 POKE L,141:A = A%(I): GOSUB 500: NEXT 5130 RETURN 5200 REM COMPILE CLEAR S BYTES 5210 POKE L,169: POKE L + 1,160:L = L + 2 5220 FOR I = 1 TO S: POKE L,141:A = A%(960 - I): GOSUB 500: NEXT 5230 RETURN 5300 REM COMPILE POSTLOGUE 5310 FOR I = 0 TO 9: READ A: POKE L + I,A: NEXT 5320 RETURN 5350 DATA 200,208,4,232,208,1,96,76,20,39
It would be nice to be able to use monitor commands from within Applesoft, both in direct commands and within running Applesoft programs. At least Kraig Arnett, from Homestead, Florida, thinks so.
I agree, and so I whipped out another handy-dandy &-subroutine for just that purpose. I call it Amper-Monitor. You can install it by BRUNning it from a binary file, or by adding some POKEs to your Applesoft program. My listing shows it residing at the ever popular $300 address, but it can be reassebled to run anywhere. Just remember to connect it properly to the Ampersand Vector.
Once Amper-Monitor is installed and hooked to the ampersand vector, you call it by typing an ampersand, a quotation mark, and a monitor command. Here is a sample program showing some uses of the Amper-Monitor.
100 FOR I = 768 TO 855 110 READ D : POKE I,D : NEXT 120 CALL 768 130 &"300.357 140 &"380:12 34 56 78 9A BC DE F0 150 &"FBE2G 160 &"300L 380.387 200 DATA 169,11,141,246,3,169,3,141,247,3,96 210 DATA 201,34,208,70,32,177,0,160,0,177,184,201,0 220 DATA 240,8,9,128,153,0,2,200,208,242,169,141 230 DATA 153,0,2,152,24,101,184,133,184,144,2,230 240 DATA 185,32,199,255,32,167,255,132,52,160,23 250 DATA 136,48,23,217,204,255,208,248,192,21,240 260 DATA 8,32,190,255,164,52,76,52,3,32,197,255 270 DATA 76,0,254,76,201,222
Why did I choose to require the quotation mark after the ampersand? Because normally Applesoft would parse the line, eliminating blanks, changing DEF to a token instead of three hex digits, using ":" to end a line, and so on. Using the "-mark prevents all this, leaving the line in raw ASCII form. Here is a listing of the program in assembly language:
1000 *SAVE S.AMPER.MONITOR 1010 *-------------------------------- 1020 * &-MONITOR COMMANDS 1030 *-------------------------------- 1040 MON.MODE .EQ $31 1050 MON.YSAV .EQ $34 1060 TXTPTR .EQ $B8 AND B9 1070 MON.BUFFER .EQ $200 1080 AMPERSAND.VECTOR .EQ $3F5 1090 *-------------------------------- 1100 AS.CHRGET .EQ $00B1 1110 AS.SYNERR .EQ $DEC9 1120 MON.BL1 .EQ $FE00 1130 MON.GETNUM .EQ $FFA7 1140 MON.TOSUB .EQ $FFBE 1150 MON.ZMODE .EQ $FFC7 1160 MON.CHRTBL .EQ $FFCC 1170 *-------------------------------- 1180 .OR $300 1190 *-------------------------------- 1200 SETUP LDA #AMPER.MONITOR 1210 STA AMPERSAND.VECTOR+1 1220 LDA /AMPER.MONITOR 1230 STA AMPERSAND.VECTOR+2 1240 RTS 1250 *-------------------------------- 1260 AMPER.MONITOR 1270 CMP #$22 MUST BE QUOTATION MARK HERE 1280 BNE .6 SYNTAX ERROR 1290 JSR AS.CHRGET 1300 LDY #0 1310 .1 LDA (TXTPTR),Y 1320 BEQ .2 1330 ORA #$80 1340 STA MON.BUFFER,Y 1350 INY 1360 BNE .1 1370 .2 LDA #$8D 1380 STA MON.BUFFER,Y 1390 TYA 1400 CLC 1410 ADC TXTPTR 1420 STA TXTPTR 1430 BCC .25 1440 INC TXTPTR+1 1450 .25 JSR MON.ZMODE 1460 .3 JSR MON.GETNUM 1470 STY MON.YSAV 1480 LDY #23 1490 .4 DEY 1500 BMI .6 SYNTAX ERROR 1510 CMP MON.CHRTBL,Y 1520 BNE .4 NOT THIS ENTRY 1530 CPY #21 1540 BEQ .5 <RETURN> ALONE 1550 JSR MON.TOSUB 1560 LDY MON.YSAV 1570 JMP .3 1580 .5 JSR MON.ZMODE-2 1590 JMP MON.BL1 1600 .6 JMP AS.SYNERR |
Lines 1200-1240 link in the ampersand vector. This is the only part that would have to be changed if you move the routine.
When Applesoft sees an "&", it will JSR to AMPER.MONITOR. The A-register will hold the character following the "&", which we hope is a quotation mark. Lines 1270 and 1280 do this hoping.
Lines 1290-1380 copy the characters following the quotation mark into the monitor buffer starting at $200. If you typed in the &"... as a direct command, it is already in the monitor buffer but starts at $202, so it gets shifted over two bytes. If the command is in a program, it will be copied out of program space into $200. Applesoft has stripped off the sign bit from every byte, so my loop adds the sign bit back in to satisfy the monitor's requirements. Applesoft ends the line with a $00 byte, and the monitor wants $8D, so I fix that up too. I don't let colon terminate the line, because colon is a valid character in a monitor command line. I use "LDA (TXTPTR),Y" rather than repeated calls to AS.CHRGET because AS.CHRGET would eliminate blanks.
Lines 1390-1440 adjust the Applesoft pointer to the end of the line, so upon returning we won't get false syntax errors and the Applesoft program can continue executing.
Lines 1450-1590 parse the command line one command at a time, call on the monitor to execute each command, and finally return to Applesoft after the last command on the line. (The idea for this code came originally from code Steve Wozniak wrote for the mini-assembler in the old Apple monitor ROM.) Note that an illegal monitor command will result in a syntax error.
I thought it would now be possible to use the Amper-Monitor to write hex dumps on text files...BUT: Unfortunately DOS uses some critical zero page locations which prevent using the Amper-Monitor while writing on a text file. Monitor commands use locations $3D through $42, and so does DOS. I tried using the &"300.357 to do a hex dump into a text file, but DOS went wild and clobbered itself. Sorry, but I see no solution without changing DOS or recoding the entire monitor.
In the July issue of AAL I outlined the changes Apple made to DOS 3.3 early this year. Today I received a new "Developer's System Master", with a cover letter claiming another correction to the APPEND routine. The letter binds developers to begin using the new version no later than November 1st.
If you like APPEND, or would like to like it, you might want to make these patches in your own system master. I am going to assume you already have the "early 1983" version, either because you bought a //e or a disk drive this year, or you copied one from a friend, or you made the patches from my July article. Here are the new changes:
"early 1983" August, 1983 --------------------- ----------------------- B683:4C 84 BA JMP $BA84 B683:4C B3 B6 JMP $B6B3 $B6B3-B6CE:ALL ZEROES B6B3:AD BD B5 LDA $B5BD B6B6:8D E6 B5 STA $B5E6 B6B9:8D EA B5 STA $B5EA B6BC:AD BE B5 LDA $B5BE B6BF:8D E7 B5 STA $B5E7 B6C2:8D EB B5 STA $B5EB B6C5:8D E4 B5 STA $B5E4 B6C8:BA TSX B6C9:8E 9B B3 STX $B39B B6CC:4C 7F B3 JMP $B37F $BA84-BA93:PATCH BA84-BA93:ALL ZEROES
What Apple has done is move the patch they had put at $BA84 down to $B6B3 and added four extra lines to that patch. I HOPE IT IS NOW CORRECT!
I believe that Steve Wozniak was the first to use the tricks in a microcomputer, back in 1976 and 1977. All of the other designs I recall either used the more expensive static RAM, or used a complex circuit to refresh dynamic RAM arrays. Steve's design allowed the use of dynamic RAM without any separate circuitry for refresh.
Dynamic RAM needs refreshing because each bit cell is really only a capacitor, and the charge runs out after a few milliseconds. By reading each bit and re-writing it every few milliseconds, the data in memory is maintained as long as you like. Each 16384-bit RAM-chip is organized in 128 rows by 128 columns of bytes, and the chips are designed so that merely addressing each row often enough will keep the bits fresh as a daisy. Steve hooked up the Apple so that the process of keeping data displayed on the screen also ran through all the row addresses.
His second trick was to keep the screen (and therefore the RAM) happy without stealing any time from the CPU. He did this by using alternate half cycles of the clock. The one-megahertz clock runs the 6502 every other half cycle, and the screen gets its whacks at memory in between.
What has all the above to do with an article titled "Base Address Calculation"? Well, I'm getting to that. In order to address each row often enough, Steve re-arranged the address bits in a rather complicated way. As the screen is refreshed, scan-line by scan-line, bytes are read from RAM in an order that assures every RAM row is accessed about every 2 milliseconds. [ For the exact details of this process, see Winston Gayler's "Apple II Circuit Description", pages 41-57. ]
All this boils down to a need to go through a complicated calculation to convert a display line number into a base address in RAM. The process is implemented for the text screen at $FBC1 in the monitor ROM; for the lo-res graphics screen at $F847 in the monitor ROM; for the hi-res graphics screen at $F417 in the Applesoft ROM.
If we represent the 8-bit value for the line number on the text screen as "000abcde", the base calculation computes the address in RAM for the first character on that line and stores the result in two bytes at $28 and $29 in the form "000001cd eabab000". The two bits "ab" may have values "00", "01", or "10" for lines 0-7, 8-15, and 16-23 respectively. The "abab000" part of the least significant byte of the base address represents "ab" times 40. Remember there are 40 characters on a line?
The hi-res base address calculation is more complicated, but it really the same thing. If we think of a text line as being made up of 8 hi-res lines, both calculations ARE the same. Except that the lo-res RAM starts at $400, and the hi-res starts at $2000. A hi-res line number runs from 0 through 191, or $00 - $BF. If we visualize it as "abcdefgh", the base address calculation merely re-arranges the bits to "001fghcd eabab000". Note that if we multiply the text line number by 8 and run it through the hi-res calculation we will get "001000cd eabab000" which is correct except for starting at $2000 rather than $400.
The hi-res calculation inside Applesoft takes 33 bytes and 61 cycles. Harry Cheung, who lives in Onitsha, Nigeria, wrote a letter to Call APPLE (page 70, July, 1983) to present his shorter, faster version. Harry did it in 25 bytes and only 46 cycles (one more byte and 6 more cycles if you count the RTS, but I didn't count an RTS in the Applesoft version). Here is Harry's code, with my comments.
1200 *-------------------------------- 1210 * BASE ADDRESS CALCULATOR 1220 * HARRY CHEUNG 1230 * PMB 1601, ONITSHA, NIGERIA 1240 * CALL APPLE, JULY 1983, PAGE 70 1250 *-------------------------------- 1260 CALC TAY (TAY..TYA COULD BE PHA..PLA) 1270 AND #$C7 ABCDEFGH 1280 STA 0 AB000FGH 1290 ORA #$08 FOR BASE = $2000, $10 FOR $4000 1300 STA 1 AB001FGH 1310 TYA ABCDEFGH 1320 * CARRY..A-REG......$00.......$01... 1330 ASL A--BCDEFGH0 AB000FGH AB001FGH 1340 ASL B--CDEFGH00 " " 1350 ROR 0 H-- " BAB000FG " 1360 ASL C--DEFGH000 " " 1370 ROL 1 A-- " " B001FGHC 1380 ROR 0 G-- " ABAB000G " 1390 ASL D--EFGH0000 1400 ROL 1 B-- " " 001FGHCD 1410 ASL E--FGH00000 " " 1420 ROR 0 G-- " EABAB000 001FGHCD 1430 RTS |
I need to point out several things here. Harry used page zero locations $00 and $01 for the resulting base address. If you want to use his program with Applesoft, change them to $26 and $27. Harry saved the line number temporarily in the Y-register. If the Y-register is already holding something important (it is in the Applesoft case), you can substitute PHA and PLA for the TAY and TYA above. Same number of bytes, but 3 cycles longer.
If you want REAL speed, and can spare a few more bytes, you need to pre-compute all the base addresses and store them in a table. Then you can use the line number as an index into the table and do a base address TRANSLATION in just a few cycles. For example, assume you store all the low-order bytes in a 192-byte table called LO.BASE, and similarly the high-order bytes at HI.BASE. If you get the line number in the Y-register, then you can convert the line number to a base address like this:
LDA LO.BASE,Y STA $26 LDA HI.BASE,Y STA $27
That takes 10 bytes of program, 384 bytes of table, and only 14 to 16 cycles. I say 14 to 16, because it depends on whether either or both of the two tables cross page boundaries. If they each are entirely within a memory page, 14 cycles.
Now here is a little piece of code I wrote to test out Harry's calculator. It runs through each of the 192 lines and prints out the line number, an equal sign, the base address, and a space for each line (all in hex).
1000 *SAVE FAST & SHORT HBASCALC 1010 *-------------------------------- 1020 * DRIVER ROUTINE TO PRINT OUT 1030 * CALCULATED BASE ADDRESSES 1040 *-------------------------------- 1050 TEST LDX #0 1060 .1 TXA 1070 JSR CALC 1080 TXA 1090 JSR $FDDA 1100 LDA 1 1110 JSR $FDD3 1120 LDA 0 1130 JSR $FDDA 1140 LDA #$A0 1150 JSR $FDED 1160 INX 1170 CPX #192 1180 BCC .1 1190 RTS |
The monitor address $FDD3 is not a labelled entry point, but I think it will probably stay consistent in future editions of the Apple ROMs. It saves whatever is in the A-register, prints "=", restores the A-register, and falls into $FDDA. The routine at $FDDA prints the contents of A in hex.
Just for fun I also wrote some new versions of the text base address calculator. One of them is shorter but takes more time, and the other is longer but takes less time. Oh well, can't win every race! Here are listings of them both, followed by a commented listing of the Applesoft hi-res calculator.
1440 *-------------------------------- 1450 LRCALC.1 1460 PHA 1470 AND #$18 000DE000 1480 ASL 00DE0000 1490 STA 0 1500 ASL 0DE00000 1510 ASL DE000000 1520 ORA 0 DEDE0000 1530 STA 0 1540 PLA 000DEFGH 1550 LSR 0000DEFG 1560 ROR 0 HDEDE000 1570 AND #$03 000000FG 1580 ORA #$04 000001FG (FOR PAGE 1) 1590 STA 1 1600 RTS 1610 *-------------------------------- 1620 LRCALC.2 1630 PHA 1640 AND #$18 000DE000 1650 BEQ .1 1660 CMP #$10 1670 LDA #$A0 1680 BCS .1 1690 LSR 1700 .1 STA 0 DEDE0000 1710 PLA 000DEFGH 1720 LSR 0000DEFG 1730 ROR 0 HDEDE000 1740 AND #$03 000000FG 1750 ORA #$04 000001FG (FOR PAGE 1) 1760 STA 1 1770 RTS 1780 *-------------------------------- 1790 * FROM APPLESOFT ROM AT $F417-$F437 1800 *-------------------------------- 1810 MON.GBASL .EQ $26 1820 MON.GBASH .EQ $27 1830 HGR.PAGE .EQ $E6 1840 AS.HRCALC 1850 PHA Y-POS ALSO ON STACK 1860 AND #$C0 CALCULATE BASE ADDRESS FOR Y-POS 1870 STA MON.GBASL FOR Y=ABCDEFGH 1880 LSR GBASL=ABAB0000 1890 LSR 1900 ORA MON.GBASL 1910 STA MON.GBASL 1920 PLA (C) (A) (GBASH) (GBASL) 1930 STA MON.GBASH ?-ABCDEFGH ABCDEFGH ABAB0000 1940 ASL A-BCDEFGH0 ABCDEFGH ABAB0000 1950 ASL B-CDEFGH00 ABCDEFGH ABAB0000 1960 ASL C-DEFGH000 ABCDEFGH ABAB0000 1970 ROL MON.GBASH A-DEFGH000 BCDEFGHC ABAB0000 1980 ASL D-EFGH0000 BCDEFGHC ABAB0000 1990 ROL MON.GBASH B-EFGH0000 CDEFGHCD ABAB0000 2000 ASL E-FGH00000 CDEFGHCD ABAB0000 2010 ROR MON.GBASL 0-FGH00000 CDEFGHCD EABAB000 2020 LDA MON.GBASH 0-CDEFGHCD CDEFGHCD EABAB000 2030 AND #$1F 0-000FGHCD CDEFGHCD EABAB000 2040 ORA HGR.PAGE 0-PPPFGHCD CDEFGHCD EABAB000 2050 STA MON.GBASH 0-PPPFGHCD PPPFGHCD EABAB000 2060 *-------------------------------- 2070 RTS 2080 *-------------------------------- |
By the way, if you want to see the WHOLE thing...a commented listing of the entire Applesoft ROM, we have it on disk in format for the S-C Macro Assembler.
I have discovered a way to store source code, complete with comments, on disk files for the Apple mini-assembler (at $F666 in the Integer BASIC ROM or Language Card load). I use what I call "the world's best word processor", the one you get from S-C Software for $50. I create a text file that looks like this:
FP CALL-151 C080 F666G 300:LDX #C0 ;START WITH "A"-1 INX ;LOOP COMES HERE TXA ;CHAR TO PRINT JSR FDED ;PRINT IT CPX #DA ;STOP AFTER "Z" BCC 302 ;NOT THERE YET RTS ;FINISHED! FP CALL768
Assuming I have Integer BASIC in my RAM card, EXECing the above text file assembles the code very nicely and even runs the program once! Note that the Mini-Assembler does allow comments following a ";".
Some computer terminals have a special key on the keyboard which will dump whatever is on the screen to a printer. The following program will give the same function to an Apple, using the ctrl-P key.
Many different versions of screen dump programs have been written, and published hither and yon. Most of them work with the particular author's printer and interface combination, but not mine or yours. I found the one Bob S-C published in the July 81 issue of AAL to be like that, so I worked it over. Now I believe it can truly be called "generic", or at least general, because it runs on every combination of printers and interfaces I can find.
I tested it on systems using the following interfaces:
The screen dump should work with any interface which recognizes the Apple standard method for turning off video output. The standard is to "print" a control-I followed by an "N". Lines 2190 through 2250 perform the output of these two characters.
The only board I found which did not work with this convention was the SSM AIO board, so the program which follows has a special conditional assembly mode to make it assembly slightly different object code for that board. If you have that board, change line 1610 to say "VERSION .EQ AIO" and it will assembly your version. Instead of Lines 2190 through 2250 being assembled, lines 2260 through 2310 will. They do not show up in the listing, so here they are:
2260 .DO VERSION=AIO 2270 LDA #$80 2280 JSR COUT 2290 LDX SLOT 2300 STA NOVID,X 2310 .FIN
If your assembler does not support conditional assembly, you can merely type in the lines 2270-2300 above in place of lines 2190-2310.
If your printer interface is not plugged into slot 1, change the slot number in line 2030, or at $0319.
Install the program by BRUNning the binary file of the object code, or by BLOADing it and doing a CALL768. Then whenever you type control-P, the screen will be printed. You can also call the screen dump from a running Applesoft program with CALL 794.
1000 *SAVE GENERIC SCREEN DUMP 1010 *-------------------------------- 1020 * 1030 * GENERIC SCREEN DUMP 1040 * 1560 *-------------------------------- 1570 1580 GENERIC .EQ 1 1590 AIO .EQ 2 1600 1610 VERSION .EQ GENERIC 1620 1630 CH .EQ $24 1640 BASL .EQ $28 1650 CSWL .EQ $36 1660 CSWH .EQ CSWL+1 1670 KSWL .EQ $38 1680 KSWH .EQ KSWL+1 1690 1700 DOS.HOOK .EQ $3EA 1710 1720 BASCALC .EQ $FBC1 1730 COUT .EQ $FDED 1740 KEYIN .EQ $FD1B 1750 RDKEY .EQ $FD0C 1760 OUTPORT .EQ $FE95 1770 VTAB .EQ $FC22 1780 1790 CR .EQ $8D CARRIAGE RETURN 1800 NOVID .EQ $578 1810 *-------------------------------- 1820 .OR $300 1890 *-------------------------------- 1900 START LDA #ENTRY HOOK ROUTINE INTO DOS 1910 STA KSWL 1920 LDA /ENTRY 1930 STA KSWH 1940 JMP DOS.HOOK 1950 *-------------------------------- 1960 ENTRY JSR KEYIN WAIT FOR A KEYPRESS 1970 CMP #$90 ^P ? 1980 BNE .1 NO 1990 JSR DUMP YES 2000 JMP RDKEY 2010 .1 RTS 2020 *-------------------------------- 2030 SLOT .DA #1 2040 *-------------------------------- 2050 DUMP PHA SAVE A, X, Y 2060 TXA 2070 PHA 2080 TAY 2090 PHA 2100 LDA CH SAVE CH 2110 PHA 2120 LDA CSWL SAVE OUTPUT HOOKS 2130 PHA 2140 LDA CSWH 2150 PHA 2160 * 2170 LDA SLOT COLD START BOARD 2180 JSR OUTPORT IN SLOT 1 2190 .DO VERSION=GENERIC 2200 LDA #$89 KILL VIDEO ECHO 2210 JSR COUT 2220 LDA #"N" 2230 JSR COUT 2240 NOP PAD TO STAY ALIGNED W/ AIO VERSION 2250 .FIN 2260 .DO VERSION=AIO 2270 LDA #$80 KILL VIDEO ECHO 2280 JSR COUT 2290 LDX SLOT 2300 STA NOVID,X 2310 .FIN 2320 * 2330 LDA #CR START ON A NEW LINE 2340 JSR COUT 2350 * 2360 LDX #0 START W/ 1ST LINE (0TH) 2370 STX CH SET CH TO 0 SO PRINTER WON'T INDENT 2380 2390 .1 TXA LINE LOOP 2400 JSR BASCALC GET ADDR OF LINE 2410 LDY #0 START W/ 1ST CHARACTER (0TH) 2420 .2 LDA (BASL),Y GET A CHAR 2430 .3 CMP #$A0 CONVERT FLASH/INVERSE CHAR 2440 BCS .4 NON-FLASHING U.C. 2450 ADC #$40 2460 BNE .3 ..ALWAYS 2470 .4 AND #$7F MASK OFF HI BIT TO AVOID 2480 * EPSON BLOCK GRAPHICS 2490 JSR COUT PRINT IT 2500 INY LOOP FOR ANOTHER CHAR 2510 CPY #40 2520 BCC .2 2530 LDA #CR END OF LINE 2540 JSR COUT 2550 INX LOOP FOR ANOTHER LINE 2560 CPX #24 2570 BCC .1 2580 2590 PLA RESTORE OUTPUT HOOKS 2600 STA CSWH 2610 PLA 2620 STA CSWL 2630 PLA RESTORE CH 2640 STA CH 2650 JSR VTAB AND LINE 2660 PLA RESTORE Y, X, A 2670 TAY 2680 PLA 2690 TAX 2700 PLA 2710 RTS ..THAT'S ALL FOLKS 2720 * |
Most of the routines I've seen to terminate a CATALOG listing involve patching in a routine that checks for a particular key input and adding code to do different actions, like aborting or single-stepping the catalog list. Here is a modification I came up with that requires only a small change and no additional code.
This is the section of DOS that handles a new line in the CATALOG display:
1000 .OR $AE2C 1010 AE2C- 4C 7F B3 1020 JMP $B37F leave File Manager AE2F- A9 8D 1030 NEWLN LDA #$8D carriage return AE31- 20 ED FD 1040 JSR $FDED MON.COUT AE34- CE 9D B3 1050 DEC $B39D line count AE37- D0 08 1060 BNE .1 AE39- 20 0C FD 1070 JSR $FD0C MON.RDKEY AE3C- A9 15 1080 LDA #$15 count 21 lines AE3E- 8D 9D B3 1090 STA $B39D reset line count AE41- 60 1100 .1 RTS
Line 1020 is really the end of the previous routine, but we're going to be borrowing it, so I'll show it here. NEWLN is called every time the catalog list finishes a file name.
Notice that two bytes are wasted in lines 1030-1040. Why do LDA #$8D, JSR $FDED, when JSR $FD8E does the same thing? Two bytes may not sound like much, but in this case it's enough to work some magic! Try replacing the above piece of DOS with this:
1000 .OR $AE2C 1010 AE2C- 4C 7F B3 1020 EXIT JMP $B37F leave File Manager AE2F- 20 8E FD 1030 NEWLN JSR $FD8E MON.CROUT AE32- CE 9D B3 1040 DEC $B39D line count AE35- D0 0A 1050 BNE .1 return if not done AE37- 20 0C FD 1060 JSR $FD0C get a keypress AE3A- 29 17 1070 AND #$17 the magic number AE3C- F0 EE 1080 BEQ EXIT abort CATALOG AE3E- 8D 9D B3 1090 STA $B39D new line count AE41- 60 1100 .1 RTS
Slipping in that AND #$17, BEQ EXIT, has several effects:
1. Space Bar or Back Arrow will terminate the listing. 2. Forward Arrow will advance the listing one page (just like normal.) 3. The "A" key will advance the listing one line.
And it all fits into the original space! The other keys will have different effects, depending on the value left in the accumulator after AND #$17. Most keys will advance the listing between 1-23 lines.
Try substituting other values for the $17 in line 1070. Remember that the value of (Keypress AND Value) will be the new line count. The catalog display will scroll up by that number of lines. If the result is zero, the catalog display will end. The maximum result is the same as the mask value, that is, 23 lines for a $17 mask.
[ My favorite mask value is $4F. With that value SPACE still breaks the display, but now the numeral keys scroll up by the same number of lines, i.e., pressing the "1" key gives one more line, "2" shows two more names, and so on. Also, the "O" (oh, not zero) key scrolls up by 79 lines, which usually means all the way to the end of the catalog....Bill ]
1000 *SAVE S.CATALOG INTERRUPT 1010 *-------------------------------- 1020 FMEXIT .EQ $B37F 1030 COUNT .EQ $B39D 1040 RDKEY .EQ $FD0C 1050 CROUT .EQ $FD8E 1060 *-------------------------------- 1070 .OR $AE2C 1080 1090 EXIT JMP FMEXIT leave File Manager 1100 NEWLN JSR CROUT send <CR> 1110 DEC COUNT line count 1120 BNE .1 return if not done 1130 JSR RDKEY get a keypress 1140 AND #$17 the magic number 1150 BEQ EXIT abort CATALOG 1160 STA COUNT new line count 1170 .1 RTS |
I have been trying out the monitor patches in the July issue of AAL for adding an ASCII display to the memory dump, and I have two problems with them. Because the routines place the characters directly into the Apple's screen memory, they do not work with my 80 column card. The same problem also arises when I want to send a dump to a printer. As a solution to this problem I present still another monitor patch for an ASCII display. My version is slightly longer than the others, but it still fits in the cassette tape portion of the monitor (just barely, I might add).
In order to take advantage of the 80 column display I first made the following patches to the monitor:
FDA6:0F FDB0:0F
These changes allow the dump routine to print 16 values on each line, rather than the usual eight.
Since the characters have to be printed after the current line of the dump is finished, I need a place to buffer up to 16 characters. $BCDF, an unused area in DOS, serves this purpose. My routine buffers each byte before calling PRBYTE to display the hex value. If a particular byte will be the last one on that line of the dump, the patch calls PRBYTE to print the byte, then tabs to column 60 and displays the contents of the buffer. Upper and lower case characters are printed as they are, and control characters are replaced with blanks. (That's my style. As Bob said in July, choose your own favorite!)
Of course the following patch needs to be made to the dump code, to call my routine (this is the same as shown in the July article):
FDBE:C9 FC
The patch can be used with a 40 column display by ignoring the above patches to $FDA6 and $FDB0, and by making the following changes to my patch routine:
1140 AND #7 1200 EOR #7 1300 LDA #30 1420 CPX #8
This patch was tested on a Microtek Magnum 80 card, but it should work on other brands as well.
[ It also works fine with the STB80 card, and the Apple //e...Bill ]
1000 *SAVE S.MON ASCII DISPLAY (DOBE) 1010 *-------------------------------- 1020 CH .EQ $24 1030 A1L .EQ $3C 1040 A1H .EQ $3D 1050 A2L .EQ $3E 1060 A2H .EQ $3F 1070 BUFFER .EQ $BCDF 1080 PRBYTE .EQ $FDDA 1090 COUT .EQ $FDED 1100 *-------------------------------- 1110 .OR $FCC9 1120 .TA $CC9 1130 1140 PATCH PHA save byte 1150 LDA A1L low byte of dump address 1160 AND #$F is transformed to 1170 TAX offset in buffer 1180 PLA get original byte back 1190 PHA but keep it on the stack 1200 STA BUFFER,X buffer the character 1210 CPX #$F last byte of line? 1220 BEQ .0 if so, print the buffer 1230 LDA A2L 1240 CMP A1L done with range? 1250 BNE .3 return to monitor if not 1260 LDA A2H 1270 CMP A1H check high bytes 1280 BNE .3 return if more 1290 1300 .0 PLA 1310 JSR PRBYTE print the last byte 1320 LDA #60 tab to column 60 1330 STA CH 1340 LDX #0 1350 .1 LDA BUFFER,X display the buffer 1360 ORA #$80 1370 CMP #$A0 control character? 1380 BCS .2 1390 LDA #$A0 if so, substitute blank 1400 .2 JSR COUT print the character 1410 LDA #$A0 1420 STA BUFFER,X blank out buffer as we go 1430 INX 1440 CPX #$10 done? 1450 BCC .1 no, go on 1460 RTS 1470 1480 .3 PLA restore original byte 1490 JMP PRBYTE returns to caller |
Apple Assembly Line is published monthly by S-C SOFTWARE CORPORATION, P.O. Box 280300, Dallas, Texas 75228. Phone (214) 324-2050. Subscription rate is $15 per year in the USA, sent Bulk Mail; add $3 for First Class postage in USA, Canada, and Mexico; add $13 postage for other countries. Back issues are available for $1.50 each (other countries add $1 per back issue for postage).
All material herein is copyrighted by S-C SOFTWARE, all rights reserved.
Unless otherwise indicated, all material herein is authored by Bob Sander-Cederlof.
(Apple is a registered trademark of Apple Computer, Inc.)