Skip to content

Lists of ROM entry points and zero page addresses for various CBM 8-bit machines (PET, VIC-20, C64) generated directly from assembler listing files

License

Notifications You must be signed in to change notification settings

ethandicks/cbmsymbols

Repository files navigation

cbmsymbols

There have been lists and lists of common ROM entry points and zero page locations for Commodore 8-bit computers going back to the first major PET ROM revision in 1979. These lists were first published in newsletters then books then much, much later, on the Web. In order to complete a different project of mine, I needed a fairly extensive list of BASIC and Kernal entry points spanning original PET BASIC through at least the Commodore 64. To build the list I needed, it seemed most expedient to use the listing files from the cbmsrc project.

Provided here is a Perl script to slurp up these listing files and extract all the symbols, and emit them in formatted columns of numbers that are easy to grep and could easily be imported into spreadsheets or tables in any markdown language. Also provided are a couple of examples of the output that I have found to be useful.

The input data is any listing file produced by cbm6502asm. The important formatting is that symbols are emitted on lines beginning with a space, and any line that doesn't start with a space will be discarded here. A line of acceptable symbols looks like this:

SYMBOL VALUE
 ABS      DB64    ADDEND   0025    ADDFRS   C2D0    ADDIND   D1FC
 ADDON    C803    ADDPR2   0002    ADDPR4   0004    ADDPR8   0008
 ADDPRC   0001    AFFRTS   C550    ANDMSK   0046    ANDOP    CECB

Symbols are expected to be in upper case, interleaved with the values, as upper case hex with no numeric base markers. Other symbol table formats could be added with minimal effort.

Usage:

./mkmmap HDR1,file1.txt HDR2,file2.txt [COMMENT,symbols.csv]...

Wrapper scripts:

For convenience, wrapper scripts can help filter undesired tags and comments, and combine multiple label files

./petmap.sh # Three versions of PET ROMs
./c64map.sh # Three versions of Commodore 64 ROMs
./allmap.sh # Three versions of PET ROMs, VIC-20 ROMs, and Commodore 64 ROMs

Detailed example:

Combine the BASIC and KERNAL (and EDIT ROM) listings for three versions of PETs (it is not required to remove the listing lines)

$ cat BASIC_PET_V1_REC.lst KERNAL_PET_1.0_REC.lst >PET1.txt
$ cat BASIC_PET_V2_REC.lst KERNAL_PET_2.0_REC.lst >PET2.txt
$ cat BASIC_PET_V4_REC.lst KERNAL_PET_4.0_REC.lst >PET4.txt

Then to produce a memory map of all three versions and to sort the symbols in preference of BASIC 4.0, then Original BASIC, then Upgrade BASIC:

./mkmmap 1.0,PET1.txt 2.0,PET2.txt 4.0,PET4.txt | (sed -u 1q; sort -k4,4 -k2,2 -k3,3)

(the sed command pulls out the header line prior to sorting)

The resulting output is:

LABEL     1.0     2.0     4.0     
MRCHR      ----   $D75A    ----
PKINC      ----   $D766    ----
LE082      ----   $E082    ----
.
.
.
ERRNF     $0000   $0000   $0000
USRPOK    $0000   $0000   $0000
BLF       $0001   $0001   $0001
ADDPRC    $0001   $0001   $0001
BUFPAG     ----   $0002   $0002
.
.
.
ISCNTC    $FFE1   $FFE1   $FFE1
CGETL     $FFE4   $FFE4   $FFE4
CCALL     $FFE7   $FFE7   $FFE7

Additional example:

Expanding on the original example, to add VIC-20 and Commodore 64 symbols and a comments field, start by generating the additional listing files. There are three major versions of C64 KERNAL ROMs so the entire range is:

$ cat BASIC_VIC.lst KERNAL_VIC_04.lst >VIC20.lst
$ cat BASIC_C64.lst KERNAL_C64_01.lst >C64-01.txt
$ cat BASIC_C64.lst KERNAL_C64_02.lst >C64-02.txt
$ cat BASIC_C64.lst KERNAL_C64_03.lst >C64-03.txt

On top of the extra two address columns, one can find or generate a CSV file with a label and a comment. Because existing comments files are not well formed, mkmmap splits the comments CSV on the first comma only and does not require the text field to be wrapped in quotes, even if it contains commas. Symbols are usually in upper-case, except when a symbol could be defined twice (as when a BASIC listing is joined to a KERNAL listing) where the second definition needs to have the symbol in lower-case for uniqueness.

Sample symbol CSV file

STXTPT, "txtptr"="txttab"-1
LIST, LIST instruction
GOLST, (Convert char. $ to # in 11-12)
LSTEND, +
TSTDUN, (done?)
TYPLIN, (Print the integer in A,X)
PRIT4, +
PLOOP, (Print character in A)
PLOOP1, +
GRODY, (Jump to "ready")
QPLOP, +

With a symbol file one just adds an additional input arg with the header name 'COMMENT' or 'DESC' and mkmmap treats those as a special case.

The command:

./mkmmap 1.0,PET1.txt 2.0,PET2.txt 4.0,PET4.txt VIC20,VIC20.txt C64,C64-03.txt COMMENT,all_symbols.csv | (sed -u 1q; sort -k4,4 -k3,3 -k2,2)

Will produce output similar to:

VALTYP    $005E   $0007   $0007   $000D   $000D   Flag for variable type: 00=numeric; ff=string
INTFLG    $005F   $0008   $0008   $000E   $000E   Flag for # type: 80=integer; 00=floating point
DORES     $0060   $0009   $0009   $000F   $000F   Flag whether can crunch reserved words
GARBFL    $0060   $0009   $0009   $000F   $000F   x
SUBFLG    $0061   $000A   $000A   $0010   $0010   Flag which allows subscripts in syntax (FNX flag)
INPFLG    $0062   $000B   $000B   $0011   $0011   Flags input or read (00=INPUT; 40=GET; 98=READ)
DOMASK    $0063   $000C   $000C   $0012   $0012   Mask used by relation operations (Comparison Evaluation Flag)
TANSGN    $0063   $000C   $000C   $0012   $0012   Flag sign of tangent
DSDESC     ----    ----   $000D    ----    ----   ds$ length
CHANL      ----   $000E   $0010    ----    ----   Active I/O Channel #
.
.
.
FOR       $C649   $C658   $B6DE   $C742   $A742   FOR instruction
NOTOL     $C65A   $C669   $B6EF   $C753   $A753   +
LDFONE    $C692   $C6A1   $B727   $C78B   $A78B   (Continue to build FOR vectors)
ONEON     $C6A6   $C6B5   $B73B   $C79F   $A79F   (Extract a FAC sign)
NEWSTT    $C6B5   $C6C4   $B74A   $C7AE   $A7AE   Read & execute next statement
DIRCON    $C6C4   $C6D4   $B759   $C7BE   $A7BE   +
DIRCN1     ----   $C6E4   $B769   $C7CE   $A7CE   +
GONE      $C6E9   $C6F7   $B77C   $C7E1   $A7E1   Dispatches next byte "chrget" returns
GONE3     $C6F2   $C700   $B785   $C7ED   $A7ED   Dispatches A if <>0 else loop to "newstt"
GONE2     $C6F5   $C702   $B787   $C7EF   $A7EF   (Entry from b8e7)
GONE4      ----    ----   $B795    ----    ----   +
GLET      $C6F9   $C717   $B7A2   $C804   $A804   (Jump to perform LET)
MORSTS    $C6C8   $C71A   $B7A5   $C807   $A807   (A ":"?)
SNERR1    $C6CC   $C71E   $B7A9   $C80B   $A80B   (Jump to print "syntax error")
GO         ----   $FECF   $B7AC    ----    ----   Handle GO token - find a TO
RESTOR    $C70D   $C730   $B7B7   $C81D   $A81D   RESTORE statement
RESFIN    $C717   $C73A   $B7C1   $C827   $A827   (Entry from bce2)
ISCRTS    $C71B   $C73E   $B7C5   $C82B   $A82B   (End of RESTORE)
STOP      $C71C   $C73F   $B7C6   $C82F   $A82F   STOP instruction if carry set - else...
BSTOP     $C71C   $C73F   $B7C6    ----    ----   O.S. Substitute: "stop" is also a label at $f343

The historic comments from 1981 are somewhat stylistically inconsistent. Some edits have been made, but in the future, a nice, clean, edited copy would be nice. In addition to meaningful text contents, the original comments used "+" as a continuation character from the previous comment in blocks of code where there aren't specific callable entry-points. On top of that, the use of "x" is something of a visible place-holder for symbols that never had a comment in the 1981 list, or ones that slipped through the cracks when processing the files for this repo. Some comments are new and are intended to be used to filter out lines that are largely uninformative such as constants defined in the original source code that do not correspond to items in the memory map, or local symbols that exist in the code but are not helpful to be highlighted.

One unimplemented feature in the 1981 list is multi-line comments. Presently, all output lines begin with a symbol or an address.

Also, as part of the disambiguation mechanism for symbols that appear twice (in the BASIC listing and in the KERNEL listing), some symbols are printed out in lower case. The second appearance must be in lower case in the symbol CSV file to line up the correct values, but an upcoming feature will be to output all symbols in upper case regardless of how they appear in the symbol CSV file.

About

Lists of ROM entry points and zero page addresses for various CBM 8-bit machines (PET, VIC-20, C64) generated directly from assembler listing files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published