s502 assembler
A very simple assembler for the 6502 line of processors written in C
|
Source files consist of valid tokens and comments
comments start with a semicolon (;) and end at the end of the line Comments are fully ignored by the assembler.
tokens are a string of non-whitespace non-semicolon characters in a line They can be surrounded by any number of whitespace and can be followed by a comment in the same line, but no line can contain two tokens.
Tokens must be of 3 types:
Labels are aliases for addresses in memory. They automatically mark the next address where they are defined, but they do not take any space in the resulting binary. Label tokens must end with a colon (:)
directives are commands for the assembler. They might or might not take space in the binary.
directive tokens must start with a dot (.)
Valid commands:
define
ifdef
printc
print
include
endif
ifndef
ifbeq
org
data
pad
The assembler is not case-sensitive with directive names
Define a constant with name and value.
Conditional compilation until the matching .endif
if the constant is defined.
Node: does not need @ in front of the name!
Log the name and value of a constant.
Note: does not need @ in front of the constant name!
Logs the rest of the line
Include another file in the source.
Basically automated copy-paste.
Searches the file in CWD.
Close a block of conditional compilation
Inverted ifdef
Conditional compilation: first >= second
Both can be numbers!
Set PC to number
Can have many entries of these 3 types:
<number>
(raises an error if it can't fit in a single byte)w:<number>
(gets encoded in $LLHH big-endian format)"<str>"
(must not include whitespace) The binary data from these entries will be concatenated an added in the resultPad remaining space to to
with byte with
instructions are actual 6502 instructions
They must consist of a valid instruction mnemonic (3 letters), followed by an additional whitespace and argument
(i.e they must end after the mnemonic, or have a space and an argument)
See this page for a list of valid mnemonics, arguments and their combinations.
Differences to that page:
The assembler is not case-sensitive with instruction mnemonics or operands.
Numbers in directives and operands can take 4 main forms:
&<labelname>
@<constant>
$<number>
<number>
The assembler is not case sensitive with hex numbers (a-f, A-F), but IS case sensitive with constant and label names.
Both constant and label names have an upper limit set in map.h
All types can have 2 additional modifiers:
<
to extract low byte (lower 8 bits)>
to extract high byte (higher 8 bits)When an operand or a directive takes a 16 bit value, it will get stored as the 6502's format (high endian, $LLHH).
Labels can also be forward-referenced in 2 cases: