CPU::Z80::Assembler - a Z80 assembler


CPU-Z80-Assembler documentation Contained in the CPU-Z80-Assembler distribution.

Index


Code Index:

NAME

Top

CPU::Z80::Assembler - a Z80 assembler

SYNOPSIS

Top

  use CPU::Z80::Assembler;

  $CPU::Z80::Assembler::verbose = 1;
  $CPU::Z80::Assembler::fill_byte = 0xFF;
  $binary = z80asm(q{
      ORG 0x1000
      LD A, 1
      ...
  });
  $binary = z80asm_file($asm_file);
  $binary = z80asm(@asm_lines);
  $binary = z80asm('#include <file.asm>');
  open($fh, $file); $binary = z80asm(sub {<$fh>});

  $lines  = z80preprocessor(@asm_lines); $line  = $lines->get;
  $tokens = z80lexer(@asm_lines);        $token = $tokens->get;

DESCRIPTION

Top

This module provides functions to assemble a set of Z80 assembly instructions given as a list or as an iterator, or a Z80 assembly source file.

EXPORTS

Top

All functions are exported by default.

FUNCTIONS

Top

z80asm

This function takes as parameter a list of either text lines to parse, or iterators that return text lines to parse.

The list is passed to z80lexer, that in turn calls z80preprocessor to handle file includes, and then splits the input into tokens.

The stream of tokens is passed on to CPU:Z80::Assembler::Parser that parses the input and generates the object image in CPU::Z80::Assembler::Program. Assembly macro expansion is handled at this stage by CPU::Z80::Assembler::Macro.

The assembly program is composed by a list of CPU::Z80::Assembler::Segment, each representing one named section of code. Each segment is composed by a list of CPU::Z80::Assembler::Opcode, each representing one assembly instruction.

The output object code is returned as a string.

If the $CPU::Z80::Assembler::verbose variable is set, an output listing is generated by CPU::Z80::Assembler::List on standard output.

Assembly is done in five steps:

1

input is preprocessed, scanned and split into tokens

2

tokens are parsed and converted to lists of opcodes

3

addresses for each opcode are allocated

4

relative jumps are checked for out-of-range jumps and replaced by absolute jumps if needed

5

object code is generated for each opcode, computing all expressions used; the expressions are represented by CPU::Z80::Assembler::Expr.

z80asm_file

This function takes as argument a Z80 assembly source file name and returns the binary object code string.

z80preprocessor

This function takes as parameter a list of either text lines to parse, or iterators that return text lines to parse.

The list is passed to the Asm::Preproc that takes care of file includes and handles the %line and #line lines generated by external preprocessors like cpp or nasm.

The result is a Asm::Preproc::Stream of Asm::Preproc::Line objects that contain each of the input lines of the input.

z80lexer

This function takes as parameter a list of either text lines to parse, or iterators that return text lines to parse.

It calls z80preprocessor to split the input into a Asm::Preproc::Stream of Asm::Preproc::Line objects representing the source lines of the Z80 assembly language program.

It returns a stream of Asm::Preproc::Token objects for each assembly token in the input.

Each token contains a type string, a value and a Asm::Preproc::Line object pointing at the input line where the token was found.

SCRIPTS

Top

z80masm

  z80masm sourcefile [destfile]

The z80masm (z80masm) program, installed as part of this module, calls the z80asm_file() function to assemble an input source file, generates an output binary file, and produce an assembly listing on standard output.

SYNTAX

Top

Input line format

Instructions are written in ASCII text. Opcodes are separated by new-line or colon : characters. Comments start with ;. Lines starting with # are ignored, to handle files generated by pre-processors.

    ; comment beginning with ;
    # comment beginning with # as first char on a line
    [LABEL [:]] INSTRUCTION [: INSTRUCTION ...] [; optional comments]
    LABEL [:]
    LABEL = EXPRESSION [; ...]

Preprocessing

See Asm::Preproc.

Tokens

The following tokens are returned by the stream:

reserved words

  Asm::Preproc::Token('word', 'word', $line)

All the reserved words and symbols are returned in lower case letters.

strings

  Asm::Preproc::Token(STRING => $string, $line)

Single- or double-quoted strings are accepted. The quote character cannot be used inside the string. The returned string has the quotes stripped.

identifiers

  Asm::Preproc::Token(NAME => $name, $line)

The program identifiers must start with a letter or underscore, and consist solely of letters, underscores and numbers. There is a special case $ identifier that represents the current location counter.

Identifiers are returned with case preserved, i.e. the assembler is case-sensitive for labels and case-insensitive for assembly reserved words.

numbers

  Asm::Preproc::Token(NUMBER => $decimal_number, $line)

Numbers are converted to decimal base from one of the following formats:

Z80 assembly

See Asm::Z80::Table for all allowed Z80 instructions, including the undocumented Z80 opcodes and composed instructions.

relative jumps

The DJNZ and JR instructions take an address as their destination, not an offset. If you need to use an offset, do sums on $. Note that $ is the address of the *current* instruction. The offset needs to be calculated from the address of the *next* instruction, which for these instructions is always $ + 2.

A relative jump instruction can always be used. The assembler automatically replaces it with an absolute jump if the distance is too far, or if the given flag is not available, e.g. jr po,NN. A djnz NN instruction is converted to dec b:jp nz,NN if the distance is too far.

stop

This extra instruction (which assembles to 0xDD 0xDD 0x00) is provided for the convenience of those using the CPU::Emulator::Z80 module.

Pseudo-instructions

defb

Accepts a list of expressions, and evaluates each as a byte to load to the object file.

defw

Accepts a list of expressions, and evaluates each as a 16-bit word to load to the object file, in little-endian order.

defm, deft

Accepts a list of literal strings, either single- or double-quoted. The quoted text can not include the quotes surrounding it or newlines. The characters are loaded to the object file.

defmz

Same as defm, but appends a zero byte as string terminator after each string.

defm7

Same as defm, but "inverts" (i.e. bit 7 set) the last character of the string, as string terminator.

equ, =

Labels are created having the value of the address they are created at.

Alternatively labels may be assigned expressions by using equ or =. The expressions use the Perl operators and can refer to other labels by name, even if they are defined further on the file. The $ can be used in the expression to represent the current location counter.

  label      = $ + 8
  otherlabel = label / 2 + 3

org

Tell the assembler to start building the code at this address. If it is not the first instruction of the assembly, the gap to the previous location counter is filled with $CPU::Z80::Assembler::fill_byte. If absent, defaults to 0x0000.

include

Recursively include another file at the current source file.

Macros

Macros are supported. See CPU::Z80::Assembler::Macro for details.

BUGS and FEEDBACK

Top

We welcome feedback about our code, including constructive criticism. Bug reports should be made using http://rt.cpan.org/.

SEE ALSO

Top

CPU::Z80::Assembler::Macro CPU::Z80::Assembler::Parser CPU::Emulator::Z80

AUTHORS, COPYRIGHT and LICENCE

Top

CONSPIRACY

Top

This software is also free-as-in-mason.


CPU-Z80-Assembler documentation Contained in the CPU-Z80-Assembler distribution.
# $Id: Assembler.pm,v 1.56 2010/11/21 16:32:49 Paulo Exp $

package CPU::Z80::Assembler;

#------------------------------------------------------------------------------

#------------------------------------------------------------------------------

use strict;
use warnings;

use Asm::Preproc;
use Asm::Preproc::Lexer;
use CPU::Z80::Assembler::Program;
use CPU::Z80::Assembler::List;

use Text::Tabs; 						# imports expand(), unexpand()
use Regexp::Trie;

use vars qw(@EXPORT $verbose);

our $VERSION = '2.13';
our $verbose;
our $fill_byte = 0xFF;

use base qw(Exporter);

@EXPORT = qw(z80asm z80asm_file z80preprocessor z80lexer);

#------------------------------------------------------------------------------

#------------------------------------------------------------------------------
sub z80asm {
	my(@input) = @_;
	my $list_output = ($CPU::Z80::Assembler::verbose) ? 
					CPU::Z80::Assembler::List->new(
										input => \@input, 
										output => \*STDOUT) :
					undef;
	my $program = CPU::Z80::Assembler::Program->new();
	my $token_stream = z80lexer(@input);
	$program->parse($token_stream);
	my $bytes = $program->bytes($list_output);
	$list_output->flush() if $list_output;
	return $bytes;
}
#------------------------------------------------------------------------------

#------------------------------------------------------------------------------
sub z80asm_file {
	my($file) = @_;
	return z80asm("#include <$file>");
}
#------------------------------------------------------------------------------

#------------------------------------------------------------------------------

sub z80preprocessor { 
	my(@input) = @_;
	my $pp = Asm::Preproc->new;
	$pp->include_list(@input);
	
	# create a new stream to handle "INCLUDE" statement
	return Asm::Preproc::Stream->new( 
		sub {
			while (1) {
				my $line = $pp->getline
					or return undef;			# end of input
				
				# handle "INCLUDE"
				if ($line->text =~ /^\s*(include\s+.*)/i) {
					$pp->include_list("%$1");	# handle %include...
					next;						# get next line
				}
				else {
					return $line;
				}
			}
		}
	);	
}

#------------------------------------------------------------------------------

#------------------------------------------------------------------------------
# Keywords and composed symbols
my %KEYWORDS;
for (split(" ", "
								a adc add af af' and b bc bit c call ccf cp cpd cpdr cpi cpir 
								cpl d daa de dec di djnz e ei equ ex exx h halt hl im 
								in inc ind indr ini inir ix iy jp jr l ld ldd lddr ldi ldir m 
								nc neg nop nz or otdr otir out outd outi p pe po pop push 
								res ret reti retn rl rla rlc rlca rld rr rra rrc rrca rrd rst 
								sbc scf set sla sll sli sp sra srl sub xor z
								ixh ixl iyh iyl hx lx hy ly xh xl yh yl i r f
								org stop defb defw deft defm defmz defm7 macro endm
						")) {
	$KEYWORDS{$_}++;
}
my $SYMBOLS_RE = _regexp("
								<< >> == != >= <= 
						");

#------------------------------------------------------------------------------
# lexer
my $lexer = Asm::Preproc::Lexer->new(
	
	# ignore comments and blanks except newline
	COMMENT	=> qr/ ; .* /ix,			undef,		
	BLANKS	=> qr/ [\t\f\r ]+ /ix,		undef,

	# newline
	NEWLINE	=> qr/ \n /ix,				sub {["\n", "\n"]},

	# string - return without quotes
	# Sequence (?|...) not recognized in regex in Perl 5.8
	STRING	=> qr/ (?: ' [^']* '
					 					 | " [^"]* " ) /ix,	sub {[$_[0], 
											substr($_[1], 1, length($_[1])-2)]},
	
	# numbers
	NUMBER	=> qr/ ( \d [0-9a-f]+ ) h \b /ix,
										sub {[$_[0], oct("0x".$1)]},

	NUMBER	=> qr/ [\$\#] ( [0-9a-f]+ ) \b /ix,
										sub {[$_[0], oct("0x".$1)]},
										
	NUMBER	=> qr/ ( [01]+ ) b \b /ix,	sub {[$_[0], oct("0b".$1)]},
										
	NUMBER	=> qr/ % ( [01]+ ) \b /ix,	sub {[$_[0], oct("0b".$1)]},
										
	NUMBER	=> qr/ 0x [0-9a-f]+ | 0b [01]+ \b /ix,
										sub {[$_[0], oct(lc($_[1]))]},
										
	NUMBER	=> qr/ \d+ \b /ix,			sub {[$_[0], 0+$_[1]]},
										
	# name or keyword, after numbers because of $FF syntax
	NAME	=> qr/ af' | [a-z_]\w* | \$ /ix,
										sub { my($t, $v) = @_;
											my $k = lc($v);
											$KEYWORDS{$k} ? [$k, $k] : [$t, $v];
										},
										
	# symbols
	SYMBOL	=> qr/ $SYMBOLS_RE | . /ix,	sub {[$_[1], $_[1]]},

);										
	
#------------------------------------------------------------------------------
# _lexer_stream(INPUT)
# 	INPUT is a Stream of $line = Asm::Preproc::Line,
#	as returned by z80preprocessor()
#	The result Stream contains CPU::Z80::Assembler:Token objects
#	with token type, value, and the line where found
#	Reserved words are returned with type = value in lower case.
sub _lexer_stream {
	my($input) = @_;
	my $this_lexer = $lexer->clone;		# compile $lexer only once
	$this_lexer->input( $input );		# define our input stream
	
	return $this_lexer->stream;
}

#------------------------------------------------------------------------------
# _regexp(LIST)
#	Return a regexp to match any of the strings included in LIST, as blank separated
#	tokens
sub _regexp { my(@strings) = @_;
	my $rt = Regexp::Trie->new;
	for (@strings) {
		for (split(" ", $_)) {
			$rt->add($_);
		}
	}
	return $rt->_regexp;				# case-insensitive
}

#------------------------------------------------------------------------------
sub z80lexer {
	my(@input) = @_;
	return _lexer_stream(z80preprocessor(@input));
}
#------------------------------------------------------------------------------

#------------------------------------------------------------------------------

#------------------------------------------------------------------------------

1;