System16 - 16 bit CPU Core

Introduction:

The CPU core design I am working on at home started off as a 6809 VHDL design, which was too big to
fit in a 200K gate FPGA. I am in the process of simplifying the design to a 16bit architecture with 14 general
purpose registers, but using the 6809 ALU design extened to 16 bits. Although initially this looks like a larger
design, I am hoping to reduce the size of the core by making the CPU more orthogonal. All the registers
are 16 bit including the Program Counter so it can address 64Kwords.The biggest file in the 6809 design was
in the state sequencer for the instructions, so I am hoping a more regular design will simplify that.

I am using the OpenCores.org mini Uart and MC6845 CRTC core around the CPU core to provide a complete
SOC (System On a Chip). I am looking to multiplex the 16 bit 15nsec SRAM between the CPU and the CRTC,
and use a 24MHz Clock (41 nsec cycle period).

Instruction Opcode Map

Instruction Bus Cycle Description

Address Unit Diagram

Arithemtic Logic Unit Diagram

Barrel Shifter Unit Diagram


Design Updates:

23 May 2002

I've modified the addressing modes so there are now no indirect indexed addressing mode, only absolute
and PC relative indirect addressing. I've defined a Constant addressing mode, which uses the effective address
register number as an immediate constant. Note that the constant addressing mode is valid for both absolute
and indexed addressing modes. Effective Address register 7 is used to signify absolute addressing but the
register number is used as a constant if the EA mode is set for constant addressing. I've also added indexed
(no offset) and indexed PC relative (no offset) instructions to supplement the addressing modes.

I've added static (#n) and dynamic (Rea) Shift instructions and re-instated the Bit operators. These instructions
only operate on the target register (Rt). The Target Register is assumed to be 16 bit and the Control Register
(Rea) always 8 bit.  Rotate Left and Rotate Right, (ROL and ROR) have been renamed for this purpose and
do not use the carry bit. The MSBs are wrapped around to the LSBs using a barrel shifter.

The single operand instructions are now called RCL and RCR (Rotate through Carry Left and Rotate
through Carry Right) which is a better description of what the instructions actually do.
The Shift Register Left and Shift Register Right (SHL and SHR) instructions shift the register up or down
by the specified number of bits without wraping around, using the carry bit as the MSB for SHR or LSB for SHL.

I've also simplified the Push and Pull instructions so they only push and pull one register. Its a bit tedious saving
a batch of registers, but it means that the inherent instructions are all single opcodes, rather than some having a
second word. The Push and Pull instructions will always work on both higher and lower 8 bit register pairs.
The LSB of the register number defines big or little endian storage.

The software interrupt now uses the effective address register field as a 4 bit vector number. The top 16 words
of memory can be used for Interrupt Vectors. Note that software interrupts and hardware interrupts share the same
vectors. I'm not sure if this is a good idea. If they do, software interrupt will have to mask the interrupts in the
interrupt mask word just like hardware interrupts. This means that low priority software interrupts will be blocked
by high priority hardware or software interrupts, so there is a possibility that you can hang the CPU :-(

I'm not sure if I should have 15 interrupts + Reset or 7 interrupts + Reset

The 68000 only prioritises hardware interrupts. Presumably the interrupt status is not cleared on a return from
interrupt, so if you return from a SWI or TRAP it won't affect the harware interrupt status. hmmmmmm.....

I've removed the condition codes from Register 7 and made Register 7 the stack pointer rather than register6.
It seemed pointless being able to perform arithmetic operations on the Condition Codes and Interrupt Mask.
Additional instructions have been added to Load Condition Codes (LCC), Store Condition Coded (SCC),
And Condition Codes (ACC) and Or Condition Codes (OCC). I've also added Load Interrupt Mask (LIM)
Store Interrupt Mast (SIM), And Interrupt Mask (AIM) and Or Interrupt Mask (OIM).
All Condition Code and Interrupt Mask instructions use an 8 bit register as the argument.

This does mean you cannot use the stack pointer as an effective address. This is probably a good idea, although
it does mean any stack based operations, such as manipulating local variables in a C function must be done
with a frame pointer. You have to load another register with the stack pointer and index the stack that way.
A LINK and UNLINK instruction would be handy to allocate stack space, but I have run out of opcode space.

7th March 2002

It was suggested to me some time back that the idea of making a purely 16 bit machine was fraught
with peril and that most designed were eventually modified for byte manipulation. To that end I have
made the register file 16 x 8 bit registers or 8 x 16 bit registers. The Size field is now only one bit,
selecting either byte (8 bit) or Word (16 bit) which gives scope to expand the addressing modes by
one bit.

The addressing range will be 64Kbytes and not 64K words as before. There will be two 8 bit ALUs
that can be concatenated for 16 bit operations. A byte reversal switch, to the left and right sides of
both ALUs are derived formed from the LSB of the register number. This will allow access to the
upper or lower byte of a register, to the lower ALU, and doubles as a big endian / little endian switch
on word accesses. The upper ALU will pass the opposite byte in 8 bit mode which means that all
memory accesses can be 16 bit. Note that 16 bit accesses must be byte aligned; a bit messy admittedly,
but the 68000 survived doing that for many years.

I have not modified the assembler or simulator at this stage.

17th November 2001

I've been a bit slack on the design these past couple of months, being busy with a 6809 Flex computer
proposal for the Flex and UniFlex Users group . I am also interested in designing a board using the
68EZ328 Dragon Ball processor as used by the Open Hardware group , running uCLinux with a
Spartan II FPGA for image processing.

I have gone through the instruction cycle timing, but its not complete yet. I have also used the Cygwin tools
to compile an assembler and simulator. The assembler and simulator are not complete, but I've put them up
on the web in case anyone would like to look at them or work on them (as unlikely as that may seem).

20th August 2001

1. I have re-arranged the instructions because I needed to add an LEA (Load Effective Address) instruction
for position independant code as well as an EOR (Exclusive OR), BIT (Bit Test) and MUL (8 bit Multiply).

2. The static shift operators have been limited to one bit shifts, because that is what is used most in arithmetic
calculations. More than one shift can be done with multiple instructions or a loop. I put the Shift instructions
in the Single Operand line.

3. The static Bit operators have been removed, as I figured that can just as easily be done with other instructions:
"BIT" for "BTST", "AND" for "BCLR", "OR" for "BSET" and "EOR" for "BCHG". The only difference is that
bit operators work on a bit number where as the logical operators work on a bit mask, which is probably more
useful.

4. Conditional Branches are now all PC Relative. An 8 bit signed offset is included in the instruction.This is more
consistant with the original 6809. I figured I could use a zero offset to indicate that the following byte was a
16bit offset for Long Branches.

5th October 2001

1. Added "Word" (.W), "Low Byte" (.L), "High Byte" (.H) and "Double Word" (.D) to the opcode map.
Double word format is still being worked out, however it is propose that it will be microcoded as two 16 bit
word operations with the appropriate condition codes carried over from the second operation.

2. The assembler mnemonics for bracnches will specificly refer to Long Branches (LBRA) as well as short
branches (BRA) even though it is the same opcode. The reason is to avoid phasing errors that result from
trying to guess the length of the intruction in forward references. ie. you cannot guess in the first pass if a
forward reference offset is larger or smaller than 128 bytes.

3. Moved inherent operators under single operand instructions to save space in the opcode map. Inherent
operators do not have a size or effective addressing mode so those bits are used for sub instruction decoding.

4. Conditional Branches have been moved form 0010 to 0001
Load Effective Address (LEA) has been moved from 1111 to 0010
This leaves a spare opcode at 1111 (f line).

Tools:

Assembler:

The assembler is based on the Motorola microprocessor assembler suite. I have changed the output format
to a modified Intel Hex format to match the simulator. The assembler does not support byte addressing any
more. All byte accesses must be specifically specified in the instruction as a .W word, .L lowbyte or .H high
byte instruction. Psuedo ops have also been modified to reflect word addressing rather than byte addressing.

Simulator:

I have re-written a simulator for the System 16 in C++. It is based on Ray Bellis's Usim0.91 for the 6809
but is so different as to be considered a complete re-write. I have included my FD1771 Floppy disk controller
simulator for all it's worth and I need to modify the mc6850 similation code to match the MiniUART design
from opencores.org.

Its neither the Assembler or Simulator are complete, but you can down load it to take a look at what I have
done so far. I am using Cygwin to compile it.

UAsm16.tgz
USim16.tgz

Kbug16:

The monitor program for the simulator needs a lot of work. I started off with a 6809 monitor program, but the
stack manipulation, and major change in register allocation make a complete re-write necessary.

Processor Model:

Registers:

R0.H
R0.L
R1.H
R1.L
R2.H
R2.L
R3.H
R3.L
R4.H
R4.L
R5.H
R5.L
R6.H
R6.L
Stack pointer R7
Interrupt Masks IM Condition Codes CC
Program Counter PC

Condition Code Register


B7
B6
B5
B4
B3
B2
B1
B0



H
N
Z
V
C

Interrupt Mask Register

There are 7 hardware interrupts (IRQ0 to IRQ6) and Reset (IRQ7). Interrupts are prioritised.
The Interrupt Mask register reflects the interrupt level that the CPU is sevicing.
Interrupts at and below the current interrupt level are masked. The interrupt mask will be 1 more than the
interrupt number. ie. if Interrupt IRQ0 is generated the interrupt mask register will read 001, similarly
Interrupt IRQ6 will read 111. Interrupt 7 or Reset is not maskable.

The interrupt Mask bits are set on an interrupt *after* the interrupt service routine is called and the
mask register and condition codes are pushed onto the stack. The mask register can be cleared by
using the Load Interrupt Mask instruction (LIM) or it may be restored by poping the Interrupt Mask and
Condition Code Registers on a Return from Interrupt.

The Interrupt Mask bits are priority encoded so that Reset (The highest priority interrupt) will mask all
lower level interrupts. This is the only way that the Interrupt Mask register can be restored in an orderly
manner when returning from interrupts. Ie. The lowest priority interrupt Mask will be the last to be restored
in the case of nested interrupts.

B7
B6
B5
B4
B3
B2
B1
B0





IM2
IM1
IM0