CPD1802 BASICs - How it works - from BMP802
- From time to time I ask GOOGLE what is available on 1802, it was such a clean
processor. When I read all of these emails here I thought some of the
material I generated at the time (1980) an wrote up as BMP802 (Brussels Micro
Processor documentation 1980, No 2) at the time might interest other readers.
It consists of 3 parts: Some special features, let us build the 1802 from
scatch, and the pin functions it then ends up with.
Design Ideas Book BMP802
( this book was written mostly by J.Pintaske, apps eng at RCA Europe at the
time, and I still have a copy, no pdf, before this question is asked)
Here some of the pages, I have changed the contents slightly. (corrections
after 20 years)
BMP802 page 3: Some 1802 specialities (mostly not possible with others):
- Small systems even without RAM; subroutines do not need RAM if the SEP
technique is used. On -chip registers might be enough RAM.
- In some cases even Interrupts are possible without RAM
Very fast Interrupt saves only what is needed
Address modes can be optimized
Arithmetic on program counters is possible
Several stack pointers
Multiple data pointers
Many program counters possible for fast subroutine response
Easy to implement interpreters
Special Macro-Instructions are possible
DMA transfer built into an 8bit device
Easy software interrupts using MARK/SEP
16bit I/O with one instruction by using the register contents
Power reduction by use of WAIT line
Power reduction using frequency reduction, dynamic switching, even down to 0
BMP802 page 4 Short Description of the CDP 1802 CMOS Microprocessor
One possibility to understand this microprocessor is to just set up a
wischlist for a processor and then transfer it into CMOS:
1. A 16 bit PROGRAM COUNTER is needed to address 64k Bytes
2. The ACCUMULATOR together with the ALU register manipulate the data
3. Some instructions (e.g. ADD FF + FF) result in an overflow of the
ACCUMULATOR, so a Data Flag DF is needed
4. A STACK POINTER points to a free location in RAM to store and load under
5. One pointer is normally not enough, so with the technology at the time 16
16bit registers is probably the maximum silicon one could afford
6. To address one of these 16 registers, 4 bit are needed. We call these 4
bit N designator. Instructions can use this designator to define which
pointer register should be used as pointer into memory, e.g. use pointer 9
(loaded with 1234) to get the data from there into the ACCUMULATOR.
7. The instruction to be executed would be 8 bit long: 4 bit for instruction
type, 4 bit for register involved. These are the XN instruxtions
8. One disadvantage is that then 4 bits are always used. One could assign one
extra register for a certain amount of time which would leave the 8bits free
for the instructions. The place to store which of the 16 registers is the
actual X register is the 4 bit X-designation register X. It is once set and
then stays until changed. Many load and store instructions use the X.
9. Something like this X designator register could be used for something
different. By defining a P designator register one could define one of 16
program counters, where the P value defines the current 16 bit register being
used as the PROGRAM COUNTER. Soubroutines can be called by just switching to
another (preloaded) register. Switching back to the calling program is the
same, just change the P register which defines who is PROGRAM COUNTER.
10. We need a clean start after RESET, then automatically the P designator
register is reset to 0, so R0 is the program counter after RESET.
11. What about the X register designator. This is a well reset to 0 and can
be changed by software later. Even with X=P=0 some tricks are possible.
12. A direct output flipflop would be nice to have a Quick way to output
something., Let us call this bit Q. It can be used as output for a serial
communications link under software control with flexible baud rate setting -
as well by software. Two instructions to control it: Set Q, Reset Q.
13. Some direct inputs would be an effective way to get for example a switch
status into the processor. At lease one, 4 is better, EFfective Inputs
EF1..4. Branch instructions will test for HIGH and LOW. One of these can be
used for the serial input, controlled by software.
14. The fastest way to change the normal program flow is to INTERRUPT it,
faster then polling. This input will automatically change the P of the
program counter designator to 1 and save X and P into the Temporary register
15. The question with interrupts is always, how much of the CPU status do you
want to save by using a hardware implementation? The ACCU, DF, some
registers?? It will be either too much or too little. So we save nothing
except X,P into T. This means we loose nothing, but the rest is under
software control and can be adapted to the program requirements. We might
want to do something very quickly, for example set Q,so a fast entry is
needed. If not, we got enough time anyway to save later anything required
before we return to the interrupted program.
16. If one Interrrupt is not enough, 4 EF lines could be ORed with the INT
line and per software it is decided which one it is: 1, 2, 3 or 4.
17. The interrupt might have to store data into RAM quickly. So in addition
to setting the program counter to 1, so R1 is the Interrupt program counter,
the X register is as well set to 2. This is wy R2 is normally the STACK
POINTER. As X,P of the interrrupted program have been saved into T, nothing
18. Saving the T register with X,P to stack is normally the fist instruction
of the interrrupt routine.
19. We must have the possibility to allow for an interrrupt or not. The IE
flipfllop is implemented for this reason and can be set/reset by software.
20. The processor can communicate with 64k byte memory. But communication
with IO is important. One simple way to do this to implement IO like an
address in memory, but with "wires" to the outside world. In such a case the
data from memory has to be taken from a stack location or a memory location
into the ACCU and then stored to the relevant I/O RAM location. There must be
a better and faster way: Let us implement 3 additional IO lines N0, N1, N2
which can address up to 8 Iinput and the data flows directly from memory to
IO or vice versa.
21. The big advantage is direct transfer: from stack or RAM directly to the
IO location in one cycle without bothering the ACCU. As there is now a Direct
Memory Access from and to IO - we got one DMA channel for free. If we put the
processor in DMA mode, we use R0 as DMA pointer ( and the program counter is
any other than 0 ) and can transfer data directly as in this mode the DMA
pointer is automatically incremented after each transfer.
22. In one direction this is DMA IN, in the other direction it is DMA OUT. As
it uses the same pointer, one has to use this capability very carefully!
23. This can be used while the program is running, as an additional DMA cycle
is added between fetch and execute (cycle stealing).
24. This DMA can be used for fast transfer e.g.for video data using the 1861
25. All of this together with the power supply pins had to fit (at the time)
into a 40 pin Dual-in-Line package. If we add them up, there are more than
40. So what can we do? We can save a few pins by sending out the 16 bit
address as a sequence of 2x 8 bit and another line will define when the High
Byte has to be latched.. This will change the pin requiremnts from 16 to
8+1=9, saves 7.
26. This sequential addressing has a big advantage: ROMs need less pins as
well. And they can latch the high byte, decode it internally and know, when
they have to be active, no external chip select logic necessary. The address
range is programmed at the same time as the internal bits.
27. This all we need for the processor. One block we forgot: the register
block includes an increment/decrement, so with one instruction or
automatically a 16 bit register can be changed by one.
And this is basically all there is. Some features are not fully explained,
but like this the loved (hated) processor is easy to understand.
Now let us look at all the pins and add them up (active high or low):
- Clock generator in (1) and out (39).
- There are 4 modes: RUN, PAUSE, RESET, LOAD via DMA.
The 2 pins to select these modes are WAIT (2) and CLEAR (3).
- Q for the Quick output flipflop (4)
- The processor has to tell us what it is doing, done via the State Codes:
FETCH, EXECUTE, DMA, LOAD. SC1(5) and SC0(6).
- MRD(7) to read from memory
- Data Bus 0..7 (8..15)
- Vcc(16) this is the IO voltage which might be different from the internal
processor core voltage
- N0, N1, N2 as the 3 I/O lines (17, 18, 19)
- Vss(20) Ground
- Our very Effective Flag inputs EF 1..4 (21,22,23,24)
- The multiplexed address bus which sends high byte first, then the low byte
- TPB(33) timing pulse B for IO transfers
- TPA(34) timing pulse A to latch the high order address byte
- MWR(35) to write to memory
- INT(36) interrupt input
- DMA OUT Input(37) to start a DMA out cycle
- DMA IN input(38) to start a DMA input cycle
- XTAL (39) AS Clock output (see pin 1)
- Vdd(40) is the internal processor voltage which can be higher to allow for
frequency. ( the higher the votage the higher the clock possible, and the
higher the power
All the 40 pins used. I hope these 40 pins make more sense now.
I hope this helps others who would like to understand this " 8bit 68000"
Juergen at EPLDFPGA@...