Undergraduate
Research
May 2002 - December 2002
Advisor - Dr. Jafar Saniie
Partner - Christopher Middendorf
Abstract: The purpose
of this research project was to get familiar with the ATMEL family of
8-bit RISC microcontrollers and design a TV Game Console. We implemented
the game TIC TAC TOE. The ATMEGA163 microcontroller was chosen for this
purpose and was used togenerate TV signals which were then fed to a
TV Monitor. The main program resided in the microcontroller's EEPROM.
Main Hardware Components The main components
required to generate video signals are:
1. The ATMega163 microcontroller mounted on the STK500 development kit.
2. An 8 MHz crystal, which is required for the microcontroller to run
at it’s full speed of 8 MHz
3. Video DAC, which is a network of 3 resistors and a transistor and
is used to generate the three video levels as shown.
Two bits of a port are used to generate three video levels:
• Sync level = 0 volts
• Black level = 0.3
• White level = 1.0
|

Video DAC
|
Problems Encountered
1. Video Signal Standard
There
are three major TV-standards: NTSC, SECAM and PAL. The NTSC (Short for
"National Television System Committe", is the American TV-standard,
it has only 525 scan-lines, but it has an update frequency of 30Hz. SECAM
(Short for "Sequential Color And Memory", is the French TV-standard,
it has improved color stability and higher intensity resolution but with
less color resolution. The European standard is SCART (Syndicat des Constructeurs
d'Appareils Radiorécepteurs et Téléviseurs). We didn’t
realise we were focusing on the wrong standard in the beginning. The SCART
standard. Due to this reason, we were unable to find proper information
and equipment to implement the project. We did a lot of research on this
standard and had to switch to the American Standard, NTSC. This resulted
in a lot of time wasted running in the wrong direction.
2. Timing
As mentioned
above, the timing has to be very precise to be able to generate video
signals properly. Even though the ATMega163 is capable of running at 8MHz,
it was running only at 3.6MHz.The reason being that in order that the
microcontroller runs at 8MHz, an 8MHz crystal has to be plugged on the
STK500 board. This resolved a lot of timing issues.
3. TV Signal or Video
Signal?
Chris
brought in his TV so we could test the software. We spent a lot of time
and efforts analyzing the output signals from the ports, which looked
fine. We could see the video and the syn signals on the digital oscilloscope
but we still didn’t have anything on the TV. After a good number
of days and trying to debug the program and check the hardware over and
over again, we realised that we were using the wrong kind of TV. A regular
TV requires TV signals while TV Monitor requires Video signals. We however
realized this in time and tested the software with a TV Monitor. And voila!!
We had a bright checker board pattern on the screen.
4. Software Availability
A major
hindrance we faced towards the end of the semester when we started writing
the game was the availability of the software. We were initially working
the Student Version of Codevision AVR C Compiler. This version has code
size limitations. In the beginning since we were writing only small programs,
there wasn’t a need to buy the full version. Once the size of the
code started increasing, we could no longer use the student version. We
decided to switch to the GNU C Compiler, which is freeware. The problem
with this compiler is that it is very hard to use. Learning all the tips
and tricks was taking a lot of time. We presented our problem to the Department
and they were more than generous to buy the full Standard Version of the
Cdevision C Compiler
The
complete report is available here.
Close Window |
Computer Society International Design Competition (CSIDC) 2003
Real-time Parking Analysis Solution System (RPASS)
Advisor - Dr. Donald Ucci
Team Members - Raghu Kutty, Nayana Samaranayake
Abstract—This
project introduced a Real-time Parking Analysis Solution System (RPASS)
which is a revolutionary development in parking space management. Over
the past few years parking space has become a major issue in large metropolitan
areas around the globe. While most cities have constructed multistory
parking lots at their heart, people still find it hard to locate a parking
lot with free space without actually visiting the location. Even when
they find such free spaces, they end, more often than not, a few blocks
from their target destination.
Our proposal was a Real-Time Parking Analysis Solution System (RPASS).
This technology would solve many problems associated with parking and
provide the customer with easy access to information in real time. Unlike
conventional systems where the customer has to search for parking lots
before hand and reserve spaces, the proposed technology would allow the
user to gain access to real-time information on parking spaces within
a certain radius of his or her current location or intended destination.
|
System
Overview
The
system was broken down into four main components to facilitate modular
development. The system is comprised of Sensor Control Units, a Parking
Box (P-Box), Central Server and Client handheld device. The Figure 1.1
below shows how the four components interact with each other.
Sensor
Control Units: Sensor Control Units are an essential part of
the system. The sensors can be photo sensors or pressure sensors and provide
the core information regarding the occupancy of a parking lot. Positioned
at the entrances and exits, the units transmit a signal to the P-Box as
a car enters or exits a lot.
P-Box:
A single P-Box in a parking lot handles all sensor Control Units’
functioning in
that parking lot. The P-Box is responsible for providing information on
the occupancy of a parking lot to the Central Server. The P-Box collates
all signals transmitted by the sensor units in the parking lot.
|

Block Diagram of the
system |
Central
Sever:
The central server is responsible for maintaining the status of all parking
lots using the
information sent out by the P-Boxes. The server would be running a database
engine to keep track of this information. The Server would also be hosting
a web-service to provide this information to the Client. According to
the size of the system, the database and web-service could be on two different
machines. The database contains information such as space availability,
location of parking lot and cost that are made available to the public
through a web-service that would be hosted on the server. To facilitate
the needs of PDA, cellular phone and other handheld device users, the
web-service would be using WML.
Client
Device:
Any device that supports Wireless Markup Language (WML), which is
part of the Wireless Application Protocol (WAP), can be used as
a Client Device. PDA’s, cellular phone’s and many other
handheld devices are currently WAP enabled. On the Client side,
the web-service can be accessed over a desktop or any WAP enabled
mobile platform. The user would be able to provide the zip code
or the destination location name (such as “Midway Airport”
through the web service, and get information on parking lots around
that location. Future versions of the system could also use GPS
technology to transmit the GPS coordinates of the current location.
Close Window |
8-Bit
POWERALU
In this
project the chip designed was required to implement the functions of a
basic calculator using an efficient method for approximating the reciprocal
using Goldschmidt’s division or division by convergence. The Goldschmidt
algorithm can be applied to a variety of different systems and is also
suitable for square-root. Some of the requirements are listed below:
Although
division is less frequent among the four basic arithmetic operations,
a recent study shows that in a typical numerical program, the time spent
performing divisions is approximately the same as the time spent performing
additions or multiplications. This is due to the fact that in most current
processors, division is significantly slower than the other operations.
Hence, faster implementations of division are desirable. There are two
principal classes of division algorithms. The digit-recurrence methods
produce one quotient digit per cycle using residual recurrence which involves
(i) redundant additions, (ii) multiplications with a single digit, and
(iii) a quotient-digit selection function. The latency and complexity
of implementation depends on the radix. The method produces both the quotient
which can be easily rounded and the remainder. The iterative, quadratically
convergent, methods, such as the Newton-Raphson, the Goldschmidt and series
expansion methods use multiplications and take advantage of fast multipliers. |
DESIGN FLOW
METHODOLOGY:
The
design flow steps highlighted in the lab manual was carefully followed
to ensure successful completion of the project.
Architecture Design
The different subsystems in the design were then designed and specifications
were written.
The state diagrams for the FSM was also specified at this stage.
Component Design
Verilog descriptions were written for each component in the
architecture design according to the detailed specification. Each component
in the architecture design was simulated and verified. The block schematics
were then synthesized using synopsys and the simulation data obtained
was also verified.
Assembled Chip Core
All subsystems were connected together and more simulation tests were
carried out to verify the integrity of the integrated components. The
entire chip was then simulated with IRSIM to check for correctness
SYNTHESIS
AND PLACE AND ROUTE (P & R) FLOW
A simple script (compile.scr) was modified to perform
the synthesis cells designed using Verilog. The synthesis was
performed using Silicon Ensemble. “dc_shell” was the
command used to carry out the synthesis. After running dc_shell,
some extra pads, corner and supply pads were added to get exactly
40 pads for a Mosis padframe.The
script “iit06_fixpads” was run to undertake this task.
We ran gds2mag to carry out the Place and route. The layout was
thoroughly checked for DRC errors. After extracting the layout
in magic, the script “polyres_convert” was executed
to deal with the resistors in the pad layout.
Close Window
|
|
Custom
Design Layout - 8-Bit Full Adder
Abstract
- In this project, an 8-bit full adder/subtractor circuit was built in
Sue, laid out in Magic and simulated extensively with IRSIM. Gemini was
also used to do an LVS (Layout vs. Schematic) before simulation. Results
of the analysis of the final design done using HSPICE include the propagation
rise and fall times and the critical path delay. |
Main Components
The design of an 8-bit
full adder/subtractor circuit was done using the following components:
transmission gates, positive latch, register, xor gate, full adder and
an and gate. All these are described in the following paragraphs.
Transmission
Gate (TG):
The structure of a CMOS transmission gate consists of an nFET Mn in
parallel with a pFET Mp. The TG is designed to act as a voltage-controlled
switch. The gates are controlled by the complimentary voltages Vg applied
to the nFET, and (Vdd-Vg) applied to the pFET. When Vg is high, both
Mn and Mp are biased into conduction and the switch is closed. If Vg
is low, then both MOSFETs are in the cutoff and the switch is open.
The TG is used to build the 2-1 multiplexor. Figure 1 shows a TG.
Latch: There are many approaches for constructing latches.
One very common technique involves the use of transmission gate multiplexors.
Figure 2 shows a multiplexor-based latch. In this latch the D input
is selected when the clock is high, and the output is held when clock
is low. When Clk is high, the bottom transmission gate is on and the
D input is copied to the Q output. During this phase, the feedback loop
is open since the top transmission gate is off.
Register:
the master-slave configuration is commomly used to design an edge triggered
register. Figure 3 shows the register using this configuration. On the
low phase of the clock, the master stage is transparent and the D input
is passed to the master stage output. During this period, the slave
stage is in the hold mode, keeping its previous value using feedback.
On the rising edge o fthe clock, the master stage stops sampling the
input and the slave stage starts sampling. During the high phase of
the clock, the slave stage samples the output of the master stage, while
the master stage remains in the hold mode. The value of Q is the value
of D right before the rising edge of the clock, achieving the positive
edge-triggered effect.
|

Figure
1: Transmission Gate

Figure
2: Multiplexor based positive latch |

Figure
3: Edge-triggered register using multiplexors
|
The top
level block diagram of the circuit to be designed is shown in Figure 4.
The
first step was to complete the design in SUE. The top level hierarchy
was built in the following stages listed in the order of increasing level
of hierarchy:
1. Multiplexor based positive latch
2. 1 bit register using the positive latch
3. Bitslice comprising of an XOR-FA-Register chain
4. 8 bitslices stacked together one on top of the other to form the 8
bit datapath
5. AND gate to provide the gated clock logic
6. XOR gate to provide the overflow logic
All the above mentioned
components were simulated using IRSIM and corrected when required. The
layout was done in a similar fashion. After the completion of every stage
in Magic, an LVS comparison was done to validate the functionality of
the layout. Since the schematics had been simulated successfully, any
mismatches produced by Gemini indicated a problem in the layout.
Click
here for the final schematic. Click here for
the final layout.
Critical Path Delay
obtained = (bInvert to sum7) + (register clock to R7) = 4.88 +
0.518 ns = 5.398 ns
Close Window
|

Figure
4. Top Level Block Diagram |
4-Bit
Datapath - Calculator
Team partner: Nikhil Jain
Stack
Architecture is one of the microprocessor architectures that present its
various complexities and details in the design process. Unlike Accumulator,
Two or Three register architecture; Stack Architecture does not have any
General Purpose Registers (GPR). This complicates the design process for
a 4-bit central processing unit (CPU) or microprocessor. The size of the
stack depends on the type of instructions used and data width. For example,
a 4-bit CPU will require 8 nibbles of 4-bits each to store a complete
word. The stack only allows the top most element to be accessed at any
given clock cycle. Intel 4004 microprocessor was the first single chip
4-bit microprocessor built in 1971. |
CPU
Specifications
- Datapath: 4 bits
- RAM width: 4 bits
- Stack width: 4 bits
- Address Space: 4096
locations
- Address Bus: 12 bits
- No Exceptions or
interrupts
- Memory Mapped I/O
- Little Endian Memory
Organization
- Reserved Memory Range:
0x FE2 – 0x FFF (for return address)
- Variable Instruction
Formats
- Does not have any
General Purpose Registers (GPR)
|
|
CPU
Details
- 4-bit Datapath
means that all the data buses are 4 bit wide only. Therefore, the
CPU (central processing unit) can handle operations on 4-bit ‘words’
only. Thus, memory and stack data width is also 4 bits.
- However, Address
Busses are 12-bits wide, giving an address space of 4096 nibbles of
4 bits each. Thus a maximum of 512 thirty-two-bit words can be stored.
- As Interrupt handling
requires extra hardware, we are not implementing that; it makes the
CPU costly and doesn’t fulfill the goal of this project.
- Memory Mapped I/O
means that all the devices read/write data from/to a specific range
of address.
- ‘Little Endian’
memory organization is used for this architecture. It does not have
any specific benefit as compared to the Big Endian memory model, it
is just a matter of choice and the compiler has to be programmed accordingly.
In Little Endian model, least significant byte has the lowest address.
- As the CPU performs
system calls, it has to save the return address from a sub-routine
in memory. To make sure that the address is not overwritten by data,
the memory is partitioned and only the return address can be written
to it.
Address range 0x FE2 – 0x FFF is reserved for this purpose.
- The Program Counter
(PC) and Memory Address Register (MAR) are 12 bits wide as they correspond
to address bus and so they can issue addresses up to 4096.
- Since the data
bus is 4-bit wide, all the instructions are multiple of 4, which makes
memory access easier and gives a variety of instruction lengths.
Instruction
Set
The
instruction set used in this project was based on 0-Adress Architecture
or the Stack Architecture. Instructions form a set of tools that perform
the following types of operations:
- Data Movement
Instructions
- Arithmetic
Instructions
- Logical
Instructions
- Program
Control Instructions
Close Window
|
|