CLOCK BUFFER AND MINIMUM PULSE WIDTH VIOLATION
Transition (slew): A slew is defined as a rate of change. In STA analysis the rising or falling waveforms are measured in terms of
whether the transition(slew) is fast or slow. Slew is typically measured in terms of
transition time, i.e. the time it takes for a signal to transition between two
specific levels ( 1 to 0 or 0 to 1/ low to high or high to low). Transition
time is inverse of the slew rate- the larger the transition time, the slower the
slew and vice-versa.
In
lib these transition is defined as:
#rising
edge threshold:
Slew_lower_threshold_pct_rise
: 20.0;
Slew_upper_threshold_pct_rise
: 80.0;
#falling
edge threshold:
Slew_upper_threshold_pct_fall
: 80.0;
Slew_lower_threshold_pct_fall
: 20.0;
These
values are specified as a % of Vdd.
Rise time: The time required for a signal to
transition from 20% of its (VDD) maximum value to 80% of its maximum value.
Fall time: The time required for a signal to transition from 80% of its (VDD) maximum value to 20% of its maximum value.
Propagation delay: The time required for the signal to change the inputs to its state like 0 to 1 or 1 to 0.
Clock buffer and normal
buffer
Clock
net is a high fan-out net and most active signal in the design. Clock buffer
mainly used for clock distribution to make the clock tree. The main goal of CTS
to meet skew and insertion delay, for this we insert buffer in the clock path. Now
if the buffer has different rise and fall time it will affect the duty cycle with this
condition tool can do skew optimization but complicates the whole optimization
process as a tool has to deal with a clock with duty cycle at different flop paths.
If buffer delays are the same only thing the tool has to do balance the delay by
inserting buffer.
The
clock buffers are designed with some special property like high drive strength,
equal rise and fall time, less delay and less delay variation with PVT and OCV.
Clock buffer has an equal rise and fall time. This prevents the duty cycle of clock
signal from changing when it passes through a chain of clock buffers.
A
perfect clock tree is that gives minimum insertion delay and 50% duty cycle for the clock. The clock can maintain the 50% duty cycle only if the rise and the fall
delays and transition of the tree cells are equal.
How
to decide whether we need to used buffer or inverter for building a clock tree in
the clock tree synthesis stage. This decision totally depends on the
libraries which we are using. The main factors which we consider to choose
inverter or buffer are rise delay, fall delay, drive strength and insertion
delay (latency) of the cell. In most of the library files, a buffer is the combination of
two inverters so we can say that inverter will be having lesser delay than buffer with the same
drive strength. Also inverters having more driving capacity than a buffer that’s
why most of the libraries preferred inverter over buffer for CTS.
Clock
buffers sometimes have input and output pins on higher metal layers much fewer
vias are needed in the clock distribution root. Normal buffer has pins on
lower metal layers like metal1. Some lib also has clock buffers with input pins on high metal
layers and output pins on lower metal layers. Normally clock routing is done
into higher metal layers as compared to signal routing so to provide easier
access to clock pins from these layers clock buffer may have pins in higher
metal layers. And for normal buffer pins may be in lower metal layers.
Clock
buffer are balanced i.e. rise and fall time almost the same. If these are not equal
then duty cycle distortion in the clock tree will occur and because of this
minimum pulse width violation comes into the picture. In clock buffer the size
of PMOS is greater than NMOS.
On the other hand normal buffer have not equal rise and fall time. In other words
they don’t need to have PMOS/NMOS size to 2:1 i.e. size of PMOS don’t need to
be bigger than the NMOS, because of this normal buffer is in a smaller size as
compared to clock buffer and clock buffer consumes more power.
The
advantage of using an inverter-based tree is that it gives equal rise and fall
transition so due to that jitter (duty cycle jitter) get canceled out and we
get symmetrical high and low pulse width.
Buffer
contain two inverters with unequal size in area and unequal drive strength.
First inverter is of small size having low drive strength and the second buffer is
of large size having high drive, strength are connected back to back as shown in
figure below.
So
a load of these two inverters are unequal. The net length b/w two back to
back inverter is small so small wire capacitance will present here we can
neglect that but for the next stage the net length is more and because of net
length the capacitance is more by wire capacitance and next inverter input pin
capacitance and we get unequal rise and fall time so jitter will get added in
clock tree with an additional cost of more area than an inverter.
So
mainly we are preferred inverter-based trees instead of the buffer based.
inverter based tree having equal rise and fall time |
buffer based tree having unequal rise and fall time |
Why PMOS is having bigger
size than NMOS?
We
know NMOS have majority charge carriers are electrons and PMOS have majority
charges carriers are holes. And we also know that electrons are very much
faster than holes.
Since electron mobility is greater than the hole mobility, so PMOS width must be larger to
compensate and make the pull-up network more stronger. If W/L of PMOS is the same as NMOS
the charging time of the output node would be higher than the discharging time
because discharging time is related to the pulldown network.
So
we make PMOS is of big size so that we can get equal rise and fall time.
Normal
buffer are designed with W/L ratio such that sum of rise and fall time is
minimum.
Normally
(R) PMOS > (R) NMOS
(R) PMOS
=3*(R) NMOS
For making equal resistance of
both transistor the size of PMOS is bigger than the NMOS.
The duty cycle of clock:
It is the fraction of one period of the clock
during which clock signal is in the high (active) state. A period is the
time it takes for a clock signal to complete an on-and-off state. Duty cycle (D)
is expressed in percentage (%).
Minimum Pulse width violation:
It
is important for the clock signal to ensure the proper functionality of sequential and
combinational cells. Ensure that the width of the clock signal is wide enough
for the cell, internal operation i.e. minimum pulse width of the clock has to be
maintain for proper output otherwise, the cell will go into metastable state and
we will not get the correct output.
In
other words clock pulse into the flop/latch must be wide enough so that it does
not interfere with the correct functionality of the cells.
Minimum pulse width violation checks are to ensure that the pulse width of the clock signal for the high and low duration is more
than the required value.
Basically this violation is based on what frequency of operation and Technology we are working. If the frequency of design is 1 GHz
then the time period for each high and low pulse will be 0.5ns as if we
consider the duty cycle is 50%.
Normally we saw that in most of design
duty cycle always keep 50% for the simplicity otherwise designer can face many issues like clock
distortion and minimum pulse width violation. If in our design is using half-cycle path means data is launch at the positive edge and capturing at the negative edge and again minimum pulse width as rising level and
fall level will not be the same and if lots of inverter and buffer will be in chain
then it is possible that pulse can completely vanish.
Normally for the clock path, we use clock
buffer because they have equal rise and fall delay of these buffer as compare
to normal buffer having unequal delay that’s why we have to check
minimum pulse width.
Why
the minimum pulse width violation occurs:
Due to unequal rise and fall delay of combinational
cell. Let’s take an example of buffer and clock signal having 1 GHz frequency (1ns
period) is entering into a buffer. So for example, if the rise delay is more
than the fall delay than the output of clock pulse width will have less width for high
level than the input clock pulse.
The difference b/w rise and fall time is:
0.007
High pulse: 0.5-0.006=0.494
Low pulse: 0.5+0.006=0.506
We can understand it with an example:-
Let’s there is a clock signal which is
pass through more numbers of buffers with different rise and fall delay time.
We can calculate how it effects to the low or high pulse of the clock signal. The width
of clock signal is decreasing when buffer delay is more than the pulse width.
As we know every
buffer in the chain is taking more time to charge than to discharge. When the clock signal is propagating through a long chain of buffers, the pulse width is reduced as shown below.
We can understand by the
calculation:-
High pulse width = half pulse
width of clock signal– (rise delay –fall delay)
= 0.5 - (0.055-0.048)
- (0.039-0.032) - (0.025-0.022) - (0.048-0.043) - (0.058-0.054) = 0.474ns
Low Pulse width = half pulse width of
clock signal + (rise delay –fall delay)
= 0.5 + (0.055–0.048)
+ (0.039–0.032) + (0.025–0.022) + (0.048 – 0.043) + (0.058 – 0.054) =
0.526ns
Let’s required value of Min pulse width
is 0.410ns, Uncertainty = 90ps
Then high pulse width = 0.474-0.090 =
0.384ns
The slack is 0.384-0.410= - 0.026ns
here we can see that we are getting min pulse width violation for high pulse as total high pulse
width is less than the required value.
If
uncertainty we did not consider then violation will not occur in this scenario.
How to correct if violations are present in
design:
We
need to change the clock tree cells which have equal rise and fall delay time
or use those cells they have less difference between rise and fall delays.
What are the problems occurs if pulse
width violation occurs:
- Sequential data might not be captured properly, and flop can go into a metastable state.
- In some logic circuits the entire pulse could disappear and does not capture any new data.
So
it is required to ensure every circuit element always gets a clock pulse
greater than minimum pulse width required then only violation will not occur in
the design.
There are two types of minimum pulse width checks
are performed:
Clock
pulse width check at sequential devices
Clock
pulse width check at combinational circuits
How to report:
report_timing
–check_type pulse_width
How to define pulse width:
By liberty file (.lib):
By
default all the registers in the design have a minimum pulse width defined in
.lib file as this is the format to convey the std cell requirement to the STA tool.
By
convention min pulse width is defined for the clock signal and reset pins.
Command
name: min_pulse_width
In SDC file (.sdc):
set_min_pulse_width
-high 5 [get_clock clk1]
set_min_pulse_width
-low 4 [get_clock clk1]
If
high or low is not specified then constraints applied to both high and low
pulses.
NOTE:
Balanced
buffers means buffer having equal rise and fall time.
Unbalanced
buffers means buffer having unequal rise & fall time
NEXT TOPICS
clock tree types
clock tree optimization process
difference between High fanout net synthesis and clock tree synthesis
-physicaldesign4u@gmail.com
- physicaldesign4you
NEXT TOPICS
clock tree types
clock tree optimization process
difference between High fanout net synthesis and clock tree synthesis
-physicaldesign4u@gmail.com
- physicaldesign4you
Great article
ReplyDeleteNice article,
ReplyDeleteThanks for the clear and detail explanation
ReplyDelete