RTL Patch ECO Flow

Overview

RTL Patch ECO can avoid full scale synthesis which takes long time in large design. There are two modes for the RTL Patch ECO flow. One mode has the patch embedded in the netlist. The other mode has the patch in a stand alone Verilog file. RTL Patch is created by user manually. Comparing with several days turn around ECO cycle in the traditional way, the RTL Patch method can reduce the time to several hours. Once the RTL Patch is extracted from the original design, it is guaranteed that the ECO result will be equivalent to the reference design.

Embedded RTL Patch

Embedded RTL Patch is added into the netlist directly. When the netlist is read in, the embedded RTL Patch is incrementally synthesized into gate level. And the new gates are directly saved into the ECOed netlist when write_verilog is used. The method is good for small changes.

Figure 1: RTL Patch Embedded in Netlist

Discrete RTL Patch Overview

Discrete RTL Patch is a Verilog file with GOF keywords indicating how to connect the interface ports. The patch describes the necessary RTL changes to fix the logic. Since most net names are optimized away in synthesis, the patch should expand the fanin and fanout of the change until the known boundaries are reached. The known boundaries can be equivalent nets, input ports, output ports, flop pins and hierarchical instance pins.

RTL Patch Syntax and Ports Connections

The RTL Patch has exactly the Verilog syntax. The module name is the same as the module under ECO. The connections guidance is in comments with GOF keyword.

The register names are normally kept in synthesis, for example, state machine 'current_state' having register names current_state_reg_*_ or \current_state_reg[*] . These names are used in RTL Patch for the ports connections in ECO. There are several types of port connections:

Type 1: Input Direct Connections

Input port without GOF keyword guidance should have the net existing in the module under ECO. For example, 'input clock;'

Type 2: Output Driving the Port Under ECO

Output port without GOF keyword guidance should drive the same name output port under ECO in the module. For example, 'output proc_active;'

Type 3: Input Driven by Instance Out Pin

Input port driven by a flop's Q or hierarchical instance out pin has GOF key word as guidance for connection.

In the expression, "input [1:0] current_state_Q; //GOF current_state_reg_*_/Q", input net current_state_Q[0] is driven by current_state_reg_0_/Q and current_state_Q[1] is driven by current_state_reg_1_/Q.

When the instance has backslash in the name, remove the backslash and space. For example, "\current_state_reg[0] /Q" should be written as "current_state_reg[0]/Q".

In the expression, "input pll_stable; //GOF u_pll/pll_stable", the input net pll_stable is driven by instance u_pll output pin pll_stable.

Type 4: Input Driven by the Driver of Instance In Pin

Input port is driven by the driver of an instance input pin. The instance's input pin itself is under ECO, the input port takes the original driver of the instance's input pin.

In the expression, "input current_state_D; //GOF current_state_reg/D", the instance current_state_reg/D is driven by U456/Z before ECO. During ECO, GOF connects current_state_D to U456/Z.

Type 5: Input Driven by Output Port Driver

Input port is driven by output port's driver before ECO is done to the output port. The output port itself is under ECO, and the input port takes the original driver of the output port.

In the expression, "input state_valid_ORI; //GOF state_valid", the output state_valid is driven by U123/Z before ECO. During ECO, GOF connects state_valid_ORI to U123/Z.

Type 6: Output Driving Instance In Pin

Output port drives a flop's D pin or hierarchical instance input pin. The flop or instance is under ECO.

In the expression, "output [1:0] current_state_D; //GOF current_state_reg_*_/D", after ECO, the net current_state_D[0] drives current_state_reg_0_/D and current_state_D[1] drives current_state_reg_1_/D.

In the expression, "output pll_start; //GOF u_pll/pll_start", after ECO, the output net pll_start drives instance u_pll input pin pll_start.

If user knows the exact ECO spot on a random instance name, it is ok to guide the input port to that instance input pin.

For example, "output eco_net_valid; //GOF U567/A" is to direct ECO fix on U567/A pin and drive the pin by net eco_net_valid.

Type 7: New Input Port

New input port will be generated during ECO.

In the expression, "input new_enable; //GOF_NEW", the input port new_enable will be generated in the current module.

Type 8: New Output Port

New output port will be generated during ECO.

In the expression, "output new_enable; //GOF_NEW", the output port new_enable will be generated in the current module.

RTL Patch Example

As shown in Figure 2, the design under ECO has a state machine to be updated in one state.

Figure 2: One state Update for A State Machine

The RTL Patch on the state machine has the same name as the module under ECO, process_controller. The content is shown in Figure 3.

Figure 3: RTL Patch for a State Machine

In the GOF keywords guidance description:

"input [1:0] current_state_Q; //GOF current_state_reg_*_/Q" is the type 3 connection described in the Syntax and Ports Connections section.

"input [1:0] current_state_D; //GOF current_state_reg_*_/D" is the type 4 connection described above.

"output [1:0] next_state_new; //GOF current_state_reg_*_/D" is the type 6 connection described above.

The logic change that the RTL Patch implements is shown in Figure 4.

Figure 4: The RTL Patch Function

RTL Patch Generation

RTL Patch can be generated by collecting the related combinational logic in the original design. The boundaries are ports or sequential logic.

Check Figure 5 for one example of RTL Patch generation.

Figure 5: RTL Patch Generation Example

Apply RTL Patch and Optimization

The RTL Patch can be applied to the netlist by read_rtlpatch API in a ECO script and run the script by "gof -run rtl_patch.pl". There are two methods to apply the RTL Patch, one is the non-optimized way that the final gate level patch has all gates synthesized from the RTL Patch; the other method is optimized way that the final patch is optimized from the original full size patch.

Non-optimized Patch

The RTL Patch is directly applied to Implementation Netlist:

read_library("stdlib.lib");
read_design("-imp", "netlist_under_eco.v");
# The RTL Patch directly applied to the netlist
read_rtlpatch("rtl_patch.v");
set_top("chip_top");
report_eco;
write_verilog("rtl_patch_eco_noopt.v");

Optimized Patch

To optimize the final RTL Patch gate level size, the RTL Patch is applied to Reference Netlist first, then use the new Reference Netlist to fix Implementation Netlist. Note, in RTL Patch ECO, the initial Reference Netlist is the same as Implementation Netlist. The RTL Patch is applied to the initial Reference Netlist to create the working Reference Netlist. The working Reference Netlist may have patch size in hundreds gates. When the working Reference Netlist is used to fix the original Implementation Netlist, GOF can find the equivalent points in Implementation Netlist to replace the gates in the synthesized patch. The final optimized patch size may have only several gates.

Figure 6: RTL Patch Optimization Flow

The optimization flow in one script:

read_library("stdlib.lib");
# Read the same netlist for Reference and Implementation
read_design("-ref", "netlist_under_eco.v");
read_design("-imp", "netlist_under_eco.v");
set_tree("ref"); # Set the work tree to Reference, apply rtl_patch to Reference
read_rtlpatch("rtl_patch.v");
set_tree("imp"); # Switch back the work tree to Implementation
set_top("chip_top");
set_cutpoint_ultra(9); # Set internal fix effort
fix_design; 
report_eco;
write_verilog("rtl_patch_eco_optimized.v");

Multiple RTL Patches and New Ports

Multiple RTL Patch can be read in one by one. New input/output ports can be added in the RTL Patch by GOF_NEW keyword. The new ECO ports can also be added by the ECO script.

For example, module parent_mod have two sub-modules a_mod and b_mod. Both a_mod and b_mod have RTL Patches. And a_mod has an ECO output port new_valid drives an ECO input port new_valid in b_mod. In parent_mod, new_valid should be connected up. Other new ports can use the same way to connect up.

Multiple RTL Patches ECO script:

read_library("stdlib.lib");
# Read the same netlist for Reference and Implementation
read_design("-ref", "netlist_under_eco.v");
read_design("-imp", "netlist_under_eco.v");
set_tree('ref'); # Set tree to Reference, and apply the original changes to Reference
# Apply the RTL Patches 
read_rtlpatch("a_mod_rtl_patch.v");
read_rtlpatch("b_mod_rtl_patch.v");
read_rtlpatch("parent_mod_rtl_patch.v");
set_tree('imp'); # Switch back to Implementation, use the updated Reference
set_top("chip_top");
set_cutpoint_ultra(9); # Set internal fix effort
fix_design;
report_eco;
write_verilog("rtl_patches_eco_optimized.v");

The RTL Patch of a_mod:

module a_mod(new_valid,iam_ok, clk, rst, in0d, in1d);
output [1:0] new_valid; //GOF_NEW
input 	     iam_ok; //GOF_NEW
input 	     clk;
input 	     rst;
input 	     in0d, in1d;
reg [1:0]    new_valid; 
wire 	     topnext0  = !(in1d & in0d);
wire 	     topnext1 = in1d | in0d;
always @(posedge clk or negedge rst) begin
   if(!rst) new_valid <= 2'b0;
   else if(iam_ok) new_valid <= 2'b11;
   else new_valid[1:0] <= {topnext1, topnext0};
end   
endmodule 

The RTL Patch of b_mod:

module b_mod(set_d, foranaout3, new_out, iam_ok, ana_in0, ana_in2, ana_in3, din0, ana_out1, clk, rst, ana_out1z,
dly_set, setting, new_valid
);
parameter SMAX = 4;
parameter AMAX = 4;
input ana_in0, ana_in2, ana_in3, din0, ana_out1;
input clk, rst;
input ana_out1z; //GOF u_ana_mod/ana_out2
input [SMAX-1:0] dly_set; //GOF settingdly_reg_*_/Q
input [SMAX-1:0] setting;
output [SMAX-1:0] set_d; //GOF settingdly_reg_*_/D
output foranaout3; //GOF u_ana_mod/ana_in3
output new_out; //GOF newout_reg_0_/D
input  [1:0] new_valid; //GOF_NEW
output 	     iam_ok; //GOF_NEW
reg 	     foranaout3;
reg 	     new_out;
reg [AMAX-2:0]    add3bits;
assign set_d = ana_in0? setting : dly_set;
assign iam_ok = (set_d==4'b1010);
always @(*) begin
   if(dly_set == 4'b1001) foranaout3 = !(din0 & ana_out1 | &new_valid);
   else foranaout3 = 1'b0;
end
always @(*) begin
   add3bits  = {ana_in0, ana_in2} + {ana_in3, din0};
   new_out = add3bits[2] ^ ana_out1z;
end
endmodule 

The RTL Patch of parent_mod:

module parent_mod(to_sig, iam_okdrv,in_sig, iam_ok);
input [1:0] in_sig; //GOF a_mod/new_valid[*]
output [1:0] to_sig; //GOF b_mod/new_valid[*]
input 	     iam_ok; //GOF b_mod/iam_ok
output 	     iam_okdrv; //GOF a_mod/iam_ok
assign to_sig = in_sig;
assign iam_okdrv = iam_ok;
endmodule   

Verify RTL Patch Extraction

The RTL Patch can be verified easily before any change is made. For example, the above state machine patch can be written to make no change in the original logic.

The RTL Patch without logic change:

module process_controller(next_state_new, current_state_Q, current_state_D, pro_stop);
input [1:0] current_state_Q; //GOF current_state_reg_*_/Q
input [1:0] current_state_D; //GOF current_state_reg_*_/D
output [1:0] next_state_new; //GOF current_state_reg_*_/D
input 	     pro_stop;
parameter IDLE = 2'b00;
parameter RAMP = 2'b01;
parameter DATA = 2'b10;
parameter COMP = 2'b11;
reg [1:0]    next_state_new;
always @(*) begin
   if(current_state_Q == DATA) next_state_new = COMP; // No logic change to verify the patch itself
   else next_state_new = current_state_D;
end
endmodule 

By applying this patch to Implementation Netlist, the ECOed netlist is equivalent to the original Implementation Netlist.

The RTL Patch Verification script:

read_library("stdlib.lib");
# Read the same netlist for Reference and Implementation
read_design("-ref", "netlist_under_eco.v");
read_design("-imp", "netlist_under_eco.v");
set_tree("ref"); # Set the work tree to Reference, apply rtl_patch to Reference
read_rtlpatch("rtl_patch.v"); # The RTL Patch is only extracted from original design, no change has been made
set_tree("imp"); # Switch back the work tree to Implementation
set_top("chip_top");
my $non_eq = run_lec; 
if($non_eq == 0){
  gprint("Good! The RTL Patch is extracted correctly\n");
}else{
  gprint("Error! The RTL Patch is not extracted correctly\n");
}
write_verilog("eco_chip_top.v");

In some corner cases, input ports can be inverted in P&R stage and it causes the verification process to fail. The prelayout netlist can be read in to check the phase inversion of input ports.

The input ports inversion checking script:

read_library("stdlib.lib");
read_design("-ref", "prelayout.v");
read_design("-imp", "netlist_under_eco.v");
set_top("process_controller");
# To check if "input [15:0] data_in" has bits inverted
for(my $i=0;$i<16;$i++){
   compare_nets("data_in"."[$i]", "data_in"."[$i]");
}

Tips in Creating RTL Patch

There are several ways to speed up the patch creation and minimize the RTL Patch size.

Use GOF Save Restore Session

In creating the RTL Patch, it may take lots of iteration in applying the RTL Patch. Loading the library and netlist files may take long time. A session can be saved after the library and netlist loading and the session can be restore in much faster speed next time.

Save a session after loading library and netlist files:

read_library("stdlib.lib");
# Read the same netlist for Reference and Implementation
read_design("-ref", "netlist_under_eco.v");
read_design("-imp", "netlist_under_eco.v");
save_session("for_rtl_patch");

Since restoring a 10M instances database may take only 10 seconds, it makes the debug process much faster.

Restore the session in applying the RTL Patch iterations:

restore_session("for_rtl_patch");
read_rtlpatch("rtl_patch.v"); # The RTL Patch is only extracted from original design, no change has been made

Big Case Statement Handling

The basic method to extract RTL Patch from the original RTL design is starting from the change point, expanding the fanin and fanout from this point. When the fanout reaches a big case statement, only add the cases that have been affected. For example, in the state machine above, only 'DATA' state is changed by "if(current_state_Q == DATA)".

Flop Synchronous Set Reset

In GOF keyword guided type 6 connection, the flop instance may have synchronous initialization like the code below. The initialization should be added into the patch.

The original state machine RTL has initialization:

always @(posedge clk) begin
   if(!rst_syn) current_state <= IDLE; // Synchronous reset
   else current_state <= next_state; 
end
always @(*) begin
   case(current_state)
     IDLE: next_state = startd? RAMP : IDLE;
     RAMP: next_state = pro_stop? IDLE : DATA;
     DATA: next_state = done_count? COMP : DATA;
     COMP: next_state = IDLE;
   endcase 
end

The RTL Patch should have the initialization condition added:

always @(*) begin
   if(!rst_syn) next_state_new = IDLE; // To handle synchronous reset in RTL Patch
   else if(current_state_Q == DATA) next_state_new = pro_stop?IDLE : COMP;
   else next_state_new = current_state_D;
end

Optimized Away Flops Connections

For the optimized away flops in a bus registers, the bus can't be used as a whole. It has to be broken into several segments.

The bus has to be divided into several pieces:

// intr_point_reg_11_ has been merged into intr_point_reg_12_
input [15:12] intr_point_15_12; //GOF intr_point_reg_*_/Q
input [10:0] intr_point_10_0;   //GOF intr_point_reg_*_/Q
// Use {intr_point_15_12[15:12], intr_point_15_12[12], intr_point_10_0[10:0]} as the new bus in the RTL Patch

Follow us:
NanDigits.com US | NanDigits.cn China
© 2023 NanDigits Design Automation. All rights reserved.