rgb_yuv 硬體設計
DESCRIPTION
RGB_YUV 硬體設計. 林鼎原 Department of Electrical Engineering National Cheng Kung University Tainan, Taiwan, R.O.C. Program code. void main(void) { int a, b, c; ……. RGB_2_Y(I_Frame, O_Frame); ……. } void RGB_2_Y(I_Frame, O_Frame); { int y; for (i=1, iTRANSCRIPT
112/04/20
1
RGB_YUV 硬體設計
林鼎原
Department of Electrical EngineeringNational Cheng Kung University
Tainan, Taiwan, R.O.C
NCKU EE CAD
2Ding Yuan Lin
Program codevoid main(void){ int a, b, c; ……. RGB_2_Y(I_Frame, O_Frame); …….}void RGB_2_Y(I_Frame, O_Frame);{ int y; for (i=1, i<64, i++) { y=0.257*a +0.504*b+0.098*c+16; write(y) to O_Frame;}}
NCKU EE CAD
3Ding Yuan Lin
Pipelining Scheduling for 6 Pipeline Latency
*
a b
0.2570.504
0.098
y
16
*
*
*
a b
0.504
0.098*
0.257
s1
s2s3
s4
s5
s6
s7s8
s9
s10s11
c
c
V1
V4
V2
V5
V1
V3
y
16*
V2V3
V4 V5
+
1>=
64
Loop body
>= +
*
a b
0.504
0.098*
0.257
c
V1
y
16+
+*
V2
V3
V4V5
+
64
64
Each cycle1 adders2 multipliers
c1 c2
c3 c4c5
c6
c7
+
+ s12
s13
s14
V7
c8
c8
V7c1 c2
c3 c4c5
c6
c7
>=
s15
s16
s17
s18
c1c2
c4
c6c5
c3
c8
c7
+
+
+
+
+
status
status
status
NCKU EE CAD
4Ding Yuan Lin
Lifetimes of Values 1 2 3 4 5 6
V1
V2
V3
V4
V5
V6
V7
1 2 3 4 5 6
V1
V5
V6
V2
V3
V4
V7
R1
R2
R3
Left edge algorithm to allocate valuesinto registers
NCKU EE CAD
5Ding Yuan Lin
1 2 3 4 5 6
C1
C2
C4
C8
C7
C3
C5
C6
Lifetimes of Operations 1 2 3 4 5 6
C1
C2
C3
C4
C5
C6
C7
C8
*
*
+
*
+
+
+
*
*
*
+
+
+
+
乘法器
加法器
+ +
NCKU EE CAD
6Ding Yuan Lin
IP Data Path
0 1
a c
0.257
0.504
0.098
0 1 0 1
M1 M2
R1 = {V1, V5,} , R2 = {V2, V3, V4} C1, C4 multiplier 1C2 → multiplier 2
b
R216
multiplier 1 multiplier 2
M3
0 1
M4
s1s2
s3 s4
R1.enaR2.ena
* * +/-
2
R1
out
s3 s4
s1
R1.ena
R3.ena
R3
Controller 64
R3.ena
AlU_op
1
2 R2.ena
AlU_op
status
clk
rst
valid busy
s2
NCKU EE CAD
7Ding Yuan Lin
IP ControllerNext_state Enable
Enabl
e M1 M2 M3 M4 adder
State status=1 status= 0 valid busy R1 R2 R3 s1 s2 s3
s4 Alu_op
State0 State1 State1 0 0 0 0 1 0 0 10 01 0
State1 State2 State2 0 1 1 1 1 0 0 10 10 1
State2 State3 State3 0 1 1 0 1 0 0 00 00 0
State3 State4 State4 0 1 1 0 0 0 0 00 00 0
State4 State5 State5 0 1 1 1 0 1 1 00
00 0
State5 State6 State6 0 1 1 1 0 1 1 01
00 0
State6 State0 State1 1 1 1 1 0 1 1 00
00
0
S1 S2 S3S0 S6S4 S5rst
status = 0
status = 1
NCKU EE CAD
8Ding Yuan Lin
Pre-allocation: 設計方法根據 loop body 直接設計成硬體,總共有 7 個暫存器
(R1~R6,counter) , 4 個加法器以及 3 個乘法器。乘法運算部分,先將小數乘上 256(2 的八次方 ) ,也就是左移
8 位元。再與 8bits 輸入資料相乘,得到的結果會是 16 位元,此時將後 8 位元捨去,留下來的就是整數部分。
控制單元有 7 個狀態 (s0~s6) S0: reset 。 S1: 接收 input data R,G,B 判斷 counter 是否大於等於 0 ,如果成立則繼續做,
否則跳出。 S2: 讀取 R 、 G ,並開始運算 RGB_R*0.257,RGB_G*0.504 , counter 減 1 。 S3: 運算 RGB_R*0.257 值存入 V1 , RGB_G*0.504 值存入 V2 。 S4: 讀取 input data c ,並開始運算 RGB_B*0.098, V3=V1+V2 。 S5: 運算 V3+16 ,運算 RGB_B*0.098 值存入 V5 S6: Y=V4+V5 。
NCKU EE CAD
9Ding Yuan Lin
Verilog Code for Pre-allocation Design(1/5) `timescale 1ns / 1psmodule rgb_to_yuv( clk,reset,rgb_in, Y,busy,valid);// Input and output port 宣告input clk,reset;input [23:0] rgb_in;output [7:0] Y;output busy;output valid;
reg busy;reg valid;reg [6:0] counter;reg [7:0] RGB_R, RGB_G, RGB_B;reg [2:0] present_state,next_state;reg [7:0] R3_tmp,R4_tmp,R6_tmp; wire [7:0] R1_tmp,R2_tmp,R5_tmp;reg [15:0] m1,m2,m3; // for 3 mutiplierreg [7:0] R1,R2,R3,R4,R5,R6;//sate parameterparameter [2:0] s0=3'd0,s1=3'd1, s2=3'd2, s3=3'd3,s4=3'd4, s5=3'd5,
s6=3'd6;
輸入和輸出埠
當 busy為 high時, rgb_in暫停輸入直到 busy為low。當 valid為 high時,輸出的值才是有效得。用來計數做的次數,並判斷是否該結束
NCKU EE CAD
10Ding Yuan Lin
Verilog Code for Pre-allocation Design(2/5) //counteralways @(posedge clk) begin if(reset) counter<=7‘d0; else if (present_state==s6) counter<=counter+7'd1; else counter<=counter; end
用來計數做的次數,並判斷是否該結束執行。
//data or state registersalways @ (posedge clk or posedge reset) begin if(reset) begin //初始化
present_state <=s0; RGB_R<=8‘d0;
RGB_G<=8’d0; RGB_B<=8‘d0; R1<=8’d0; R2<=8‘d0; R3<=8’d0; R4<=8‘d0; R5<=8’d0; R6<=8‘d0;end
end (1/2)
else begin present_state <=next_state; if(present_state==s1)//state 1 讀值 begin
RGB_R<=rgb_in[23:16];RGB_G<=rgb_in[15:8] ;RGB_B<=rgb_in[7:0] ;
end R1<=R1_tmp;R2<=R2_tmp;R3<=R3_tmp;R4<=R4_tmp;R5<=R5_tmp;R6<=R6_tmp;
end end (2/2)
NCKU EE CAD
11Ding Yuan Lin
Verilog Code for Pre-allocation Design(3/5) //next state logicalways @ (present_state) . begin case(present_state)
s0: next_state=s1;s1: next_state=s2;s2: next_state=s3;s3: next_state=s4;s4: next_state=s5;s5: next_state=s6;
default: next_state=s1; endcase end
//control signal always @ (present_state or busy or counter ) begin case(present_state) s0: begin valid=1'b0; busy=1'b0; end s1: begin valid=1'b0; busy=1'b0; end s2: begin valid=1'b0; busy=1'b1; end s3: begin valid=1'b0; busy=1'b1; end s4: begin valid=1'b0; busy=1'b1; end s5: begin valid=1'b0; busy=1'b1; end s6: begin valid=1'b1; busy=1'b1; end default: if(counter==7'd0) begin valid=1'b0;busy=1'bx; end else begin valid=1'b1;busy=1'b0; end endcase end
assign R1_tmp=m1[15:8];assign R2_tmp=m2[15:8]; 捨棄後 8bitsassign R5_tmp=m3[15:8];assign Y = (present_state==s6)? R6 : 8‘d0 ; 狀態 S6 時 輸出 Y
NCKU EE CAD
12Ding Yuan Lin
Verilog Code for Pre-allocation Design(4/5) //rgb to y executionalways @(* ) begin case(present_state) s0: begin m1=16'd0; m2=16'd0; m3=16'd0; R3_tmp=8'd0;
R4_tmp=8'd0; R6_tmp=8'd0; end
s1: begin m1={R1,8'd0}; //read data m2={R2,8'd0}; //read data m3={R5,8'd0}; //read data R3_tmp=R3; R4_tmp=R4; R6_tmp=R6; end
(1/4)
s2: begin m1=RGB_R * 8'd66; //action 0.257 m2=RGB_G * 8'd129; //action 0.504 m3={R5,8'd0}; R3_tmp=R3; R4_tmp=R4; R6_tmp=R6; ends3: begin m1={R1,8'd0}; m2={R2,8'd0}; m3=RGB_B * 8'd25;//action 0.098 R3_tmp=R1+R2; //action R4_tmp=R4; R6_tmp=R6; end (2/4)
NCKU EE CAD
13Ding Yuan Lin
Verilog Code for Pre-allocation Design (5/5)
s4: begin m1={R1,8'd0}; m2={R2,8'd0}; m3={R5,8'd0}; R3_tmp=R3; R4_tmp=R3+8'd16;//action R6_tmp=R6; ends5: begin m1={R1,8'd0}; m2={R2,8'd0}; m3={R5,8'd0}; R3_tmp=R3; R4_tmp=R4; R6_tmp=R4+R5; end
(3/4)
s6: begin m1={R1,8'd0}; m2={R2,8'd0}; m3={R5,8'd0}; R3_tmp=R3; R4_tmp=R4; R6_tmp=R6; end default: begin m1=16'd0; m2=16'd0; m3=16'd0; R3_tmp=8'd0; R4_tmp=8'd0; R6_tmp=8'd0; end endcaseend (4/4)
NCKU EE CAD
14Ding Yuan Lin
Post-allocation : 設計方法根據 Life time 分析,可找出以下共用的地方:
乘法器共用後只需 2 個加法器共用後只需 1 個暫存器: R1,R5, 可共用 ,並重新命名為 R1 R2,R3,R4 可共用,並重新命名為 R2 counter , 重新命名為 R3
控制電路包含控制四個多工器用的控制訊號、 adder 加剪法運算控制訊號、暫存器寫入訊號 reg_ena 。
NCKU EE CAD
15Ding Yuan Lin
Verilog Code for Post-allocation Design(1/6)`timescale 1ns / 1psmodule rgb_to_yuv( clk,reset,rgb_in, Y,busy,valid);// Input and output port 宣告input clk,reset;input [23:0] rgb_in;output [7:0] Y;output busy;output valid;reg busy;reg valid;reg [7:0] RGB_R, RGB_G, RGB_B;reg [2:0] present, state,next_state;reg [7:0] R1,R2,R3;//shared registersreg [15:0] mux1, mux2;reg [7:0] mux3, mux4;reg [7:0] mul_reg1, mul_reg2; reg [15:0] mul1, mul2;//two multiplierreg [7:0] add;// one adderwire status;//select linereg R1_ena,sel_12,R2_ena,R3_ena,alu_op ;reg [1:0] sel_3 ,sel_4;
輸入和輸出埠
當 busy為 high時, rgb_in暫停輸入直到 busy為low。當 valid為 high時,輸出的值才是有效得。
//sate parameterparameter [2:0] s0=3'd0, s1=3'd1, s2=3'd2, s3=3'd3, s4=3'd4, s5=3'd5,
s6=3'd6;
NCKU EE CAD
16Ding Yuan Lin
assign status=(R3>=0)?1'b0:1'b1;
Verilog Code for Post-allocation Design(2/6)
//data or state registersalways @ (posedge clk or posedge reset) begin if(reset) begin //初始化
present_state <=s0; RGB_R<=8‘d0;
RGB_G<=8’d0; RGB_B<=8‘d0;
mul_reg1<= 8'd0; . mul_reg2<= 8'd0;
R1<=8’d0; R2<=8‘d0; R3<=8’d0; end
end (1/2)
else begin present_state <=next_state; if(present_state==s1&& status ==1’d0)//state 1 讀值 begin
RGB_R<=rgb_in[23:16];RGB_G<=rgb_in[15:8] ;RGB_B<=rgb_in[7:0] ;
end mul_reg1 <= mul1 [15:8];
mul_reg2 <= mul2 [15:8]; R1 <= mul_reg1; if (R2_ena==1'b0) R2<=mul_reg2; else if(R3_ena==1'b1&& alu_op==1’b1) R3<=mux3-mux4 ; else R2<=add; end end (2/2)
assign Y = (present_state==s6)? add : 8‘d0 ; 狀態 S6時,輸出 Y
NCKU EE CAD
17Ding Yuan Lin
Verilog Code for Post-allocation Design(3/6)//next state logicalways @ (present_state) . begin case(present_state)
s0: next_state=s1;s1: next_state=s2;s2: next_state=s3;s3: next_state=s4;s4: next_state=s5;s5: next_state=s6;
default: next_state=s1; endcase end
//control signal always @ (present_state or busy or counter ) begin case(present_state) s0: begin valid =1'b0;
busy =1'b0; R2_ena=1'b0;
R3_ena=1'b1; sel_12=1'b0;
sel_3 =2'b10; sel_4 =2'b01;//64 alu_op=1'b0; //add end
s1: begin valid =1'b0; busy =1'b0;
R2_ena=1'b1; R3_ena=1'b1; sel_12=1'b0;
sel_3 =2'b10;//R3
sel_4 =2'b10;//1 alu_op=1'b1;//sub
end (1/4)
NCKU EE CAD
18Ding Yuan Lin
Verilog Code for Post-allocation Design(4/6)s2: begin valid=1'b0; busy =1'b1; R2_ena=1'b0; R3_ena=1'b1; sel_12=1'b0; sel_3 =2'b0; sel_4 =2'b0; alu_op=1'b0;//add ends3: begin valid =1'b0; busy =1'b1; R2_ena=1'b0; R3_ena=1'b0; sel_12=1'b0; sel_3 =2'b00; sel_4 =2'b00; alu_op=1'b0; end (2/4)
s4: begin valid=1'b0; busy=1'b1; sel_12=1'b1; R2_ena=1'b1; R3_ena=1'b0; sel_3 =2'b00; sel_4 =2'b00; alu_op=1'b0; end s5: begin valid=1'b0; busy=1'b1; R2_ena=1'b1; R3_ena=1'b0; sel_12=1'b1; sel_3 =2'b01; sel_4 =2'b00; alu_op=1'b0; end (3/4)
NCKU EE CAD
19Ding Yuan Lin
Verilog Code for Post-allocation Design(5/6)s6: begin valid=1'b1; busy=1'b1; sel_12=1'b0; sel_3 =2'b00; sel_4 =2'b00; R2_ena=1'b1; R3_ena=1'b0; alu_op=1'b0; enddefault: begin valid=1'b0; busy=1'b0; sel_12=1'b0; sel_3 =2'b00; sel_4 =2'b00;
R2_ena=1'b0; R3_ena=1'b0;
alu_op=1'b0; end endcaseend (4/4)
//Mux1 and Mux2 always@(sel_12 or RGB_R or RGB_B) begin case(sel_12) 1'b0: begin mux1 = RGB_R; mux2 = 16'd66;// 0.257 end default: begin mux1 = RGB_B ; mux2 = 16'd25; //0.098 end endcase end
NCKU EE CAD
20Ding Yuan Lin
Verilog Code for Post-allocation Design(6/6)//Mux3 always@(sel_3 or R1 or R3 ) begin case(sel_3) 2'b00: mux3 = R1; 2'b01: mux3 = 8'd16; 2'b10: mux3 = R3; default: mux3 =8'd0; endcase end //Mux4 always@(sel_4 or R2 ) begin case(sel_4) 2'b00: mux4 = R2; 2'b01: mux4 = 8'd64; 2'b10: mux4 = 8'd1; default: mux4 = 8'd0; endcase end
//ALUalways@(mux1 or mux2 or mux3 or mux4 or RGB_R or RGB_G or alu_op or R1 or R2 or R3 ) begin mul1 = mux1 * mux2; mul2 = RGB_G* 16'd129; //0.504 if(alu_op==1'b1) add = mux3 - mux4; else add = mux4+mux3; end
NCKU EE CAD
21Ding Yuan Lin
波形圖RGB 輸入 (hex)
Valid high 輸出為有效的busy 為 high 時 暫停資料輸入
alu_op 為 high 時 adder 做減法 Status 為 high 時不再接受任何資料 Control signal
NCKU EE CAD
22Ding Yuan Lin
Pattern 驗證結果計算完的結果和預期結果比較正確性總共 64筆資料 (0~63)。
NCKU EE CAD
23Ding Yuan Lin
Quartus 參數設定
NCKU EE CAD
24Ding Yuan Lin
數據分析Pre_allocation Post_allocation
由結果可看出,暫存器共用後的結果, total logic elements 由原先 125 減少為 91 。
NCKU EE CAD
25Ding Yuan Lin
Pre_allocation 合成分析Xlinx 合成結果使用了 3 個乘法器、 4 個加減法器。
加法器
乘法器
State Machine
NCKU EE CAD
26Ding Yuan Lin
Post_allocation 合成分析Xlinx 合成結果使用了 2 個乘法器、 1 個加減法器。
乘法器
加減法器
Mux1 Mux2
Mux3
Mux4
State Machine
NCKU EE CAD
27Ding Yuan Lin
Post simPost_sim 後的結果 也符合預期