浮点ALU开发实记

tech2025-08-13  10

The Last two month I was working for a porject of my lab. And I am responsible for designing a Floating-point ALU module with Verilog. Due to its confidentiality, I can’t publish the original code of what I am doing right now(Or maybe forever). But the bugs I have fixed and the way I do all those testings is worthy to share.

Important Mistakes and Lessons

1. Convert a Two’s complement to its true form

Suppose there’s a w-bit number in true form, you can convert it to two’s complement without worrying anything, becasue the two’s complement can represent one more number than true form with same bits. (-0 in true form is interpreted as -2^(w - 1) in two’s complement).

But, before you convert a two’s complement to a true form number, ***Please do check if it is the minimun number in two’s complement!!!***.

2. Truncating in true form and two’s complement

​ Due to the ALU only have limited bits for store the temperate result and all those rounding and normalization stuff is handled in another phase of the pipeline. So I need to truncate the temperate result before the rounding.

​ The difference between truncating them in both forms is actually quite obvious when you sit down to think about it, but at first I just thought storing two more bits than the length of final result can abvoid all the errors.Well , it dosen’t. At least Not all the errors

Different stagesTrue formTwo’s complent5-bit original number1.10011.0111Turncate to 4 bits1.1001.011Decimal result-0.5-0.625

​ Clearly truncating a negative number in true form can make it greater while truncating in two’s complement making it less. This could lead to a mistake when you transform the two’s complement to true form and then do the rounding or other operation. Store more bits during truancating can only reduce the posssibility of bumping into this bug. You should always take care of this suitation.

3. Other binding bugs in Verilog

​ You’d better make a cleat table that tells what does each bit in a vector represent. This can save you tones of time in debuging.

Testing(Using shell and python)

​ Test should include both random data test and corner case test.

Corner case test

Corner case should cover erery typical case of input and output. Store all the test case in a distinct file and keep enrich it during your debug is a good idea.

Random data test

This part is also inportant. It’s almost inevitable that you would forget to consider some quirky cases that make your code don’t work out as you wish. The truncating bug is found when I do the random case at about 500th round.

Reference data

To generate the reference data for my ALU, I use python to simulate the ALU. When the output is different, you should check both the code in Verilog and in python.

A glipse of the auto test script and python simulation code

Auto test:

#!/bin/bash/ # Select test times here times=300 # -------- rm -f DIFF_*.dat RAWFILENAME1="RawTestData_f.dat" NEWINPUTDATA1="Input_f.dat" RAWFILENAME2="RawTestData_h.dat" NEWINPUTDATA2="Input_h.dat" # Corner case test cat "corner_case_f.dat" | awk '{printf $1"\n"}' > $NEWINPUTDATA1 cat "corner_case_f.dat" | awk '{printf $2"\n"}' >> $NEWINPUTDATA1 cat "corner_case_h.dat" | awk '{printf $1"\n"}' > $NEWINPUTDATA2 cat "corner_case_h.dat" | awk '{printf $2"\n"}' >> $NEWINPUTDATA2 iverilog -o test_tb.out test_tb.v ./test_tb.out python3 ./Reference_data.py -h python3 ./Reference_data.py -f # 1 to 12 is the result except sum diff <(awk -F'\t' '{for(i=1;i<=12;i++){if($2 != ""){printf $i"\t"}}if($2 != "")printf "\n"}' Output_f.dat) <(awk -F'\t' '{for(i=1;i<=12;i++){if($2 != ""){printf $i"\t"}}if($2 != "")printf "\n"}' $RAWFILENAME1) > DIFF_f.dat diff <(awk -F'\t' '{for(i=1;i<=12;i++){if($2 != ""){printf $i"\t"}}if($2 != "")printf "\n"}' Output_h.dat) <(awk -F'\t' '{for(i=1;i<=12;i++){if($2 != ""){printf $i"\t"}}if($2 != "")printf "\n"}' $RAWFILENAME2) > DIFF_h.dat diff <(awk -F'\t' 'END {print $13 }' Output_f.dat) <(awk -F'\t' 'END {print $1}' $RAWFILENAME1) >> DIFF_f.dat diff <(awk -F'\t' 'END {print $13 }' Output_h.dat) <(awk -F'\t' 'END {print $1}' $RAWFILENAME2) >> DIFF_h.dat # Random test case while ((times > 0)) do rm -f $RAWFILENAME1 python3 ./Generate_data.py -f cat $RAWFILENAME1 | sed -E 's/[a-z_A-Z]+://g' | awk -F'\t' '$2 != "" {printf $1"\n"}' | cat >$NEWINPUTDATA1 cat $RAWFILENAME1 | sed -E 's/[a-z_A-Z]+://g' | awk -F'\t' '$2 != "" {printf $2"\n"}' | cat >>$NEWINPUTDATA1 rm -f $RAWFILENAME2 python3 ./Generate_data.py -h cat $RAWFILENAME2 | sed -E 's/[a-z_A-Z]+://g' | awk -F'\t' '$2 != "" {printf $1"\n"}' | cat >$NEWINPUTDATA2 cat $RAWFILENAME2 | sed -E 's/[a-z_A-Z]+://g' | awk -F'\t' '$2 != "" {printf $2"\n"}' | cat >>$NEWINPUTDATA2 iverilog -o test_tb.out test_tb.v ./test_tb.out # 1 to 12 is the result except sum pre_len_1=$(awk 'END{print NR}' DIFF_f.dat) pre_len_2=$(awk 'END{print NR}' DIFF_h.dat) diff <(awk -F'\t' '{for(i=1;i<=12;i++){if($2 != ""){printf $i"\t"}}if($2 != "")printf "\n"}' Output_f.dat) <(awk -F'\t' '{for(i=1;i<=12;i++){if($2 != ""){printf $i"\t"}}if($2 != "")printf "\n"}' $RAWFILENAME1) >> DIFF_f.dat diff <(awk -F'\t' '{for(i=1;i<=12;i++){if($2 != ""){printf $i"\t"}}if($2 != "")printf "\n"}' Output_h.dat) <(awk -F'\t' '{for(i=1;i<=12;i++){if($2 != ""){printf $i"\t"}}if($2 != "")printf "\n"}' $RAWFILENAME2) >> DIFF_h.dat diff <(awk -F'\t' 'END {print $13 }' Output_f.dat) <(awk -F'\t' 'END {print $1}' $RAWFILENAME1) >> DIFF_f.dat diff <(awk -F'\t' 'END {print $13 }' Output_h.dat) <(awk -F'\t' 'END {print $1}' $RAWFILENAME2) >> DIFF_h.dat if (( $(awk 'END{print NR}' DIFF_f.dat) != pre_len_1 || $(awk 'END{print NR}' DIFF_h.dat) != pre_len_2 ));then echo "Error detected!!" break fi echo "${times} tests remaining" ((times=times-1)) done

Python simulation:

class FLOAT: @staticmethod def SUM(O_L_A): # Extend to 20 bits or 40 bits to calculate ori_len = len(O_L_A[0].fra) L_A = [] for i in range(0, len(O_L_A)): A = FLOAT("".zfill(20)) if ori_len == 12 else FLOAT("".zfill(40)) A.sign = O_L_A[i].sign A.exp = O_L_A[i].exp A.fra = O_L_A[i].fra L_A.append(A) max_exp = -1 for i in range(0, len(L_A)): if L_A[i].exp > max_exp: max_exp = L_A[i].exp res = FLOAT("".zfill(20)) if ori_len == 12 else FLOAT("".zfill(40)) res.exp = max_exp for i in range(0, len(L_A)): while L_A[i].exp < max_exp: L_A[i].exp += 1 L_A[i].fra = "0" + L_A[i].fra L_A[i].fra = L_A[i].fra[0 : ori_len] if ori_len == 12: L_A[i].fra = L_A[i].fra.rjust(20,'0') else: L_A[i].fra = L_A[i].fra.rjust(40,'0') ans = 0 for i in range(0, len(L_A)): fra = L_A[i].fra Oprand = int(L_A[i].fra,2) if L_A[i].sign == '0' else -int(L_A[i].fra,2) ans += Oprand if ans >= 0: res.sign = '0' else: res.sign = '1' if ans == 0: res.exp = 0 res.fra = "".zfill(ori_len) ans_bin = bin(abs(ans))[2 : ] # True Form if ori_len == 12: if len(ans_bin) < 20: ans_bin = ans_bin.rjust(20,'0') else: if len(ans_bin) < 40: ans_bin = ans_bin.rjust(40,'0') # complement = FLOAT.T2C(res.sign + ans.bin) if ori_len == 12: first1 = 20 for i in range(0, 20): if ans_bin[i] == '1': first1 = i break if first1 < 8: res.exp = res.exp + 8 - first1 # Simulate the error cause by truancating the two's complement when the truncated part is not all zero if res.sign == '1' and first1 < 7 : ans_bin = bin(int(ans_bin,2) + int(ans_bin[first1 + ori_len : ].rjust(20,'0'),2))[2:] if len(ans_bin) < 20: ans_bin = ans_bin.rjust(20,'0') if len(ans_bin) > 20: ans_bin = ans_bin[0 : 20] res.exp += 1 res.fra = ans_bin[first1 : first1 + ori_len] else: res.fra = ans_bin[8 : ] else: first1 = 40 for i in range(0, 40): if ans_bin[i] == '1': first1 = i break if first1 < 11: res.exp = res.exp + 11 - first1 # Simulate the error cause by truancating the two's complement when the truncated part is not all zero if res.sign == '1' and first1 < 10 : ans_bin = bin(int(ans_bin,2) + int(ans_bin[first1 + ori_len : ].rjust(40,'0'),2))[2:] if len(ans_bin) < 40: ans_bin = ans_bin.rjust(40,'0') if len(ans_bin) > 40: ans_bin = ans_bin[0 : 40] res.exp += 1 res.fra = ans_bin[first1 : first1 + ori_len] else: res.fra = ans_bin[11 : ] return res
最新回复(0)