Jump to content

unexpected VCD dump value


Recommended Posts

I'm learning systemc at early stage, so pls forgive me if it's a stupid question.

Code below is a fixed-point mac for FIR.

This mac class will be called from sym_hbf_2x class.

And the latter be called from tx_fir_0_1_4 which is called in tx_top cthread process().

 

When I dump the values into VCD, the index "i" stucks at "4".

But other internal signals (like *_out) shows the for loop is running.

How can I dump the accurate internal signals states into VCD?

template<int in_bw, int out_bw, int tap, int coeff_bw, int mul_trunc, int acc_trunc>
class sym_fir_mac{
public:
    static const int mul_bw = in_bw + coeff_bw - mul_trunc;
    static const int acc_bw = mul_bw + acc_trunc;
    //This line doesn't work
    //Cannot cast double to const int?
    //        const int acc_trunc = ceil(log2(tap));
    sc_uint<3>                                    i;
    sc_int<in_bw+1>                             data;
    sc_fixed<mul_bw, mul_bw, SC_TRN, SC_SAT>     mul_out;
    sc_fixed<acc_bw, acc_bw, SC_TRN, SC_SAT>     acc_out;
    sc_int<out_bw>                                 mac_out;

    sc_int<TX_BITS> inst(sc_int<in_bw> *reg, sc_int<coeff_bw> *coeff) {
        for(i = 0; i<tap; i++) {
            data = reg[i] + reg[2*tap-1-i];
            mul_out = ((data * coeff[i]) >> mul_trunc);
            acc_out = acc_out + mul_out;
        }
        mac_out = (acc_out >> acc_trunc);
        return mac_out;
    }
};
  
  
template<int in_bw, int out_bw, int tap, int coeff_bw, int mul_trunc, int acc_trunc>
class sym_hbf_2x {
public:
	sc_bit   			state;
	sc_int<TX_BITS>		latched_data_out;
	sym_fir_mac<16,16,tap,coeff_bw,mul_trunc, acc_trunc> 		mac;
	sc_logic				busy_out;
	sc_int<TX_BITS>		data_out;

	void reset() {
		state = 0;
		data_out = 0;
		latched_data_out = 0;
		busy_out = 0;
	}

	sc_int<TX_BITS> inst(sc_int<16> *data_in, sc_int<coeff_bw> *coef, sc_logic busy_in) {
		if (busy_in == 0) {
			switch(state) {
			case 0 :
				data_out = mac.inst(data_in, coef);
			case 1 :
				data_out = data_in[2*tap];
			}
			latched_data_out = data_out;
		}
		else {
			data_out = latched_data_out; // restore the data
			busy_out = 1;
		}
		return data_out;
	}
};
  
class tx_fir_0_1_4 {
public:
	sym_hbf_2x<16,16,4,20,13,2>  fir0;

	sc_int<16>	fir0_data_in[8];
	sc_int<16>	fir0_data_out;
	sc_logic	fir0_busy_out;

      public:
	void reset() {
        cout << "tx top:" << "start reset tx_top" <<endl;
		fir0.reset();

		fir0_coeff[0] = -234;
		fir0_coeff[1] = 1320;
		fir0_coeff[2] = -4706;
		fir0_coeff[3] = 199997;
	}

	sc_int<TX_BITS> inst(sc_int<16> *data_in, sc_logic busy_in) {
        cout << "tx top:" << "start inst tx_top" <<endl;
		if(fir0.busy_out == 0) {
			for (int i=7; i>0; i--) {
				fir0_data_in[i] = fir0_data_in[i-1];
			}
			fir0_data_in[0] = *data_in;
		}

		fir0_data_out = fir0.inst(fir0_data_in, fir0_coeff, busy_in);
		return fir0_data_out;
	}
};

 

image.thumb.png.886811d79d1d7ed72ce4f268c2b62e7d.png

Link to comment
Share on other sites

Hello @zidane,

How is the inst method called?

41 minutes ago, zidane said:

sc_int<TX_BITS> inst(sc_int<in_bw> *reg, sc_int<coeff_bw> *coeff) {

What processes are you using(SC_[C]THREAD/SC_METHOD)?

In the "inst" method the instance variable i goes through 0 to the value of tap and SystemC kernel only sees the final value of 4 as evident in your VCD Trace.

It would be better if you can share a minimal example of what you are trying to achieve.

Hope this helps.

Regards,

Ameya Vikram Singh

Link to comment
Share on other sites

Hello @zidane,

Instead of updating the original post with the details it would have been better to just post a new comment so as to read through the new information succinctly.

Currently it's difficult to judge what changed between your original post and the edited one.

But at a high-level I see that there are no SystemC processes involved, i.e. SC_THREAD/SC_METHOD's in your model design.

Also, I don't see any interface definition of your FIR module implementation, i.e. Input/Output (I/O declarations).

I would recommend going through some examples of designs implemented with SystemC included as part of SystemC sources.

Also, better going through the reference book to understand to concepts for modeling using SystemC.

Regards,

Ameya Vikram Singh

Link to comment
Share on other sites

Hi @AmeyaVS

I didn't paste all code (like IO and SC_THREAD)  here otherwise it may become a length thread.

But I think the code is structurally OK or the systemc simulation wont run.

More about SC_THREAD

This mac class will be called from sym_hbf_2x class.

And the latter be called from tx_fir_0_1_4 which is called in tx_top cthread process().

Link to comment
Share on other sites

A running simulation is no indication that you model follows established SystemC coding practices. The suggestions by @AmeyaVS are all valid. The code snippets, which you provided use for the moment only SystemC data types, but they don't define a module class with a ports interface and processes. Without a minimal, self-contained, and executable example exposing your issue, you are making it difficult to others to give you good feedback. Instead of pasting all the code, you can also attach a ZIP archive and keep code snippets to the parts, which you think are relevant for your problem. I recommend you reading a good introduction book on SystemC to get familiar with it and associated modelling methodologies.

Link to comment
Share on other sites

Hello @zidane,

I would recommend the smallest example you can come up with the problem you are facing so as to give you a feedback effectively.

You can use https://edaplayground.com/ for submitting the minimal example or as suggested by Dr. @maehne as a zip attachment to the post in the forum.

You can also look into the historical posts in the forum discussing the similar issue.

Regards,

Ameya Vikram Singh

Link to comment
Share on other sites

Hi @maehne and @AmeyaVS

Thanks for the follow up.

Here is the code snippet regarding with IO.

I also attached full code as well.

Quote

system.cc

    SC_CTOR(SYSTEM) :
        clk("clk", 5, SC_NS, 0.5, 0, SC_NS, true) {
        fir_filter1 = new tx_top("fir_filter");
        fir_filter1->clk(clk);
        fir_filter1->en(en);
        fir_filter1->reset_n(reset_n);
        fir_filter1->data_in(input_data);
        fir_filter1->data_out(output_data);
        ...
    }

Quote

tx_top.h

SC_MODULE(tx_top)
{

    sc_in <bool>                            clk;
    sc_in <bool>                            reset_n;
    sc_in <bool>                            en;
    sc_in  < sc_int<16> >    data_in;
    sc_out < sc_int<16> >    data_out;

    sc_int<16>                                fir_in;
    sc_bit                                    fir_en;
    sc_int<16>                              fir_out;
    tx_fir_0_1_4                             fir_top;

    void process();

    SC_CTOR(tx_top):
        clk("clk"),
        reset_n("reset_n"),
        en("en"),
        data_in("data_in"),
        data_out("data_out")
    {
        SC_CTHREAD(process, clk.pos());
        async_reset_signal_is(reset_n, false);

    }
};

Quote

tx_top.cpp

void tx_top:: process() {
    { 
        HLS_DEFINE_PROTOCOL("Reset");
        data_out.write(0);
        fir_top.reset();
        wait();
    }

    while(true) {
        {
        HLS_DEFINE_PROTOCOL("INPUT");
        HLS_DEFINE_PROTOCOL("HLS_ASSUME_STABLE");
        fir_in=data_in.read();
        fir_en=en.read();
        }

        {
        if(fir_en==1) {
            fir_out = fir_top.inst( &fir_in, (sc_logic) false );
            fout_fir0_out << fir_top.fir0.busy_out << endl;
        }
        else {
            fir_out = 0;
        }
        }

        {
        HLS_DEFINE_PROTOCOL("OUTPUT");
        data_out.write(fir_out);
        wait();
        }
    }

 

 

Link to comment
Share on other sites

Your code snippet of the tx_top::process() confirms that it gets activated once per rising edge of clock and then waits until the next rising edge of clock. All code, which gets executed in your while loop (including the function calls gets executed in the same delta cycle. It's important to be aware that tracing of signals and variables happens not upon assignment to them, but as part of the simulation cycle, i.e., a new trace value gets only recorded once there are no new events to process for the current time (because all signals have stabilised). After that, the simulator advances time to the next moment when an event occurs. This explains, why you are observing in your VCD trace only the final values of your traced internal variables from the end of your function executions. So, @AmeyaVS's hypothesis in his first reply was right to the point!

If you want to trace your algorithm execution within a delta cycle, you are on your own. One option is to set a breakpoint on the respective function and step through it while monitoring the evolution of the variable values. Another option is to output the values at strategic points in your code to some output stream.

Link to comment
Share on other sites

Thanks @maehne, I think I understand the issue now.

If I expect the algorithm to run in pipeline, and II = 1.

I think I can add the setting in tool like Stratus, but how to achieve same thing just in simulation. 

(currently, I'm using Eclipse)

 

Update:

I tried to use wait() in the loop.

Seems working as I will be updated at each clock cycle.

Will this method impact with Stratus scheduling and optimization?

Link to comment
Share on other sites

If you add the wait() into the inner for loop, the algorithm will be implemented using a FSMD architecture. Each sample will then take 4 clock cycles to get processed. A new sample is processed only every 4 clock cycles as well. If you want your algorithm to be pipelined, your loop needs to get unrolled. Depending on your HLS tool and your coding style, the synthesizer might automatically defer the pipeline, require some hint in form of a pragma or you’ll have to rewrite your model.

Similar to classic HDLs, it helps to first imagine the structure and behaviour of the hardware, you want to implement and then try to express it in code following the recommended coding styles of your tool. A for loop with a fixed number of iterations can behave like a for generate statement in VHDL. Then, i can be simply of type int, because it is just an index and not a transient value kept in a register.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...