Jump to content

Is there any better way to define sc_out port of arrays?


Recommended Posts

Posted

I am a newbie to systemc. I'm writing a regfile module which is a vector register that outputs an array at a time. The code below is how I do it now, it requires a for loop to copy the data every update, which I think will slow down the simulation.
Is there a better way to implement it, to avoid this for loop? I searched the web and found that we can't get the sc_module to output a pointer.

SC_MODULE(regfile)
{
    sc_in<sc_uint<5>> rsv1_addr;

    sc_out<sc_int<32>> rsv1_data[8];

    void read_vector();

    SC_CTOR(regfile)
    {
        SC_THREAD(read_vector);
    }

private:
    sc_int<32> v_regfile[32][8];    // suppose data stored in it
};

void regfile::read_vector()
{
    for (int i = 0; i < 8; i++)
        rsv1_data[i] = v_regfile[rsv1_addr.read()][i];
    while ((true))
    {
        wait(rsv1_addr.value_changed_event());
        for (int i = 0; i < num_thread; i++)
            rsv1_data[i] = v_regfile[rsv1_addr.read()][i];
    }
}

 

Posted

You should stick to modern C++. Beyond that SystemC provides sc_vector which handles sc_object based instances better than C arrays do (e.g. when binding ports):

SC_MODULE(regfile)
{
    using reg_t = sc_int<32>;
    sc_in<sc_uint<5>> rsv1_addr{"rsv1_addr"};

    sc_vector<sc_out<reg_t>> rsv1_data{"rsv1_data", 8};

    void read_vector();

    SC_CTOR(regfile)
    {
        SC_THREAD(read_vector);
    }

private:
    using regfile_t = array<reg_t, 8>;
    array<regfile_t, 32> v_regfile;    // suppose data stored in it
};

void regfile::read_vector()
{
    auto& elem1 = v_regfile[rsv1_addr.read()];
    for (int i = 0; i < 8; i++)
        rsv1_data[i] = elem1[i];
    while ((true))
    {
        wait(rsv1_addr.value_changed_event());
        auto& elem2 = v_regfile[rsv1_addr.read()];
        for (int i = 0; i < num_thread; i++)
            rsv1_data[i] = elem2[i];
    }
}

A few more notes:

  • If this piece of code is part of a header file: pleas do not use 'using namespace sc_core;' in it. This will lead to problems when later using the code in larger problems.
  • Name your signals and ports. This greatly helps when debugging your design with waveform traces
  • Use C++ std lib datastructures (like std::array or std::vector) as they allow for range checking when built in debug mode.

Other than that the code is fine. You cannot avoid the loop based assignment as this implies a type conversion (v_regfile elements is sc_int while rsv1_data is sc_port<...>). And even if there would be some convenience function this would result in a for loop. On the other hand having contigous memory layout (which std::array guarantees) the caching and branch prediction of your processor kicks in and compensates largely for the loop.

BTW, num_thread is not declared, make sure that it sticks to a value smaller than 8. Otherwise you might run into a out-of-bounds access

Posted
35 minutes ago, Eyck said:

You should stick to modern C++. Beyond that SystemC provides sc_vector which handles sc_object based instances better than C arrays do (e.g. when binding ports):

SC_MODULE(regfile)
{
    using reg_t = sc_int<32>;
    sc_in<sc_uint<5>> rsv1_addr{"rsv1_addr"};

    sc_vector<sc_out<reg_t>> rsv1_data{"rsv1_data", 8};

    void read_vector();

    SC_CTOR(regfile)
    {
        SC_THREAD(read_vector);
    }

private:
    using regfile_t = array<reg_t, 8>;
    array<regfile_t, 32> v_regfile;    // suppose data stored in it
};

void regfile::read_vector()
{
    auto& elem1 = v_regfile[rsv1_addr.read()];
    for (int i = 0; i < 8; i++)
        rsv1_data[i] = elem1[i];
    while ((true))
    {
        wait(rsv1_addr.value_changed_event());
        auto& elem2 = v_regfile[rsv1_addr.read()];
        for (int i = 0; i < num_thread; i++)
            rsv1_data[i] = elem2[i];
    }
}

A few more notes:

  • If this piece of code is part of a header file: pleas do not use 'using namespace sc_core;' in it. This will lead to problems when later using the code in larger problems.
  • Name your signals and ports. This greatly helps when debugging your design with waveform traces
  • Use C++ std lib datastructures (like std::array or std::vector) as they allow for range checking when built in debug mode.

Other than that the code is fine. You cannot avoid the loop based assignment as this implies a type conversion (v_regfile elements is sc_int while rsv1_data is sc_port<...>). And even if there would be some convenience function this would result in a for loop. On the other hand having contigous memory layout (which std::array guarantees) the caching and branch prediction of your processor kicks in and compensates largely for the loop.

BTW, num_thread is not declared, make sure that it sticks to a value smaller than 8. Otherwise you might run into a out-of-bounds access

Great answer that clarified a lot of grammar specifications for me. thank you very much!

Posted
7 hours ago, Eyck said:

You should stick to modern C++. Beyond that SystemC provides sc_vector which handles sc_object based instances better than C arrays do (e.g. when binding ports):

SC_MODULE(regfile)
{
    using reg_t = sc_int<32>;
    sc_in<sc_uint<5>> rsv1_addr{"rsv1_addr"};

    sc_vector<sc_out<reg_t>> rsv1_data{"rsv1_data", 8};

    void read_vector();

    SC_CTOR(regfile)
    {
        SC_THREAD(read_vector);
    }

private:
    using regfile_t = array<reg_t, 8>;
    array<regfile_t, 32> v_regfile;    // suppose data stored in it
};

void regfile::read_vector()
{
    auto& elem1 = v_regfile[rsv1_addr.read()];
    for (int i = 0; i < 8; i++)
        rsv1_data[i] = elem1[i];
    while ((true))
    {
        wait(rsv1_addr.value_changed_event());
        auto& elem2 = v_regfile[rsv1_addr.read()];
        for (int i = 0; i < num_thread; i++)
            rsv1_data[i] = elem2[i];
    }
}

A few more notes:

  • If this piece of code is part of a header file: pleas do not use 'using namespace sc_core;' in it. This will lead to problems when later using the code in larger problems.
  • Name your signals and ports. This greatly helps when debugging your design with waveform traces
  • Use C++ std lib datastructures (like std::array or std::vector) as they allow for range checking when built in debug mode.

Other than that the code is fine. You cannot avoid the loop based assignment as this implies a type conversion (v_regfile elements is sc_int while rsv1_data is sc_port<...>). And even if there would be some convenience function this would result in a for loop. On the other hand having contigous memory layout (which std::array guarantees) the caching and branch prediction of your processor kicks in and compensates largely for the loop.

BTW, num_thread is not declared, make sure that it sticks to a value smaller than 8. Otherwise you might run into a out-of-bounds access

hi, 

I have another question. Right now, I'm trying to write a small processor, and I want to use a higher level of abstraction. In order to enable event communication between modules, I defined my own interface as shown in the following code. But I wonder if this is a good way to write it? Is there a similar function in tlm?

thanks a lot again

here's the code:

class event_if : virtual public sc_interface
{
public:
    virtual const sc_event &obtain_event() const = 0;
    virtual void notify() = 0;
};
class event : public sc_module, public event_if
{
public:
    event(sc_module_name _name) : sc_module(_name) {}
    const sc_event &obtain_event() const { return self_event; }
    void notify() { self_event.notify(); }
private:
    sc_event self_event;
};

so if you want to bind an event to a port, just type:

sc_port<event_if> myevent;

To get the event and wait for a trigger:

wait(myevent->obtain_event());

To notify an event:

myevent->notify();

 

Posted

Basically this would work but usually you want to transport data with the event and then you are with a signal. If you communicate values separatly you need to keep in mind that the receiving side of the channel is invoked 1 delta cycle later and the sending side may also be invoked in this delta cycle due to some notification. In this case it is left to the kernel which is function is invoked first: you have a classical race condition.

In my experience a better approach is to write the processor model (instruction set simulator, ISS) in C++ and wrap it into a SystemC module adding the specifics of a SystemC module. This enables more versatile uses and a higher simulation speed: each event and its associated conext switch costs time. One example can be found here: https://github.com/Minres/DBT-RISE-RISCV/ This contains a ISS for the RISC-V ISA (written in C++) with a single sc_module being the wrapper around the ISS (https://github.com/Minres/DBT-RISE-RISCV/tree/master/src/sysc).

Posted (edited)
On 12/20/2022 at 3:08 PM, Eyck said:

Basically this would work but usually you want to transport data with the event and then you are with a signal. If you communicate values separatly you need to keep in mind that the receiving side of the channel is invoked 1 delta cycle later and the sending side may also be invoked in this delta cycle due to some notification. In this case it is left to the kernel which is function is invoked first: you have a classical race condition.

In my experience a better approach is to write the processor model (instruction set simulator, ISS) in C++ and wrap it into a SystemC module adding the specifics of a SystemC module. This enables more versatile uses and a higher simulation speed: each event and its associated conext switch costs time. One example can be found here: https://github.com/Minres/DBT-RISE-RISCV/ This contains a ISS for the RISC-V ISA (written in C++) with a single sc_module being the wrapper around the ISS (https://github.com/Minres/DBT-RISE-RISCV/tree/master/src/sysc).

Hi,

The last question was a bit unclear, I'm sorry, I just got infected with Omicron, so I haven't had the energy for a while.
I read your code and have some understanding of its code structure, which is indeed a good reference. arch_if is the most basic architecture, which is inherited by riscv_hart_msu_vp as a configurable instruction architecture, and finally encapsulated by core_wrapper and core_complex.
But I didn't find where the specific CPU execution part of the code is. I would like my simulator to reflect the variation of cycle precision, where each instruction is not executed in one cycle, but is divided into multiple cycles according to its characteristics. So I take the liberty to ask, what is the precision of DBT-RISE and in which file can I find the code for CPU execution?

Edited by MagicLantern
More detailed question
  • 1 month later...
Posted

The behavior can be found in the vm_* classes, e.g. in https://github.com/Minres/DBT-RISE-RISCV/blob/master/src/vm/interp/vm_rv32imac.cpp

The idea idea here is that the childs of arch_if encapsulate the architectural state of the processor while the iss::interp::vm_base<> derived classes implement the behavior. The combination of the 2 forms a processor. This processor is then extended with functionality by using mixins (riscv_hart_msu_vp  is a mixin providing the privileged ISA stuff of RISC-V).

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...