Jump to content

How to handle transactions which are longer than BUSWIDTH


Recommended Posts

Hi All,

We are using convenience sockets (simple initiator/target) in our project, with few point to point buses and simple interconnects and nb_transport() (using 4 Ph base protocol). What would be best way of handling Tx which are longer in length than BUSWIDTH. 4 phase base protocol does not provide 'intermediate' states to let us break transactions into multiple beats. 

I was unable to glean from SystemC or TLM examples. Could we modify 4 ph protocol to add additional states (for a write Tx) like BEGIN_REQ, BEGIN_NEW_DATA, .(repeat this states as many times as needed; data would be transferred in this state) ..,  BEGIN_LAST_DATA (to indicate this is last beat).

Similarly there would be new states for read Tx. BEGIN_RESP_NEW_DATA, (repeat) and BEGIN_RESP_LAST_DATA. 

Does this sound complicated ? 

Thanks.

 

Link to comment
Share on other sites

Describing particular protocols means extending the base protocol (see also IEEE 1666-2011, section 14.2). There are several ways to do this:

  1. ignorable phases (IEEE1666-2011, section 14.2.1, 15.2.5):
    here you add intermediate timepoints inbetween the base protocol phase timepoint. This allows: 'An ignorable phase may be ignored by its recipient.'
  2. define new protocol traits (IEEE1666-2011, section 14.2.2):
    you define new, non-ignorable phases so the implementation are base-protocol-compliant. This way you can only connect base-protocol-compliant models together

The easiest way would be 1. as it allows to reuse a lot of the the tlm_utils for implementation. But 2. ist the recommended one. An example of the second apporach for AXI/ACE can be found at https://github.com/Arteris-IP/tlm2-interfaces

Link to comment
Share on other sites

I think it is simpler than you think. Just set the data length to reflect the size of the burst. In the target you can set the latency to reflect the burst size based on socket width and byte enables if used. Sure you wont see the timing of the burst internals  but unless your bus protocol allows intra-burst interlehaving, this should not be a problem.

Remember that TLM is only cycle approximate. This provides faster simulations that are reasonably accurate. 

 

Example: socket width 16 bits, transfer length of 32 bits. Obviously takes two clocks. 

Link to comment
Share on other sites

Hi Eyck and David,

Thanks for your responses and help.

David - we do have situations where target's (t1) response beats (to a burst req from initiator I1) would be interleaved with response beats from initiator I2. I.e. intra-burst interleaving is required in our model. Could there be ways of 'accounting for' additional delays due to intra-burst interleaving on target's side ? 

For example - Target T1 has two ports (sockets, 16 bit BUSWIDTH), P1 and P2. On each port  T1 gets a req of 64 bits read (resulting in 4 beat response on each port) , and would likely send out data in this seq - BEGIN_RESP_P1_Beat1, BEGIN_RESP_P2_Beat1, BEGIN_RESP_P1_Beat2, BEGIN_RESP_P2_Beat2 ...
In such scenario - extending your idea of 'target setting the latency' - can we double the latency for each port ? Interleaving mainly happens due to bank conflict / arbitration.

Thanks.

Link to comment
Share on other sites

  • 2 weeks later...

Yes, interleaving is possible in such cases the delays can be modeled but you will likely want to add some type of protocol extension (e.g., to track the transaction id when breaking up a response) and likely need to create a custom protocol as defined by the standard. The four timing points of TLM 2.0 are meant to define a basic starting point for protocols. You can also update timing in the interconnect and even deal with arbitration issues by splitting up transactions, but this is not typically done. Of course if you have a custom protocol, there are implications to interoperability and you may need to design/implement protocol adapters.

Keep in mind that TLM models are generally cycle-approximate and not cycle by cycle accurate. This is because SystemC is trying to provide timing information sufficient for architectural exploration, which generally only needs to be 60% to 80% accurate. RTL accuracy should be left for the RTL; otherwise, your SystemC model complexity might slow down development and execution to the point where it was meaningless. Ideally, SystemC models are designed to execute many times faster than RTL simulations, but still give us "sufficient" information to make architectural analysis possible.

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...