iamgame Posted November 5, 2020 Report Posted November 5, 2020 Hi All, We are using convenience sockets (simple initiator/target) in our project, with few point to point buses and simple interconnects and nb_transport() (using 4 Ph base protocol). What would be best way of handling Tx which are longer in length than BUSWIDTH. 4 phase base protocol does not provide 'intermediate' states to let us break transactions into multiple beats. I was unable to glean from SystemC or TLM examples. Could we modify 4 ph protocol to add additional states (for a write Tx) like BEGIN_REQ, BEGIN_NEW_DATA, .(repeat this states as many times as needed; data would be transferred in this state) .., BEGIN_LAST_DATA (to indicate this is last beat). Similarly there would be new states for read Tx. BEGIN_RESP_NEW_DATA, (repeat) and BEGIN_RESP_LAST_DATA. Does this sound complicated ? Thanks. Quote
Eyck Posted November 5, 2020 Report Posted November 5, 2020 Describing particular protocols means extending the base protocol (see also IEEE 1666-2011, section 14.2). There are several ways to do this: ignorable phases (IEEE1666-2011, section 14.2.1, 15.2.5): here you add intermediate timepoints inbetween the base protocol phase timepoint. This allows: 'An ignorable phase may be ignored by its recipient.' define new protocol traits (IEEE1666-2011, section 14.2.2): you define new, non-ignorable phases so the implementation are base-protocol-compliant. This way you can only connect base-protocol-compliant models together The easiest way would be 1. as it allows to reuse a lot of the the tlm_utils for implementation. But 2. ist the recommended one. An example of the second apporach for AXI/ACE can be found at https://github.com/Arteris-IP/tlm2-interfaces Quote
David Black Posted November 5, 2020 Report Posted November 5, 2020 I think it is simpler than you think. Just set the data length to reflect the size of the burst. In the target you can set the latency to reflect the burst size based on socket width and byte enables if used. Sure you wont see the timing of the burst internals but unless your bus protocol allows intra-burst interlehaving, this should not be a problem. Remember that TLM is only cycle approximate. This provides faster simulations that are reasonably accurate. Example: socket width 16 bits, transfer length of 32 bits. Obviously takes two clocks. Quote
iamgame Posted November 5, 2020 Author Report Posted November 5, 2020 Hi Eyck and David, Thanks for your responses and help. David - we do have situations where target's (t1) response beats (to a burst req from initiator I1) would be interleaved with response beats from initiator I2. I.e. intra-burst interleaving is required in our model. Could there be ways of 'accounting for' additional delays due to intra-burst interleaving on target's side ? For example - Target T1 has two ports (sockets, 16 bit BUSWIDTH), P1 and P2. On each port T1 gets a req of 64 bits read (resulting in 4 beat response on each port) , and would likely send out data in this seq - BEGIN_RESP_P1_Beat1, BEGIN_RESP_P2_Beat1, BEGIN_RESP_P1_Beat2, BEGIN_RESP_P2_Beat2 ... In such scenario - extending your idea of 'target setting the latency' - can we double the latency for each port ? Interleaving mainly happens due to bank conflict / arbitration. Thanks. Quote
David Black Posted November 15, 2020 Report Posted November 15, 2020 Yes, interleaving is possible in such cases the delays can be modeled but you will likely want to add some type of protocol extension (e.g., to track the transaction id when breaking up a response) and likely need to create a custom protocol as defined by the standard. The four timing points of TLM 2.0 are meant to define a basic starting point for protocols. You can also update timing in the interconnect and even deal with arbitration issues by splitting up transactions, but this is not typically done. Of course if you have a custom protocol, there are implications to interoperability and you may need to design/implement protocol adapters. Keep in mind that TLM models are generally cycle-approximate and not cycle by cycle accurate. This is because SystemC is trying to provide timing information sufficient for architectural exploration, which generally only needs to be 60% to 80% accurate. RTL accuracy should be left for the RTL; otherwise, your SystemC model complexity might slow down development and execution to the point where it was meaningless. Ideally, SystemC models are designed to execute many times faster than RTL simulations, but still give us "sufficient" information to make architectural analysis possible. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.