Jump to content

David Black

Members
  • Content Count

    374
  • Joined

  • Last visited

  • Days Won

    81

Reputation Activity

  1. Like
    David Black reacted to Eyck in How to trace std::list<int> with sc_trace?   
    Your sc_trace function is a member function of the TraceList class and cannot be called like the sc_trace functions coming with the SystemC reference implementation. Those are free functions in the sc_core namespace.
    Moreover your sc_trace implementation is non-static so it cannot be used without a TraceList object. You need to move the function out of the class scope.
    Basically this is a valid approach to setup complex types. But under performance considerations I would suggest to use a different container. Best choices are std::vector or std::dqueue. And if you are using C++ 11 I would replace the while loop with a range based loop, something like:
    for(auto& val: var.lst) { // use namespace, compiler otherwise chooses wrong function sc_core::sc_trace(tf, val, nm + std::to_string(pos++)); }  
  2. Like
    David Black got a reaction from ArjunMadhudi in serial transmission   
    [I assume that when you say "TLM", you mean SystemC TLM 2.0.]
    You need to understand the difference between modeling styles. TLM is precisely about not modeling at the level of RTL. The SystemC TLM 2.0 also has two different modeling styles: Loosely Timed (LT) and Approximately Timed (AT). Let's look at each using a specific case. Suppose you are modeling two UARTs operating at 9600 baud (bits per second) with 8-bits, no parity, and 1 stop bit  to transfer the message "Hello World\n". This configuration results in 960 characters per second (1.042 ms/char), which is quite slow, so probably you would be transmitting/receiving characters slowly enough that most systems would either process them one at a time or provide a FIFO (e.g. 16 bytes) and only process empty/full events. There is one more question to answer though. Consider the diagram below. The connections between sender to UART and UART to receiver are clearly memory mapped for most systems. So there is no question of modeling. The connection UART to UART is not memory mapped, which means you need to create a custom protocol. Furthermore, for TLM, it actually requires to connections since communication can be invoked bi-directionally (for a full UART). You need to decide what is important to model. For a high level model and efficiency, I would either transfer as much data as I could. It might even make sense to use TLM 1.0 rather than TLM 2.0. Do you have the requirement to inject errors?

    For my example, you would configure the transmitter, and then transfer a burst of  12 characters into the transmit FIFO on one end of the transfer and generate an empty FIFO interrupt at 12.5 ms later. The receiver side would be similar. What about the UART/UART transaction? An efficient approach might be as follows:
    Create a required extension that carries the transmit configuration information (baud rate, bits, parity, etc.) Use TLM_WRITE_COMMAND because all transactions over this socket pair are initiated from the sender. The second pair in the opposite direction would do the same thing. Check and insist that the address always be 0 and the streaming width is 1. Byte enables would be illegal. Check that the configuration matches before accepting data. Place all received data into an unbounded queue and then indicate the size allowed by the hardware model. Send interrupts using the sc_signal when the received queue goes non-empty. Consider the error situation when the timing indicates characters would be lost due to FIFO full and timing of characters. You will have to decide how to deal with interrupts received in your thread process. Notice that I do not model at the bit level. If you wish to add bit-level error injection, then inject errors at the point of transmission.
     
  3. Like
    David Black got a reaction from chandan in The difference between run_phase and main_phase in uvm_component   
    Actually, you can start a sequence in any phase. It is more important to understand the domain/scheduling relationships between the task based (i.e. runtime) phases. UVM undergoes a number of pre-simulation phases (build, connect, end_of_elaboration, start_of_simulation) that are all implemented with functions. Once those are completed, the task based phases begin. The standard includes two schedules. One is simply the run_phase, which starts executing at time zero and continues until all components have dropped their objections within the run_phase. The other schedule contains twelve phases that execute parallel to the run phase. They are: pre_reset, reset, post_reset, pre_config, config, post_config, pre_main, main, post_main, pre_shutdown, shutdown, and post_shutdown. They execute in sequence. Every component has the opportunity to define or not define tasks to execute these phases. A phase starts only when all components in the previous phase have dropped their objections. A phase continues to execute until all components have dropped their objections in the current phase.
     
    Many companies use the run_phase for everything because there are some interesting issues to consider when crossing phase boundaries. In some respects it may be easier to use uvm_barriers for synchronization. Drivers and monitors (things that touch the hardware) are usally run exclusively in the run_phase, but there is nothing to prevent them also having reset_phase, main_phase, etc...
  4. Like
    David Black got a reaction from asicengineer in The difference between run_phase and main_phase in uvm_component   
    Actually, you can start a sequence in any phase. It is more important to understand the domain/scheduling relationships between the task based (i.e. runtime) phases. UVM undergoes a number of pre-simulation phases (build, connect, end_of_elaboration, start_of_simulation) that are all implemented with functions. Once those are completed, the task based phases begin. The standard includes two schedules. One is simply the run_phase, which starts executing at time zero and continues until all components have dropped their objections within the run_phase. The other schedule contains twelve phases that execute parallel to the run phase. They are: pre_reset, reset, post_reset, pre_config, config, post_config, pre_main, main, post_main, pre_shutdown, shutdown, and post_shutdown. They execute in sequence. Every component has the opportunity to define or not define tasks to execute these phases. A phase starts only when all components in the previous phase have dropped their objections. A phase continues to execute until all components have dropped their objections in the current phase.
     
    Many companies use the run_phase for everything because there are some interesting issues to consider when crossing phase boundaries. In some respects it may be easier to use uvm_barriers for synchronization. Drivers and monitors (things that touch the hardware) are usally run exclusively in the run_phase, but there is nothing to prevent them also having reset_phase, main_phase, etc...
  5. Like
    David Black reacted to Eyck in bind multi ports to other port.   
    Another option would be to use a resolved signal and connect all output ports to it.
    But this is already about techincal implementation options. The question to me is: what would you like to model? Is this the right way to model the intend?
    Best regards
  6. Like
    David Black reacted to Philipp A Hartmann in Static Sensitivity to "AND" of two events   
    Please be aware, that an sc_and_event_list does not imply that the events in the list are triggered at the same time. I would suggest to keep the only the clock sensitivity and act on the triggers in the body of the method instead:
     
    SC_METHOD(func2); sensitive << clk.pos(); dont_initialize(); // ... void func2() { if( nreset.posedge() ) { // nreset went high in this clock cycle // ... } } Alternatively, you can be sensitive to nreset.pos()  and check for clk.posedge() (as a consistency check), if you don't have anything else to do in the body of the method.  With this approach, you might be able to avoid unnecessary triggers of the method.
    Side note to Eyck: There's a small typo in the example above, which should should use "&=" to append to an sc_event_and_list.
    ev_list &= nreset;  
  7. Like
    David Black got a reaction from swami060 in Where to set the global quantum value   
    You should have a top level. If you don’t, create one and instantiate everything there. Set the global quantum in your top level module st end_of_elaboration and reset local quantum’s at start_of_simulation. Set a default and allow for override. You can obtain the override value to use at run time from any of:
    - command-line argument using sc_argv()
    - read a file or database if it exists 
    - Enviroment variable set prior to invocation
    - user input prompt
     
  8. Thanks
    David Black got a reaction from andrewkrill in Issues with concurrency and running C++ code on top of SystemC   
    Caution: SystemC Kernel is not thread-safe without taking special precautions. If you call into SystemC from outside the SystemC OS-thread, you may need to create a primitive channel utilizing async_request_update(). If on the other hand you are simply stalling SystemC from within, which is what I think is being stated, then you should be fine with simple std::mutex (not sc_core::sc_mutex which is only for use inside SystemC between SystemC internal "processes").
  9. Like
    David Black got a reaction from Mat in reading part of input port   
    Sorry, but this is simply not possible in the convenient manner of Verilog. Reason: SystemC is not about RTL. If you need a few bits, then read all of them and mask off the ones you want.
  10. Like
    David Black reacted to Roman Popov in Systemc performance   
    Real-life simulation performance usually depends a lot on modeling style. For high-level TLM-2.0 models share of simulation time consumed by SystemC primitives is usually much lower, comparing to time consumed by "business logic" of models.  Efficiency of simulation kernel (like context switches and channels) is much more important for low-level RTL simulations.
  11. Like
    David Black got a reaction from maehne in Systemc performance   
    Perhaps you would like to share your code for measurements via GitHub?
    Measuring performance can be tricky to say the least. How you compile (compiler, version, SystemC version) and what you measure can really change results. Probably helps to specify your computer's specifications (Processor, RAM, cache, OS version) too.
    Processor (vendor, version) L1 cache size L2 cache size L3 cache size RAM OS (name, version) Compiler (name, version) Compiler switches (--std, -O) SystemC version SystemC installation switches How time is measured and from what point (e.g. start_of_simulation to end_of_simulation) Memory consumption information if possible This will help to make meaningful statements about the measurements and allow others to reproduce/verify your results. It is also important to understand how these results should be interpreted (taken advantage of) and compared.
    As with respect to TLM, it will get a lot more challenging. For example, what style of coding: Loosely Timed, Approximately Timed. Are sc_clock's involved?
  12. Like
    David Black got a reaction from swami060 in Systemc performance   
    Perhaps you would like to share your code for measurements via GitHub?
    Measuring performance can be tricky to say the least. How you compile (compiler, version, SystemC version) and what you measure can really change results. Probably helps to specify your computer's specifications (Processor, RAM, cache, OS version) too.
    Processor (vendor, version) L1 cache size L2 cache size L3 cache size RAM OS (name, version) Compiler (name, version) Compiler switches (--std, -O) SystemC version SystemC installation switches How time is measured and from what point (e.g. start_of_simulation to end_of_simulation) Memory consumption information if possible This will help to make meaningful statements about the measurements and allow others to reproduce/verify your results. It is also important to understand how these results should be interpreted (taken advantage of) and compared.
    As with respect to TLM, it will get a lot more challenging. For example, what style of coding: Loosely Timed, Approximately Timed. Are sc_clock's involved?
  13. Like
    David Black reacted to Eyck in multiple initiator sockets and single target socket   
    From the snippets you provide it looks ok.
    Assuming that bus_mutex is a sc_mutex this is you problem. sc_mutex does not do arbitration. It selects randomly which waiting thread to grant the lock (actually the next activated process base on an even notification) . But what you expect is arbitration. So either you write an arbiter or you may use and ordered semaphore with an initial counter value of 1. You may find an implementation here: https://git.minres.com/SystemC/SystemC-Components/src/branch/master/incl/scc/ordered_semaphore.h
    The semaphore grant access based on a queue. So eventually you get first-comes-first-serves (FCFS) behavior, but all requestors have equal priority.
    Best regards
  14. Like
    David Black got a reaction from shubham_v in using clocks in tlm   
    Yes, you can supply clocks to TLM, but this is a very bad idea in general. Clocks will slow down your simulations There are many ways to insert clocks: ports, local instances, global references. The best b_transport implementations find ways to avoid calling wait. More precisely, we attempt to reduce context switching to a minimum and its associated overhead.
  15. Like
    David Black got a reaction from shubham_v in block_interface   
    TLM does have a response status, but that is intended to catch modeling failures rather than modeled failures. So if you consider your case a modeling error, which seems unreasonable to me on the surface, then you should either set the error response to something like a generic error or issue SC_REPORT_ERROR. In any event, do not set both. If this is a modeled error, then you are left to your own devices.
    See section 14.17 of IEEE-1666-2011 for details.
  16. Thanks
    David Black got a reaction from shubham_v in system c,tlm & system verilog interfaces   
    The term 'interface' (and for that matter 'virtual') is used in somewhat different ways in SystemVerilog than in SystemC. TLM is simply and library built on SystemC that has some well understood standard SystemC interfaces.
    Fundamentally, the concept of direction as used in hardware (and hence Verilog) does not translate to SystemC particularly well. In fact, it is somewhat annoying that we have the sc_in<T>, sc_out<T> ports in SystemC because it confuses most folks. It is best in SystemC to think like a C++ programmer. The way that SystemC views "input" and "output" is by observing data flow semantics of function calls. If I have a function with the signature put(int value), then I expect I am moving data from the caller to the callee.
    SystemC views the concept of interface in the same manner as other object oriented (OO) programming languages do. An OO interface class is simply an abstract class that exclusively contains pure virtual methods. SystemVerilog as of 2012 also has this concept in the manner of 'interace class', but this was added later.
    Thus SystemVerilog uses the keyword 'interface' in three completely different manners:
    interface blocks provide a wrapper around signals as a method of bundling signals hence the syntax: interface Bus( input clock ); logic[7:0] address, data; logic rw; modport cpu_mp( output address, rw, inout data ); modport mem_mp( input address, rw, inout data ); clocking cb @(posedge clock); input address, data, rw; endclocking modport verif_if; endinterface Note: semantically an interface is somewhat of a super module because it may contain initial, always, assign and hierarchy.
    SystemVerilog's virtual interface is simply references to instances of interface blocks to be used inside classes.
    SystemVerilog interface classes are more like C++
    interface class Print_if; pure virtual function void print( string message ); endclass class A implements Print_if; function void print( string message ); $info("%s", message ); endfunction endclass  
    By contrast C++ would use:
    class Print_if { virtual void print( std::string message ) = 0; }; class A : Print_if { void print( std::string message ) { std::cout << message << std::endl; } };  
  17. Like
    David Black got a reaction from TRANG in How to Method work with event?   
    Notify (either case) is non-blocking, so your call to notify followed by initialize will happen. Then after you return, the notified element(s) may execute.
    Notify() implies execution will be in the same delta-cycle; whereas, notify(SC_ZERO_TIME) postpones to the next one and allows other processes in the current delta-cycle to complete.
    Take a look at <https://github.com/dcblack/SystemC-Engine/blob/master/Engine_v2.4.pdf>.
  18. Thanks
    David Black got a reaction from Ahr in Connecting two Bi-directional ports   
    sc_signal is not really a good channel for bi-directional signaling since by design it is intended for single driver (writer) multiple reader use.
    You should use sc_signal_rv<T> or sc_signal_resolved for multiple drivers so that contention can be properly modeled.
     
  19. Like
    David Black got a reaction from shubham_v in block_interface between initiator and target   
    @shubham_v as a course top-level description, you have the correct general idea, but there are many subtleties:
    b_transport() can block using a call to wait( args... ); although, models that do not block are desirable from a performance perspective. b_transport() executes in the simulation context of the caller. Anything requiring the SystemC kernel management (e.g. wait) imposes the requirement that b_transport() was invoked in the context of a SystemC SC_THREAD. Simulated time conceptually may be decoupled from the SystemC kernel's notion of time using a temporal offset (when 2nd argument of the b_transport call is non-zero). Sockets are not required; however, they solve a number of problems. TLM 2.0 Base Protocol compliant target models must support both blocking and non-blocking interfaces. This is one reason the convenience simple target sockets are so useful (automatic conversion provided may be suitable for some models). Extensions and custom protocols may have important consequences that need to be considered.
  20. Like
    David Black got a reaction from TRANG in Timing in TLM   
    @Eyck I would point out that Timing Annotation is not limited to Loosely-Timed (LT) modeling, but can also be applied to Approximately-Timed (AT) models (see section 11.1 of IEEE-1666-2011); however, there is an important difference. LT timing annotation describes temporal decoupling as you explained. AT timing annotation is a way of indicating where a phase applies. This has some odd implications that are not immediately obvious.
    For instance, I can start an nb_transport_fw transaction with a non-zero annotated delay:
    tlm_phase phase { BEGIN_REQ }; sc_time time { 50_ns }; auto status = nb_transport_fw( payload, phase, time ); ///< begin transaction 50 ns in the future // Note that ns_transport_fw may increase the time (same as b_transport); however, it may not decrease the time. Section 11.1.3.1 describes this in detail.
    Why would this be done? Perhaps the initiator knows wants to dispatch a transaction and doesn't want to wait around to its initiation.
    I cannot immediately think of why the returned value might change, but it is legal.
    The main rule about time in SystemC requires that time never goes backward. No playing Dr. Who.
  21. Like
    David Black got a reaction from TRANG in sensitivity list   
    You can only specify sensitivity on objects that have events or event finders directly accessible at the time of construction. Normally this means using either a suitable channel, port or explicit event. If you wrap your int's with a channel such as sc_signal<T>, you can do it.
    Example - https://www.edaplayground.com/x/5vLP
  22. Like
    David Black got a reaction from shubham_v in block_interface between initiator and target   
    At first glance, you can call b_transport from C++, but it must be C++ that is inside a SystemC process during simulation phase if the b_transport call invokes sc_core::wait(). You can read the details in the IEEE-1666-2011 specification. Or perhaps you should signup for the Doulos SystemC TLM-2.0 course and get expert hands-on training.
    Fundamentally, b_transport is a simple function call; however, it may use SystemC semantics to accomplish its work. Hence you should also be knowledgable on SystemC SC_THREADs (may not be used in SC_METHODs).
    Technically, you could write a b_transport method in the target that did not call sc_core::wait, which is desirable anyhow. If you did this, then you may call from pretty much anywhere; however, SystemC is not thread-safe without special precautions. In any event, there is a lot to learn about the subtleties of the generic payload and extensions too.
     
  23. Like
    David Black got a reaction from maehne in sensitivity list   
    You can only specify sensitivity on objects that have events or event finders directly accessible at the time of construction. Normally this means using either a suitable channel, port or explicit event. If you wrap your int's with a channel such as sc_signal<T>, you can do it.
    Example - https://www.edaplayground.com/x/5vLP
  24. Like
    David Black got a reaction from Teddy Minz in Synthetizable FFT function in SystemC   
    You paint a very bleak and incorrect picture of the HLS tool. I will suggest that you need to get some training on its use. Xilinx have many examples and their documentation is quite good. Document UG902 clearly documents the HLS math library which supports all manner of synthesizable operations. For instance:
    Trigonometric Functions: acos, atan, cospi, acospi, atan2, sin, asin, atan2pi, sincos, asinpi, cos, sinpi, tan, tanpi
    Hyperbolic Functions: acosh, asinh, cosh, atanh, sinh, tanh
    Exponential Functions: exp, exp10, exp2, expm1, frexp, idexp, modf
    Logarithmic Functions: ilogb, log, log10, log1p
    Power Functions: cbrt, hypot, pow, rsqrt, sqrt
    Error Functions: erf, erfc
    Gamma Functions: lgamma, lgamma_r, tgamma
    Rounding Functions: ceil, floor, llrint, llround, lrint, lround, nearbyint, rint, round, trunc
    and that's only a few. 
    Perhaps your grasp of C++ and what can or cannot be synthesized is limited. For instance, dynamically allocated memory is forbidden because it is not reasonable to expect silicon to grow new logic during operation.
    Please read the fine manual (RTFM).
     
  25. Like
    David Black got a reaction from dilawar in Heartbeat, clock and negedge   
    You can use it however you like. We didn't use it everywhere and I'm sure there are more areas where it might be applicable. The main point is that "Performance is a function of simulator CPU activity and how well it used." In some cases such as clocks, there is a lot of activity that goes unused. Many designs really only use the positive edge of the clock. In some designs, the activity can even be reduced significantly.
    Another instance is timers that often are only touched when they are set up and timeout after N clocks. The RTL approach to modeling a timer decrements the timer value on every clock. A behavioral approach would be:
    void set_timer( int N ) { assert( N > 0 ); delay = N * clock.period(); setup_time = sc_time_stamp(); projected_time = setup_time + delay; timeout_event.notify( delay ); }  The current value of the timer can always be had with:
    int get_timer_value( void ) { return ( projected_time - sc_time_stamp() ) % clock.period() ); } So you really don't even need the clock in many instances. Instead replace clock.period() with a simple constant.
    Fast and smart SystemC models don't use sc_clock at all.
×
×
  • Create New...