Jump to content

MehdiT

Members
  • Content Count

    5
  • Joined

  • Last visited

  1. MehdiT

    SystemC threads stack overflow?

    Terminating the threads after they finish their execution didn't help. I went back to the part in my code where I spawn dynamic threads and rewrited it. No processes are dynamically created in my code and problem seems to be solved. From this, I get that it is perhaps not a good idea to spawn dynamic threads in large SystemC simulations. I also noticed that the usage of the RAM grows exponentially when I use dynamic threads (with and without manual termination). It could be because I haven't done it in a correct way though (see code above). Without dynamic threads, the usage of the RAM is stable (goes linearly with the number of static threads in the simulation). However my problem remains unsolved with regard to what reason makes the code crash. I am writing this as a workaround for others if they face a similar issue in the future.
  2. MehdiT

    SystemC threads stack overflow?

    Hi Alan, There are many reasons why I can't just migrate to sc_methods. Among which the use of wait statements. Killing the spawned threads sounds reasonable although I thought this should be taken care automatically. Do you mean I should use sc_process_handle, wait for the thread to be terminated and then kill it with kill() method in the handle?
  3. MehdiT

    SystemC threads stack overflow?

    Thanks Alan. In fact all my modules are instantiated with "new" and I do allocate memory dynamically everywhere to avoid using large arrays. I investigated the problem more with gdb to find out that it may be related to the fact that I spawn a lot of dynamic threads. Here is the output of (gdb) backtrace noc_exe: ../../../../src/sysc/kernel/sc_cor_qt.cpp:107: virtual void sc_core::sc_cor_qt::stack_protect(bool): Assertion `ret == 0' failed. [Thread debugging using libthread_db enabled] Program received signal SIGABRT, Aborted. 0x00000038b5e328a5 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install boost-program-options-1.41.0-17.el6_4.x86_64 glibc-2.12-1.107.el6.x86_64 libgcc-4.4.7-3.el6.x86_64 libstdc++-4.4.7-3.el6.x86_64 (gdb) backtrace #0 0x00000038b5e328a5 in raise () from /lib64/libc.so.6 #1 0x00000038b5e34085 in abort () from /lib64/libc.so.6 #2 0x00000038b5e2ba1e in __assert_fail_base () from /lib64/libc.so.6 #3 0x00000038b5e2bae0 in __assert_fail () from /lib64/libc.so.6 #4 0x00002aaaaadaa5b0 in sc_core::sc_cor_qt::stack_protect(bool) () from /home/########/systemc-2.3.1/lib-linux64/libsystemc-2.3.1.so #5 0x00002aaaaadbf9dc in sc_core::sc_simcontext::create_thread_process(char const*, bool, void (sc_core::sc_process_host::*)(), sc_core::sc_process_host*, sc_core::sc_spawn_options const*) () from /home/emehtaa/systemc-2.3.1/lib-linux64/libsystemc-2.3.1.so #6 0x0000000000428bb6 in sc_core::sc_spawn<sc_boost::_bi::bind_t<int, sc_boost::_mfi::mf2<int, credit_ctrl, unsigned int, noc_trans_t*>, sc_boost::_bi::list3<sc_boost::_bi::value<credit_ctrl*>, sc_boost::_bi::value<unsigned int>, sc_boost::_bi::value<noc_trans_t*> > > > (object=..., name_p=0x47dca6 "credit_spawned_f", opt_p=0x129ac77f0) at /home/emehtaa/systemc-2.3.1/include/sysc/kernel/sc_spawn.h:118 #7 0x0000000000424c5d in credit_ctrl::credit_init_thread (this=0x29fe350) at ./src/credit_ctrl.cpp:240 #8 0x00002aaaaadc5d36 in sc_core::sc_thread_cor_fn(void*) () from /home/########/systemc-2.3.1/lib-linux64/libsystemc-2.3.1.so #9 0x00002aaaaadcd2c1 in qt_blocki () from /home/#########/systemc-2.3.1/lib-linux64/libsystemc-2.3.1.so At each node (router), there is a module credit_controller spawning dynamic threads almost every cycle from a main thread credit_init_thread() sc_spawn(sc_bind(&credit_ctrl::transaction_track_thread, this, current_phase_index, trans_ptr)); After spawning this child thread, the main thread continues with processing the next transaction. The child thread transaction_track_thread is defined as a member function inside the same module credit_ctrl. void credit_ctrl::transaction_track_thread(unsigned phase_index, noc_trans_t* noc_trans) { // do some stuff } Although I think that a child thread will terminate after finishing its execution, I have a doubt they are still there. After some simulation time, the huge number of unterminated threads are causing the above issue. It is only a guess though. I tried to pass a name for each spawned thread as an argument for sc_spawn only to get runtime warning messages saying the spawned object already exists with the same name and the declaration will be renamed. Not good. If the child thread terminates so should its name disappear as well. I don't know what I am missing here.
  4. My top-level is a NoC consists of a Network-On-Chip with a grid of 15*15 nodes (Router+PE). Been trying to simulate it in different machines/configurations but kept stopping at different times of the simulation. Can't see what causes the problem and hence I am stack. 1) Running only SystemC / C++ executable (output of the compiler) in a LSF cluster. Simulation runs normally with expected output then it stops at the 55000ish cycle (cycle accurate model) with this error message: noc_exe: ../../../../src/sysc/kernel/sc_cor_qt.cpp:107: virtual void sc_core::sc_cor_qt::stack_protect(bool): Assertion `ret == 0' failed. /home/#######/.lsbatch/1438183854.772657: line 8: 15547 Aborted (core dumped) ./noc_exe 2) Running with Cadence irun command in a LSF cluster. The simulation was heaps of times slower but managed to reach 100 000ish cycles before generating this error message: Simulation interrupted at 1025080 NS + 0 ncsim> ncsim: *W,NCTERM: Simulation received SIGTERM signal from process 22268, user id 0 (/env/seki/app/lsf/8.0/etc/sbatchd). make: *** [run] Error 15 I have investigated the error NCTERM with nchelp and got: nchelp: 14.20-s010: (c) Copyright 1995-2015 Cadence Design Systems, Inc. ncsim/NCTERM = A SIGTERM signal was received by the running simulation. This signal may have been issued due to various reasons: * sent by the user using the kill command * machine on which the job was running went down * sent by LSF (Load Sharing Facility) to enforce certain user specified job control limits (memory, CPU, swap, etc.) I had a little doubt that the stack size might not be enough for my threads. The outputs from 1) and 2) took place even after I tried to increase the stack size. In 2), it is enough to add the -SC_THREAD_STACKSIZE 0x80000 switch to irun command. In 1), I had to go to every registration of thread in my constructors and append it with another line: SC_THREAD(controller_thread); set_stack_size(NOC_THREAD_STACK_SIZE); // 0x80000 I'd appreciate any prompt reply When I run the same test with a smaller number of nodes, the issue does not occur.
  5. The time at which a packet is generated is stored in the header of the packet. The header allocates only 32 bits for this timestamp. For any variable of type sc_time T_start, T_start.to_default_time_unit() will return the number of clock cycles if default time unit is configured to (CLK_PERIOD, SC_NS). The question here is how to convert sc_time values to 32 bits unsigned integer without large loss of precision. At the producer side this piece of code runs each time is generated: sc_time t_start = sc_time_stamp(); double magnitude = t_start.to_default_time_unit(); // number of clock cycles since the start of the simulation until this time (is this correct?) unsigned int t_start_u_32 = (unsigned int) std::round(magnitude); // possible loss of precision if magnitude is too large to be presented in 32 bits // store t_start_u_32 in the correspondent packet header field. At the consumer side (an arbiter that selects packets depending on their age) double t_start_ret = t_start_u_32 * CLK_PERIOD; sc_time t_start(t_start_ret, SC_NS); sc_time packet_age = sc_time_stamp() - t_start; Is there any good way to manipulate sc_time variables, default time unit, round/cast operations without loosing precision under the constraint of 32 bits?
×