Jump to content

A few Valgrind issues


Paul Floyd

Recommended Posts

While I'm wearing my Valgrind hat, memcheck detects a few issues in the regression tests

 

  • tlm/bus_dmi
  • systemc/misc/user_guide/chpt4.4
  • systemc/datatypes/misc/concat/test07

I haven't looked at the first two. For the last one, here is a cut down version that reproduces the first error.

 

#include "systemc.h"
#include "valgrind/memcheck.h"

void checkout( const sc_unsigned* actual_p, const sc_unsigned* expected_p )
{
    cout << *actual_p <<" " << *expected_p << endl;
	cout <<"another line" << endl;
}

int sc_main(int argc, char* argv[])
{
    int high_i = 64;
    int low_i = 30;

    sc_int_base  high_int(high_i);
    sc_int_base  low_int(low_i);
    sc_signed    low_signed(low_i);

    low_int =       "0x0000000000000000";
    low_signed =    "0x0000000000000000";
    high_int =      "0xffffffffffffffff";

    cout << endl << dec << "[" << high_i << "," << low_i << "]:" << endl;

    cout << hex << "  int      " << (high_int,low_int) << " <- (  " << high_int  << " , " << low_int << " )" << endl;

    const sc_unsigned* actual_p = &(high_int,low_signed).value();
    
    {
      int i;
      std::vector<int> bits(actual_p->nbits/32+1);
      VALGRIND_GET_VBITS(actual_p->digit,bits.data(),actual_p->nbits/8+1);
      std::cout << "actual bits ";
      for (auto b : bits)
      {
         std::cout << b << " ";
      }
      std::cout << std::endl;
  }
    
    
    const sc_unsigned* expected_p = &(high_int,low_int).value();
    if ( *actual_p != *expected_p )
    {
        checkout(actual_p, expected_p);
        cout << "!!! ERROR (high_int,low_signed):" << endl;
        cout << "    expected:  " << hex << *expected_p  << endl;
        cout << "    actual:    " << hex << *actual_p << endl;
        cout << "    diff mask: " << hex << ((high_int,low_int) ^ (high_int,low_signed))  << endl;
    }

    cout << "Program completed" << endl;
    return 0;
}

A couple of things to note: I hacked the definition of sc_unsigned to make digit and nbits public so that I could access them directly. This won't compile without that change. I've added a print of Valgrind's vbits (which tells which bits have been initialized and which are uninitialized).

The output is as follows

 

64,30]:
  int      3fffffffffffffffc0000000 <- (  ffffffffffffffff , 00000000 )
actual bits c0000000 0 0 
==6726== Conditional jump or move depends on uninitialised value(s)
==6726==    at 0x4A1081B: sc_dt::vec_cmp(int, unsigned int const*, int, unsigned int const*) (src/sysc/datatypes/int/sc_nbutils.h:427)
==6726==    by 0x4A2B106: sc_dt::vec_skip_and_cmp(int, unsigned int const*, int, unsigned int const*) (src/sysc/datatypes/int/sc_nbutils.h:500)
==6726==    by 0x4A37DE0: sc_dt::compare_unsigned(int, int, int, unsigned int const*, int, int, int, unsigned int const*, int, int) (src/sysc/datatypes/int/sc_unsigned.cpp:2026)
==6726==    by 0x4A37D57: sc_dt::operator==(sc_dt::sc_unsigned const&, sc_dt::sc_unsigned const&) (src/sysc/datatypes/int/sc_unsigned.cpp:1719)
==6726==    by 0x4A3BEAC: sc_dt::operator!=(sc_dt::sc_unsigned const&, sc_dt::sc_unsigned const&) (src/sysc/datatypes/int/sc_nbcommon.inc:1930)
==6726==    by 0x2059E3: sc_main (small.cpp:97)
==6726==    by 0x4A4C579: sc_elab_and_sim (src/sysc/kernel/sc_main_main.cpp:89)
==6726==    by 0x4A4C3E1: main (src/sysc/kernel/sc_main.cpp:36)
==6726==  Uninitialised value was created by a heap allocation
==6726==    at 0x4853224: operator new[](unsigned long) (vg_replace_malloc.c:477)
==6726==    by 0x4A419DA: sc_core::sc_byte_heap::initialize(unsigned long) (src/sysc/utils/sc_temporary.h:96)
==6726==    by 0x4A41406: sc_core::sc_byte_heap::sc_byte_heap(unsigned long) (src/sysc/utils/sc_temporary.h:114)
==6726==    by 0x4A40C94: __cxx_global_var_init (src/sysc/datatypes/misc/sc_concatref.cpp:55)
==6726==    by 0x4A40D48: _GLOBAL__sub_I_sc_concatref.cpp (sc_concatref.cpp:0)
==6726==    by 0x400D2FC: ??? (in /libexec/ld-elf.so.1)
==6726==    by 0x400C03C: ??? (in /libexec/ld-elf.so.1)
==6726==    by 0x40098C8: ??? (in /libexec/ld-elf.so.1)

The first nibble of the vbits is 'c', meaning that the top two bits are uninitialized. That makes some sense as the value is '3fff..' which has the inverse pattern, 0011 binary.

I should also try MSAN to see if it finds the same error. ASAN doesn't detect uninit memory.

That is as far as I've gotten so far. No idea yet if the issue is in 'operator,'  or in 'value()'.

Edited by Paul Floyd
Mixed what asan and msan do.
Link to comment
Share on other sites

So, for the 1st item that I've looked at the most, I reproduced this on Fedora 33 amd64 with the default GCC [gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)]. The SystemC library is the latest from github and regressions systemc-regressions-2.3.3 from Accellera. Valgrind is the latest release, 3.16.1. I also reproduced the same issue on FreeBSD 12.2 with the default clang (10 I believe). That was with my own build of Valgrind, a bit more recent than 3.16.1 but no significant differences on amd64.

 

The 2nd and 3rd items I also saw on FreeBSD and I'm rerunning the full set of tests on Fedora at the moment. Will update tomorrow.

 

Link to comment
Share on other sites

More details on the other two issues

systemc/misc/user_guide/chpt4.4
 

==165605== Invalid write of size 8
==165605==    at 0x49FACA6: sc_core::sc_object::orphan_child_objects() (sc_object.cpp:336)
==165605==    by 0x49F5D46: sc_core::sc_module::~sc_module() (sc_module.cpp:273)
==165605==    by 0x406CC3: stage1_2::~stage1_2() (stage1_2.h:43)
==165605==    by 0x406B95: pipeline::~pipeline() (pipeline.h:43)
==165605==    by 0x406C38: pipeline::~pipeline() (pipeline.h:43)
==165605==    by 0x49F5343: sc_core::sc_module_dynalloc_list::~sc_module_dynalloc_list() (sc_module.cpp:94)
==165605==    by 0x4EA4236: __run_exit_handlers (in /usr/lib64/libc-2.32.so)
==165605==    by 0x4EA43DF: exit (in /usr/lib64/libc-2.32.so)
==165605==    by 0x4E8C1E8: (below main) (in /usr/lib64/libc-2.32.so)
==165605==  Address 0x5377510 is 96 bytes inside a block of size 448 free'd
==165605==    at 0x483AEDD: operator delete(void*) (vg_replace_malloc.c:584)
==165605==    by 0x4071A6: stage1::~stage1() (stage1.h:43)
==165605==    by 0x49F5343: sc_core::sc_module_dynalloc_list::~sc_module_dynalloc_list() (sc_module.cpp:94)
==165605==    by 0x4EA4236: __run_exit_handlers (in /usr/lib64/libc-2.32.so)
==165605==    by 0x4EA43DF: exit (in /usr/lib64/libc-2.32.so)
==165605==    by 0x4E8C1E8: (below main) (in /usr/lib64/libc-2.32.so)
==165605==  Block was alloc'd at
==165605==    at 0x4839E7D: operator new(unsigned long) (vg_replace_malloc.c:342)
==165605==    by 0x406EAC: f_stage1(char const*, sc_core::sc_clock&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0> const&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0> const&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0>&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0>&) (stage1.cpp:66)
==165605==    by 0x406951: stage1_2::stage1_2(sc_core::sc_module_name, sc_core::sc_clock&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0> const&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0> const&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0>&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0>&) (stage1_2.h:56)
==165605==    by 0x406AA1: pipeline::pipeline(sc_core::sc_module_name, sc_core::sc_clock&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0> const&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0> const&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0>&) (pipeline.h:56)
==165605==    by 0x4067FB: f_pipeline(char const*, sc_core::sc_clock&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0> const&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0> const&, sc_core::sc_signal<double, (sc_core::sc_writer_policy)0>&) (pipeline.cpp:55)
==165605==    by 0x4048D9: sc_main (main.cpp:58)
==165605==    by 0x49F1E8C: sc_elab_and_sim (sc_main_main.cpp:89)
==165605==    by 0x49F1CCE: main (sc_main.cpp:36)

memcheck is saying that in this code [sc_object.cpp void sc_object::orphan_child_events()]

    for( ; it != end; ++it  )
    {
        (*it)->m_parent = NULL;
        simcontext()->add_child_object(*it);
    }

the child object has already been deleted, so it is writing to deleted memory.

I assume that stage1 should be destroyed after stage1_2. Should pipeline::stage1_2 be a pointer allocated with SC_NEW so that the destruction gets done in the right order? If so it's an issue with the testcase so not a very serious problem.

And the last problem

tlm/bus_dmi
 

==166842== Thread 5:
==166842== Conditional jump or move depends on uninitialised value(s)
==166842==    at 0x4278D8: tlm_utils::simple_target_socket_b<ExplicitATTarget, 32u, tlm::tlm_base_protocol_types, (sc_core::sc_port_policy)0>::bw_process::nb_transport_bw(tlm::tlm_generic_payload&, tlm::tlm_phase&, sc_core::sc_time&) (simple_target_socket.h:157)
==166842==    by 0x415358: ExplicitATTarget::beginResponse() (ExplicitATTarget.h:133)
==166842==    by 0x4A0B504: sc_core::sc_process_b::semantics() (sc_process.h:685)
==166842==    by 0x4A11C21: sc_core::sc_thread_cor_fn(void*) (sc_thread_process.cpp:117)
==166842==    by 0x49EABB0: sc_core::sc_cor_pthread::invoke_module_method(void*) (sc_cor_pthread.cpp:127)
==166842==    by 0x4E493F8: start_thread (in /usr/lib64/libpthread-2.32.so)
==166842==    by 0x4F65902: clone (in /usr/lib64/libc-2.32.so)
==166842==  Uninitialised value was created by a stack allocation
==166842==    at 0x40A8EB: sc_main (bus_dmi.cpp:37)
==166842==
==166842== Thread 1:
==166842== Conditional jump or move depends on uninitialised value(s)
==166842==    at 0x429158: tlm_utils::simple_target_socket_b<SimpleATTarget2, 32u, tlm::tlm_base_protocol_types, (sc_core::sc_port_policy)0>::bw_process::nb_transport_bw(tlm::tlm_generic_payload&, tlm::tlm_phase&, sc_core::sc_time&) (simple_target_socket.h:157)
==166842==    by 0x413A65: SimpleATTarget2::beginResponse() (SimpleATTarget2.h:145)
==166842==    by 0x4A0B504: sc_core::sc_process_b::semantics() (sc_process.h:685)
==166842==    by 0x4A0B788: sc_core::sc_method_process::run_process() (sc_method_process.h:305)
==166842==    by 0x4A0CFA3: sc_core::sc_simcontext::crunch(bool) (sc_simcontext.cpp:486)
==166842==    by 0x4A08D19: sc_core::sc_simcontext::simulate(sc_core::sc_time const&) (sc_simcontext.cpp:887)
==166842==    by 0x4A0AB61: sc_core::sc_start(sc_core::sc_time const&, sc_core::sc_starvation_policy) (sc_simcontext.cpp:1718)
==166842==    by 0x4A0AC8C: sc_core::sc_start() (sc_simcontext.cpp:1752)
==166842==    by 0x40AEBF: sc_main (bus_dmi.cpp:76)
==166842==    by 0x49F1E8C: sc_elab_and_sim (sc_main_main.cpp:89)
==166842==    by 0x49F1CCE: main (sc_main.cpp:36)
==166842==  Uninitialised value was created by a stack allocation
==166842==    at 0x40A8EB: sc_main (bus_dmi.cpp:37)
==166842==

This is a very straightforward missing initialization, fixed with the following change

 

diff --git a/src/tlm_utils/simple_target_socket.h b/src/tlm_utils/simple_target_
socket.h                                                                        
index 7e4c3a19..c3bd240f 100644       
--- a/src/tlm_utils/simple_target_socket.h
+++ b/src/tlm_utils/simple_target_socket.h
@@ -68,6 +68,7 @@ public:
    : base_type(n)
    , m_fw_process(this)
    , m_bw_process(this)
+    , m_current_transaction(NULL)
  {
    bind(m_fw_process);
  }


 

Link to comment
Share on other sites

Here is a full analysis of the first problem, and a proposed patch.

Here's what I saw running the executable under Valgrind via the GDB server (this is required to execute the mo(nitor) command).

 

  257                     result_p->digit = (sc_digit*)sc_core::sc_temp_heap.allocate(
  258                         sizeof(sc_digit)*result_p->ndigits );
  259         #if defined(_MSC_VER)
  260                     // workaround spurious initialisation issue on MS Visual C++
  261                     memset( result_p->digit, 0, sizeof(sc_digit)*result_p->ndigits );
  262         #else
  263                     result_p->digit[result_p->ndigits-1] = 0;
  264         #endif
 >265                     right_non_zero = m_right_p->concat_get_data( result_p->digit, 0 );

 

The above is copied from gdb. result_p->nbits is 95, result_p->ndigits is 4 and result_p->digit contains all zeroes

 

(gdb) x /4x result_p->digit  
0x5825bb0:      0x00000000      0x00000000      0x00000000      0x00000000

 

However, the 1st 3 words are uninitialized, only the 4th word is initialized, as seen on line 263

(gdb) mo xb 0x5825bb0 16
                 ff      ff      ff      ff      ff      ff      ff      ff
0x5825BB0:      0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
                 ff      ff      ff      ff      00      00      00      00
0x5825BB8:      0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00

 

Inside concat_get_data we execute

>337                     dst_p[dst_i] = dst_p[dst_i] & ~mask;
 

Where dst_i is 0, mask is 0x3fffffff, dst_p[dst_i] is all uninitialized

After this assignment, the lower 30 bits are initialized but the top 2 bits are still uninitialized:

(gdb) p &dst_p[dst_i]
$27 = (sc_dt::sc_digit *) 0x5825bb0
(gdb) mo xb 0x5825bb0 4
                 00      00      00      c0
0x5825BB0:      0x00    0x00    0x00    0x00

However, I don't think that the problem is here. The next call

  266                     left_non_zero = m_left_p->concat_get_data(result_p->digit, m_len_r);

is where I reckon those two bits ought to be written to, but aren't. Watch this space.

Update:

In the second call to concat_get_data, the first word of dst_p doesn't get written (since dst_i is 1 rather than 0). So in summary, the 1st call to concat_get_data doesn't initialize all of word 0 of dst_p but then the 2nd call assumes that it was all initialized.

For comparison, the concatenation with 29 bits behaves in a similar fashion to 30 bits for the 1st call, bit the second call has dst_i set to 0 so word 0 does get fully initialized. In the case of 31 bits, a different path is taken in in the first call to concat_get_data (sc_unsigned.cpp around line 352) which writes 2 whole words rather than just part of 1 word.

 

I suggest as a fix changing this block

 

        // ALL DATA TO BE MOVED IS IN A SINGLE WORD:

 
        if ( dst_i == end_i )
        {
            mask = ~(~0U << real_bits) << left_shift;
            dst_p[dst_i] = dst_p[dst_i] & ~mask;
        }

 

to be

 

        // ALL DATA TO BE MOVED IS IN A SINGLE WORD:

 
        if ( dst_i == end_i )
        {
            if ( left_shift == 0 )
            {
                dst_p[dst_i] = 0;
            }
            else
            {
                mask = ~(~0U << real_bits) << left_shift;
                dst_p[dst_i] = dst_p[dst_i] & ~mask;
            }
        }

(and the same thing for signed)

It is not easy to test this thoroughly. The problem is with the circular buffer, which reuses memory. The re-used memory never becomes uninitialized, so to check that I made this change

 

+#include "/usr/home/paulf/tools/valgrind/include/valgrind/memcheck.h"
+
template<class T> T* sc_vpool<T>::allocate()
{
       T* result_p;    // Entry to return.
 
       result_p = &m_pool_p[m_pool_i];
+
+   VALGRIND_MAKE_MEM_UNDEFINED(result_p, sizeof(T));
+
       m_pool_i = (m_pool_i + 1) & m_wrap;
       return result_p;
}

 

I've attached a patch that seems to solve the issue.

 

Lastly, this can also probably be changed

#if defined(_MSC_VER)
            // workaround spurious initialisation issue on MS Visual C++
            memset( result_p->digit, 0, sizeof(sc_digit)*result_p->ndigits );
#else
            result_p->digit[result_p->ndigits-1] = 0;
#endif

The comment is al least half wrong - it is a genuine workaround. With the attached patch the memset should not be necessary.

concat_init.patch

Link to comment
Share on other sites

  • 1 month later...

The believe the issue here is that the current proof of concept simulator uses the low order 30 bits of each 32-bit word to store sc_unsigned or sc_signed values. The masks generated to store bits into the sc_unsigned value returned from sc_concatref::value() should take that into account and at present they do not. This code in sc_conatref::value() handles the very highest order word, which may need some number of zero bits at the top of it, if that top concatenation word is not a full 30 bits:

 

#if defined(_MSC_VER)

            // workaround spurious initialisation issue on MS Visual C++

            memset( result_p->digit, 0, sizeof(sc_digit)*result_p->ndigits );

#else

            result_p->digit[result_p->ndigits-1] = 0;

#endif

            right_non_zero = m_right_p->concat_get_data( result_p->digit, 0 );

            left_non_zero = m_left_p->concat_get_data(result_p->digit, m_len_r);

What this does not take into account is the high order two bits in words below that highest word. Those bits need to be zeroed in the various concat_get_data() methods, and they are not. I believe that is why the MS Visual C++ work-around was necessary.  Thus, the solution to this is to modify the various concat_get_data() methods to include the masking of those bits. 

I expect the proof of concept simulator to change in the near future to use an implementation that uses all 32 bits in a 32-bit word to store the sc_signed and sc_unsigned values, at that point the zeroing of the high order digit before making the concat_get_data() calls will be sufficient using the existing masks.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...