Jump to content

Likely memory corruption by SystemC


Oscar_Huang

Recommended Posts

First of all, how to report a bug to Accellera?

Recently I ported one of our old model to Centos 7. The building was fine. When running it, I got this error during initialization stage:

../../../../src/sysc/kernel/sc_main.cpp:55: int Check_mprotect(void*, size_t, int): Assertion `ret == 0' failed.
Aborted (core dumped)

After days of debugging, I managed to narrow down the problem to SystemC itself.  And I can reproduce the failure with a very simple program as below:

#include <sys/mman.h>
#include <sys/types.h>
#include <errno.h>
#include <stdio.h>
#include <assert.h>

static int Check_mprotect(void *p, size_t sz, int line)
{
    int pagesize = 4096;
    caddr_t redzone = caddr_t( ( ( size_t( p ) + pagesize - 1 ) /
                          pagesize ) * pagesize );
    int ret;
    ret = mprotect( redzone, pagesize - 1, PROT_READ|PROT_WRITE);
    if(ret) ret = -3;
    if(!ret)
        ret = mprotect( redzone, pagesize - 1, PROT_NONE );
    if(!ret){
        ret = mprotect( redzone, pagesize - 1, PROT_READ|PROT_WRITE);
        if(ret) ret = -2;
    }

    printf("mprotect check done at %d: ret: %d, errno: %d, p: %p, redzone: %p\n", line, ret, errno, p, redzone);
    assert(ret == 0);
    return ret;

}

static int CHECK_MPROTECT(int line)
{
    char *p;
    size_t sz = 0x200000;
    p = new char[sz];
    int ret = Check_mprotect(p, sz, line);
    delete []p;
    if(!ret)
        printf("@@@@ mprotect check pass at Line %d\n", line);
    return ret;
}

extern "C" int sc_main(int sc_argc, char *sc_argv[])
{
        printf("Customer sc_main called\n");
        CHECK_MPROTECT(__LINE__);
        CHECK_MPROTECT(__LINE__);
        CHECK_MPROTECT(__LINE__);
}

it always fails (assert(ret==0)) from the 2nd CHECK_MPROTECT call with ret being -2.

If I remove the SC_REPORT_FATAL line from the macro definition SC_API_PERFORM_CHECK  in file src/sysc/kernel/sc_ver.cpp, the program can run through. It seems some global or static variable initialization routines have corrupted memory, but this is just my guess. The same test program runs well on Ubuntu 18.04. 

Here is my makefile:

mini.x: my_main.o
        gcc ../../lib/libsystemc.a -lstdc++ -lm -lpthread  -o $@ $<
my_main.o: my_main.cpp
        gcc -I../../include -c -o $@ $<

.PHONY: clean
clean:
        rm -f my_main.o mini.x


My SystemC is v2.3.3 (but the same issue was observed with 2.3.1a too). And I tried with gcc 4.8.5 and gcc 7.3 on Centos7. 

It's much appreciated if anybody can help me out. 

 

 

Thanks

-Oscar

 

Link to comment
Share on other sites

You can report a bug on this forum. Or on github https://github.com/accellera-official/systemc/issues

../../../../src/sysc/kernel/sc_main.cpp:55: int Check_mprotect(void*, size_t, int): Assertion `ret == 0' failed.

I don't see Check_mprotect in sc_main.cpp. Can it be that you have modified SystemC kernel (non-Accellera)? If this is the case you should probably contact support of your SystemC vendor. 

Quote



extern "C" int sc_main(int sc_argc, char *sc_argv[])
{
        printf("Customer sc_main called\n");
        CHECK_MPROTECT(__LINE__);
        CHECK_MPROTECT(__LINE__);
        CHECK_MPROTECT(__LINE__);
}

it always fails (assert(ret==0)) from the 2nd CHECK_MPROTECT call with ret being -2.

If I remove the SC_REPORT_FATAL line from the macro definition SC_API_PERFORM_CHECK  in file src/sysc/kernel/sc_ver.cpp, the program can run through.

 

Why you put extern "C" there?  SystemC is C++, and C does not support exceptions. SC_REPORT_FATAL throws exception, and once it reaches your sc_main it is UB (undefined behavior) what happens next.

Quote

SC_API_PERFORM_CHECK  

SC_API_PERFORM_CHECK tries to check if SystemC library and user application where compiled with same options and same C++ standard.  Looks like exception is thrown when SC_DEFAULT_WRITER_POLICY is configured differently.

Quote

Anybody used SystemC on Centos 7?

CentOS 7 supposed to be equivalent to RHEL 7. And RHEL is often used for SystemC development.

Link to comment
Share on other sites

Hello, Roman,

Sorry for the confusion. Yes, I had modified the kernel (sc_main.cpp) by adding two lines of CHECK_MPROTECT function call in the main function, just to confirm if the failure could happen even earlier. And of course, this is a SystemC by Accellera. 

The reason of using extern "C" for the sc_main function is, I didn't include header systemc in my c++ source file. Actually sc_main is declared as extern "C" in sysc/kernel/sc_externs.h.

I reproduced the issue on Centos  8 as well.

If I copy the same built binary to Ubuntu 18, it run through without any problem. 

Further debugging shows, as long as your app links in  any function from sc_simcontext.cpp, for example, by adding a call to sc_core::sc_get_curr_simcontext(), mpprotect checking will fail. 

 

Thanks,

-Oscar

Link to comment
Share on other sites

Thanks for trying my example, Roman.

The original issue was the assert failure at Line 109 in file sysc/kernel/sc_cor_qt.cpp:

    int ret;

    // Enable the red zone at the end of the stack so that references within
    // it will cause an interrupt.

    if( enable ) { 
        ret = mprotect( redzone, pagesize - 1, PROT_NONE );
    }   

    // Revert the red zone to normal memory usage. Try to make it read - write -
    // execute. If that does not work then settle for read - write

    else {
        ret = mprotect( redzone, pagesize - 1, PROT_READ|PROT_WRITE|PROT_EXEC);
        if ( ret != 0 ) 
            ret = mprotect( redzone, pagesize - 1, PROT_READ | PROT_WRITE );
    }   

    sc_assert( ret == 0 ); //<<--Failed here

My example is just for reproducing this issue, which does not do anything meaningful other than that. 

I google'd a little bit about mprotect. There is a thread saying mprotect may not be reliable on malloc'd memory. See this thread.

I tried with mmap, and my example could run without the issue. In POSIX's manual for mprotect, there is a statement "The behavior of this function is unspecified if the mapping was not established by a call to mmap().". (run "man 3p mprotect" on centos).  If this is true for Centos, it might be problematic to use mprotect in SystemC because the stack memory is created by "new", not mmap(). 

 

Link to comment
Share on other sites

Quote

There is a thread saying mprotect may not be reliable on malloc'd memory. See this thread.

Indeed, quote from that thread:

Quote

 Longer version: Small malloc()s tend to end up in BSS, and they might overlap with pages mapped from the ELF executable (as rw-p in /proc/self/maps), so you cannot reliably use mprotect() on malloc()'d memory. 

So it could be that SystemC co-routine stack was allocated in BSS and setting mprotect on redzone failed. And the difference between CentOS and Ubuntu can be explained by difference in malloc implementation.  So indeed can be a bug in SystemC.

Can you comment-out assert in stack_protect function in SystemC kernel, rebuild it, rebuild your application and check if it works?

Link to comment
Share on other sites

I did one more experiment. If you examine the p values:

Customer sc_main called
mprotect check done at 50: ret: 0, errno: 0, p: 0x7f404a099010, redzone: 0x7f404a09a000
@@@@ mprotect check pass at Line 50
mprotect check done at 51: ret: -2, errno: 13, p: 0x214cbe0, redzone: 0x214d000

It looks like if p was allocated at the upper end of the address space (where p is 0x7f404a099010) , the test can pass. If p was allocated at the lower end (where p is 0x214cbe0), the test fails. 

To confirm this, I changed the sz variable to a smaller value: 4096*2. Then I found the first CHECK_MPROTECT failed as well:

Customer sc_main called
mprotect check done at 50: ret: -2, errno: 13, p: 0xf50be0, redzone: 0xf51000

I have no idea how Centos handles memory allocations. But this shows the location of the allocated buffer matters for mprotect.

 

Link to comment
Share on other sites

Once correction to my first post: to toggle the failure, in addition to commenting out the SC_REPORT_FATAL call in the macro definition of SC_API_PERFORM_CHECK_ in sysc/kernel/sc_ver.cpp, the exception handling code in sc_elab_and_sim() function also needs to be commented out. I did it like this:

80     try
 81     {
 82         pln();
 83 
 84         // Perform initialization here
 85         sc_in_action = true;
 86 
 87         // copy array of pointers to keep allocated pointers for later release
 88         std::vector<char*> argv_call = argv_copy;
 89         status = sc_main( argc, &argv_call[0] );
 90 
 91         // Perform cleanup here
 92         sc_in_action = false;
 93     }
 94     catch(...){}
 95 #if 0
 96     catch( const sc_report& x )
 97     {
 98         sc_report_handler::get_handler()
 99             ( x, sc_report_handler::get_catch_actions() );
100     }
101     catch( ... )
102     {
103         // translate other escaping exceptions
104         sc_report*  err_p = sc_handle_exception();
105         if( err_p )
106             sc_report_handler::get_handler()
107                 ( *err_p, sc_report_handler::get_catch_actions() );
108         delete err_p;
109     }
110 
111     for ( int i = 0; i < argc; ++i ) {
112         delete[] argv_copy[i];
113     }
114 
115     // IF DEPRECATION WARNINGS WERE ISSUED TELL THE USER HOW TO TURN THEM OFF
116 
117     if ( sc_report_handler::get_count( SC_ID_IEEE_1666_DEPRECATION_ ) > 0 )
118     {
119         std::stringstream ss;
120 
121         const char MSGNL[] = "\n             ";
122         const char CODENL[] = "\n  ";
123 
124         ss << "You can turn off warnings about" << MSGNL
125            << "IEEE 1666 deprecated features by placing this method call" << MSGNL
126            << "as the first statement in your sc_main() function:\n" << CODENL
127            << "sc_core::sc_report_handler::set_actions( "
128            << "\"" << SC_ID_IEEE_1666_DEPRECATION_ << "\"," << CODENL
129            << "                                         " /* indent param */
130            << "sc_core::SC_DO_NOTHING );"
131            << std::endl;
132 
133         SC_REPORT_INFO( SC_ID_IEEE_1666_DEPRECATION_, ss.str().c_str() );
134     }
135 #endif

 

Link to comment
Share on other sites

I checked the /proc/<pid/maps file of my example program. Most of the memory segments have "x" flag:

  1 00400000-0045f000 r-xp 00000000 fd:02 557832807                          /home/ohuang/data/software/mpfail/systemc-2.3.3/SystemC/examples/mini/mini_fail.x
  2 0065f000-00661000 r-xp 0005f000 fd:02 557832807                          /home/ohuang/data/software/mpfail/systemc-2.3.3/SystemC/examples/mini/mini_fail.x
  3 00661000-00665000 rwxp 00061000 fd:02 557832807                          /home/ohuang/data/software/mpfail/systemc-2.3.3/SystemC/examples/mini/mini_fail.x
  4 00665000-00666000 rwxp 00000000 00:00 0
  5 01447000-01449000 rwxp 00000000 00:00 0                                  [heap]
  6 01449000-0144a000 ---p 00000000 00:00 0                                  [heap]
  7 0144a000-01669000 rwxp 00000000 00:00 0                                  [heap]
  8 7f2f8ecbd000-7f2f8ee80000 r-xp 00000000 fd:00 33648244                   /usr/lib64/libc-2.17.so
  9 7f2f8ee80000-7f2f8f080000 ---p 001c3000 fd:00 33648244                   /usr/lib64/libc-2.17.so
 10 7f2f8f080000-7f2f8f084000 r-xp 001c3000 fd:00 33648244                   /usr/lib64/libc-2.17.so
 11 7f2f8f084000-7f2f8f086000 rwxp 001c7000 fd:00 33648244                   /usr/lib64/libc-2.17.so
 12 7f2f8f086000-7f2f8f08b000 rwxp 00000000 00:00 0
 13 7f2f8f08b000-7f2f8f0a0000 r-xp 00000000 fd:00 33625636                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
 14 7f2f8f0a0000-7f2f8f29f000 ---p 00015000 fd:00 33625636                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
 15 7f2f8f29f000-7f2f8f2a0000 r-xp 00014000 fd:00 33625636                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
 16 7f2f8f2a0000-7f2f8f2a1000 rwxp 00015000 fd:00 33625636                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
 17 7f2f8f2a1000-7f2f8f2b8000 r-xp 00000000 fd:00 33743401                   /usr/lib64/libpthread-2.17.so
 18 7f2f8f2b8000-7f2f8f4b7000 ---p 00017000 fd:00 33743401                   /usr/lib64/libpthread-2.17.so
 19 7f2f8f4b7000-7f2f8f4b8000 r-xp 00016000 fd:00 33743401                   /usr/lib64/libpthread-2.17.so
 20 7f2f8f4b8000-7f2f8f4b9000 rwxp 00017000 fd:00 33743401                   /usr/lib64/libpthread-2.17.so
 21 7f2f8f4b9000-7f2f8f4bd000 rwxp 00000000 00:00 0
 22 7f2f8f4bd000-7f2f8f5be000 r-xp 00000000 fd:00 33648254                   /usr/lib64/libm-2.17.so
 23 7f2f8f5be000-7f2f8f7bd000 ---p 00101000 fd:00 33648254                   /usr/lib64/libm-2.17.so
 24 7f2f8f7bd000-7f2f8f7be000 r-xp 00100000 fd:00 33648254                   /usr/lib64/libm-2.17.so
 25 7f2f8f7be000-7f2f8f7bf000 rwxp 00101000 fd:00 33648254                   /usr/lib64/libm-2.17.so
 26 7f2f8f7bf000-7f2f8f8a8000 r-xp 00000000 fd:00 33743422                   /usr/lib64/libstdc++.so.6.0.19
 27 7f2f8f8a8000-7f2f8faa7000 ---p 000e9000 fd:00 33743422                   /usr/lib64/libstdc++.so.6.0.19
 28 7f2f8faa7000-7f2f8faaf000 r-xp 000e8000 fd:00 33743422                   /usr/lib64/libstdc++.so.6.0.19
 29 7f2f8faaf000-7f2f8fab1000 rwxp 000f0000 fd:00 33743422                   /usr/lib64/libstdc++.so.6.0.19
 30 7f2f8fab1000-7f2f8fac6000 rwxp 00000000 00:00 0
 31 7f2f8fac6000-7f2f8fae8000 r-xp 00000000 fd:00 33635267                   /usr/lib64/ld-2.17.so
 32 7f2f8fcb5000-7f2f8fcba000 rwxp 00000000 00:00 0
 33 7f2f8fce5000-7f2f8fce7000 rwxp 00000000 00:00 0
 34 7f2f8fce7000-7f2f8fce8000 r-xp 00021000 fd:00 33635267                   /usr/lib64/ld-2.17.so
 35 7f2f8fce8000-7f2f8fce9000 rwxp 00022000 fd:00 33635267                   /usr/lib64/ld-2.17.so
 36 7f2f8fce9000-7f2f8fcea000 rwxp 00000000 00:00 0
 37 7fff13e46000-7fff13e67000 rwxp 00000000 00:00 0                          [stack]
 38 7fff13f87000-7fff13f8a000 r--p 00000000 00:00 0                          [vvar]
 39 7fff13f8a000-7fff13f8c000 r-xp 00000000 00:00 0                          [vdso]
 40 ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
~                                                                                        

If I modify the SystemC kernel as indicated in my previous post to make the test pass, only a few segments have "x" flag, which is reasonable to me:

  1 00400000-00404000 r-xp 00000000 fd:02 557832828                          /home/ohuang/data/software/mpfail/systemc-2.3.3/SystemC/examples/mini/mini.x
  2 00603000-00604000 r--p 00003000 fd:02 557832828                          /home/ohuang/data/software/mpfail/systemc-2.3.3/SystemC/examples/mini/mini.x
  3 00604000-00605000 rw-p 00004000 fd:02 557832828                          /home/ohuang/data/software/mpfail/systemc-2.3.3/SystemC/examples/mini/mini.x
  4 01c17000-01c18000 rw-p 00000000 00:00 0                                  [heap]
  5 01c18000-01c19000 r--p 00000000 00:00 0                                  [heap]
  6 01c19000-01e38000 rw-p 00000000 00:00 0                                  [heap]
  7 7f3e14aa5000-7f3e14c68000 r-xp 00000000 fd:00 33648244                   /usr/lib64/libc-2.17.so
  8 7f3e14c68000-7f3e14e68000 ---p 001c3000 fd:00 33648244                   /usr/lib64/libc-2.17.so
  9 7f3e14e68000-7f3e14e6c000 r--p 001c3000 fd:00 33648244                   /usr/lib64/libc-2.17.so
 10 7f3e14e6c000-7f3e14e6e000 rw-p 001c7000 fd:00 33648244                   /usr/lib64/libc-2.17.so
 11 7f3e14e6e000-7f3e14e73000 rw-p 00000000 00:00 0
 12 7f3e14e73000-7f3e14e88000 r-xp 00000000 fd:00 33625636                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
 13 7f3e14e88000-7f3e15087000 ---p 00015000 fd:00 33625636                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
 14 7f3e15087000-7f3e15088000 r--p 00014000 fd:00 33625636                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
 15 7f3e15088000-7f3e15089000 rw-p 00015000 fd:00 33625636                   /usr/lib64/libgcc_s-4.8.5-20150702.so.1
 16 7f3e15089000-7f3e150a0000 r-xp 00000000 fd:00 33743401                   /usr/lib64/libpthread-2.17.so
 17 7f3e150a0000-7f3e1529f000 ---p 00017000 fd:00 33743401                   /usr/lib64/libpthread-2.17.so
 18 7f3e1529f000-7f3e152a0000 r--p 00016000 fd:00 33743401                   /usr/lib64/libpthread-2.17.so
 19 7f3e152a0000-7f3e152a1000 rw-p 00017000 fd:00 33743401                   /usr/lib64/libpthread-2.17.so
 20 7f3e152a1000-7f3e152a5000 rw-p 00000000 00:00 0
 21 7f3e152a5000-7f3e153a6000 r-xp 00000000 fd:00 33648254                   /usr/lib64/libm-2.17.so
 22 7f3e153a6000-7f3e155a5000 ---p 00101000 fd:00 33648254                   /usr/lib64/libm-2.17.so
 23 7f3e155a5000-7f3e155a6000 r--p 00100000 fd:00 33648254                   /usr/lib64/libm-2.17.so
 24 7f3e155a6000-7f3e155a7000 rw-p 00101000 fd:00 33648254                   /usr/lib64/libm-2.17.so
 25 7f3e155a7000-7f3e15690000 r-xp 00000000 fd:00 33743422                   /usr/lib64/libstdc++.so.6.0.19
 26 7f3e15690000-7f3e1588f000 ---p 000e9000 fd:00 33743422                   /usr/lib64/libstdc++.so.6.0.19
 27 7f3e1588f000-7f3e15897000 r--p 000e8000 fd:00 33743422                   /usr/lib64/libstdc++.so.6.0.19
 28 7f3e15897000-7f3e15899000 rw-p 000f0000 fd:00 33743422                   /usr/lib64/libstdc++.so.6.0.19
 29 7f3e15899000-7f3e158ae000 rw-p 00000000 00:00 0
 30 7f3e158ae000-7f3e158d0000 r-xp 00000000 fd:00 33635267                   /usr/lib64/ld-2.17.so
 31 7f3e15a9d000-7f3e15aa2000 rw-p 00000000 00:00 0
 32 7f3e15acd000-7f3e15acf000 rw-p 00000000 00:00 0
 33 7f3e15acf000-7f3e15ad0000 r--p 00021000 fd:00 33635267                   /usr/lib64/ld-2.17.so
 34 7f3e15ad0000-7f3e15ad1000 rw-p 00022000 fd:00 33635267                   /usr/lib64/ld-2.17.so
 35 7f3e15ad1000-7f3e15ad2000 rw-p 00000000 00:00 0
 36 7ffecb358000-7ffecb379000 rw-p 00000000 00:00 0                          [stack]
 37 7ffecb3d6000-7ffecb3d9000 r--p 00000000 00:00 0                          [vvar]
 38 7ffecb3d9000-7ffecb3db000 r-xp 00000000 00:00 0                          [vdso]
 39 ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

Why linking in SystemC objects could lead to this difference? I have no idea what it implies. Just to share my observation. 

Link to comment
Share on other sites

5 minutes ago, Roman Popov said:

I have no explanation.  Can you please write the steps how you build SystemC library. Will try to install Centos7 on VM and reproduce your steps.

 

Also can you run make check after build to see if it is only your example that fails, or some bundled examples also fail.

Link to comment
Share on other sites

My steps to build SystemcC library:

1. Downloaded the systemc package

2. untar it to systemc-2.3.3

3. cd to systemc-2.3.3

4. mkdir objs

5. mkdir target_dir

6. cd to objs

7. run ../configure --enable-debug  --disable-optimize --with-unix-layout --prefix=../target_dir

8. make -j16

9. make install

I ran make check, and all tests passed.

Link to comment
Share on other sites

The issue is with only the static library libsystem.a. If you use the shared library, there is no problem. My example used libsystem.a

If you configure it with option -enable-shared=no, and rebuild, then "make check" will fail with exactly the same assert error as seen with my example. Here is my test log file:

  1 =================================================
  2    SystemC 2.3.3: examples/sysc/test-suite.log
  3 =================================================
  4 
  5 # TOTAL: 22
  6 # PASS:  21
  7 # SKIP:  0
  8 # XFAIL: 0
  9 # FAIL:  1
 10 # XPASS: 0
 11 # ERROR: 0
 12 
 13 .. contents:: :depth: 2
 14 
 15 FAIL: 2.1/forkjoin/test.sh
 16 ==========================
 17 
 18 
 19         SystemC 2.3.3-Accellera --- Oct 25 2019 14:07:22
 20         Copyright (c) 1996-2018 by all Contributors,
 21         ALL RIGHTS RESERVED
 22 test.sh: line 65: 30949 Aborted                 (core dumped) ./test > run.log
 23 ***ERROR:
 24 30,32d29
 25 < Fatal: (F4) assertion failed: ret == 0
 26 < In file: ../../../src/sysc/kernel/sc_cor_qt.cpp:109
 27 < Info: (I99) simulation aborted

 

Link to comment
Share on other sites

On 10/24/2019 at 11:15 PM, Oscar_Huang said:

The issue is with only the static library libsystem.a. If you use the shared library, there is no problem. My example used libsystem.a

If you configure it with option -enable-shared=no, and rebuild, then "make check" will fail with exactly the same assert error as seen with my example. Here is my test log file:

I've reproduced the issue on Centos7.  Preliminary it looks like a misuse of mprotect on memory allocated with new. So as a workaround commenting out stack protection section should work. 

Since I don't have enough linux system programming expertise I will bring it to accellera wg discussion before submitting a fix to systemc repo.

Thanks a lot for reporting this and spending your time on investigation!

Link to comment
Share on other sites

Thanks for the report.  I don't fully understand the summary, though.  Some questions:

Are the changes to the API check and/or the exception handling in sc_simcontext.cpp really needed? I would hope that removing/skipping the assert in sc_cor_qt.cpp is sufficient to work around the mprotect restrictions on CentOS 7+?

Your instructions do not include --enable-shared=no, but your description says that you only(?) see it on a static SystemC library. Can you please clarify?

To better understand the failing mprotect call in your environment, can you please provide the value of errno after the call?  This can be obtained by adding something like:

#include <errno.h>
// ...

ret = mprotect( ... );
if( ret != 0 ) {
  int mprotect_errno = errno;
  printf( "mprotect errno: %d, %s\n", mprotect_errno, strerror(mprotect_errno) );
}

 

IIRC, Linux generally allows calling mprotect on allocated memory.  The memory needs to be properly aligned at a page boundary, of course.  One option might be to allocate the stack memory via posix_memalign (if available) instead of new.  We can also change the implementation to gracefully ignore a failing protection and only restore the permissions if the mprotect(...,PROT_NONE) call was successful earlier.

This brings me to my remaining question: Which one of the mprotect calls actually fails: Is it the one removing the protection (PROT_NONE) or the one trying to restore them?

Thanks,
  Philipp

Link to comment
Share on other sites

@Philipp A Hartmann

7 hours ago, Philipp A Hartmann said:

Are the changes to the API check and/or the exception handling in sc_simcontext.cpp really needed? I would hope that removing/skipping the assert in sc_cor_qt.cpp is sufficient to work around the mprotect restrictions on CentOS 7+?

For workaround, removing/skipping the asert in sc_cor_qt.cpp is sufficient. Changes made in sc_simcontext.cpp is for debugging the issue. 

7 hours ago, Philipp A Hartmann said:

Your instructions do not include --enable-shared=no, but your description says that you only(?) see it on a static SystemC library. Can you please clarify?

In my simple example for reproducing the issue, the makefile explicitly used the the static library. So it doesn't matter if --enable-shared=no is used or not. To reproduce the issue with the examples coming with the systemc package,  you have to use --enable-shared=no to reproduce the issue because otherwise the shared library will be used.

7 hours ago, Philipp A Hartmann said:

To better understand the failing mprotect call in your environment, can you please provide the value of errno after the call?  This can be obtained by adding something like:

The errno is 13 (EACCES 13 Permission denied)

7 hours ago, Philipp A Hartmann said:

IIRC, Linux generally allows calling mprotect on allocated memory.  The memory needs to be properly aligned at a page boundary, of course.  One option might be to allocate the stack memory via posix_memalign (if available) instead of new.  We can also change the implementation to gracefully ignore a failing protection and only restore the permissions if the mprotect(...,PROT_NONE) call was successful earlier

I had tried posix_memalign and mmap, both are successful.

7 hours ago, Philipp A Hartmann said:

This brings me to my remaining question: Which one of the mprotect calls actually fails: Is it the one removing the protection (PROT_NONE) or the one trying to restore them?

It is the one that restores the protection flag failed. 

Link to comment
Share on other sites

  • 5 weeks later...
  • 1 year later...

Hello,

I recently stumbled across the same issue, but in a different manner: I configure, build and install SystemC on RHEL7, then execute on CentOS7.

Before applying the above fix, I encountered the same sc_assert() failure; after applying the fix, I got a Segmentation fault.

I have found that I can fix this by using mmap()/munmap() to allocate/deallocate memory for the Quick Thread stack.

See attached a possible patch for this.

Kind regards,
Guillaume

 

sc_cor_qt.patch

Link to comment
Share on other sites

  • 2 weeks later...

Thanks @Guillaume Audeon for reporting the issue and proposing a possible fix! I have forwarded it to the SystemC LWG. Could you be a bit more specific, with which SystemC version you are observing the segmentation fault on CentOS 7? Did you observe it with the latest official release tar ball of SystemC 2.3.3 or with the HEAD from the official SystemC Git repository

Link to comment
Share on other sites

  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...