Jump to content

nVidia's Matchlib


ManikantaAllam

Recommended Posts

Hello seniors, 

I am trying to use NVIDIA'S matchlib's components (cmod & LI Channels) for the first time. 

I'm wondering if I can instantiate & run these directly with systemc & C++ ( Visual Studio - Windows OS) as same as other systemc modules? 

If I can avoid HLS tools' intervention, how can I do this? I am not interested in the synthesis for now.

The reason I am interested in this library is its generic nature and HLS-ability.  For now, I am just interested in writing a simulation framework and using it as part of my other work. 

 

@StuartSwan I hope you could give me some hints regarding this. 

 

 

Thank you & Best regards

Manikanta

Link to comment
Share on other sites

Manikanta-

Here are some tips on using Matchlib and the examples:

- I recommend using the fully open-source self-contained kit available here:

https://forums.accellera.org/files/file/126-matchlib-examples-kit-for-accellera-systemc-evolution-day-2020-presentation/

- Follow the README steps in the top level dir.

- You don't need any HLS tools or any other installs to run the examples.

- The kit has been successfully downloaded and run on many flavors of Linux machines.

- It is easiest probably if you use some version of g++ .

- My personal preference if using windows is to use VirtualBox and an Ubuntu Linux virtual machine. If you do this, you can still download Microsoft VSCode from the Ubuntu repositories.

- Some people have reported successfully using this kit on cygwin on Windows, though a few minor changes to the Makefiles may be needed for locations of system libs.

- Microsoft has something similar to cygwin now in the latest windows (linux shell or something like that), so you can try that too.

- The SystemC simulator includes scripts for building with Visual Studio, but the other code in the kit I don't think has been tested with Visual Studio.

Thanks

Stuart Swan

Link to comment
Share on other sites

  • 2 months later...

@StuartSwan

Hi Swan, 

I am interested in WHVCSource Router, as part of creating a cycle accurate simulation frame work. The router itself is kind of loosely/un-timed implementation as the unrolling and pipelining are left to HLS Tool. Is there a way, which I am missing and must know, that enables system level cycle accurate(approx.) fast simulation without depending on the RTL?

Somewhere, in one of Siemens' webinars, I heard about Back Annotation. Unfortunately, I could not understand it as I am a beginner and could not find any example. 

Is there something you could comment on this?

 

TIA & Best regards

Manikanta 

 

 

 

Link to comment
Share on other sites

Manikanta-

The WHVCRouter model in Matchlib is a "wormhole virtual channel" router. It is used in network on chip models.

FYI there is a unit test for the router model in the matchlib dir in in the kit at:

matchlib-main/cmod/unittests/WHVCRouterTop

The model itself is already very close to cycle accurate. Pretty much all of the loops are fully unrolled during HLS, so only

the main loop of the main process remains, and that loop may be pipelined during HLS but the pipeline latency is probably just 1.

So it is basically very close to RTL, it is just in SystemC .

Thanks

Stuart Swan

Link to comment
Share on other sites

@StuartSwan

Hi Swan, 

Thank you so much for replying.

Yes, I am creating a heterogeneous NoC simulation framework as part of my thesis. I will also modify this router to have different number of virtual channel distribution per ingressport unlike the original one has uniform distribution of no. of virtual channels on all the input ports. 

I have seen and simulated unittests _ WHVC router example already. But also saw all the processes(functions involved in routing) are put in single clocked process. That's where I had the concern regarding the latency mismatch between after HLS and pre HLS simulations. Nevertheless, this level of detailing is sufficient for system level exploration I guess.

I will further see and try to understand it. Thank you again.

 

Best regards

Manikanta

 

Link to comment
Share on other sites

  • 1 year later...

Hi @StuartSwan

I am trying to do hls flow using the kit you provided. Unfortunately, I am only able to use latest gcc version later than 10.0.  There are some issues with the hls, especially with RTL simulations using VCS. Could you point out something useful or I am missing? I will provide the log.

 

Thanks & regards

manikanta 

 

 

PS: 

Solved by correctly pointing VCS/GNU package. And changing the right systemC version in 'run_hls_global_setup.tcl'.

Link to comment
Share on other sites

Hi Manikanta, Stuart may be out on vacation for 1-2 weeks, so I'll add some advice for today.

The most common cause for simulations failing with HLS-generated RTL when the SystemC is passing, is missing resets/initiatialization of variables.

  • In RTL waveforms, if you see X's on any RTL module outputs, this is definitely the problem.
  • Compile with all warnings (-Wall) and optimization (-O1) for g++ to warn about uninitialized variables. 
  • For sc_module member variables, confirm that the reset section of your threads initializes all of them, including sc_signals and sc_out ports.
  • For locally scoped variables, confirm that all native types and "not auto-initialized" types like ac_int have initial values assigned.

If your system is made of multiple sc_modules communicating through channels, it is possible that the order-of-events is changed due to the latency and initiation interval of pipelined loops. A well-designed system should be agnostic to the latency changes, but complex interactions between blocks may introduce some type of ordering dependencies.

Finally, HLS tools have some interesting optimizations that can potentially parallelize non-blocking channel operations that are coded in sequence in the SystemC code. This is unlikely to be the issue, but may be possible if you do a PopNB() of one channel followed by PushNB() to another channel in one of your threads.

Beyond checking for these common issues, you'll need to dig into debug to understand further. The starting point for this is usually comparing waveforms between the RTL and SystemC sims. Identify the first sample in which the outputs differ, and trace back to the logic/code driving those outputs for hints on where functionality diverges.
 

Link to comment
Share on other sites

  • 2 weeks later...

Hi @Matt Bone & @StuartSwan

I am trying to run the matchlib HLS flow for WHVCRouter with 2 virtual channel configuration (cmod & hls unittests from the kit). No other changes at all. The design at RTL simulations shows some credit overflow warnings, but shows the simulation is passed. OSCI simulation has no issue at all. Unfortunately, the send and receive count of the traffic in testbench are being affected due to this. I am unsure whether to consider this as a bug or some other kind of issue, or no issue at all.

What would be your comment on it?

Thanks in advanvce & best regards

Manikanta  

Link to comment
Share on other sites

Quote

I am trying to run the matchlib HLS flow for WHVCRouter with 2 virtual channel configuration (cmod & hls unittests from the kit). 

Hi @ManikantaAllam,

I am not familiar with this design, but I did try the example. For me, the simulation appears successful for both SystemC and RTL (no warnings observed).

The only recommendation I have is to check that your matchlib kit is up-to-date. That is, either something bundled with a toolset from 2024, or downloaded from here:
   https://github.com/Stuart-Swan/Matchlib-Examples-Kit-For-Accellera-Synthesis-WG
 

Link to comment
Share on other sites

 "concat_sim_rtl.v", 4519: sc_main.my_testbench.router.ccs_rtl.dut_inst.router.WHVCSourceRouter_1_4_2_8_WHVCRouterTop_Flit_t_16_process_inst.WHVCRouterBase_1_4_2_8_WHVCRouterTop_Flit_t_receive_credit_for_6_WHVCSourceRouter_1_4_2_8_WHVCRouterTop_Flit_t_16_process_WHVCRouter_h_ln113_assert_credit_recvOS_i_CS_le_buffersize_and_TotalcreditsreceivedcannotbelargerthanBuffersize: started at 4895000ps failed at 4895000ps
# 	Offending 'pcredit_recv_i_buffersize_Total_credits_received_cannot_be_larger_than_Buffer_size_prb_5'

Hi @Matt Bone

I tried the latest repo you provided. Unfortunately, it also has the same faults. Above is the warning from the RTL simulation. Perhaps you should have run the WHVCRouter example with 1 virtual channel configuration. 

To run the same example that I run, 

1. Go to 'WHVCRouterTop.h' in /matchlib-main/cmod/unittests/WHVCRouterTop/ and edit line 31 with 2 VCs (kNumVChannels = 2), and line 45 flit field with packetID 2bits ( typedef Flit<64, 0, 0, 2, FlitId2bit, WormHole> Flit_t;)

2. Go to 'testbench.cpp' in /matchlib-main/cmod/unittests/WHVCRouterTop/ and uncomment the line 347 (flit.packet_id = vc;)

3. go to 'run_hls_global_setup.tcl' in /matchlib-main/hls/ and edit line 43 to make sysC version 2.3.3 (solution options set /Flows/VCS/SYSC_VERSION 2.3.3)

4. go to 'hls_Makefile' in /matchlib-main/hls/ and edit clock period to 5ns (export CLK_PERIOD  ?= 5) (design failed to schedule at default clk period)

5. Make sure all the env variables are set properly VCS, Verdi, Catapult and ...etc.,

6. Go to /matchlib-main/hls/unittests/WHVCRouterTop/ start make 

This will run the example and show the faults I am encountering.

 

Thank you & best regards

Manikanta

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...