Jump to content

Annoying and useless packer implementation in the UVM 2017 v1.0 Library Code release


Recommended Posts

Hi Guys,

Current implementation of the UVM packer store some additional contexts information at the beginning of the bitstream (m_bits). A copy of the m_pack_iter and m_unpack_iter members. Exactly 64 bits (8 bytes). During packing or unpacking (un/pack, un/pack_bits/bytes/ints/intlongs) using an UVM object these fields are exposed. It makes these operations useless from user perspective and experience. To pack bits (example from header into bytes of stream), user needs to allocate additional extra magic 8 bytes (reverse engineering) and store some magic values (2x64, again...) at the beginning. This is also true for unpack methods. It is a completely different behavior from old UVM 1.2 and UVM-SystemC (stream operators). And not very unintuitive.

To fix this irrelevant implementation, user is forced to inheritance from the default UVM packer base class and override all get_packed* and set_packed* methods to discards this context (state) information from packed/unpacked data stream and use the uvm_packer::set_default() method alongside with the uvm_factory::get().set_type_override_by_type().

ill-considered decision alongside with the UVM packer 4096 limitation.

Link to comment
Share on other sites

I have found another issue with the original UVM packer. It doesn't initialize internal m_pack_iter and m_unpack_iter iterators! They have undefined values. Only the .flush() method initialize them. How this even pass a code review? Do you have any unit tests for that?

Link to comment
Share on other sites

Also this code is not thread safe. When packer is null, it gets the uvm_packer::get_default() that is a singleton instance... And call methods that changes internal packer states... Only solution is to create a packer object for every pack/unpack calls.

Link to comment
Share on other sites

In my opinion, the internal m_pack() method from the UVM object should be changed to:

if (packer == null) begin
    packer = uvm_packer::get_default().clone();
end

To preserve the global configuration and be truly thread safe (separate state per method call). 

Link to comment
Share on other sites

Thanks for the spirited comments @Tymonx!

I'll try and respond to all of your messages in one comment, apologies in advance if it's overly long.

To the concerns re: the packing of the m_pack_iter and m_unpack_iter into the byte stream, that is a consequence of how the 1800.2 LRM defines the packer's functionality and compatibility.  Specifically sections 16.5.3.1 (State assignment) and 16.5.3.2 (State retrieval), which effectively allow one to dump the packer's state, and safely load it into another packer, regardless of what operations have been performed, and in what order, on the original packer.  In this way, the state assignment/retrieval methods can be compared to SystemVerilog's process::set/get_randstate methods.  They contain enough information to put you in _exactly_ the same state.  

As an additional aside, it's also important to acknowledge that while uvm_object does provide a pack/do_pack/do_unpack interface, there's zero restrictions on where a packer can actually be used.  Users can create/use packers anywhere in their code, not just in the context of a UVM object.

As to m_pack_iter and m_unpack_iter being magic numbers, you're absolutely right... but to that end, the entire bit stream dumped by the packer is a magic number.  The 1800.2 standard intentionally leaves the formatting of the bitstream to be not just UVM Library dependent, but uvm_packer implementation dependent.  This allows for alternative implementations based on performance or other requirements.  Additionally, as m_pack/unpack_iter are values of type int, they auto-initialize to 0.  

For some background on "Why did 1800.2 add the state methods?  Why the changes?":

Pre-1800.2 it was impossible to use a packer _without_ pushing it through a uvm_object of some kind first, unless you opened up the source code to see how it worked.  FWIW:  This is precisely what UVM-SystemC did.  The 1800.2 standard reversed the polarity though, saying that it's not important that everyone pack/unpack byte streams in the same exact way, but instead it's important that the standard allows for developers to define their own bit packing mechanisms, so long as they all present the same interface to the user.  This is also why the various control knobs which altered string and array packing were removed.  IOW: If library X wants to pack/unpack with UVM, they don't need to know how Accellera packs/unpacks, they just need to make "class X_packer extends uvm_packer", and follow the rules.  So long as they do that, they're safe to use whatever bitpacking mechanism is best for their needs.

Your final concerns re: thread safety are valid, but are not constrained to uvm_packer.  Unfortunately, the UVM library in general isn't uber-thread-safe, primarily because SystemVerilog isn't particularly thread safe in "zero time" (ie. functions).  The same basic singleton flaw surrounds all of the singletons in the library... the default printer, copier, report server... you name it, it's not thread safe.  Unfortunately, SystemVerilog doesn't provide any mechanism for "atomic" access to a variable without consuming time.   The 1800.2 standard and the Accellera reference implementation do their best to "thread the needle" between general functionality, performance, and strict thread safety (and generally bias in that order).  That said, the best place to attack this may be inside of the uvm_coreservice itself... an alternative "thread safer" version could be created that constructed clones of the policies when get_* was called.  You'd then have the option of perf. vs (relative) thread safety. 

I hope that helps to shed some light on the questions,

Justin R.

 

 

Link to comment
Share on other sites

Hi Justin,

Thanks for your detailed answer. I'm very grateful to read that. But still... I'm not happy with the UVM packer implementation, behavior and pack/unpack methods from UVM object.

In my modest opinion, the UVM packer and pack/unpack methods had enormous potential, but it was simply wasted by that.

The state retrieval should be moved to separate method and/or encapsulate within class (maybe a separate proxy class). Not exposed to user world with user packed bits. This is against proper software architecture design. Class methods and object itself should do one thing but very good. Like in an UNIX philosophy "Doing One Thing, Well". I don't see any value in that. It makes only using UVM API very confusing and not natural. Self-explained API is a great and valuable thing. Packer or pack/unpack methods should only pack data, not add extra content. Now I have ended with a custom UVM object (pack/unpack methods, not only do_pack/unpack) and UVM packer (re-implemented, without 4096 limitation) inheritance from original UVM classes to make them more natural for others. For basic stuff like packing/unpacking data.

Simple scenario. I have planned to use default pack/unpack methods to store header data with different bit width and a separate byte payload[] for layered encapsulation and decapsulation. This lead me to another issue with the UVM packer behavior. Packing extra bits when someone is using packer.pack_object(). I have read an internal comment/reason about that in UVM sources. And this have no sense at all! For serializing/deserializing data you have two options. Pack/unpack data continuously without any extra information and let code decide to properly pack/unpack these bits to fields. Like protocol buffers framework. Or encode EVERY fields with extra bits information like BSON or something similar. Not mix two approaches in one.

If someone is interested in more friendly UVM object pack/unpack methods and UVM packer implementation. I can open source that for everyone.

Link to comment
Share on other sites

To summarize my topic:

  • Default pack/unpack methods from UVM object don't pack only user data but also additional magic UVM packer state (8 bytes) at the beginning and extra bits in mid when using packer.pack_object() method
  • The packing API is not self-explained and it is very confusing. API user is expecting simple packing/unpacking solution
  • It is against the "Doing One Thing, Well" philosophy. It requires extra effort to simplify it (extend basic classes). Not opposite!
  • The UVM object pack/unpack methods are not thread-safe on default. They are using a single global packer instance (singleton). Packer state is shared and not immutable
  • Packed bits stream doesn't contain only pure user data bits. Packer is packing unnecessary extra bits during packer.pack_object() method call
  • The internal packer state is not initialized correctly during packer creation (missing initialization for internal packer field members m_pack_iter and m_unpack_iter). Correct values are assigned during a separate packer .flush() method call
  • Packer limitation to 4088 bytes (-8 extra bytes from 4096 bytes from packed packer state). This can be simple re-implement with performance byte m_bytes[] and extra allocation ahead
  • UVM packer set/get_packed_* methods are not properly aligned with the IEEE specification. Typo in method argument types. It leads to compilation errors
Link to comment
Share on other sites

I will show a real life scenario, not some academic use case that you can find in many UVM cookbooks or tutorial for dummies (like Candy Lovers) which are detached from reality.

Lets assume that we have 2 two different protocols and we want to create a new sequence item that will represent a new protocol with header + encapsulate these two protocols inside a payload.

Simple usage:

byte bytes[];

protocol_c.pack_bytes(bytes);

First try:

function void protocol_c::do_pack(uvm_packer packer);
    uvm_object protocols[] = {
        protocol_a,
        protocol_b
    };
	
    // Flush is required, because default UVM packer doesn't initialize internal fields properly!
    packer.flush();
	
    // Packing protocol_c header
    packer.pack_field_int(...);

    // Encapsulation
    foreach (protocols[i]) begin
        packer.pack_object(protocols[i]);
    end
endfunction

Of course is not working. The pack_object() method add additional bits. Next approach that should have similar effect but is written in different way. It only show how UVM packer methods are not symmetric. And this is an anti pattern! Next try:

function void protocol_c::do_pack(uvm_packer packer);
    uvm_object protocols[] = {
        protocol_a,
        protocol_b
    };
	
    byte unsigned bytes[];
	
    // Flush is required, because default UVM packer doesn't initialize internal fields properly!
    packer.flush();
	
    // Packing protocol_c header
    packer.pack_field_int(...);

    // Encapsulation
    foreach (protocols[i]) begin
        byte unsigned tmp[];
	
        protocols[i].pack_bytes(tmp);
		
        bytes = {bytes, tmp};
    end
	
    packer.pack_bytes(bytes);
endfunction

Works? Of course not! Because UVM object pack/unpack methods use a single packer instance and because of that we have a side affects. It will duplicate data from previous packed protocol! Because protocols are using the same global instance of the UVM packer. Shared state. Next anti pattern (not only this is not thread safe). We must fix it:

function void protocol_c::do_pack(uvm_packer packer);
    uvm_object protocols[] = {
        protocol_a,
        protocol_b
    };
	
    byte unsigned bytes[];
	
    // Flush is required, because default UVM packer doesn't initialize internal fields properly!
    packer.flush();
	
    // Packing protocol_c header
    packer.pack_field_int(...);

    // Encapsulation
    foreach (protocols[i]) begin
        byte unsigned tmp[];
	
        protocols[i].pack_bytes(tmp, uvm_packer::type_id::create("packer"));
		
        bytes = {bytes, tmp};
    end
	
    packer.pack_bytes(bytes);
endfunction

Now it works! We have added several extra lines of code to avoid some UVM packer and UVM object pack/unpack gotchas.

Engineer has spent almost two working days to simple pack header fields into array of bytes in a presumption UVM way...

Link to comment
Share on other sites

@tymonx-

Thanks again for the responses, I didn't get a notification about them, otherwise I would have responded a bit faster 🙂 

The UVM Packer is not specified as packing bits in any particular format... if the a developer or end user requires a specific format, then they are free to implement their own within the standard.  If you've come up with an alternative and you think it'd be useful, please post it!

That said, while the format/structure of the bits isn't specified, the LRM is very clear about how packers behave... it seems that the behavior just isn't exactly what you were expecting.  I can understand the frustration (FWIW: there's plenty of places wherein the library acts in a way I would consider unexpected, you're not alone there!).

The largest disconnect here seems to be with how UVM handles "metadata", ie. data that describes the data contained within the stream.  There are 3 basic forms of metadata that Accellera's implementation is concerned with:

  • The current position of pointers in the packer stream (Your magic 8 bytes)
  • The size of a string
  • The validity of an object handle

The use of a fixed-size array is an Accellera implementation artifact.  It was chosen back in the pre-UVM days, but I believe the basic reasoning for it is that accessing data within a fixed size array tends to be faster than constantly (re)allocating data inside of a dynamic array.  The reference implementation does allow the user to easily change the size of the fixed size array, but it will still be a fixed size array.  To be clear though, this is an Accellera decision, and you are free to implement a packer which uses a dynamic array instead.  Should you choose to make that a queue instead of a dynamic array, then you no longer need pointers for pack/unpack.  Your unpack pointer is always 0, your pack pointer is always [$].  

Strings can generally be dealt with one of two ways:  {size,string} or {string,'\0'}.  Accellera goes with the latter, but really either is fine so long as you're consistent.

The validity of an object handle is called out by the LRM.  The LRM dictates that is_null returns 1 if the next item in the stream is a null object, 0 otherwise.  Unfortunately, this is the one place wherein the LRM truly requires _some_ form of metadata being present.  You are absolutely free to create a packer which doesn't support this method, but then your packer won't work for 100% of the objects out there.

 

 As to your other concerns:

  • The initial values of the member variables is fine because they're 2-state ints, not 4-state integers.  They automatically initialize to a value of 0. 
    • The library will also automatically flush any packer passed to uvm_object::pack_*, so long as that packer is not actively packing another object (refer to 16.1.3.4 in the 2017 LRM here).
  • Having the pack_* methods use the default packer was another case wherein simplicity/performance was chosen over strict thread safety (again, pre-UVM). 
    • I would argue that instead of changing the behavior of "Default/Null packer" to clone, it would be cleaner to simply remove the option altogether.  Make it an error to pass null to pack_*, and now the user has to be explicit.  No more thread safety concerns (for the library), no more potential for unexpected behavior.  Downside?  Breaks a _ton_ of existing code, some of which dates back to before UVM existed.
  • The packer is actually one of the more heavily documented features of UVM, even going so far as separating those methods which packer developers need to worry about (16.5.3) from those that the users generally interact with (16.5.4).  The fact that the LRM doesn't dictate the format of the bitstream isn't a bug or an omission, it's an intenitonal feature.  It's left at the discretion of the developer.
  • The "do one thing, well" philosophy is a bit alive and well:  the one thing is that the packer allows you convert a sequence of pack_*/unpack_* calls to/from a bitstream.     
    • A quick side story:  During the discussions of the packer during the development of the 1800.2 standard, an example was shown wherein all of the methods in 16.5.4 didn't actually modify a bitstream at all, instead they simply pushed/popped their values in separate local queues of bits, bytes, ints, longints, uvm_integral_t, uvm_bitstream_t, and strings.  A hook was present which allowed a user to control how that data was eventually packed/unpacked inside of the set/get_packed_state methods.  In theory this implementation could be significantly faster, because the packer could choose the optimal layout for each type.  This was just an example though, the full source code was not provided.

A final note on the fact that the Accellera implementation doesn't exactly match the LRM:  You're 100% correct, which is why the release notes include the following:

Quote

2. The `uvm_packer::set_packed_*` and `uvm_packer::get_packed_*` methods as defined in sections 16.5.3.1 and 16.5.3.2 use `signed` arguments while the `uvm_object::pack` and `uvm_object::unpack` methods are specified in sections 5.3.10 and 5.3.11 as `unsigned`. The implementation keeps the arguments `unsigned`, so as to remain consistent with the other packing APIs.

The implemented sigatures are:
<pre>
virtual function void uvm_packer::set_packed_bits( ref bit <b>unsigned</b> stream[] );
virtual function void uvm_packer::set_packed_bytes( ref byte <b>unsigned</b> stream[] );
virtual function void uvm_packer::set_packed_ints( ref int <b>unsigned</b> stream[] );
virtual function void uvm_packer::set_packed_longints( ref longint <b>unsigned</b> stream[] );

virtual function void uvm_packer::get_packed_bits( ref bit <b>unsigned</b> stream[] );
virtual function void uvm_packer::get_packed_bytes( ref byte <b>unsigned</b> stream[] );
virtual function void uvm_packer::get_packed_ints( ref int <b>unsigned</b> stream[] );
virtual function void uvm_packer::get_packed_longints( ref longint <b>unsigned</b> stream[] );
</pre>

[Mantis 6423](https://accellera.mantishub.io/view.php?id=6423)

The inconsistency between sections 5.3 and 16.5.3 are being addressed by the IEEE in the next revision.

 

-Justin

PS-  I get that it's just an example, and therefore I can't tell if protocol_c is meant to extend from uvm_object or not, but you should never call packer.flush in a uvm_object::do_pack call, unless you explicitly created the packer!  If the packer has any data in it (including but not limited to metadata), then you just cleared all of that data!

 

 

Link to comment
Share on other sites

Sure. I will share: https://gitlab.com/tymonx/uvm-extensions

Benefits:

  • it does one simple thing but very good. It packs and unpacks data without any additional bits
  • very good for any kind of protocols (sequence items) like networking, encap/decap or streaming interfaces like AXI4-Stream or Avalon-ST
  • new packer class inherited from the UVM packer class and extended. The default UVM packer is replaced with UVM factory and uvm_packer::set_default() method
  • new coreservice class inherited from the default UVM coreservice class and extended. The default UVM coreservice is replaced with uvm_coreservice_t::set() method
  • no bits size limitation for packing data, unlimited
  • using foreach, fixed slice of the array (+:CONSTANT), stream operators ({>>{...}}) and forward ahead memory allocation for the best performance
  • reducing number of required allocation with the forward ahead memory allocation technique
  • removed unnecessary magic 8 bytes with magic values at the beginning of a packed bit stream that stores internal packer state
  • removed unnecessary magic bits during packing UVM objects
  • proper packer initialization in constructor. Default UVM packer was not initialized correctly (m_*_iter not set to 64 value but to 0)
  • adding the .clone() call for .get_default_*() methods in coreservice
  • default packer object in packing and unpacking methods are now thread safe
  • default packer object in packing and unpacking methods don't have any side effects (not shared state as a singleton)
  • code simplicity

Usage:

/* Place it somewhere in your custom UVM component class like test or env in
 * build_phase() method or new() constructor 
 */
uvm_extensions_pkg::setup();

 

It reduces the code complexity of previous example to simple:

byte bytes[];

protocol_c.pack_bytes(bytes);

Implementation:

function void protocol_c::do_pack(uvm_packer packer);
    uvm_object protocols[] = {
        protocol_a,
        protocol_b
    };
	
    // Packing protocol_c header
    packer.pack_field_int(...);

    // Encapsulation
    foreach (protocols[i]) begin
        packer.pack_object(protocols[i]);
    end
endfunction

 

Link to comment
Share on other sites

I have another minor issue but during extending UVM. I thing it will be better for everyone to split interface from implementation. Because currently I have inherited everything, including some unused private variable members. For example uvm_packer class as an empty class and a separate uvm_packer_impl class with implementation details. It will make inheritance more cleaner.

Link to comment
Share on other sites

Ah, I see what you mean by the 64 as opposed to 0, but it's still initialized correctly:

// Constructor implementation
function uvm_packer::new(string name="");
  super.new(name);
  flush();
endfunction

The constructor is just relying on flush() for initialization, so that we don't implement the same code twice.

I also agree with the base/implementation comment... that's more legacy than strict intent.  It's a good enhancement request for the library though (and the standard in general)!  We did it with the uvm_report_server back in UVM 1.2, it makes sense to do it for the policy classes.

Finally, you may wish to make to the following events errors in your contribution:

  • [un]pack_object(null)/is_null() - If an object does any of these, chances are they're going to get unexpected behavior at some point
  • Calling pack_* after calling set_packed_bytes/ints/longints - The m_packed pointer is going to be in a bad state here
  • Calling unpack_* without calling set_packed_* - Any subsequent get_packed_* calls are going to get unexpected behavior.
    • Technically the problem only occurs if you call get_packed_* after calling unpack_*

 Thanks again!

-Justin

 

Edited by jrefice
Grammar correction
Link to comment
Share on other sites

Thanks @jrefice.

  • Now I see that the .flush() method is called in the packer new() constructor. It was located at the bottom of the uvm_packer.svh header file (line 1188). I have debugged correctly my issue with invalid m_*_iter values. I see that changes were met in another thread because of a singleton default packer nature. In the standard UVM implementation, creating a new packer object each time when packing or unpacking methods are called, it solves the problem
  • In my implementation, the .is_null() method is not used because I never encode object handler. It is based on the user code flow like in the proto buffer framework. Not from encoded bitstream information. That why it always return false. I can add an UVM error/warning message in the .is_null() method and in un/pack_object() methods when the provided value parameter equal to null
  • I don't see why m_packed value will be in bad state? .pack_*() methods are working like vector append methods (moving pointer) and .set_packed_*() methods like vector assignment methods (setting pointer)
  • Calling .unpack_*() methods without calling .set_packed*() methods are safe but it will return zero value, nothing or empty string. It depends on unpacking method. But here I can add an UVM error/warning message to inform user about that
  • But these methods are called correctly in UVM object class. The behavior is the same like in the standard UVM implementation. More like in UVM 1.2 and previous implementation
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...