Community Forum
ANVEL Licensing Options

ANVEL 3.0 Crashes during Simulation



  • Hello,

    Currently my team is running multiple ANVEL instances for our simulation lab, and every so often, we run into an ANVEL instance crashing during the simulation. The ANVEL instances run within Windows virtual machines, which we manage remotely. The ANVEL instances run a set of plugins we use to publish vehicle data, controls and sensor streams to remote ROS machines on the network. When ANVEL crashes, it does not generate a DebugLog or output any errors, which is unusual behavior, considering Anvel will generate a log for crashes. The Windows Reliability Monitor comes up with a problem signature pointing to AnvelSimulation.dll.

    0_1549997621482_96364f6f-b7a4-4cc8-a22a-53f88ad58756-image.png

    Given the problem signature, I suspect this may be some sort of bug with ANVEL. Is there anything this specific error points to, and if this is not a bug with ANVEL, is there anything specific I should look at?



  • @aamirmaj,

    Thanks for writing in, we're happy to help you look into this. I have a few questions, though. I'm curious if we can narrow the problem down, because it seems like there are several "moving parts" to your simulation environment. Can you tell us more about what is happening in the simulation when ANVEL crashes? Is it reproducible? Does ANVEL crash in a similar setup without your custom plugins loaded?

    Also worth trying (if possible) is upgrading to ANVEL 3.5. There are some stability fixes included in the latest version that may help you out here.

    Regards,
    ANVEL Support



  • Hello ANVEL Team,

    Apologies for the late reply. Yes, there are several "moving parts" in our current simulation environment. Given the complexity of our setup, we haven't been able to narrow down the crash to a specific action in our setup. Some plugins that may be of note include:

    • A CameraStreamer plugin running on top of ANVEL, which sends sensor feeds as RTSP streams to ROS virtual machines over a shared buffer

    • A ROSController plugin running on top of ANVEL, which externally controls the vehicles by sending and receiving throttle, steering and brake inputs through a shared buffer with ROS.

    • A DISController plugin which follows the DIS protocol, where given a list of "DIS objects" (vehicles), each ANVEL instance periodically publishes its own DIS objects' positions and orientations across the network, and all ANVEL instances subscribe to each other. This allows us to run multiple vehicles in a single simulation.

    An important note I would like to make, is that when we run into a crash due to our plugins, ANVEL's crash log or the Windows Reliability Monitor will specifically point to the plugin responsible. That is not the case with the crash mentioned.

    Another note I would like to make with the current simulation, is that our ANVEL instances themselves are running as virtual machines on a Linux machine. We are using IOMMU passthrough to pass 1 GPU (GTX 1080Ti's) to each virtual machine running ANVEL. Each of our servers has 3 GPU's, so each server splits it's CPU workload amongst 3 virtual machines, each of which run ANVEL. We split the virtual machines based on the sensors per vehicle, where a master ANVEL instance controls the vehicle, and slave ANVEL instances run individual sensors (1 GPU per sensor). Networking between the instances is done using the plugins mentioned.

    There are additional interactions between the ROS virtual machines and the interfaces which allow us to control the vehicles, view the sensor feeds, and interact with the simulation in expected ways. These are external and technically separate from the ANVEL simulation, which is why it is hard to narrow down exactly when and why the ANVEL simulation crashes. However, we have had instances where an ANVEL instance, or instances, will crash around 5-30 minutes of uptime, even when we do not interact with the simulation at all, meaning no plugins are receiving new interactions through the network.

    I have not been able to replicate the error outside of the simulation we are running, but I will try doing that. I will also test running the ANVEL instances without plugins, to see if we can replicate the error. We are considering switching to ANVEL 3.5, and have already started testing ANVEL 3.5 in a virtual machine with our server setup. The migration process may be long, but if we can get the simulation to run smoothly in ANVEL 3.5 we will consider switching to that.

    If there are any more specifics you would like to know, I would be happy to send them. I appreciate the help!

    With Regards,

    • Aamir


  • @aamirmaj,

    Thanks for contextualizing your simulation setup! It is indeed complicated, and since we are not the authors of the plugins you are using, it's going to be challenging to narrow down the solution without being able to debug.

    That being said, ANVEL 3.0 has some known instability issues. We would highly recommend that you encourage the author of your plugins to upgrade to ANVEL 3.5 then follow suit with your overall simulation setup.

    Regards,
    ANVEL Support