C4D locks up, when opening dialog from MessageData.CoreMessage

Hi,

I have an issue with opening a modal dialog from a MessageData's CoreMessage().
There's a certain chance, this locks up C4D completely.

The problem occurs here in R21 as well as S24, both on Win10.

MessageData.CoreMessage()gets triggered by SpecialEventAdd(custom message id) from another dialog and is then supposed to show a custom modal requester.

I have the feeling, the issue may be related to this thread.

Yet, this thread never seems to have come to a conclusion. Also the suggested workaround (ExecuteOnMainThread()) is neither available to me in Python, nor does it seem to be a "main thread issue" as I test for this.
Also I always thought, using SpecialEventAdd()/CoreMessage() the way to transfer execution context to the main thread, also documented C++ docs here and here. Are there any changes to this during latest releases.

Here's some code strapped down from the actual plugin. I tried to provide additional information, about different things I have tried already in the comments of the example.

Please note, when using the example. The problem only has a certain chance to occur. So you may need go through the sequence "Open Popup"->"Press Me"->"Ok" in modal requester several times. No hectic clicking needed, all steps can take there time.

I really hope, somebody finds out, that I'm just massively stupid (which I confess to be anyway, regardless of the outcome of this thread) and need to simply change a line of code, because the actual set up or combination of dialogs is quite crucial for our plugin and its workflow.

In brief the code below does the following:

  1. CommandData opens DialogMain in Execute()
  2. In reaction to button press, DialogMain.Command() opens DialogPopup
  3. In reaction to button press, DialogPopup.Command() calls SpecialEventAdd()
  4. In reaction to event, MessageData opens modal dialog in CoreMessage()

Thanks for any help in advance.

Cheers,
Andreas

Here's the code:

import c4d

PLUGIN_ID = 1234570 # MAKE SURE TO USE A UNIQUE IF FROM Plugin Café
PLUGIN_ID_MSGDATA = 1234571 # MAKE SURE TO USE A UNIQUE IF FROM Plugin Café
PLUGIN_ID_COREMESSAGE = 1234572 # MAKE SURE TO USE A UNIQUE IF FROM Plugin Café
PLUGIN_ID_POPUP = 1234573 # MAKE SURE TO USE A UNIQUE IF FROM Plugin Café
PLUGIN_ID_REQUESTER = 1234574 # MAKE SURE TO USE A UNIQUE IF FROM Plugin Café

ID_BUTTON = 1001


def OpenModalRequester():
    if not c4d.threading.GeIsMainThread():
        print('We are here, so the issue would be understandable...')
        # Fortunately, we never end up here.
        # Otherwise would have meant wrong understanding of SpecialEventAdd() leading to main thread.
        return

    dlg = DialogRequester()

    # Being unsure, which plugin ID to pass to subdialogs, I tried PLUGIN_ID, PLUGIN_ID_MSGDATA as well as a unique PLUGIN_ID_REQUESTER.
    # Unfortunately without any effect on the actual issue.
    # Assumption is, for modal or temporary dialogs plugin IDs won't matter much and are only used to store and retrieve size info.
    dlg.Open(dlgtype=c4d.DLG_TYPE_MODAL, pluginid=0, xpos=-2, ypos=-2, defaultw=0, defaulth=0)


class DialogRequester(c4d.gui.GeDialog):

    def CreateLayout(self):
        self.SetTitle('Modal Requester')

        self.AddDlgGroup(c4d.DLG_OK)

        return True


    def Command(self, id, msg):
        if id == c4d.DLG_OK:
            self.Close()

        return True


class MessageDataTest(c4d.plugins.MessageData):

    def CoreMessage(self, id, bc):
        if id == PLUGIN_ID_COREMESSAGE:
            # Here I also tried storing the dialog in a member variable to make sure
            # there's no conflict/race condition with the dialog closing and dialog
            # object being disposed. No effect on the issue, though.
            OpenModalRequester()

        return True


# Even though I have no idea why,
# it seems, having this popup dialog in the mix (compared to SpecialEvent() from DialogMain),
# massively increases the chance of the issue occurring.
class DialogPopup(c4d.gui.GeDialog):

    def CreateLayout(self):
        if self.GroupBegin(0, c4d.BFH_SCALEFIT | c4d.BFV_SCALEFIT, initw=200, cols=1):
            self.GroupBorderSpace(5, 5, 5, 5)
            self.GroupBorderNoTitle(c4d.BORDER_ROUND)

            self.AddButton(ID_BUTTON, c4d.BFH_LEFT, initw=0, name='Press Me')

        self.GroupEnd()

        return True


    def Message(self, msg, result):
        if msg.GetId() == 1649165891: # if lost focus
            self.Close()

        return c4d.gui.GeDialog.Message(self, msg, result)


    def Command(self, id, msg):
        c4d.SpecialEventAdd(PLUGIN_ID_COREMESSAGE)

        self.Close()

        return True


class DialogMain(c4d.gui.GeDialog):
    _dlgPopup = None


    def CreateLayout(self):
        self.SetTitle('Test Main Dialog')

        self.AddButton(ID_BUTTON, c4d.BFH_LEFT, initw=0, name='Open Popup')

        self.AddDlgGroup(c4d.DLG_OK)

        return True


    def CommandOpenPopUp(self):
        self._dlgPopup = DialogPopup()

        # Being unsure, which plugin ID to pass to subdialogs, I tried PLUGIN_ID as well as a unique PLUGIN_ID_POPUP.
        # Unfortunately without any effect on the actual issue.
        # Assumption is, for modal or temporary dialogs plugin IDs won't matter much and are only used to store and retrieve size info.
        self._dlgPopup.Open(dlgtype=c4d.DLG_TYPE_ASYNC_POPUPEDIT, pluginid=0, xpos=-1, ypos=-1, defaultw=0, defaulth=0, subid=0)


    def Command(self, id, msg):
        if id == ID_BUTTON:
            self.CommandOpenPopUp()

        elif id == c4d.DLG_OK:
            self.Close()

        return True


class CommandDataTest(c4d.plugins.CommandData):
    _dlgMain = None


    def Execute(self, doc):
        if self._dlgMain is None:
            self._dlgMain = DialogMain()

        self._dlgMain.Open(c4d.DLG_TYPE_ASYNC, PLUGIN_ID, xpos=-1, ypos=-1, defaultw=300, defaulth=0)

        return True


if __name__ == '__main__':
    c4d.plugins.RegisterCommandPlugin(PLUGIN_ID, 'Test Dlg From MsgData CoreMsg', 0, None, '', CommandDataTest())
    c4d.plugins.RegisterMessagePlugin(PLUGIN_ID_MSGDATA, str='Test MessageData', info=0, dat=MessageDataTest())

And above code as file:
TestDialogFromMessageData.pyp

Edit: Fixed last minute change of code. Sorry!

This post is deleted!

Hello @a_block,

thank you for reaching out to us. I understand the description of your problem, but unfournetely we are not able to reproduce it here. I ran your plugin on both R21 and S24 SP1 and had no luck in "locking up Cinema 4D completely".

Just to sort out some things here for clarity: A modal dialog is blocking, i.e., myValue in the example below will only be calculated once myDialog has been closed by the user. A non-modal dialog is not blocking, i.e., myValue will calculated right after the dialog has opened.

myDialog.Open(c4d.DLG_TYPE_MODAL, *args)
myValue = myDialog.GetFloat(ID_M_YFLOAT) * 42

I assume you are aware of this, but if one opens a modal dialog, one has to be sure that one is okay with the consequences. If one opens a modal dialog in NodeData::Message, one will block the execution and consumption of that node and the message. Which in most cases is okay and what you want to do. With core messages I would be instinctively hesitant to do this, since they happen 'on a higher level'. But I do not see anything particularly wrong with your example, since you only hook into your own core message.

I cannot tell you anything more right now, since I cannot reproduce the problem. My recommendation would be to move out of CoreMessage with the GUI stuff, or make the dialog non-modal. Have you also tried removing all other plugins on the installations where your plugin causes problems? That would not solve your problem, but it would help us to pinpoint where this is coming from, e.g., a another plugin which uses CoreMessage in a special way, which in turn causes your problem.

Cheers,
Ferdinand

Hi Ferdinand,

thanks for your quick reply. I'm aware of the consequences of modal dialogs, actually that is the behavior I expect and want to achieve there. The use of the requester in NodeData:Message() is not planned. And I'd happily move that requester out of CoreMessage(), but it needs to happen in response to some threaded stuff going on in background, thus the requester call actually went into MessageData.CoreMessage() as I expected this to be the correct way to get into the main thread, being allowed to do UI stuff there. The above example with the event being posted directly from the popup is only a very simplified version for the purpose of reproduction.
Funny, it can not be reproduced on your end. Here it happens quite reproduceably, latest after five to ten times going through the described sequence.

I will check my installations in regards to possibly interfering plugins. Though R21 and S24 installations are already pretty different in this regard, so I have no high hopes. I'll report back, if I find anything.

One more note, the CoreMessage itself works nicely without issues. Only when the modal requester comes into play, the problems start.

Thanks again, cheers

Uh oh...
Good news, indeed with all other plugins removed, I can not reproduce it in S24 either.
Of course good depends on the eye of the beholder... for you it's probably not so good, as I have wasted your time. Sorry for that. Yet, for me, it means, I can continue to search for the cause of the issue on my end. And this to me is good in a way...
Cheers

Hello Andreas,

no, you do not waste my time, and another plugin being the culprit does not necessarily mean that we will consider this as out of scope of support. It depends a bit on what the other plugin does. It would be good, if you could pinpoint the exact plugin that does cause this. I could ask the devs about their take on the consequences on doing GUI stuff in CoreMessage, but I fear with the current amount of information their answer will be the same as mine for now, i.e., 'rather not'.

And I forgot to mention, that I did open your dialog ten times in a row on each version of Cinema 4D without blocking it up. Manuel said he "clicked as fast as could" 😉

Cheers,
Ferdinand

Hi Ferdinand,

very kind, thanks 🙂

The click speed, as mentioned in my first post, is not relevant. I can reproduce it here, leaned back, having a cup of coffee between the clicks.
It is actually pretty funny. In S24 I'm no longer able to reproduce it anymore, since I had removed all other plugins, but then re-added them one after one. Problem still gone... and this made me think. Actually you made me think, something I can't tell you how grateful I am for. By now I have a hypothesis of what is going on here. A bit too early to make my incompetence public. But as soon as I have proven my theory right, I'll report back here and explain in detail, to hopefully prevent others from falling into the same self inflicted pit...

I'll be back!

Hi Ferdinand,

I'm sorry. Unfortunately I can not close this thread, yet. I honestly do not know, what's going on.
So, yes, my S24 does not expose the issue currently. This is with the actual plugin under development (not the example provided here) and after removing and re-adding all plugins. I shouldn't have left this my only test. It seems to have lead me into the wrong direction. Yet, I was so happy...

So the removal of plugins lead me to the hypothesis, that I have some interdependency in my own plugins, which may have fixed itself due to perhaps another load order.

I did not see any potential in the various MessageData components, though, but suspected something else:
Yes, here I have to admit (and I am aware, we are not supposed to be doing this), my plugins are separated into multiple submodules. So I am indeed polluting sys.path (and am aslo reloading modules in 'C4DPL_RELOADPYTHONPLUGINS'). To my excuse, with projects of a certain size I see actually no other option. Anyway, my plugins indeed share a bunch of code in various utility modules and I suspected this to be the cause of all my misery.

But by now, I am pretty sure, this is not the case anymore. Even worse, when going back to R21, the problem was there again, immediately. So I started removing plugins there again. And after removing all plugins except the above posted test example, it does still happen. And here it really happens quite fast and reproduceably. Most of the time the first time I click Cancel in the modal requester. During a row of ten consecutive tests, most of the time it happened during the first three click cycles, at most I needed five.

By the way, something I hadn't mentioned before: C4D is not only locked up, but burning on one CPU core, so my educated guess would be C4D is spinning on some spin lock, though of course it may as well be caught in an infinite loop (please lets not discuss, if spin locks belong into the set of endless loops).

Taking into account, that it only happens with a certain chance, it obviously also is some kind of race condition. Which may also explain, why small changes to the "C4D ecosystem" (i.e. removing plugins) are capable of masking the problem. And in the end CPU speeds and number of cores most likely have this potential as well. Here it's running on an quite old Intel Core i7-3930K (6 cores, 12 threads).

Nevertheless, after having spent all day with this issue, I am quite sure, the above posted code is either doing something really sinister, or, if what is does is in the realm of things we are allowed to do, then there is some issue in C4D. Most reproduceable it is in my R21. In S24 I admit, with only above test plugin in the system I can still no longer reproduce this. But I am sure, I saw it in S24 as well before, so maybe due to different runtime characteristics it is just way less likely... who knows. I could as well be wrong though, and in S24 it was a different issue in my code, which got implicitly fixed by now.

During my experiments I have further changed the example code (e.g. removed use of DialogPopup.Message(), added more buttons, to play with different ways to reproduce, removed flags not needed for reproduction, increased button sizes so Manuel could rapid fire them more easily, adcded a second CommandData to host the popup dialog...). So far, I can reproduce it only, with the intermediate popup dialog, regardless, if this popup dialog gets spawned directly from main dialog or by use of the second CommandData to own the popup dialog.

This is the part that worries me most. Could it be, the real issue is not opening the modal dialog from MessageData.CoreMessage(), but there is something wrong with spawning another asynchronous dialog from an asynchronous dialog? But I only noticed this for whatever reason due to increasing the chances for the issue via the modal dialog spawned by the MessageData? Which in the end would mean, I have a completely different and way more serious issue in my plugin lingering around. It just went by unnoticed until this dreaded modal dialog came into play...

Anyway, here's the updated version of the test plugin, just in case somebody still feels motivated to look into the issue:
TestDialogFromMessageData_2.pyp

Sorry to be a nuisance.

Cheers,
Andreas

Edit: One more finding: Once I make the popup dialog modal (which would basically destroy my workflow) the problem seems also no longer reproducable.

Edit: I was shooting too fast. I also got it with a modal popup dialog.

Hello Andreas,

I am as unsure what to do as you are. But I just saw this in your new code, which made me flinch, although I understand the idea behind it.

global g_dlg
g_dlg = DialogRequester()

Where the flinching part is the global keyword, although the idea seems valid, that Python's GC could erroneously collect the dialog, because for example something does not handle the GIL correctly in the C++ backend. And that reminded me that I found it already a bit odd that you had that function for opening the dialog floating around in your first code example. Have you tried attaching the dialog to your MessageData? Something like added at the end of the posting? This seems a bit safer than just saying "eh, something global".

PS: We will probably talk about this tomorrow, and then I will have another look, but I thought this might be worth a shot. I have written my example below "blind", it is only meant to convey an idea. It has not been tested.

Cheers,
Ferdinand

class MessageDataTest(c4d.plugins.MessageData):
    """MessageDataTest implementation that includes a replacement for your
    function OpenModalRequester().

    This is nothing special, I just added the dialog to this class, to
    ensure that we never produce a dangling dialog reference because Python's
    GC did something stupid.
    """

    # This all should not be necessary, since your dialog is modal, i.e, we 
    # should never run into the case that Python's GC does something stupid, 
    # since in your implementation we never leave OpenModalRequester() before 
    # the dialog is being closed. But here we are paranoid and attach the 
    # dialog to the MessageData, which at least judging by your example should 
    # not make a difference a difference for you. But since you have access
    # to this MessageData implementation, you could still open it with 
    #   MessageDataTest._openModalRequester()
    # from the outside if you need to. The reason for this is of course to
    # make the reference counting never go below one for the dialog.
    _requesterDialog = None

    @classmethod
    def _openModalRequester(cls):
        """Handles opening the class bound modal dialog.
        """
        # Don't repeat yourself version of opening the dialog.
        openMe = lambda item: item.Open(dlgtype=c4d.DLG_TYPE_MODAL, 
                                        pluginid=PLUGIN_ID_REQUESTER)

        # The dialog not been instantiated yet.
        if not isinstance(cls._requesterDialog, c4d.gui.GeDialog):
            cls._requesterDialog = DialogRequester()
            openMe(cls._requesterDialog)
        # There is a dialog instance, but it has been closed.
        elif not cls._requesterDialog.IsOpen():
            openMe(cls._requesterDialog)
        # This is being called not on the main thread for some reason, so we 
        # bail. The logic here being that when the modal dialog is still open,
        # we should be still on the main thread.
        else:
            pass

    def CoreMessage(self, id, bc):
        if id == PLUGIN_ID_COREMESSAGE:
            MessageDataTest._openModalRequester()
        return True

Hi Ferdinand,

in the original code this global variable does not exist. I am roughly aware of the intricacies of global variables and try to avoid them as much as our cat avoids water (for those not so familiar with cats: pretty much). It was more a sign of desperation, I added it here today, while experimenting and playing around. In the original code the dialog is actually just held in a local variable of the static global function. As it is modal, the dialog should not be needed anymore after Open() returned. But sure I will test your code as soon as I get back to work tomorrow morning.

Thanks for the suggestion and still thinking about my problem.

Cheers,
Andreas

Good morning Ferdinand,

you guessed perfectly right, here in my plugins DialogRequester is basically a more generic and versatile replacement for c4d.gui.MessageDialog() and c4d.gui.QuestionDialog(). Thus the implementation of a static OpenModalRequester()function. It's used all over the place and so far worked very nicely, until I stumbled about this issue, trying to use it in CoreMessage().

I tested your piece of code, unfortunately as we both expected, it doesn't make a difference. We can probably exclude any fears about Python's cleanup being an issue here.

One question though:
In your _openModalRequester() function, how do you derive not being in main thread, simply from checking if the dialog instance exists or is open? I can't follow your logic in the comment. After all it is more or less a static function. Even if implemented as member function, it could be called everywhere and in arbitrary context. But maybe this comment is to read only in the isolated environment of this example and other cases are simply not considered.

And lastly here is a vastly simplified example. Actually the MessageData is not even needed. Neither the intermediate popup dialog. It is enough to send the event to 'DialogMain.CoreMessage()'. The chances are a definitely lower, but I get C4D to lock up this way as well, need maybe ten to fifteen clicks.

Simplified version:
TestDialogFromMessageData_3.pyp

Cheers,
Andreas

Edit: Actually this last finding (issue also occurring with a CM send directly to the dialog itself) is quite funny. Because in the previous version of the example there's the button sending the CM directly to the MessageData ("Send CM dirctly (no issue)") and with this I am not able to get C4D to lock up. But my assumption now is, actually it has a chance to lock up as well even if ever so small for whatever reason. At least I see not real difference between CoreMessage in dialog or MessageData.

Hello Andreas,

thanks for the updates. It was a bit a shot into the dark with the garbage collection issue. I will not have time to tackle it today, but I have asked Maxime to test if he can reproduce your issue. I will have a look at it tomorrow again with the newer versions. Just for my clarity:

  1. If you understood you correctly, you said you could reproduce this more reliably on R21, right? And although R21 has left the support cycle, this would be of interest for us, since it would give us better chances of reproducing it.
  2. And my current understanding now is that you were also be able to reproduce this on R21 with any further plugins installed, right?
  3. Do you have any third party performance or security tools installed on your system? Examples would be ram managers, firewalls or virus scanners, although the last are probably not going to be an issue. The emphasis lies on third party, i.e., anything provided by Windows, e.g., Windows Defender, does not count.

Cheers,
Ferdinand

Hello Ferdinand,

  1. Yes, I can reproduce it way easier in R21.
  2. Correct. I even removed Redshift. Only above demo plugin is installed and running.
  3. Nope, no "performance tools". Only MSE/Windows Defender and Windows Firewall as protective tools.

Cheers,
Andreas

Edit: And please take your time. No stress. In the end, whatever the outcome of this thread will be, I will need to find some kind of solid workaround anyway. It will need to work in R19 to S24+ and most of these versions will never get any fixes anymore.

Hello Andreas,

sorry for the slight delay. So, I did try again with V3 of your plugin and in R21.116 I had no luck in locking up Cinema 4D. I did massage the Send CM button in short, medium, and long intervals of clicks. I did remove all my other plugins on that Cinema installation, and I did even install a tool to artificially put my machine under load so that Cinema only has a fraction of my RAM and CPU capacity left. All this was however fruitless, I am unable to reproduce this.

Which then got me thinking that I might misunderstand the circumstances that lead to the problem. Below is a screencast of what I was doing the whole time. Could you please confirm that this is what leads to a freeze on your machine?

demo2.gif

edit: We are able to reproduce this now, but it depends heavily on the exact version used. We are not able to reproduce this on S24 at all and not on R21.116 (the version I did use). But it is reproduceable on R21.105 and .202.

edit2: So, I am still unable to reproduce this in my machine. I have tested R21.116, .202 and .208 (the last R21 revision). I also did test S24.108 and .111 again without any luck. The ability to reproduce this seems to be connected the hardware configuration of the machine (I have 32GB Ram and 12GB VRam here). The weird thing is though, that we did test R21.116 on the machine where it did crash on R21.202 and could not make it crash there either.

We have will have to see what we are going to do about this and how far it reaches into revisions of Cinema that are still on the support cycle. Probably not going to be fun to find the root of this. I will post an update here once we have come to a conclusion.

Cheers,
Ferdinand

Hi Ferdinand,

thanks for your extensive testing. I feel a bit sorry for having this brought up. Also I wouldn't have expected it to be that hard and specific to reproduce, because here in my R21 chances for it to occur are not that low. Anyway, I can confirm, you are doing the exact right steps to reproduce it.

As almost always with my work, in the end I need something working on multiple versions of C4D. So during the weekend I restructured some parts, to get those requesters out of CoreMessage() as much as possible. And where this was not possible, I went back to the good old standard requesters (c4d.gui.MessageDialog()). So far, this workaround seems to hold. Fingers crossed.

So, from my side, you can almost consider this closed. It's completely up to Maxon, to judge if this may or may not point to some internal issue. And given the fact, that also on my system it seems way less reproducable in S24 (also here on my system), I can fully understand, that there are other more pressing priorities.

I said "almost considered closed", because I'd like to ask two more questions in the context of this issue:

  1. Probably most important for me: There is nothing inherently wrong with my approach of opening requesters from CoreMessage() (when paying attention to some rules, like using a specific custom message,...)?

  2. This may already be considered off topic, let me know, if I should open a separate thread. Could you provide a bit more info on how the Plugin ID parameter is used by GeDialogs? I mean, the golden rule seems to be, when you open a GeDialog from a CommandData, the CommandData's plugin ID gets passed here. Fine. But what if not? If a GeDialog gets opened from another GeDialog? Or like in my case from a MessageData? Is passing zero ok? Is passing a unique plugin ID, registered only for this purpose ok, even if it is not related to a registered CommandData? One thing C4D seems to use the ID for is to store size information. Which can already be a bit annoying, if like in my case, this dialog is used as a requester with varying content, because once it was opened with some more content, it will remain large, even when displaying less content later on. But ok, that's just cosmetic (though I was thinking to register a bunch of plugin IDs to mitigate this issue). But I could imagine the plugin ID to be used for more serious stuff as well. Like managing contexts or event loops. So long text, short question, can I do harm with this plugin ID?

Cheers,
Andreas

Hey Andreas,

no need to feel sorry. It is valuable for us to be aware of this, as this could cause more serious problems further down the road, even though it is hard to reproduce for now. We have pushed this off to QA for now, due to them having the required tools (hardware) to assess this more thoroughly.

  1. I do not see anything inherently wrong with that. But the only ones who could answer this with complete certainty are the developers who wrote the Cinema 4D core. And until we cannot say with a reasonable degree of certainty that this is a reproduceable bug that we want to address, I will not bother them with this, since there is other "stuff" in front of the queue for them anyway.

  2. I have forked your second part of the question, as I would like to keep this thread clean, as I would anticipate that QA will confirm this bug, it then going to the developers, and we will then report back here. Which will get a bit convoluted when there is a second question being discussed here. The topic can be found here.

I understand that this is not the most satisfying procedure for you, as it will take a bit of time. The issue of yours must go through our bug tracking system now first, rather than taking the shortcut we sometimes offer here, of us talking with the developers and then creating an issue if we decide to do so.

Cheers,
Ferdinand