Problem/Motivation

1) Drupal\Tests\media\FunctionalJavascript\MediaUiJavascriptTest::testMediaTypes
Behat\Mink\Exception\ResponseTextException: The text "The media type pAkYI8C0 has been added." was not found anywhere in the text of the current page.

This is from https://www.drupal.org/pift-ci-job/853314

Originally, the patch on #2934424-13: Media has no collection route passed. I do not know why a re-test was triggered, but the re-test failed with the message above. I triggered another re-test, and that one passed.

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Comments

benjifisher created an issue. See original summary.

benjifisher’s picture

Status: Active » Needs review
StatusFileSize
new866 bytes

The attached patch should run the intermittently failing test 100 times, with 1/4 second pause in between runs.

benjifisher’s picture

benjifisher’s picture

StatusFileSize
new10.09 KB

Now let's try the same test, also applying the patch from #2934424-13: Media has no collection route.

xjm’s picture

Issue tags: +Random test failure

Nice work @benjifisher -- looks like #2 caught one! https://www.drupal.org/pift-ci-job/853413

This means that the random fail is already in HEAD, and not introduced by #2934424: Media has no collection route. So we can go ahead with that issue, and instead need to sort out which previous commit introduced this random fail.

xjm’s picture

Status: Needs review » Needs work

Setting back to NW now that we've confirmed the random fail exists in HEAD.

It looks like this might be straightforward to fix. From the test:

    // Test the creation of a media type using the UI.                          
    $name = $this->randomMachineName();
    $description = $this->randomMachineName();
    $this->drupalGet('admin/structure/media/add');
    $page->fillField('label', $name);
    $machine_name = strtolower($name);
    $this->assertJsCondition("jQuery('.machine-name-value').html() == '$machine_name'");
    $page->selectFieldOption('source', 'test');
    $this->assertJsCondition("jQuery('.form-item-source-configuration-test-config-value').length > 0;");
    $page->fillField('description', $description);
    $page->pressButton('Save');
    $assert_session->pageTextContains('The media type ' . $name . ' has been added.');

That's totally missing any wait condition above the last line, so we need to add that to ensure that the request completes before the assertion runs.

benjifisher’s picture

What is the appropriate way to wait in this sort of test?

Note that the "Save" button submits the form at /admin/structure/media/add. This is a full page load, not an AJAX event. I considered using $assert_session->assertWaitOnAjaxRequest(); until I realized this. Looking for other tests that use $page->pressButton('Save');, I see that followed by $assert->addressMatches('...'); in MediaRevisionTest; does that assertion wait until the page loads?

benjifisher’s picture

Status: Needs work » Needs review
StatusFileSize
new2.22 KB

This is just a quick test to see if I am going in the right direction.

benjifisher’s picture

StatusFileSize
new2.22 KB

Same fix as in #8, but this will run the test 100x, like the patches in #2 and #4. Time for the testbot to do some work!

benjifisher’s picture

StatusFileSize
new1.37 KB

This patch is for real. It includes the same changes to MediaUiJavascriptTest as the previous patch, without hacking run-tests.sh.

When the test adds a new media type, and again when it modifies that media type, Drupal loads /admin/structure/media. Both times, the patch waits for the "Add media type" link and asserts the URL:

    $assert_session->waitForLink('Add media type');
    $assert_session->addressMatches('/^\/admin\/structure\/media$/');

The only thing that bothers me is that waitForLink() is normally used for AJAX interactions, but this is a full page load. If there is a better way to implement the wait, I am happy to update the patch.

marcoscano’s picture

+++ b/core/modules/media/tests/src/FunctionalJavascript/MediaUiJavascriptTest.php
@@ -63,6 +63,8 @@ public function testMediaTypes() {
     $page->pressButton('Save');
+    $assert_session->waitForLink('Add media type');
+    $assert_session->addressMatches('/^\/admin\/structure\/media$/');

@@ -134,6 +136,8 @@ public function testMediaTypes() {
     $page->pressButton('Save');
+    $assert_session->waitForLink('Add media type');
+    $assert_session->addressMatches('/^\/admin\/structure\/media$/');

Exactly for the reason you pointed out in #10, another developer is likely to come here in the future and just remove them, without even noticing a failure :) How about a comment before these lines, explaining the need and referring to this issue URL?

Note: it surprises me too that we need to wait for the element, on a non-AJAX call, but I have no other suggestion to make, unfortunately. :)

Also, in other tests, after waiting for an AJAX interaction, we assert the new element is really present, with something like:

    ...
    $page->attachFileToField('files[field_media_image_0]', \Drupal::root() . '/core/modules/media/tests/fixtures/example_1.jpeg');
    $result = $assert_session->waitForButton('Remove');
    $this->assertNotEmpty($result);

I'm no test expert, so may have nothing to do, just wanted to mention it for consistency with other tests.

benjifisher’s picture

StatusFileSize
new1.59 KB

But that makes the patch twice as big (+8 instead of +4). ;)

I added code comments as suggested.

I am also not an expert in testing, either. I notice several places where addressMatches() is used after pressing a button. Looking through the code and documentation, I do not see any indication that this assertion includes an implicit wait.

Should I add assertNotEmpty() as well?

Looking at the line after pressing the 'Delete' button, I notice that there is a method addressEquals(), so this patch uses that instead of addressMatches().

This issue has been marked as a critical bug. As such, I think we should limit work here to the lines that are known to cause test failures ... and by that standard, I may already be doing too much. If I get some guidance, I can work on adding more waits in the Media tests as a follow-up issue.

lendude’s picture

Note: it surprises me too that we need to wait for the element, on a non-AJAX call, but I have no other suggestion to make, unfortunately. :)

It surprises me too, but it's not the first time this is needed, see #2864177: Random failure in FormErrorHandlerCKEditorTest
But it would be nice to find out why and when this is needed.....

Also, in other tests, after waiting for an AJAX interaction, we assert the new element is really present

This is correct. You need to assert that the element was actually found, otherwise you might just be waiting for 10 seconds, which might help the test pass, but is fairly inefficient. We want to be sure we wait for as short a time as possible.

benjifisher’s picture

You need to assert that the element was actually found

I think this (from the patch in #12) is good enough, but I can change the assertion if you think it is better to be consistent:

    $assert_session->waitForLink('Add media type');
    $assert_session->addressEquals('admin/structure/media');
lendude’s picture

I feel we do need the assertion that $assert_session->waitForLink('Add media type'); found something, because if we change the link text to something else later, we have no indication that this test is running 10 seconds longer then necessary.

For reference if you change the waitFor to
$assert_session->waitForLink('Dingoes ate my baby');
the test passes just fine now, it just runs a little longer.

So I think we need to future proof this and assert that we wait for the correct link.

benjifisher’s picture

StatusFileSize
new1.68 KB

@Lendude, that sounds like a good reason. Assertion added.

lendude’s picture

Status: Needs review » Reviewed & tested by the community

@benjifisher thanks for adding it :)

We have proof of failure in #2, we have proof of passing in #9, nice work!

xjm’s picture

I queued several more runs of #9; the fail in #2 only showed up 1/6 times, which means we need several times that to be reasonably confident it is indeed fixed. :)

Another good way to test these issues is to come up with a patch that fails 100% of the time (e.g. with a very high sleep at exactly the step where we're waiting for the page load). Not a requirement, but helpful. We can also just wait to see what happens in like 20 runs. (I'm only queuing a few at a time to avoid overloading CI; will revisit later.)

xjm’s picture

Status: Reviewed & tested by the community » Needs work
+++ b/core/modules/media/tests/src/FunctionalJavascript/MediaUiJavascriptTest.php
@@ -63,6 +63,11 @@ public function testMediaTypes() {
+    // https://www.drupal.org/project/drupal/issues/2934997.

@@ -134,6 +139,11 @@ public function testMediaTypes() {
+    // The wait prevents intermittent test failures. See
+    // https://www.drupal.org/project/drupal/issues/2934997.

Oh, we should never put "See" to the issue in a patch. If people want to know what issue added the line, they can use git blame.

benjifisher’s picture

StatusFileSize
new1.55 KB

Another good way to test these issues is to come up with a patch that fails 100% of the time (e.g. with a very high sleep at exactly the step where we're waiting for the page load).

Agreed. A high sleep in the test would make the test less likely to fail, so I guess you mean putting a sleep into the submit callback that is being triggered? That should be easy enough, but not until later (maybe tomorrow).

A single failure is useless for statistics. Before knowing how many times we should exercise the patch, we should also run the failing test enough times to have an estimate for how often it fails.

tedbow’s picture

I tried to make this test fail (without the patch ) by just adding sleep(1) to index.php.

It didn't fail so I went to 5 and then 10. It then failed but in a different spot:

Javascript condition met:
jQuery('.form-item-source-configuration-test-config-value').length > 0;
Failed asserting that false is true.
/var/www/d8/core/tests/Drupal/KernelTests/AssertLegacyTrait.php:31
/var/www/d8/core/tests/Drupal/FunctionalJavascriptTests/JavascriptTestBase.php:121
/var/www/d8/core/modules/media/tests/src/FunctionalJavascript/MediaUiJavascriptTest.php:63

$page->selectFieldOption('source', 'test');
    $this->assertJsCondition("jQuery('.form-item-source-configuration-test-config-value').length > 0;");

Assuming this is some kind of ajax for reload where option show up after you select a value for field. Does this have to be done with assertJsCondition?

Could it be done with \Drupal\FunctionalJavascriptTests\JSWebAssert::waitForElement()? Of course checking for result.

but this is what is making this test fail. but the fact that it only fails after waiting 10 seconds and not on the same line makes me wonder if there is not something else going on besides just slow tests.

benjifisher’s picture

StatusFileSize
new866 bytes

I am not sure what to do with that. :-(

For one thing, we have documented test failures on one line of code, but not the one you found.

That part of the test is exercising /admin/structure/media/add. Choosing "test" from the "Media source" select list indeed triggers an AJAX event, loading the configuration for the "Test source" media source (from the media_test_source module). Then the assertion is looking for one of the elements of that configuration form.

Maybe, while I ponder the last line of Comment #21, I will upload a variant of the patch from #2, running this test 10 times, with a 10-sec pause in index.php.

benjifisher’s picture

Status: Needs work » Needs review
benjifisher’s picture

StatusFileSize
new867 bytes

Oops, that was only a 1-second pause.

Status: Needs review » Needs work

The last submitted patch, 24: 2934997-24.patch, failed testing. View results
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.

benjifisher’s picture

Status: Needs work » Needs review

OK, my patch in #24 failed at the same place as @tedbow's experiment.

I mulled it over, and I think this is not a problem.

If you put in a 10-second wait, as we did, then the test will fail at the first possible place. That place seems to be the assertJsCondition() on Line 63.

There is only one test method in this class, and this is how it starts:

  public function testMediaTypes() {
    $session = $this->getSession();
    $page = $session->getPage();
    $assert_session = $this->assertSession();

    $this->drupalGet('admin/structure/media');
    $assert_session->pageTextContains('No media types available. Add media type.');
    $assert_session->linkExists('Add media type');

    // Test the creation of a media type using the UI.
    $name = $this->randomMachineName();
    $description = $this->randomMachineName();
    $this->drupalGet('admin/structure/media/add');
    $page->fillField('label', $name);
    $machine_name = strtolower($name);
    $this->assertJsCondition("jQuery('.machine-name-value').html() == '$machine_name'");
    $page->selectFieldOption('source', 'test');
    $this->assertJsCondition("jQuery('.form-item-source-configuration-test-config-value').length > 0;");
    // ...
  }

Correct me if I am wrong on any of this. (I am pretty sure about the middle three points.)

  • The drupalGet() calls will wait patiently for the page to load.
  • There are a bunch of assertions and miscellaneous assignments, none of which trigger a Drupal bootstrap.
  • The fillField() might trigger an AJAX event, but on this page it does not.
  • The selectFieldOption() (Line 62) does trigger an AJAX event.
  • The next line looks for something that will only be there when the AJAX call completes, so the assertion fails.

When the assertion fails, the test stops running, so we do not see any further failures.

Conclusion: all that is going on is slow tests.

If we want to make this test bullet-proof, then we can insert a wait statement here, and look for other places to do the same. I think that is a good idea, but out of scope for this issue, as I said in #12. Am I right?

I am re-uploading the patch from #20. Maybe tomorrow I will take up @xjm's challenge from #18.

benjifisher’s picture

StatusFileSize
new1.55 KB
lendude’s picture

Status: Needs review » Reviewed & tested by the community

Having a 100% failing test might be nice to have but sounds unlikely to happen. Since it's still unclear why this test fails in the first place. Having to wait for a non-dynamic element on page load is not something that is usually needed. That might be something we want to look into, since it is now needed twice in core (so I assume we will run into this again), but seems out of scope for fixing this critical. I opened up a follow up for this, #2936122: Find out why JavscriptTestBase occasionally needs a waitForElement on a normal page load

Given the 1/600 failure rate in #2, it means the 2200 runs in #9 mean we now only have 0.0455% chance of not having caught the cause but still get 2200 passes. Sounds pretty solid to me.
Bumping back to RTBC to see if people agree :)

Version: 8.5.x-dev » 8.6.x-dev

Drupal 8.5.0-alpha1 will be released the week of January 17, 2018, which means new developments and disruptive changes should now be targeted against the 8.6.x-dev branch. For more information see the Drupal 8 minor version schedule and the Allowed changes during the Drupal 8 release cycle.

  • catch committed 14c499f on 8.6.x
    Issue #2934997 by benjifisher, Lendude, xjm: Intermittent failure in...
catch’s picture

Version: 8.6.x-dev » 8.5.x-dev

Committed 14c499f and pushed to 8.6.x. Thanks!

Will cherry-pick to 8.5.x after the alpha freeze.

catch’s picture

Status: Reviewed & tested by the community » Fixed

Done.

  • catch committed f3ca2c9 on 8.5.x
    Issue #2934997 by benjifisher, Lendude, xjm: Intermittent failure in...

Status: Fixed » Closed (fixed)

Automatically closed - issue fixed for 2 weeks with no activity.

xjm’s picture