Saturday, July 7, 2012

Yellow Squad Weekly Project Report: July 6

Summary: We made some good progress, but this week was not as successful as we had hoped.

[introduction] [project report] [tricks] [topics]

Weekly Goal Progress

  • Continue running parallel tests on the EC2 32 core machine and aggregating results.
    COMPLETED. Our tests continued to be triggered only by three known bugs (974617/1011847, 1002820 and, apparently, a return of 1014916), though for some reason the success percentages fell significantly to the 60-70% range.
  • Make another attempt on at least one of 974617/1011847 and 1002820.
    ATTEMPT COMPLETED UNSUCCESSFULLY.  Based on some of the log messages, we tried to relate 974617/1011847 to 504291 and make a workaround from that direction.  It didn't work, but we talked about it more to Stuart.  Thanks to him, we have some more things to attempt in that direction.
    • The code that resets the database when the pgbouncer is installed might remove the stores and then abort.  That appears to be an incorrect order, so we can investigate that.
    • The logs really seem to be saying that the database (which is supposed to be pgbouncer at this point) really is not there.  This happens after we have verified that the bouncer is accepting connections. To investigate this oddity, we could insert some diagnostics on each retry that check some or all of the following:
      • Is the bouncer pidfile still there? If so, is the process still running?
      • Is the bouncer still accepting connections on the expected port?
      • Does launchpad have the postgres port we expext for the pgbouncer?
  • Land initial and usable versions of the remaining lpsetup commands: get, update, and inittests.
    INCOMPLETE.  We landed "get" (and renamed it to "init-repo"), and after the weekly call we landed "update".  "inittests" is not ready yet.
  • Package and use a refactored version of lpsetup for our parallel testing setup to validate the fact that it still works there.
    NOT DONE.  We tested lpsetup in this configuration manually but have not packaged and run it in our full automation.
  • Agree with IS on an approach to configuring the two new production machines.
    INCOMPLETE. We have tentative approval, but not confirmation.  The approach relies on lpsetup being in a workable state for this task.  The biggest concern there is the system update story, discussed on our call last week.  We have asked for Robert's input and are asking for him to propose next steps in the discussion.

Action Items

  • ACTION: gary_poster will try to arrange a time to discuss with Robert how to update lpsetup's system workarounds over time.
    IN PROGRESS: gary_poster contacted Robert about this via email and IRC, but he hasn't been available yet.
  • ACTION: gary_poster will make a kanban card for developing a basic, manual-run integration test suite.
    DONE: gary_poster created the card.  benji has a first cut at an lpsetup integration test, which should be ready to land at the start of this coming week.  We are using Juju (specifically the ubuntu charm) to provide our clean machine abstraction, so we could run integration tests of our installation commands on clean machines provided by EC2, LXC, or other Juju providers.

Other Status

IS is in the process of preparing the new testing machines for the configuration step.

Please Help

Goals for Next Week

frankban is unavailable Monday and Tuesday. gmb is unavailable Friday.
  • Continue running parallel tests on the EC2 32 core machine and aggregating results.
  • Make another effort to close at least one of the three known bugs (974617/1011847, 1002820 and 1014916).
  • Land initial, working cuts of lpsetup init-tests command, make a new deb release of lpsetup, and successfully incorporate the release into our automatic test structure.
  • Land integration tests for lpsetup.
  • Provide initial, incremental approach for letting lpsetup update its system workarounds across releases.
  • Give IS complete, tested instructions on how to set up a test machine in the data center.
  • Get confirmation from IS about the approach to configuring the two new machines.
Thanks for reading the project report!  See below for links to reports on other parts of our Friday meeting.

[introduction] [project report] [tricks] [topics]

No comments: