Efforts to Ease "Insights Analytics" Setup on Open edX



Deploying Insights Analytics on Open edX is an extremely complicated issue. Several teams within edX and the consultancy OpenCraft have started a collaboration “to address some of the pain points around Insights Analytics setup, deployment, maintenance, and deployment”. Plans have been outlined on this Google document. This is a summary of all of the difficulties from the Analytics Team:

  1. Maintaining jobs on the scheduler is a highly manual and rather difficult process
  2. Jobs fail periodically, we should identify all common causes and resolve them
  3. Schema changes are very painful (see the process above)
  4. The AWS configuration is rather complex and difficult to replicate
  5. The pipeline should be installed like every other component in the edX infrastructure. Currently it is not.
  6. We should seriously consider deprecating edx-analytics-configuration and just merging it into the edx/configuration monolith.
  7. The analyticstack (devstack) lags behind quite a bit and takes some manual intervention to generate new versions of. It also doesn’t support Elasticsearch 1.5, which is used by currently-in-development features in Insights. We’d like to move this into Docker.
  8. Centralize event collection. We should probably be using Kafka or something similar.
  9. Non-AWS configuration is rather complex and difficult to setup, which is very painful for the open source community.

From OpenCraft

  1. Lack of documentation
  2. Problems setting up edX Analytics Devstack (process took a long time, was impossible to complete for one team member; overall complexity of the stack made it difficult to distribute work to additional team members as needed)
  3. Problems with Hadoop version conflicts (fixed at the time via a couple of PRs: #128, #127), not really an issue anymore
  4. No (straightforward) way to run acceptance tests for edx-analytics-pipeline
  5. Using Analytics in production:
    1. Many steps required to install the stack (partly due to Ansible scripts making assumptions about, e.g., AWS regions)
    2. Many steps required to configure Jenkins (manually creating jobs and setting parameters/interval for each Analytics task, etc.)
  6. The number of PRs required to implement major changes slows work down (these types of changes often require PRs in four different repos; see “Dependencies” in this example)
  7. Not being able to merge PRs implementing work done for clients; having to maintain changes separately
  8. Deciding where to add different types of functionality (instructor dashboard vs. insights) was not straightforward in some cases