This document describes the backend-side of the trace collection, using Google Compute Engine.
When the frontend spawns new tasks, it pushes them into a TaskQueue called clovis-queue
with a unique tag. Then it creates backend instances (as an instance group) and passes them the TaskQueue tag.
The backend instances then pull tasks from the TaskQueue and process them until it is empty. When there is no task left in the queue, the backend instances kill themselves.
The main files for the backend are:
startup-script.sh
: initializes an instance (installs the dependencies, downloads the code and the configuration).worker.py
: the main worker script.Run()
method taking a ClovisTask
parameter.clovis_task_handler.py
: Main entry point, dispatches the tasks to the more specialized handlers below.trace_task_handler.py
: Handles trace
tasks.report_task_handler.py
: Handles report
tasks.Install the gcloud command line tool.
This step deploys all the source code needed by the backend workers, as well as the Chromium binaries required for trace collection.
# Build Chrome (do not use the component build). BUILD_DIR=out/Release ninja -C $BUILD_DIR -j1000 -l60 chrome chrome_sandbox # Deploy to GCE # CLOUD_STORAGE_PATH is the path in Google Cloud Storage under which the # Clovis deployment will be uploaded. ./tools/android/loading/cloud/backend/deploy.sh $BUILD_DIR $CLOUD_STORAGE_PATH
The application is automatically started by the frontend, and should not need to be started manually.
If you really want to create an instance manually (when debugging for example), this can be done like this:
gcloud compute instances create $INSTANCE_NAME \ --machine-type n1-standard-1 \ --image ubuntu-14-04 \ --zone europe-west1-c \ --scopes cloud-platform,https://www.googleapis.com/auth/cloud-taskqueue \ --metadata \ cloud-storage-path=$CLOUD_STORAGE_PATH,task-dir=dir,taskqueue-tag=tag \ --metadata-from-file \ startup-script=$CHROMIUM_SRC/tools/android/loading/cloud/backend/startup-script.sh
If you are debbugging, you probably want to set additional metadata:
auto-start=false
: to start an instance without automatically starting the app on it. This can be useful when doing iterative development on the instance using ssh, to be able to stop and restart the app manually.self-destruct=false
: to prevent the instance from self-destructing when the queue is empty.Notes:
auto-start=false
, and then try to ssh on the instance and launch worker.py
, it will not work because of various issues, such as:sudo chown
to give yourself permissions.virtualenv
. Get in touch with droger@ if you need this or want to improve it.gcloud compute instances get-serial-port-output $INSTANCE_NAME
worker.py
configuration fileworker.py
takes a configuration file as command line parameter. This is a JSON dictionary with the keys:
project_name
(string): Name of the Google Cloud projecttask_storage_path
(string): Path in Google Storage where task output is generated.binaries_path
(string): Path to the executables (Containing chrome).src_path
(string): Path to the Chromium source directory.taskqueue_tag
(string): Tag used by the worker when pulling tasks from clovis-queue
.ad_rules_filename
and tracking_rules_filename
(string): Path to the ad and tracking filtering rules.instance_name
(string, optional): Name of the Compute Engine instance this script is running on.worker_log_path
(string, optional): Path to the log file capturing the output of worker.py
, to be uploaded to Cloud Storage.self_destruct
(boolean, optional): Whether the worker will destroy the Compute Engine instance when there are no remaining tasks to process. This is only relevant when running in the cloud, and requires instance_name
to be defined.Create tasks from the associated AppEngine application, see documentation.
If you want the frontend to send tasks to a particular instance that you created manually, make sure the tag
and storage_bucket
of the AppEngine request match the ones of your ComputeEngine instance, and set instance_count
to 0
.
To stop a single instance that you started manually, do:
gcloud compute instances delete $INSTANCE_NAME
To stop instances that were created by the frontend, you must delete the instance group, not the individual instances. Otherwise the instance group will just recreate the deleted instances. You can do this from the Google Cloud console web interface, or using the gcloud compute groups
commands.
gcloud compute ssh $INSTANCE_NAME
From a new directory, set up a local environment:
virtualenv env source env/bin/activate pip install -r \ $CHROMIUM_SRC/tools/android/loading/cloud/backend/pip_requirements.txt
The first time, you may need to get more access tokens:
gcloud beta auth application-default login --scopes \ https://www.googleapis.com/auth/cloud-taskqueue \ https://www.googleapis.com/auth/cloud-platform
Create a local configuration file for worker.py
. Example:
cat >$CONFIG_FILE << EOF { "project_name" : "$PROJECT_NAME", "cloud_storage_path" : "$CLOUD_STORAGE_PATH", "binaries_path" : "$BUILD_DIR", "src_path" : "$CHROMIUM_SRC", "taskqueue_tag" : "some-tag" } EOF
Launch the app, passing the path to the deployment configuration file:
python $CHROMIUM_SRC/tools/android/loading/cloud/backend/worker.py \ --config $CONFIG_FILE
You can now use the app.
Tear down the local environment:
deactivate