Assemblyline is a scalable distributed file analysis framework. It is designed to process millions of files per day but can also be installed on a single box.
Canada’s electronic spy agency says it is taking the “unprecedented step” of releasing one of its own cyber defence tools to the public, in a bid to help companies and organizations better defend their computers and networks against malicious threats.
An Assemblyline cluster consists of 3 types of boxes: Core, Datastore and Worker.
Components
Assemblyline Core
The Assemblyline Core server runs all the required components to receive/dispatch tasks to the different workers. It hosts the following processes:
- Redis (Queue/Messaging)
- FTP (proftpd: File transfer)
- Dispatcher (Worker tasking and job completion)
- Ingester (High volume task ingestion)
- Expiry (Data deletion)
- Alerter (Creates alerts when score threshold is met)
- UI/API (NGINX, UWSGI, Flask, AngularJS)
- Websocket (NGINX, Gunicorn, GEvent)
Assemblyline Datastore
Assemblyline uses Riak as its persistent data storage. Riak is a Key/Value pair datastore with SOLR integration for search. It is fully distributed and horizontally scalable.