This document discusses using StreamSets Data Collector (SDC) to build a logging infrastructure for microservices. SDC can ingest logs from microservices running in containers and handle issues like schema changes and new log formats. It processes and transforms the logs, sending them to destinations like Kafka. SDC pipelines can run on Spark clusters on Yarn and Mesos to handle large volumes of log data and load it into systems like HDFS, HBase and Elasticsearch for analysis.