Unusually friendly for a Swede with a PhD in parallel and distributed systems. Worked on big data streams; optimizing system architecture for large throughput and spiky lookup load. Currently a developer on Bitbucket, focusing on how to predict, measure, and optimize the performance of cloud and BTF services.
YOW! Data 2016 Sydney
Big Data Feedback Architectures
TALK – WATCH VIDEO
Want to harness the real power of big data? Then you’ll need to build an architecture capable of closing the feedback loop through machine learning. In this presentation, I’ll share knowledge gathered from designing streaming big data systems for mobile advertising, where every minute taken off the feedback loop translates to real dollars.
The inherent challenge is balancing technology maturity, hardware cost, and the needs of machine learning. The streaming technology we used is similar to Apache Spark, and gave a serious competitive edge in dealing with several hundred thousand auctions per second. By combining the power of Hadoop, Cassandra, Hive, and Pig, we managed to build a cost-effective solution capable of handling massive incoming traffic, tens of thousand user-data enrichments per second, and maintaining zero loss of business-critical data.