Invited talk at the 12th International Workshop on Large-Scale and Distributed Systems for Information Retrieval, co-located with ACM CIKM 2015, October 23, 2015, Melbourne, Australia.
Cxense helps companies understand their audience and build great online experiences. Cxense Insight and DMP let customers annotate, filter, segment and target their users based on the consumed content and performed actions in real-time. With more than 5000 active websites, Insight alone tracks more than a billion unique users with more than 15 billions page views per month. To leverage the huge amounts of data in real-time, we have built a large distributed system relying on techniques familiar from databases, information retrieval and data mining. In this talk, we outline our solutions and give some insight into the technology we use and the challenges we face. This introduction should be interesting to undergraduate and PhD students as well as experienced researchers and engineers. [ Extended abstract/description: Preprint PDF, ACM DL ]