Weekly Data News
if your data don't look weird you're not looking hard enough
- People : Business Analysts and IT coordination
- Log book : daily KPI checklist in Confluence page
- Next step : clear email output to management team
Most of the week dedicated to PLR preparation. Should be able to have more accurate statistics on the new training search by next week as Eugen Cepoi is taking over David Stendardi's data crunching work.
Work preparation for the send of a skills suggestion emails in collaboration with Julie S. for a first send next week (about 700K members).
MoneyOff this week
Deal with a lot of tracking issues. Should all been fixed by today.
Audit of datameer jobs. Cleaning or deleting some jobs. Have done a clear sum-up of what has been done and exist now.
Data Management and Machine Learning
- Conversion rates analysis.
- Next : take marketing campaigns into account.
- Jointly with Love team, built skills suggestion based on headline + department*industry. 130000 recipients to be targeted.
- Preparation of Position Department migration
- Improve error management
- Simplify API : Dead Code deletion
- Support API team to help implement their domain
- Next : extract trainings stats for Clémentine
- Next : Documentation of Resources / Queries / Message format
- Next : With David, see how to store legacy events
- Next : Automatic validation with annotations
Collaborative "Less is More" Filtering (CLiMF) for "People You May Know"
- Built full data set of member relations (all countries, no restrictions)
- Ran algorithm on Amazon Hi I/O instance for 24h : data set is too massive to process at a reasonable cost in EC2 (sparse matrix of size 22Mx22M).
- Next : first option is to run the experiment on equivalent hardware on premise.
- Next : second option is to build smaller datasets based on country and activity. For instance for all 7 million French members, predict top 100 recommended contacts among our top 500K active members.
- Focus is on finding a way to access all the data we need in the context of heavy network constraints between SF data center and Amazon VPC, until Ops team start transforming infrastructure (soon).
- We can now browse and transform data extracted by Core Search team in SF hadoop cluster, using the same file formats as other Viadeo teams (Avro).
- Clémence Desneiges - Manager of Business Intelligence
- Clémentine Blanchon - Business Analyst B2B
- Julie Buire - Business Analyst B2C Growth
- Frédéric Chancholle - Business Analyst B2C Love
- Kristine Romero - Business Analyst Marketing and B2C Money
- Pierre-Emmanuel Servant - Business Analyst Search (San Francisco)
Data Management & Machine Learning
- Eugen Cepoi - Software Architect
- Arnaud de Myttenaere - PhD Candidate
- Julie Séguéla - Data Management Expert
- Julien Bille - Data Platform Engineer
- Jean-Luc Canela - Tech Lead
- Mathieu Chataigner - Data Platform Engineer
- Iñigo Mediavilla - Web Development Engineer