/ switzerland

#050 #amld2019 #frictionlessdata

The Applied Machine Learning Days was a four day conference at EPFL in Lausanne at the end of January 2019. This post focuses on the Shipshape open data workshop. Coverage of the AI & Cities and AI & Health tracks in the next post.

Ever been to a LAN party? Not to be confused with a GAN, we are talking about a bunch of computers temporarily strung together for a weekend of intense video gaming, sometimes for less ludic/legal purposes. Proximity optimizes bandwidth between nodes. Schools after hours and hackerspaces make for the ideal setting. You could try to beat your friends at a shooter game, watching expressions of rage and joy light up on their faces.

Whether you believe or not that body language is 50% of communication, we kind of miss out on this in the age of Massively Multiplayer and Virtual Reality experiences. You might say that what I am describing is a relic of the past. It is certainly with nostalgia that I remember the PolyLAN tournaments of student days, or see friends from another kind of LAN - the demoparty - up on stage at Ludicious.

Yeah, whatever. Join a LAN! Go to a hackathon! Put your hardware to the test, see how much crypto you could mine in a few hours, how many alien signals you process, what proteins you can fold, bake open data as we did in 2018, or devise a comeback strategy for the synthesized tennis court.

#AMLD2019 invited us to explore in more than a metaphorical way our developing relationships with machines. The quality of the interhuman connections in this People Area Network was exceptional. Two thousand heads bunched into the SwissTech Convention Center for intense brain-to-brain transfer, spurred by all the exciting research and industry behind the latest wave of interest in Artificial Intelligence, as Machine learning helps tackle the world's "grand challenges” (actu.epfl.ch).

I ran a workshop to share the latest outputs of the group behind DataHub.io and Frictionless Data, demonstrate pipelines for transforming and augmenting data quality at scale. Talking from experience of running a lot of hackathons, plus working on projects like Opendata.swiss, Dribdat, Data Central, the Julia libraries, or most recently SmartUse, where I have been applying some of these ideas in code.

We discussed the philosophy, concepts, and roadmap of the initiative. Several of the participants were experienced in parallel efforts in open standards for data, which led to some very lively debate. The sources, tools and waypoints for starting with open data in a machine learning project are documented at School of Data CH.

The question of reproducibility in data science got to the heart of the issue. During the conference I had many discussions about the challenges involved, the chance to learn about distributed workflows in various research endeavours - and what kind of concerns are keeping people up at night. The opportunities for ‘industrial-strength’ open data were manifold at an event where most people were already taking for granted that experiments could be started and neural networks trained with readily available public datasets. The excitement in the room was palpable.

My original purpose here was to discuss and explore the ways the movement and the community of Open Data supports Machine Learning research and applications. I came away inspired to apply ML with diligence, ethos and craft. Thank you very much to everyone who made AMLD2019 possible, to everyone who supported my workshop the participants, referees at EPFL and Open Knowledge, and to everyone there who made this an epic LAN for many neural nets to remember.

Visit #AMLD2019 for visuals, drop oleg @ TLD a line if you have any questions, and stay tuned as we discuss AI for healthy cities in the next post.