Data^3 and Installfest

PyLadiesTC was invited to join PyMNtos and the Twin Cities R User Group to help plan an event they were calling Data Cubed, which aimed to bring together the local Python and R communities as well as some Data Visualization folks for a full day of data. I was selected to represent PyLadies on the organizing committee.

Since we were expecting that many people would have expertise with either Python or R but maybe not both, PyLadies wanted to host a pre-event that would help people install both languages and then start learning to use them. I was the lead organizer for the Installfest, and I had an amazing team of volunteers- PyLadies and others- to help plan and staff the evening. SPS Commerce was our generous host and sponsor- they provided us with food, space, and flash drives that we could load all the files onto for people to use. This way we didn’t need everyone to connect to wifi to download and install, and people also had imiediate access to the activities and data set as well.

We used the Anaconda installation of Python, which comes with the IDE Spyder (and IPython Notebook). In addition to R, we had people install RStudio as well. With the help of some additional volunteers we created some activities that walked through the same tasks in both languages, with the code that people could run or edit. We chose to use the Titanic dataset from Kaggle because we knew one of the speakers would be talking about it at the main event the next day; this way, people were familiar with it and how to access it in both Python and R.

We had several PyLady and other volunteers on hand with expertise in both Python and R to assist with installation (on Windows, Mac, and Linux) and the activities. We also encouraged people to use the folks around them as resources: our nametags had indicators for what OS and languages we knew about so that people could look for another attendee to help them if the volunteers were busy. That worked out really well, since there was a lot of expertise in the room!

I was blown away by how well the event went- we had about 40 people attend, and lots of people were chatting, playing with the dataset, and troubleshooting issues together. The volunteers were fantastic and helped people with different skillsets talk to each other and learn from each other, as well as from the volunteers. It was a lot of work putting this thing together but it was totally worth it!

The next day was the main event Data^3, which was sponsored by Target and Cloudera. We had a full day of presentations on Python, R, data visualization, and tools to use.

Presentations:

Python for Data Analysis and What’s New in GitHub or on slideshare Wes McKinney, Software Engineer at Cloudera, author of “Python for Data Analysis” and developer of pandas package
Getting Started in R Danny Kaplan, Professor of Statistics at Macalester College
Getting Started in Python Ravi Shanbhag, Director of Data Science at UHG
An Introduction to Bayesian Belief Network in R Using the bnlearn Package in GitHub or on slideshare Abraham Matthew, Analytics Strategist at Carmichael Lynch r file , Tinder Data csv
Data inspection, validation, and conversion using csvkit Matt Pettis, Data Scientist at Quantum Retail Building A Classifier in Python and R Jay Jacobs, Principal at Verizon and co-author of “Data Driven Security”
The Allure of a Kaggle Competition Diane Rucker, Project Management Consultant for MIT Trust Center for Leadership
Paleontology in Python: Analyzing Dinosaur Trackways in Github or online Scott Ernst, Scientific Research Consultant at Paléojura (A16)
Data Visualization in R Winston Chang, Software Engineer at RStudio and author of “R Graphics Cookbook”
Panel Discussion - How do you practice Data Science? Jay Jacobs, Principal at Verizon Marc Light, Senior Engineering Manager at Honeywell Automation and Control Solutions Ravi Shanbhag, Director of Data Science at OptumInsight Alicia Hofelich Mohr, Data Management Research Associate at University of Minnesota

All of the materials and links to what we used at the Installfest are in our PyLadiesTC Github, as are the presentations from the Data^3 event.