There are a number of elements of my push into the world of Data Science but ultimately the idea is to make Data Science my job. To understand how I am to get there I’ve needed to start looking at what is that is missing from my experience and knowledge and then start to fill the gaps. I need to learn the whole piece and then create the environment required. This means equipment and tools. I’ve made a good start on building the cluster and I’ve also looked into the tools. The obvious requirements are development environments for both R and Python Programming so I’ve picked those up and started working with them. I do need to get an IDE for Python though and I think I’ve decided on Komodo with the ActivePython For Data Science add ons. In addition I need data storage and a DB Server that has enough storage capacity for my ongoing needs. I’ve purchased rather a lot of DB space for projects over the last 18 months but to be fair it’s pretty costly compared with the option of picking up a Synology NAS and running MariaDB (MySQL) on it. I picked up a Synology DS216J and installed two 3Tb volumes on it for now but I will probably need to go for a larger Synology NAS at some point. The good thing about the ActivePython add ins is that a DB connector for MySQL is included.
So…In the not too distant future I will have a working Python development environment, Database space available for project work and a Pi3 Cluster for extra computation power – although some may find that last part funny! Seriously – Pi3s aren’t to be sniffed at especially when running massively parallel processes – that’s exactly where I’ll be going with it.
The other part of the equation here is Data Science Education…I need to learn the actual techniques which requires finding the right course to start out with…that’s the next task.