MY JOURNEY THROUGH SHE CODE AFRICA MENTORSHIP PROGRAMME: Part 1.
My initial introduction to data science was in February,2020; in a quest to expand my knowledgebase outside and around my professional scope.
Learning was slow at first, but I forged ahead. Then came the pandemic with an enforced lockdown that afforded me a lot of free time. Still, I had challenges as there was no mentor to point me to the right directions and materials. I ended up with loads of online courses that overwhelmed me, and I got discouraged to pursue further.
By a stroke of luck, I read a tweet that admonished following the right people for you and not just the popular ones. This prompted a search for data scientists on twitter, and boom! I found She Code Africa Mentorship Programme (Cohort 2).
The happiness was short-lived as I realized I have to pass an assessment to get in. I knew I didn’t have the required knowledge to pass, but I got a new target — to acquire the required technical knowledge. In September 2020, a new cohort started and I got in.
The first month of mentorship has been wonderful. I have amassed so much knowledge this month that I am planning to include Data Scientist in my Twitter bio soon. I have learnt:
1. Python
Python is one of the recommended programming languages for data science. We did an Udacity course on python which improved my knowledge of Data types, Control flow, Functions and Errors.
We were given assignments to write programs on Guess the number and password generator which solidified my knowledge on control flow and functions. My solutions to the assignments can be found here.
2. Probability and Statistics
Knowledge of Probability and Statistics are indispensable for a data scientist. We took an edX course on probability, read different articles on both statistics and probability. This greatly improved my perception of both topics and rid me of my fears of these topics.
I wrote an article on probability in relation to Naïve Bayes which solidified my understanding of the topic and dispelled my fear of writing articles. My previous article on probability can be found here.
3. Data Analysis
We were introduced to different python libraries like Numpy, Matplotlib and Pandas basics. Each of the library has its specific use in the process of data analysis. Pandas for manipulation of the data, Numpy for manipulation of numerical data and Matplotlib for the visualization of data.
I wrote a report on the insights I gained from the analysis I did on a diabetes data set which included visualization with bar plots, scatter plots, pair plots and heat maps, dropping missing rows, creating new columns etc. This was a crucial point for me; it was no longer ones and twos but has a real meaning that can be communicated to other people. Here is the link to my GitHub repository.
4. Data Wrangling
Data wrangling knowledge is very key to data scientists. Pandas library can be used for data wrangling which involves cleaning the data, dropping missing values, merging, and grouping which helps to better understand data and data analysis.
The above are the quantifiable learnings. The most important learnings for me are the unquantifiable learnings which is the learnings from my mentor, Oguchi Ebube and my team mate, Rasheedat Atinuke Jamiu. They have been very encouraging and supportive in my learning process for the past month. I am quite excited for the next two months.
Thank you again She Code Africa for this opportunity.