About me
I’m a self-taught data scientist who is passionate about building better predictive models of The World. I spend a significant portion of my free time reading up on Statistics and programming pet projects using Python. I also like cycling, building PC’s, playing FPS and racing games and travelling.
Tech stack
I like using whatever tools are best for the project but I’m biased towards using Python when I could.
- Python (NumPy, Pandas, Scikit-learn, Pytorch, Flask)
- Flask, Django, Rails
- SQL (I really like Postgres and SQLite, other Databases are OK)
- Large Language Models, more of using them to do things for me such as cleaning data or automating tasks.
Do I know how to write a fizz-buzz script?
'''
Yes. Branchless programming not necessary
'''
for i in range(1,16):
print(f"{'fizz'*(not i % 3)}{'buzz'*(not i % 5)}{str(i)*(i % 3 != 0 and i % 5 != 0)}")
Joel Grus’s implementation is also a good choice.
Research/Publications
I worked on four different research projects on Machine Learning applications during my undergraduate degree. The results from two of the papers are currently published, the remaining two are a work in progress, that I am currently working on having graduated.
-
-
- First Author, Collaboration
- Technical Summary: Regression modelling
- General Summary: Predicting Dubai License Plate listed prices using webscraped data from various local eccommerce websites. Modelled prices as a time series to tackle a Regression problem.
- Published in the 2022 International Conference on ICT for Smart Society (ICISS)
-
- First Author, Collaboration
- Technical Summary: Classification modelling, Representation Learning, Graph Attention Networks
- General Summary: Using Graph Attention Network to learn node features of directed temporal graphs. Models trained on the Elliptic Dataset. Aiming to achieve highest recall score for illicit class.
- Planning to publish in March 2023. If you want to collaborate reach out to me through email. Work in progress.
-
- Collaboration
- Technical Summary: NLP Classification, fasttext cc.en.300 sentence vectors, GPT-Dialo-Large based chatbot, Chatbot application back-end
- General Summary: Created a classifier to detect sexual predators in online chats, trained on the PAN-12 dataset. A GPT-Dialo-Large based chatbot was tuned to pretend to be a teenage girl. The two parts are tied together using an SQLite database.
- In the process of being published in IEEE Access. Source code for sentence vector generation here.
-
Investigating students’ preferred camera settings on Zoom for optimal learning experiences across different interaction scenarios in online learning
- Collaboration
- Technical Summary: Discrete Clustering, Hypothesis Testing, ANOVA
- General Summary: Used statistical techniques to indentify different groups of students based on their camera usage (theirs and instructors). Students were stratified based on gender and academic achievement levels. Suggested pedagogical methods to ensure optimal learning outcomes for the different groups.
- Under review for publication in the Journal of Computer Assisted Learning
Personal Projects
All my personal projects can be found in my Github. Here are a few.
-
- Technical Summary: Regression modelling, Algorithms
- Story: I combined 5 different datasets to predict Housing prices in Dubai based on the features of the housing unit itself and its surrounding venues and transit options. I had to write an algorithm that uses GPS coordinates to assign venues to specific neighborhoods based on their polygon outlines.
-
Subreddite Text Analysis (Contact me for source-code.)
- Technical Summary: Time series hypothesis testing, Text analysis, Language API’s
- Story: I carried out text analysis to figure out if the discourse quality in a subreddit got worse over time.
-
- Collaboration with Michael Jurasovic.
- Technical Summary: Mixed-Integer Linear Programming, Optimization
- Story: Gives you the smallest number of electronic components (resistors, capacitors, inductors) from a given list; to achieve the required unit specification within a tolerance band. I created this project to help out fellow EE students for their EE labs so they don’t have to think about how many components they need in which amount.
-
- Technical Summary: Scripting, Webscraping, Multi-threading
- Story: I saw my dad wasting time manually plugging in distances of locations between two columns in an excel sheet. I wrote a script that does that work for him.
Hobby Writing
I like writing non-academic non-fiction texts. However, I don’t have the free time to write as much as I would like to. Ultimately I might make a Substack one day.
- Resume
- You can email me at fardinahsan146@gmail.com
- Find my LinkedIn
- WhatsApp me at +971501468233