Projects
Predicting Gene Expression Levels from DNA Sequences

[Link]
- This project was part of a hackathon to apply machine learning skills on real-world biological dataset by developing a machine learning model to predict gene expression levels across various cell types using DNA sequence information. Each input represented a DNA sequence centered on a gene’s transcription start site (TSS) extracted from the reference human genome (GRCh38). Each sequence had 49,152 nucleotides containing regulatory information that could determine the gene’s expression values in various cell types. Gene expression values across various cell types and sequencing methods were gathered from the ENCODE website. For the solution, using kipoiseq, torch, Adam optimizer, StepLR scheduler, and early stopping function we developed a CNN-based model ensemble. The model consisted of three branches, each with a different configuration of convolutional layers, pooling, dropout, and fully connected layers.
Enhancing CIFAR-10 Image Classification with Deep Learning



[Link]
- Developed and optimized a multi-layer perceptron (MLP) to classify CIFAR-10 images using PyTorch. Preprocessed the data, split it into training, validation, and test sets, and implemented data loaders. The initial MLP had two hidden layers each with 64 neurons, ReLU activation, CrossEntropyLoss, SGD with learning rate=0.001 and momentum=0.9. Tracked and plotted train/validation loss and accuracy, saving the best model based on validation accuracy.
Training a Chatbot using PyTorch
[Link]
- Trained a simple chatbot using recurrent seq-to-seq model and movie scripts from the Cornell Movie Dialogs Corpus based on the PyTorch Chatbot Tutorial. Followed the tutorial, implemented the code, and trained and evaluated the Chatbot model. To optimize model performance experimented with different combinations of hyperparameters such as clip, teacher_forcing_ratio, and attn_model, and reported the results and rationale.
NSF I-Corps Business Canvas Model for Biomedical Device Startup

[Link]
- To address the increasing prevalence of colon diseases such as IBD, Crohn’s, and Colon cancer, proposed a non-invasive, motorized capsule equipped with a camera and encapsulated robotic device to visualize the epithelial lining of the GI tract and perform small scale biopsy by leveraging AI/ML supported video analysis and CADe tool. Addressed all nine sections of NSF I-Corps Canvas Model for the startups by completing ideation, value proposition development, market and competitive analysis, sector intelligence analysis, identified funding and customer engagement strategies, identified initial supply chain and regulatory issues, conducted customer discovery interviews, completed a pro-forma budget and addressed future directions and timeline