Hi, I am pursuing Master's in Electrical Engineering (specialization in Robotics and Computer Vision) at the University of California, Riverside starting in Fall'22. In addition to my studies, I also work as a Graduate Student Researcher at the Visual Computing Group of UCR with a focus on Vision-Language model and Gait recognition. Previously, I have worked as a Project Assistant at Visual Computing Lab, IISc Bangalore under the supervision of Dr. Anirban Chakraborty. My primary research interest lies in the intersection of Deep Learning and Computer Vision. I focus on leveraging Unsupervised and Semi-supervised Learning techniques for domain adaptation. I have exposure to other fields like Natural Language Processing, Electronics, Communication systems, CAD, Graphic Designing, Video Editing, etc
As an undergrad student, I have done my research intern under Prof. Hongliang Ren, NUS. I have also worked with a senior scientist at DRDO on a drone image based object detection and tracking project.
I completed my undergraduate degree from Visvesvaraya National Institute of Technology (VNIT), Nagpur, India (Batch of 2021). During my bachelor's, I have incessantly worked at IvLabs, the AI and Robotics Lab under Dr. Shital Chiddarwar. Being a core coordinator of the lab, I have mentored motivated juniors and conducted workshops. I am open to collaborations and looking forward to connecting with people working in similar domains!
[Apr. 2023] Serving as a reviewer for ICCV-2023 conference.
[Mar. 2023] Our work at IISc on Improving Domain Adaptation through Class Aware Frequency Transformation got accepted to the International Journal of Computer Vision (IJCV).
[Oct. 2022] Our paper on CoNMix for Source-free Single and Multi-target Domain Adaptation got accepted at the Winter Conference on Applications of Computer Vision (WACV) 2023.
[Apr. 2022] Paper on Holistic Approach to Measure Sample-level Adversarial Vulnerability and its Utility in Building Trustworthy Systems got accepted at HCIS Workshop, CVPR'22.
[Oct. 2021] Our paper Open-Set Multi-Source Multi-Target Domain Adaptation got accepted at Pre-registration Workshop, NeurIPS'21.
[Aug. 2021] Our work at IISc on CAFT: Class Aware Frequency Transform for Reducing Domain Gap got accepted at TradiCV Workshop, ICCV'21 .
[Jul. 2021] Joining as a Project Assistant at Visual Computing Lab, IISc Bangalore.
[Nov. 2020] Completed my intern work at NUS - National University of Singapore on 6DOF origami pose estimation. Check out the Video Here
[Aug. 2020] First position holder of Smart India Hackathon 2020 conducted by MHRD winning cash prize of Rs. 1,00,000 [Video Link].
[Aug. 2020] Collaborated with a senior scientist at CAIR Lab, DRDO on drone image tracking, detection and pose estimation.
[Jun. 2020] Made two GitHub repository IvLabs/resources and IvLabs/ResearchPaperNotes which started to gain popularity!
[Feb. 2020] Bagged two first position in two different competitions named 'Techno.Docx' and 'Electroblitz' at AXIS’20, VNIT | Central India’s Largest Techfest.
Improving Domain Adaptation through Class Aware Frequency Transformation. Vikash Kumar*, Himanshu Patil*, Rohit Lal, Anirban Chakraborty, In. International Journal of Computer Vision (IJCV)
CoNMix for Source-free Single and Multi-target Domain Adaptation. Vikash Kumar*, Rohit Lal*, Himanshu Patil, Anirban Chakraborty, In. Winter Conference on Applications of Computer Vision (WACV), 2023 [Project Page]
Holistic Approach to Measure Sample-level Adversarial Vulnerability and its Utility in Building Trustworthy Systems. Gaurav Kumar Nayak*, Ruchit Rawal*, Rohit Lal*, Himanshu Patil, Anirban Chakraborty, In. HCIS Workshop, Conference on Computer Vision and Pattern Recognition (CVPR)-2022 [Project Page]
Open-Set Multi-Source Multi-Target Domain Adaptation. Rohit Lal, Arihant Gaur, Aadhithya Iyer, Muhammed Abdullah Shaikh, Ritik Agrawal, Shital Chiddarwar, In. Pre-registration Workshop, Neural Information Processing Systems (NeurIPS)-2021 [Project Page]
CAFT: Class Aware Frequency Transform for Reducing Domain Gap. Vikash Kumar, Sarthak Srivastava*, Rohit Lal*, Anirban Chakraborty, In. TradiCV, International Conference on Computer Vision (ICCV) Workshop-2021 [Project Page]
ScoopNet: A 6DOF Pose Estimation pipeline for Origami-inspired Worm Robots. Rohit Lal, Ruphan Swaminathan, Lalithkumar Seenivasan, Liang Qiu, Dr. Hongliang Ren, In. International Conference on Development and Learning (ICDL), Beijing, China [Video]
Deep Self Correcting Tracker (DeepSCT) Mechanism. Khush Agrawal*, Rohit Lal*, Himanshu Patil*, Dr K. Surender, Dr Deep Gupta, In. 26th National Conference on Communications (NCC)-2020, IIT Kharagpur [Paper] [Code]
DeepSCT based Person Following Drone. Himanshu Patil*, Rohit Lal*, Khush Agrawal*, Dr. K. Surender, In. Unmanned Aerial Systems in Geomatics-2021, IIT Roorkee [Paper] [Code]
Person Following Mobile Robot using Multiplexed Detection and Tracking. Khush Agrawal*,Rohit Lal*, In. International Conference on Advances in Mechanical Engineering (ICAME)-2020, VNIT Nagpur [Paper] [BibTex] [Code] [Video]
Cursor Control Using Face Gestures. Arihant Gaur, Akshata Kinage, Nilakshi Rekhawar, Shubhan Rukmangad, Rohit Lal, Dr. Shital Chiddarwar, In. Conference on Soft Computing and Pattern Recognition, Hyderabad-2019 [Paper] [BibTex] [Code]
Real Time Human Computer Interaction Using Facial Gestures. Rohit Lal, Dr. Shital Chiddarwar, In. 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), IIT Kanpur-2019 [Paper] [BibTex] [Code]
* authors claim equal contribution
Navigation System for a Vehicle and for Navigation. Pandya Karan, Kotecha Prakrut, Iyer V Aadhithya, Gaikwad Ravishankar, Lal Rohit , Agrawal Rishesh and Shital Chiddarwar, Issued Dec 1, 2019 Patent no. 201921049473 [Patent File] [Video]
Aaron Bobbick | Udacity
Helper robots are widely used in various situations, for ex-ample at airports and railway stations. This paper presents a pipelineto multiplex the tracking and detection of a person in dynamic envi-ronments using a stereo camera in real-time. Recent developments inobject detection using ConvNets have led to robust person detection.These deep convolutional neural networks generally fail to run with highframes rates on devices with less computing power. Trackers are alsoused to retain the identity of the target person as well as imposefewerconstraints on hardware. A concept of multiplexed detection and tracking is used which makes the pipeline faster by many folds. TurtleBot-2is used for prototyping the robot and tuning of the motion controller.Robot Operating System (ROS) is used to set up communication be-tween various nodes of the pipeline. The results found were comparableto current state-of-the-art person followers and can be readily usedinday to day life.
Our current solution (implemented) provides a robust registration plate detection, and extracts other features like car model, speed, face (if visible), date and time of entry/exit and upload the extracted data to a centralized IoT integrated database. Beneficiaries include malls, colleges, parking lots, etc. with multiple gates. Whenever the gate camera detects a departing car, the corresponding owner gets notified. Further, the owner can use the Alert feature to warn the guard. The web application has two levels of access, the first providing general information about a specific car to the corresponding owner, and the latter one for the Authority, which stores all the data of a campus. This can be used to monitor the traffic on the campus and for surveillance applications.
Control system is designed to stabilise the camera gimbal system used in different airborne systems for applications such as target tracking, surveillance, aerial photography, autonomous navigation and so on. The technique
is applied in everything from self-stabilising cameras to helicopters and noise reducing equipment. This camera gimbal system replaces many traditional tracking systems such as radar which are heavy and large to mount
on air vehicles. So, the stabilisation of camera gimbal is very important to eliminate shakes and vibrations in photography, provides accuracy.
NOTE: This project was selected for SIH-20 from our internal hackathon conducted by college. Further details will be shared after results of SIH-20
There are four designated locations in the grid world indicated by R(ed), G(reen), Y(ellow), and B(lue). When the episode starts, the taxi starts off at a random square and the passenger is at a random location. The taxi drives to the passenger's location, picks up the passenger, drives to the passenger's destination (another one of the four specified locations), and then drops off the passenger. Once the passenger is dropped off, the episode ends. Observations: There are 500 discrete states since there are 25 taxi positions, 5 possible locations of the passenger (including the case when the passenger is in the taxi), and 4 destination locations.This problem was solved using Q-Learning Approach. The model trained is consistently among top-5 in OpenAI Gym Leaderboard
Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per word to a continuous vector space with a much lower dimension. Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located close to one another in the space
From the above image we can see that the distance between 'Obama' and 'President' word vector in the projected 2D space is least. This is because those two words are highly correlated. Similar justification goes with words pairs like 'media' and 'press', 'speaks' and 'greets', etc. The accuracy will increase with increase in voclabulary and training time. For my results visit the github repository..
This Summer Project was mentored by me at IvLabs, VNIT
The MNIST handwritten digit classification problem is a standard dataset used in computer vision and deep learning. Although the dataset is effectively solved, it can be used as the basis for learning and practicing how to develop, evaluate, and use convolutional deep learning neural networks for image classification from scratch. This includes how to develop a robust test harness for estimating the performance of the model, how to explore improvements to the model, and how to save the model and later load it to make predictions on new data. This was coded in scratch using Numpy. For more results and details about the algorithm visit the GitHub page. For full demo click the Video button
Simple harmonic motion can serve as a mathematical model for a variety of motions, such as the oscillation of a spring. With the aim of learning computer vision and MATLAB, I worked on analyzing the motion of a target-object undergoing a damped harmonic motion. The target-object was separated from the background using color thresholding and estimated as a point object. Coordinates of this point were recorded and used to estimate the parameters associated with the mathematical model of the system like maximum displacement, mean position, the velocity at different time instants. A mathematical model was estimated by fitting a curve to the recorded data using MATLAB Curve Fitting Toolbox.
The hand gesture controlled bot is a bot which receives it commands by giving pitch and roll to hand. This is helpful for people on wheelchair who can't even move their fingers or hands.These bots are very useful in many applications like remote surveillance, military etc. Hand gesture controlled robot can be used by physically challenged people for wheelchair control .Hand gesture controlled industrial grade robotic arms can be developed.
The code is written from scratch using pytorch for dataloading, matrix calculations and GPU acceleration. This was my first introduction to DL where I wrote the code myself along with learning various mathematics and techniques required to optimize a netowrk (PS. This also included learning ways to tune hyperparameters). Deep Learning Models Implemented are enlisted below: