Özgür Kara

Health Care Engineering Systems Center

1206 W Clark St. UIUC

Urbana, IL, USA, 61801

I am a PhD student at UIUC CS PhD program under the supervision of Founder Professor James Rehg.

My ultimate research objective is to develop controllable and computationally efficient generative models for video applications including but not limited to text-to-video generation, video editing, long-term video generation. Beyond these, I also worked on continual learning, and inverse image problems during my previous internships.

I always look for self-motivated students who want to focus on Generative AI related projects. Feel free to reach out to me if you are interested and located at UIUC.

Download my CV.

Education

PhD (Transferred): Computer Science - UIUC - 2024-Present
MSc, PhD: Machine Learning - Georgia Institute of Technology - 2022-2024
BSc: Electrical-Electronics Engineering - Bogazici University - 2018-2022
High School: Math and Science - Kadikoy Anadolu High School - 2013-2018

news

Feb-2025	The project I worked on during my Adobe internship, “ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models”, has been accepted to CVPR 2025! Stay tuned!
Dec-2024	The 6th edition of our workshop, CVEU (AI for Creative Visual Content Generation, Editing, and Understanding), where I serve as the primary organizer, has been accepted for CVPR 2025. Stay tuned!
Sep-2024	2 papers have been submitted to CVPR’25. Stay tuned!
Sep-2024	I have been recognized as an Outstanding Reviewer for ECCV 2024!
Jun-2024	I am joining to CVPR’24 at Seattle with the highlight paper! Don’t forget to drop by our poster RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models which will take place on Wednesday, 19th, from 17:15 to 18:45 during Poster Session 2 in Exhibit Hall (Arch 4A-E).

selected publications

CVPR 2025

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

Ozgur Kara, Krishna. K. Singh, Feng Liu, Duygu Ceylan, James M. Rehg, and Tobias Hinz

CVPR, 2025

ShotAdapter enables text-to-multi-shot video generation with minimal fine-tuning, providing users control over shot number, duration, and content through shot-specific text prompts, along with a multi-shot video dataset collection pipeline.
In Submission

Optimization-Free Image Immunization Against Diffusion-Based Editing

Tarik C. Ozden*, Ozgur Kara*, Oguzhan Akcin, Kerem Zaman, Shashank Srivastava, Sandeep P. Chinchali, and James M. Rehg

In Submission, 2024

DiffVax is an optimization-free image immunization framework that effectively protects against diffusion-based editing, generalizes to unseen content, is robust against counter-attacks, and shows promise in safeguarding video content.

arXiv Website
In Submission to TPAMI

Towards Social AI: A Survey on Understanding Social Interactions

Sangmin Lee, Minzhi Li, Bolin Lai, Wenqi Jia, Fiona Ryan, Xu Cao, Ozgur Kara, Bikram Boote, Weiyan Shi, Diyi Yang, and James M. Rehg

In Submission to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024

This is the first survey to provide a comprehensive overview of machine learning studies on social understanding, encompassing both verbal and non-verbal approaches.

arXiv
ECCVW 2024 (Oral)

Leveraging Object Priors for Point Tracking

Bikram Boote, Anh Thai, Wenqi Jia, Ozgur Kara, Stefan Stojanov, James M. Rehg, and Sangmin Lee

Instance-Level Recognition (ILR) Workshop at ECCV (Oral), 2024

We propose a novel objectness regularization approach that guides points to be aware of object priors by forcing them to stay inside the the boundaries of object instances.

arXiv Code
CVPR 2024 (Highlight)

RAVE: Randomized Noise Shuffling for Fast and Consistent Video Editing with Diffusion Models

Ozgur Kara, Bariscan Kurtkaya, Hidir Yesiltepe, James M. Rehg, and Pinar Yanardag

CVPR (Highlight), 2024

RAVE is a zero-shot, lightweight, and fast framework for text-guided video editing, supporting videos of any length utilizing text-to-image pretrained diffusion models.

arXiv Demo Code Website
IEEE FG 2024

Transfer Learning for Cross-dataset Isolated Sign Language Recognition in Under-Resourced Datasets

Alp Kindiroglu*, Ozgur Kara*, Ogulcan Ozdemir, and Lale Akarun

IEEE International Conference on Automatic Face and Gesture Recognition (IEEE FG), 2024

This study provides a publicly available cross-dataset transfer learning benchmark from two existing public Turkish SLR datasets.

arXiv Code
CVPR 2022

ISNAS-DIP: Image-Specific Neural Architecture Search for Deep Image Prior

Metin Ersin Arican*, Ozgur Kara*, Gustav Bredell, and Ender Konukoglu

CVPR, 2022

ISNAS-DIP is an image-specific Neural Architecture Search (NAS) strategy designed for the Deep Image Prior (DIP) framework, offering significantly reduced training requirements compared to conventional NAS methods.

PDF Video Code
IEEE TAC

Domain-Incremental Continual Learning for Mitigating Bias in Facial Expression and Action Unit Recognition

Nikhil Churamani, Ozgur Kara, and Hatice Gunes

IEEE Transactions on Affective Computing, 2022

we propose the novel use of Continual Learning (CL), in particular, using Domain-Incremental Learning (Domain-IL) settings, as a potent bias mitigation method to enhance the fairness of Facial Expression Recognition (FER) systems.

PDF Code
LEAP-HRI 2021

Towards Fair Affective Robotics: Continual Learning for Mitigating Bias in Facial Expression and Action Unit Recognition

Ozgur Kara, Nikhil Churamani, and Hatice Gunes

Workshop on Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI), 16th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2021

We propose the novel use of Continual Learning (CL) as a potent bias mitigation method to enhance the fairness of Facial Expression Recognition (FER) systems.

PDF Code
Nano Communication Networks

Molecular index modulation using convolutional neural networks

Ozgur Kara, Gokberk Yaylali, Ali Emre Pusane, and Tuna Tugcu

Nano Communication Networks, 2022

We propose a novel convolutional neural network-based architecture for a uniquely designed molecular multiple-input-single-output topology, aimed at mitigating the detrimental effects of molecular interference in nano molecular communication.

PDF Code
Brain Stimulation

Neuroweaver: a platform for designing intelligent closed-loop neuromodulation systems

Parisa Sarikhani, Hao-Lun Hsu, Ozgur Kara, Joon Kyung Kim, Hadi Esmaeilzadeh, and Babak Mahmoudi

Brain Stimulation: Basic, Translational, and Clinical Research in Neuromodulation, 2021

Our interactive platform enables the design of neuromodulation pipelines through a visually intuitive and user-friendly interface. (Google Summer of Code 2021 project)

PDF Code