Canon Research Centre France S.A.S.

Internship proposals

Internships are proposed in the scope of CRF corporate social responsibility (CSR). The sole aim of internships from CRF is to contribute to the education of interns who will benefit from the expertise of CRF researchers. Each year, CRF is proposing several internships, so do not hesitate to regularly visit this page.

AI-based Annotation and Querying Tool of Scientific Publications (fulfilled)
Deep Metrics for Style distances in Medical Imaging (fulfilled)
Multiple Object Tracking (MOT) (fulfilled)
Simulation and Evaluation of the IEEE 802.11be/bn PHY layer (fulfilled)

Send your application to jobs@crf.canon.fr

AI-based Annotation and Querying Tool of Scientific Publications (fulfilled)

Duration: 5/6 months / Preferred starting date: February 2025

Internship subject

Being aware of latest scientific progress is a key challenge for research engineers. As the number of scientific publications grows, this challenge becomes increasingly difficult and time-consuming. However, document processing can now benefit from Artificial Intelligence. In particular, auto-classification of documents can help scientific teams by reducing the need for detailed analysis and manual categorization. Coupled with powerful and intuitive Web interface, it can result in huge efficiency gains.
Advances in Artificial Intelligence (AI) and Natural Language Processing (NLP) research have led to models that can be used as feature extractors for token-classification tasks that have been the cornerstone for Information Extraction tasks such as Named Entity Recognition and Relation Classification between entities.
The intern will develop a tool to organize large collections of documents by relying on both conventional approaches and latest AI technology [1], with an open-minded approach focused on results.

Mission

In a first step (review and development), documents will be pre-processed (Information Retrieval) to create a database using conventional tools (e.g. keyword search, template-based methods). The pre-processing will result in a first set of annotations, that can be used as ground-truth labels to evaluate further annotations produced by AI-based solutions (object of the second phase). The pre-processing will also be used to validate the Web interface when submitting simple queries to annotated documents.
In a second step (more research oriented), different kinds of information and relationships between documents (e.g. Relation Classification) will be identified to enrich the first set of annotations and to allow advanced queries on the publications database.
Annotating and querying the publications database will be done through a Web-based interface based on the Django framework [2].
At the end of the training period, the intern should have acquired good knowledge in the field of AI tools relevant for auto-classification of documents as well as practical usage of mainstream Python-based Web framework (Django).

[1] https://arxiv.org/abs/2406.00008 for an example of the targeted tool
[2] https://www.djangoproject.com/ as basis of the application to develop

Academic background

You apply for a Master 2 diploma or an engineering degree in computer science. You are curious, open-minded, passionate about new technologies and have real interpersonal skills to integrate an innovative and multicultural environment.

Specific knowledge

AI, Natural Language Processing, Web frameworks
Python, Git

Deep Metrics for Style distances in Medical Imaging (fulfilled)

Duration: 4/6 months / Preferred starting date: February 2025

Internship subject

How characterizing the ‘style’ of images and, more specifically, how to measure the distance between the ‘styles’ of medical images? The answer to this question could help in improving the processing of medical images for better diagnosis.
Recognizing the style of images has been investigated for a while in computer vision and early deep learning methods have proven to be successful in this task [1]. Reason is that the inner visual features learned by the deep learning models can indeed capture the ‘style’ of images. Predicting styles by supervised methods however requires first to create datasets with specific annotated ‘styles’ which can be very costly especially in medical imaging.
From a different perspective, many image style transfer methods or Image-to-Image (I2I) translation methods including GANs and Diffusion models were proposed to change the style of images [2]. In I2I, goal is to modify the distribution of an input ‘style’ domain to match the distribution of the target ‘style’ domain. Datasets are composed of 2 sets of images, one representative of the input style and the other representative of the output style. Models are evaluated by computing distance between input and output distributions. For example, the Fréchet Inception Distance (FID) is a popular choice. These Inception distances can be seen as statistical ‘Style distances’ between sets of images. The issue is when the sizes of the dataset is small (which is often the case in medical imaging) which prevents to reliably estimate FID. Some alternative methods to FID have been recently proposed [3]. When only a pair of (un-matched) images are to be compared, a method computing a ‘style’ metric between the 2 images has also been recently proposed in [4].
The internship subject is to study the different metric for computing ‘Style’ distance between medical images and their performance as a function of the size of the available datasets.

Mission

In a first phase (scientific survey), the objective of the internship will be to study and document the different metrics that were proposed for computing ‘Style’ distance between images.
In a second phase (experiments and evaluations), the objective of the internship will be to implement the above different metrics and evaluate their performances from training and testing on various datasets of various sizes. Several medical imaging modalities could be studied ranging from X-Ray to Ultrasonic imaging.
In a third phase, if time permits, the influence of the selection of the backbone models including medical foundation models could also be studied [5].
At the end of the training period, the intern should have acquired good knowledge in the field of generative AI, its application to Medical Imaging as well as practical usage of mainstream Python Deep Learning framework (Pytorch).

[1] https://arxiv.org/pdf/1311.3715
[2] https://pubmed.ncbi.nlm.nih.gov/36753766/
[3] https://arxiv.org/pdf/2401.09603v2
[4] https://arxiv.org/pdf/2405.14718
[5] https://arxiv.org/abs/2310.18689

Academic background

You apply for a Master 2 diploma or an engineering degree in computer science or in telecommunications. You are curious, open-minded, passionate about new technologies and have real interpersonal skills to integrate an innovative and multicultural environment.

Specific knowledge

AI, Computer Science and/or Image Processing
Python

Multiple Object Tracking (MOT) (fulfilled)

Duration: 4/6 months / Preferred starting date: February 2025

Internship subject

Canon CRF is contributing to several European Projects for enhancing the security and safety of road users by detecting and analyzing objects on video streams captured by video cameras or LiDARs. Deep-Learning models are used to perform these detections from the captured images or point clouds. Further processing is realized to track objects from one frame to another.
In the recent years, many progress have been made for better tracking detected objects using deep-learning models. Some models extract features from detected objects to better re-identify them in following frames (e.g., DeepSORT) while other models combine object detection and object identification (e.g., FairMOT, CSTrack).

Mission

The aim of the internship is to evaluate deep-learning models built for Multiple Object Tracking (MOT). In a first step, the intern will realize a bibliographical study on recent MOT deep-learning models. Based on this study, she/he will select one model to build a MOT prototype. Finally, she/he will evaluate the model on both public datasets and road traffic scene captured by CRF.

Academic background

Specific knowledge

AI, DeepLearning models
Python

Simulation and Evaluation of the IEEE 802.11be/bn PHY layer (fulfilled)

Duration: 5/6 months / Preferred starting date: February 2025

For several years, Canon, through its CRF research center, has been involved in activities related to the IEEE 802.11 standard, particularly the latest Wi-Fi-7 and future Wi-Fi-8 generations. The research center is recognized as active and present in the standard with strong expertise in standardization. This internship aims to closely participate in the evolution of the Wi-Fi standard, specifically by investigating advanced technologies used in the physical layer.

Internship subject

Wi-Fi is one of the technologies that enables numerous electronic devices to exchange data or connect to the Internet wirelessly using radio waves. The main advantage of IEEE 802.11 or “Wireless LAN” devices is that they contribute at low cost to the local area network (LAN) deployments. Today, millions of IEEE 802.11 devices, including those used in Canon’s devices, are utilized worldwide and operate in the same frequency bands.
The IEEE 802.11 standard is a set of specifications describing the functionalities in the MAC (Medium Access Control) layer and the physical (PHY) layer for implementing wireless local area network (WLAN) communication.
To increase data rates and enhance spectral efficiency, new techniques have been introduced to boost Wi-Fi performance, reaching speeds on the order of Gigabit per seconds (Gbps). Advanced techniques such as OFDMA, MU-MIMO, and beamforming have significantly improved the performance in recent Wi-Fi generations.
A MATLAB-based simulator used by our research center allows the evaluation of the different implemented technologies by IEEE 802.11. Indeed, this simulator enables the assessment of different techniques employed in the physical layer for various possible scenarios.
The intern will work within the Wi-Fi team and will develop a unique experience in standardization and research within a research and development center.

Mission

Within the Wi-Fi team, you will have to enrich our modelling and simulation platform to evaluate the user experience.
As part of the Wi-Fi team, the intern will carry out the following tasks:

Get familiar with the already developed simulator
Evaluate the different techniques used and their impact on system performance
Assess the channel model used
Apply the simulator to scenarios addressed by the standard

Academic background

You are a Master 2 or a 5th year engineering student in telecommunications. You are curious, open-minded, passionate about new technologies and have real interpersonal skills to integrate an innovative and multicultural environment.

Specific knowledge

MATLAB modeling
Programming languages: python, C/C++