top of page
  • Writer's pictureJunghyun (Andy) Kim

Machine learning-based code auto-completion implementation for firmware developers

Sponsored by Samsung Electronics


Research Motivation

Firmware software developers at a company are typically responsible for developing a software program that operates a product. A company always seeks to provide a streamlined work process for firmware software developers to increase productivity as the process leads to saving money for the company. One potential barrier for increasing productivity is to spend considerable time writing code that is particularly due to a repetitive task. Another potential problem is that firmware software developers may be generating similar codes simultaneously as they are separately involved in developing different hardware products. There have already been many attempts to develop code auto-completion functionality from different research groups. However, Samsung Electronics wanted to establish an in-house code for the following reasons: 1) they did not want to utilize commercial tools given that there would be a risk of leaked source codes and 2) the commercial tools might not be applicable to the specific domain especially one needed to predict unique code patterns and style.


Key Idea

I propose a hybrid approach that harnesses the synergy between machine learning techniques and advanced design methods aiming to develop a code auto-completion framework that helps firmware developers write code in a more efficient manner. In addition, given that there has not been any analysis done on the optimal diversity parameter values of the GPT-2 model on the firmware development domain, I propose the following methodology to enhance the level of understanding of the relationship between the GPT-model diversity parameters and code auto-completion functionality in the SSD firmware development domain.


Overview of the proposed methodology


Results

The sensitivity analysis results show that the deterministic design results in reducing prediction accuracy as it generates output in some unexpected ways; while the probabilistic design provides a list of reasonable next code elements in which one could select it manually to increase prediction accuracy.


Publication

Click here to access the publication.

bottom of page