Securing OSS Supply Chain Data through Modeling & Analysis


Researcher Vineet Mohanty is working resolve a number of defects and improve the mean repair time of the interdependices of open-source supply chain software instances. Reduction of repair time and defects will lead to a more efficient supply chain model, which in turn could save a substantial amount of resources within supply chains.


Mohanty applies network sceince concepts with the hopes to empirically evaluate the risks associated with OSS supply chain interdependencies. To achieve this, we analyze the relationships between the source package and its corresponding dependent packages found in a Debian buildinfo file.

Motivation - What are the sources of threat and cause for ‘critical’ packages?

Vulnerable packages can be due to many issues, such as:

  • lack of security updates
  • lack in release frequency
  • defects in environmental configuration
  • malicious code alteration in the repository
  • failed packaging process in general, as the dependence on OSS products increase, more stakeholders get involved and hence more chances of threat.


The following research milestones are explained in Mohanty’s words:

1. Identifying the problem through the data

First, we need to pre-process the raw data from the Debian OSS Linux community that design and maintain large commercial OSS products. We then develop a graph-theoretical model that allows us to describe and explain the risk of OSS supply chain ecosystems by evaluating the vulnerabilities in the design and identifying critical packages in the structure.

The critical packages are determined by computing the degree of package that is highly connected to other packages in a complex network. These packages are critical because as they are highly connected entities, any scope of cyber-attack poses a direct threat to all its connected packages in the OSS ecosystem. This way the security of the OSS supply chain gets compromised and becomes vulnerable. For the same reason these critical or highly connected packages form the “weak-link” in our network.

2. Resolve the issue by identifying possible safeguards

To safeguard the software products & packages, it is imperative we apply certain specific computational tools & techniques to our proposed research problem such as implement Big Data engineering and Machine Learning to classify the OSS risk profile. This will help us to better predict the expected bugs in the repository/OSS ecosystem. Ultimately, we expect this knowledge emanating from our research to benefit software developers in designing more secured OSS products by adding appropriate security features and have a thorough insight the development stages.

3. Protect the process

The objective is to obtain actionable insights from the data and network science approach we plan on implementing. It will enable better decision making by having a more informed idea of the software design process and test our model in the field. We make use of cumulative yearly graph generation to investigate and monitor the degree centrality of the highly connected packages as they change over time.