Caijun Qin

Student NameCaijun Qin
Faculty Mentor NamePeter Dobbins
CollegeHerbert Wertheim College of Engineering
MajorComputer Science
Research InterestsGraph Theory, Combinatorics, Optimization Problems
Academic AwardsFSU University Freshman Scholarship (2018-2019), FSU Honors Program (2019), 1st Place in FSU Spring Programming Competition Lower Division (2019), UF University Scholars Program (2020)
OrganizationsACM, ACE, UF Programming Club
Hobbies and InterestsOpen-Source Software, Competitive Programming, Chess, Reading, Running

Research Project

Predictive Sampling Method for Spread Models in Networks?

This paper proposes an accurate, predictive sampling algorithm intended to forecast events or effectual range of an event in a spread model. The spread model is physically represented on a static, connected, and weighted network. Note that graph and network are synonymous. This type of graph is a set of vertices such that each vertex is connected via an edge to at least one other vertex. Applications of data science such as predicting influence in social networks, spreading of neural signals in the brain, changes in ecological food webs, and more depend on very large graphs. Many sampling techniques have been proposed to generate a smaller sub-graph as a representation of the original graph. In some cases, we do not have access to all vertices and edges immediately; rather, we can only know the local properties of one or a few vertices and edges. Under his context, sampling a large graph requires growing the sample from one or more sources. For our purpose of demonstration, a single source will be used. Iteratively at each vertex, a less restrictive variant of caterpillar graph is accumulated onto a growing and possibly branching sample network. The ratio of each unvisited edge weight to the sum of all incident edges’ weights determines whether a new edge will be appended to the growing sample. An existing probabilistic method of network sampling using random walks will be compared to our proposed method. A random walk, in the general sense, randomly appends an incident edge (and the attached vertex) to the sample, continues down the new edge, and repeats until some end criteria is reached. For weighted edges, a “random” walk would append an edge based on probability of being chosen, which is determined by edge weight. Our method binarizes the random walk approach by either appending an edge with 100% probability or certainly not at all, based on the edge weight. To compare performance, agent-based modeling (ABM) will be utilized to simulate the spread of agents from the same initial source. The accuracy, using either sampling method, is the ratio of vertices visited by any agent(s) found in the sample graph to the number of visited vertices overall.