ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2209.05725
11
5

Hypersparse Network Flow Analysis of Packets with GraphBLAS

13 September 2022
Tyler H. Trigg
C. Meiners
Sandeep Pisharody
Hayden Jananthan
Michael Jones
Adam Michaleas
Tim Davis
Erik Welch
William Arcand
David Bestor
William Bergeron
Chansup Byun
V. Gadepally
Micheal Houle
Matthew Hubbell
Anna Klein
Peter Michaleas
Lauren Milechin
J. Mullen
Andrew Prout
Albert Reuther
Antonio Rosa
S. Samsi
Douglas Stetson
Charles Yee
J. Kepner
ArXivPDFHTML
Abstract

Internet analysis is a major challenge due to the volume and rate of network traffic. In lieu of analyzing traffic as raw packets, network analysts often rely on compressed network flows (netflows) that contain the start time, stop time, source, destination, and number of packets in each direction. However, many traffic analyses benefit from temporal aggregation of multiple simultaneous netflows, which can be computationally challenging. To alleviate this concern, a novel netflow compression and resampling method has been developed leveraging GraphBLAS hyperspace traffic matrices that preserve anonymization while enabling subrange analysis. Standard multitemporal spatial analyses are then performed on each subrange to generate detailed statistical aggregates of the source packets, source fan-out, unique links, destination fan-in, and destination packets of each subrange which can then be used for background modeling and anomaly detection. A simple file format based on GraphBLAS sparse matrices is developed for storing these statistical aggregates. This method is scale tested on the MIT SuperCloud using a 50 trillion packet netflow corpus from several hundred sites collected over several months. The resulting compression achieved is significant (<0.1 bit per packet) enabling extremely large netflow analyses to be stored and transported. The single node parallel performance is analyzed in terms of both processors and threads showing that a single node can perform hundreds of simultaneous analyses at over a million packets/sec (roughly equivalent to a 10 Gigabit link).

View on arXiv
Comments on this paper