Big Data, Big Graphs, and Graph Theory: Tools and Methods

Big Data, Big Graphs, and Graph Theory: Tools and Methods

Big Graphs Need Specialized Data Storage and Computational Methods

{A Working Blogpost – Notes for research & study}

Processing large-scale graph data: A guide to current technology, by Sherif Sakr (ssakr@cse.unsw.edu.au), IBM Developer Works (10 June 2013).

Note: Dr. Sherif Sakr is a senior research scientist in the Software Systems Group at National ICT Australia (NICTA), Sydney, Australia. He is also a conjoint senior lecturer in the School of Computer Science and Engineering at University of New South Wales. He received his doctorate in computer science from Konstanz University, Germany, in 2007. His bachelor’s and master’s degrees in computer science are from Cairo University, Egypt. In 2011, Dr. Sakr held a visiting research scientist position in the eXtreme Computing Group (XCG) at Microsoft Research, in Redmond, Wash. In 2012, he held a research MTS position in Alcatel-Lucent Bell Labs.

Big Data Start-Ups: Graphs

GRAPH-THEORETICAL METHODS FOR DETECTING AND DESCRIBING GESTALT CLUSTERS, C. T. Zahn (1970)

Zhou, Haijun and Wang, Chuang (Date TBD). Region graph partition function expansion and approximate free energy landscapes: Theory and some numerical results, Journal of Statistical Physics.

Wainwright, Martin J. and Jordan, Michael I. (2008). Graphical Models, Exponential Families, and Variational Inference, Foundations and Trends in Machine Learning, 1, Nos. 1–2 (2008), 1–305.

Abstract:
framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting.

Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide variety of algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.

Yedidia, Jonathan S., Freeman, William T., and Weiss, Yair (2002). Constructing Free Energy Approximations and Generalized Belief Propagation Algorithms, TR-2002-35 August 2002, MITSUBISHI ELECTRIC RESEARCH LABORATORIES, http://www.merl.com.

Abstract:
Important inference problems in statistical physics, computer vision, error-correcting coding theory, and artificial intelligence can all be reformulated as the computation of marginal probabilities on factor graphs. The belief propagation (BP) algorithm is an efficient way to solve these problems that is exact when the factor graph is a tree, but only approximate when the factor graph has cycles.

We show that BP fixed points correspond to the stationary points of the Bethe approximation to the free energy for a factor graph. We explain how to obtain region-based free energy approximations that improve the Bethe approximation, and corresponding generalized belief propagation (GBP) algorithms.

We emphasize the conditions a free energy approximation must satisfy in order to be a “valid” approximation. We describe the relationship between four different methods that can be used to generate valid approximations: the “Bethe method,” the “junction graph method,” the “cluster variation method,” and the “region graph method.”

The region graph method is the most general of these methods, and it subsumes all the other methods. Region graphs also provide the natural graphical setting for GBP algorithms. We explain how to obtain three different versions of GBP algorithms and show that their fixed points will always correspond to stationary points of the region graph approximation to the free energy. We also show that the region graph approximation is exact when the region graph has no cycles.
(With permission, MERL.)

Leave a Reply

Your email address will not be published. Required fields are marked *