pgmpy API Reference

models module

Directed Graphical Models

Undirected Graphical Models

factors module

class pgmpy.factors.FactorSet(*factors_list)[source]

Base class of DiscreteFactor Sets.

A factor set provides a compact representation of higher dimensional factor \phi_1\cdot\phi_2\cdots\phi_n

For example the factor set corresponding to factor \phi_1\cdot\phi_2 would be the union of the factors \phi_1 and \phi_2 i.e. factor set \vec\phi = \phi_1 \cup \phi_2.

add_factors(*factors)[source]

Adds factors to the factor set.

Parameters:

factors: Factor1, Factor2, ...., Factorn :

factors to be added into the factor set

Examples

>>> from pgmpy.factors import FactorSet
>>> from pgmpy.factors import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> factor_set1 = FactorSet(phi1, phi2)
>>> phi3 = DiscreteFactor(['x5', 'x6', 'x7'], [2, 2, 2], range(8))
>>> phi4 = DiscreteFactor(['x5', 'x7', 'x8'], [2, 2, 2], range(8))
>>> factor_set1.add_factors(phi3, phi4)
>>> print(factor_set1)
set([<DiscreteFactor representing phi(x1:2, x2:3, x3:2) at 0x7f8e32b4ca10>,
     <DiscreteFactor representing phi(x5:2, x7:2, x8:2) at 0x7f8e4c393690>,
     <DiscreteFactor representing phi(x5:2, x6:2, x7:2) at 0x7f8e32b4c750>,
     <DiscreteFactor representing phi(x3:2, x4:2, x1:2) at 0x7f8e32b4cb50>])
copy()[source]

Create a copy of factor set.

Examples

>>> from pgmpy.factors import FactorSet
>>> from pgmpy.factors import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> factor_set = FactorSet(phi1, phi2)
>>> factor_set
<pgmpy.factors.FactorSet.FactorSet at 0x7fa68f390320>
>>> factor_set_copy = factor_set.copy()
>>> factor_set_copy
<pgmpy.factors.FactorSet.FactorSet at 0x7f91a0031160>
divide(factorset, inplace=True)[source]

Returns a new factor set instance after division by the factor set

Division of two factor sets \frac{\vec\phi_1}{\vec\phi_2} basically translates to union of all the factors present in \vec\phi_2 and \frac{1}{\phi_i} of all the factors present in \vec\phi_2.

Parameters:

factorset: FactorSet :

The divisor

inplace: A boolean (Default value True) :

If inplace = True ,then it will modify the FactorSet object, if False then will return a new FactorSet object.

Returns:

If inplace = False, will return a new FactorSet Object which is division of :

given factors. :

Examples

>>> from pgmpy.factors import FactorSet
>>> from pgmpy.factors import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> factor_set1 = FactorSet(phi1, phi2)
>>> phi3 = DiscreteFactor(['x5', 'x6', 'x7'], [2, 2, 2], range(8))
>>> phi4 = DiscreteFactor(['x5', 'x7', 'x8'], [2, 2, 2], range(8))
>>> factor_set2 = FactorSet(phi3, phi4)
>>> factor_set3 = factor_set2.divide(factor_set1)
>>> print(factor_set3)
set([<DiscreteFactor representing phi(x3:2, x4:2, x1:2) at 0x7f8e32b5ba10>,
     <DiscreteFactor representing phi(x5:2, x6:2, x7:2) at 0x7f8e32b5b650>,
     <DiscreteFactor representing phi(x1:2, x2:3, x3:2) at 0x7f8e32b5b050>,
     <DiscreteFactor representing phi(x5:2, x7:2, x8:2) at 0x7f8e32b5b8d0>])
get_factors()[source]

Returns all the factors present in factor set.

Examples

>>> from pgmpy.factors import FactorSet
>>> from pgmpy.factors import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> factor_set1 = FactorSet(phi1, phi2)
>>> phi3 = DiscreteFactor(['x5', 'x6', 'x7'], [2, 2, 2], range(8))
>>> factor_set1.add_factors(phi3)
>>> factor_set1.get_factors()
{<DiscreteFactor representing phi(x1:2, x2:3, x3:2) at 0x7f827c0a23c8>,
 <DiscreteFactor representing phi(x3:2, x4:2, x1:2) at 0x7f827c0a2358>,
 <DiscreteFactor representing phi(x5:2, x6:2, x7:2) at 0x7f825243f9e8>}
marginalize(variables, inplace=True)[source]

Marginalizes the factors present in the factor sets with respect to the given variables.

Parameters:

variables: list, array-like :

List of the variables to be marginalized.

inplace: boolean (Default value True) :

If inplace=True it will modify the factor set itself, would create a new factor set

Returns:

If inplace = False, will return a new marginalized FactorSet object. :

Examples

>>> from pgmpy.factors import FactorSet
>>> from pgmpy.factors import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> factor_set1 = FactorSet(phi1, phi2)
>>> factor_set1.marginalize('x1')
>>> print(factor_set1)
set([<DiscreteFactor representing phi(x2:3, x3:2) at 0x7f8e32b4cc10>,
     <DiscreteFactor representing phi(x3:2, x4:2) at 0x7f8e32b4cf90>])
product(factorset, inplace=True)[source]

Return the factor sets product with the given factor sets

Suppose \vec\phi_1 and \vec\phi_2 are two factor sets then their product is a another factors set \vec\phi_3 = \vec\phi_1 \cup \vec\phi_2.

Parameters:

factorsets: FactorSet1, FactorSet2, ..., FactorSetn :

FactorSets to be multiplied

inplace: A boolean (Default value True) :

If inplace = True , then it will modify the FactorSet object, if False, it will return a new FactorSet object.

Returns:

If inpalce = False, will return a new FactorSet object, which is product of two factors :

Examples

>>> from pgmpy.factors import FactorSet
>>> from pgmpy.factors import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> factor_set1 = FactorSet(phi1, phi2)
>>> phi3 = DiscreteFactor(['x5', 'x6', 'x7'], [2, 2, 2], range(8))
>>> phi4 = DiscreteFactor(['x5', 'x7', 'x8'], [2, 2, 2], range(8))
>>> factor_set2 = FactorSet(phi3, phi4)
>>> print(factor_set2)
set([<DiscreteFactor representing phi(x5:2, x6:2, x7:2) at 0x7f8e32b5b050>,
     <DiscreteFactor representing phi(x5:2, x7:2, x8:2) at 0x7f8e32b5b690>])
>>> factor_set2.product(factor_set1)
>>> print(factor_set2)
set([<DiscreteFactor representing phi(x1:2, x2:3, x3:2) at 0x7f8e32b4c910>,
     <DiscreteFactor representing phi(x3:2, x4:2, x1:2) at 0x7f8e32b4cc50>,
     <DiscreteFactor representing phi(x5:2, x6:2, x7:2) at 0x7f8e32b5b050>,
     <DiscreteFactor representing phi(x5:2, x7:2, x8:2) at 0x7f8e32b5b690>])
>>> factor_set2 = FactorSet(phi3, phi4)
>>> factor_set3 = factor_set2.product(factor_set1, inplace=False)
>>> print(factor_set2)
set([<DiscreteFactor representing phi(x5:2, x6:2, x7:2) at 0x7f8e32b5b060>,
     <DiscreteFactor representing phi(x5:2, x7:2, x8:2) at 0x7f8e32b5b790>])
remove_factors(*factors)[source]

Removes factors from the factor set.

Parameters:

factors: Factor1, Factor2, ...., Factorn :

factors to be removed from the factor set

Examples

>>> from pgmpy.factors import FactorSet
>>> from pgmpy.factors import DiscreteFactor
>>> phi1 = DiscreteFactor(['x1', 'x2', 'x3'], [2, 3, 2], range(12))
>>> phi2 = DiscreteFactor(['x3', 'x4', 'x1'], [2, 2, 2], range(8))
>>> factor_set1 = FactorSet(phi1, phi2)
>>> phi3 = DiscreteFactor(['x5', 'x6', 'x7'], [2, 2, 2], range(8))
>>> factor_set1.add_factors(phi3)
>>> print(factor_set1)
set([<DiscreteFactor representing phi(x1:2, x2:3, x3:2) at 0x7f8e32b5b050>,
     <DiscreteFactor representing phi(x5:2, x6:2, x7:2) at 0x7f8e32b5b250>,
     <DiscreteFactor representing phi(x3:2, x4:2, x1:2) at 0x7f8e32b5b150>])
>>> factor_set1.remove_factors(phi1, phi2)
>>> print(factor_set1)
set([<DiscreteFactor representing phi(x5:2, x6:2, x7:2) at 0x7f8e32b4cb10>])

inference module

independencies module

class pgmpy.independencies.Independencies(*assertions)[source]

Base class for independencies. independencies class represents a set of Conditional Independence assertions (eg: “X is independent of Y given Z” where X, Y and Z are random variables) or Independence assertions (eg: “X is independent of Y” where X and Y are random variables). Initialize the independencies Class with Conditional Independence assertions or Independence assertions.

Parameters:

assertions: Lists or Tuples :

Each assertion is a list or tuple of the form: [event1, event2 and event3] eg: assertion [‘X’, ‘Y’, ‘Z’] would be X is independent of Y given Z.

Examples

Creating an independencies object with one independence assertion: Random Variable X is independent of Y

>>> independencies = independencies(['X', 'Y'])

Creating an independencies object with three conditional independence assertions: First assertion is Random Variable X is independent of Y given Z.

>>> independencies = independencies(['X', 'Y', 'Z'],
...             ['a', ['b', 'c'], 'd'],
...             ['l', ['m', 'n'], 'o'])
add_assertions(*assertions)[source]

Adds assertions to independencies.

Parameters:

assertions: Lists or Tuples :

Each assertion is a list or tuple of variable, independent_of and given.

Examples

>>> from pgmpy.independencies import Independencies
>>> independencies = Independencies()
>>> independencies.add_assertions(['X', 'Y', 'Z'])
>>> independencies.add_assertions(['a', ['b', 'c'], 'd'])
closure()[source]

Returns a new Independencies()-object that additionally contains those IndependenceAssertions that are implied by the the current independencies (using with the semi-graphoid axioms; see (Pearl, 1989, Conditional Independence and its representations)).

Might be very slow if more than six variables are involved.

Examples

>>> from pgmpy.independencies import Independencies
>>> ind1 = Independencies(('A', ['B', 'C'], 'D'))
>>> ind1.closure()
(A _|_ B | D, C)
(A _|_ B, C | D)
(A _|_ B | D)
(A _|_ C | D, B)
(A _|_ C | D)
>>> ind2 = Independencies(('W', ['X', 'Y', 'Z']))
>>> ind2.closure()
(W _|_ Y)
(W _|_ Y | X)
(W _|_ Z | Y)
(W _|_ Z, X, Y)
(W _|_ Z)
(W _|_ Z, X)
(W _|_ X, Y)
(W _|_ Z | X)
(W _|_ Z, Y | X)
[..]
contains(assertion)[source]

Returns True if assertion is contained in this Independencies-object, otherwise False.

Parameters:assertion: IndependenceAssertion()-object :

Examples

>>> from pgmpy.independencies import Independencies, IndependenceAssertion
>>> ind = Independencies(['A', 'B', ['C', 'D']])
>>> IndependenceAssertion('A', 'B', ['C', 'D']) in ind
True
>>> # does not depend on variable order:
>>> IndependenceAssertion('B', 'A', ['D', 'C']) in ind
True
>>> # but does not check entailment:
>>> IndependenceAssertion('X', 'Y', 'Z') in Independencies(['X', 'Y'])
False
entails(entailed_independencies)[source]

Returns True if the entailed_independencies are implied by this Independencies-object, otherwise False. Entailment is checked using the semi-graphoid axioms.

Might be very slow if more than six variables are involved.

Parameters:entailed_independencies: Independencies()-object :

Examples

>>> from pgmpy.independencies import Independencies
>>> ind1 = Independencies([['A', 'B'], ['C', 'D'], 'E'])
>>> ind2 = Independencies(['A', 'C', 'E'])
>>> ind1.entails(ind2)
True
>>> ind2.entails(ind1)
False
get_assertions()[source]

Returns the independencies object which is a set of IndependenceAssertion objects.

Examples

>>> from pgmpy.independencies import Independencies
>>> independencies = Independencies(['X', 'Y', 'Z'])
>>> independencies.get_assertions()
is_equivalent(other)[source]

Returns True if the two Independencies-objects are equivalent, otherwise False. (i.e. any Bayesian Network that satisfies the one set of conditional independencies also satisfies the other).

Might be very slow if more than six variables are involved.

Parameters:other: Independencies()-object :

Examples

>>> from pgmpy.independencies import Independencies
>>> ind1 = Independencies(['X', ['Y', 'W'], 'Z'])
>>> ind2 = Independencies(['X', 'Y', 'Z'], ['X', 'W', 'Z'])
>>> ind3 = Independencies(['X', 'Y', 'Z'], ['X', 'W', 'Z'], ['X', 'Y', ['W','Z']])
>>> ind1.is_equivalent(ind2)
False
>>> ind1.is_equivalent(ind3)
True
latex_string()[source]

Returns a list of string. Each string represents the IndependenceAssertion in latex.

reduce()[source]

Add function to remove duplicate Independence Assertions

class pgmpy.independencies.IndependenceAssertion(event1=[], event2=[], event3=[])[source]

Represents Conditional Independence or Independence assertion.

Each assertion has 3 attributes: event1, event2, event3. The attributes for

U \perp X, Y | Z

is read as: Random Variable U is independent of X and Y given Z would be:

event1 = {U}

event2 = {X, Y}

event3 = {Z}

Parameters:

event1: String or List of strings :

Random Variable which is independent.

event2: String or list of strings. :

Random Variables from which event1 is independent

event3: String or list of strings. :

Random Variables given which event1 is independent of event2.

Examples

>>> from pgmpy.independencies import IndependenceAssertion
>>> assertion = IndependenceAssertion('U', 'X')
>>> assertion = IndependenceAssertion('U', ['X', 'Y'])
>>> assertion = IndependenceAssertion('U', ['X', 'Y'], 'Z')
>>> assertion = IndependenceAssertion(['U', 'V'], ['X', 'Y'], ['Z', 'A'])
get_assertion()[source]

Returns a tuple of the attributes: variable, independent_of, given.

Examples

>>> from pgmpy.independencies import IndependenceAssertion
>>> asser = IndependenceAssertion('X', 'Y', 'Z')
>>> asser.get_assertion()

readwrite module

base module

class pgmpy.base.DirectedGraph(ebunch=None)[source]

Base class for directed graphs.

Directed graph assumes that all the nodes in graph are either random variables, factors or clusters of random variables and edges in the graph are dependencies between these random variables.

Parameters:

data: input graph :

Data to initialize graph. If data=None (default) an empty graph is created. The data can be an edge list or any Networkx graph object.

Examples

Create an empty DirectedGraph with no nodes and no edges

>>> from pgmpy.base import DirectedGraph
>>> G = DirectedGraph()

G can be grown in several ways

Nodes:

Add one node at a time:

>>> G.add_node('a')

Add the nodes from any container (a list, set or tuple or the nodes from another graph).

>>> G.add_nodes_from(['a', 'b'])

Edges:

G can also be grown by adding edges.

Add one edge,

>>> G.add_edge('a', 'b')

a list of edges,

>>> G.add_edges_from([('a', 'b'), ('b', 'c')])

If some edges connect nodes not yet in the model, the nodes are added automatically. There are no errors when adding nodes or edges that already exist.

Shortcuts:

Many common graph features allow python syntax for speed reporting.

>>> 'a' in G     # check if node in graph
True
>>> len(G)  # number of nodes in graph
3
add_edge(u, v, **kwargs)[source]

Add an edge between u and v.

The nodes u and v will be automatically added if they are not already in the graph

Parameters:

u,v : nodes

Nodes can be any hashable Python object.

Examples

>>> from pgmpy.base import DirectedGraph
>>> G = DirectedGraph()
>>> G.add_nodes_from(['Alice', 'Bob', 'Charles'])
>>> G.add_edge('Alice', 'Bob')
add_edges_from(ebunch, **kwargs)[source]

Add all the edges in ebunch.

If nodes referred in the ebunch are not already present, they will be automatically added. Node names should be strings.

Parameters:

ebunch : container of edges

Each edge given in the container will be added to the graph. The edges must be given as 2-tuples (u, v).

Examples

>>> from pgmpy.base import DirectedGraph
>>> G = DirectedGraph()
>>> G.add_nodes_from(['Alice', 'Bob', 'Charles'])
>>> G.add_edges_from([('Alice', 'Bob'), ('Bob', 'Charles')])
add_node(node, **kwargs)[source]

Add a single node to the Graph.

Parameters:

node: node :

A node can be any hashable Python object.

Examples

>>> from pgmpy.base import DirectedGraph
>>> G = DirectedGraph()
>>> G.add_node('A')
add_nodes_from(nodes, **kwargs)[source]

Add multiple nodes to the Graph.

Parameters:

nodes: iterable container :

A container of nodes (list, dict, set, etc.).

Examples

>>> from pgmpy.base import DirectedGraph
>>> G = DirectedGraph()
>>> G.add_nodes_from(['A', 'B', 'C'])
get_parents(node)[source]

Returns a list of parents of node.

Parameters:

node: string, int or any hashable python object. :

The node whose parents would be returned.

Examples

>>> from pgmpy.base import DirectedGraph
>>> G = DirectedGraph([('diff', 'grade'), ('intel', 'grade')])
>>> G.parents('grade')
['diff', 'intel']
moralize()[source]

Removes all the immoralities in the DirectedGraph and creates a moral graph (UndirectedGraph).

A v-structure X->Z<-Y is an immorality if there is no directed edge between X and Y.

Examples

>>> from pgmpy.base import DirectedGraph
>>> G = DirectedGraph([('diff', 'grade'), ('intel', 'grade')])
>>> moral_graph = G.moralize()
>>> moral_graph.edges()
[('intel', 'grade'), ('intel', 'diff'), ('grade', 'diff')]
class pgmpy.base.UndirectedGraph(ebunch=None)[source]

Base class for all the Undirected Graphical models.

UndirectedGraph assumes that all the nodes in graph are either random variables, factors or cliques of random variables and edges in the graphs are interactions between these random variables, factors or clusters.

Parameters:

data: input graph :

Data to initialize graph. If data=None (default) an empty graph is created. The data can be an edge list or any Networkx graph object.

Examples

Create an empty UndirectedGraph with no nodes and no edges

>>> from pgmpy.base import UndirectedGraph
>>> G = UndirectedGraph()

G can be grown in several ways

Nodes:

Add one node at a time:

>>> G.add_node('a')

Add the nodes from any container (a list, set or tuple or the nodes from another graph).

>>> G.add_nodes_from(['a', 'b'])

Edges:

G can also be grown by adding edges.

Add one edge,

>>> G.add_edge('a', 'b')

a list of edges,

>>> G.add_edges_from([('a', 'b'), ('b', 'c')])

If some edges connect nodes not yet in the model, the nodes are added automatically. There are no errors when adding nodes or edges that already exist.

Shortcuts:

Many common graph features allow python syntax for speed reporting.

>>> 'a' in G     # check if node in graph
True
>>> len(G)  # number of nodes in graph
3
add_edge(u, v, **kwargs)[source]

Add an edge between u and v.

The nodes u and v will be automatically added if they are not already in the graph

Parameters:

u,v : nodes

Nodes can be any hashable Python object.

Examples

>>> from pgmpy.base import UndirectedGraph
>>> G = UndirectedGraph()
>>> G.add_nodes_from(['Alice', 'Bob', 'Charles'])
>>> G.add_edge('Alice', 'Bob')
add_edges_from(ebunch, **kwargs)[source]

Add all the edges in ebunch.

If nodes referred in the ebunch are not already present, they will be automatically added.

Parameters:

ebunch : container of edges

Each edge given in the container will be added to the graph. The edges must be given as 2-tuples (u, v).

Examples

>>> from pgmpy.base import UndirectedGraph
>>> G = UndirectedGraph()
>>> G.add_nodes_from(['Alice', 'Bob', 'Charles'])
>>> G.add_edges_from([('Alice', 'Bob'), ('Bob', 'Charles')])
add_node(node, **kwargs)[source]

Add a single node to the Graph.

Parameters:

node: node :

A node can be any hashable Python object.

Examples

>>> from pgmpy.base import UndirectedGraph
>>> G = UndirectedGraph()
>>> G.add_node('A')
add_nodes_from(nodes, **kwargs)[source]

Add multiple nodes to the Graph.

Parameters:

nodes: iterable container :

A container of nodes (list, dict, set, etc.).

Examples

>>> from pgmpy.base import UndirectedGraph
>>> G = UndirectedGraph()
>>> G.add_nodes_from(['A', 'B', 'C'])
check_clique(nodes)[source]

Check if the given nodes form a clique.

Parameters:

nodes: list, array-like :

List of nodes to check if they are a part of any clique.

is_triangulated()[source]

Checks whether the undirected graph is triangulated or not.

Examples

>>> from pgmpy.base import UndirectedGraph
>>> G = UndirectedGraph()
>>> G.add_edges_from([('x1', 'x2'), ('x1', 'x3'), ('x1', 'x4'),
...                   ('x2', 'x4'), ('x3', 'x4')])
>>> G.is_triangulated()
True