Skip to content

Commit

Permalink
Add Clustering and TSP apps (#265)
Browse files Browse the repository at this point in the history
* Prelim commit

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* NEBM+SCIF merger: first pass

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Prelim commit

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* NEBM+SCIF merger: second pass

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Post code-review: removed `debug` from NEBM, added docstrings, reinstated `best_solution` in read_gate

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* First VRP app commit; incomplete, needs tests

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Adding helper functions to generate Q matrices
for a) clustering b) tsp.
TODO: Add Typing code

* Second VRP app commit; functional with VRPy. Includes tests.

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Second VRP app commit; solver complete, needs correct Q matrices

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Changes to clustering matrix complete.
Made changes to TSP API only.
Further work needed on TSP logic

* New tsp matrix generator with tests

* Changed formulation of distance in TSP.
Encoding is now accurate

* Added proper clustering Q matrix generator
(with test)

* VRP Solver: Almost there

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* NEBM+SCIF merger: first pass

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Prelim commit

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* NEBM+SCIF merger: second pass

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Post code-review: removed `debug` from NEBM, added docstrings, reinstated `best_solution` in read_gate

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Adding helper functions to generate Q matrices
for a) clustering b) tsp.
TODO: Add Typing code

* Resolved conflicts

* Prelim commit

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* NEBM+SCIF merger: first pass

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Prelim commit

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* NEBM+SCIF merger: second pass

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Post code-review: removed `debug` from NEBM, added docstrings, reinstated `best_solution` in read_gate

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Changes to clustering matrix complete.
Made changes to TSP API only.
Further work needed on TSP logic

* New tsp matrix generator with tests

* Changed formulation of distance in TSP.
Encoding is now accurate

* VRPSolver first milestone: successfully solves VRPs

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* SCIF CPU backend model minor change to remove `state_hist` and pass tests

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Delinting working VRPSolver

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Commented out a piece of code dependent on a draft PR

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Remove lint and add VRPy to PyProject.TOML

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Sparsification attempt #1: DistProxy with sign inversion and max cut-off

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Intermediate check point commit for scenario sweep

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Profiling and sparsification related improvements to VRPSolver and VRPConfig

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Script to sweep various scenarios for performance modelling of VRPSolver

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Profiling and sparsification related improvements to VRPSolver and VRPConfig

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Fixed the way to check if VRPy is installed

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Code clean-up refactoring in LCA module

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Corrected TSP Q matrix name

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Added edge-pruning based sparsification

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Tests and scripts for quantification of the effect of dist-mat sparsity on solution quality

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Sparsification attempt #1: DistProxy with sign inversion and max cut-off

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Delint VRP solver.py

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* First commit of clustering and TSP. Clustering is almost complete.

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Cleaner unittest for solver

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Functioning Clustering and TSP apps

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Delinting

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Removed VRP from this branch

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Removed VRP unittests

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* Clustering demo jupyter notebook added

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

* TSP demo jupyter notebook added

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>

---------

Signed-off-by: Risbud, Sumedh <sumedh.risbud@intel.com>
Co-authored-by: Ashish Rao Mangalore <ashish.rao.mangalore@intel.com>
  • Loading branch information
srrisbud and ashishrao7 committed Nov 9, 2023
1 parent 821b81f commit 4ae9122
Show file tree
Hide file tree
Showing 25 changed files with 2,290 additions and 11 deletions.
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ scipy = "^1.10.1"
nbformat = "^5.7.1"
seaborn = "^0.12.2"


[tool.poetry.dev-dependencies]
bandit = "1.7.4"
coverage = "^6.3.2"
Expand Down
157 changes: 157 additions & 0 deletions src/lava/lib/optimization/apps/clustering/problems.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Copyright (C) 2023 Intel Corporation
# SPDX-License-Identifier: BSD-3-Clause
# See: https://spdx.org/licenses/

import networkx as ntx
import numpy as np
import typing as ty


class ClusteringProblem:
"""Problem specification for a clustering problem.
N points need to be clustered into M clusters.
The cluster centers are *given*. Clustering is done to assign cluster IDs
to points based on the closest cluster centers.
"""
def __init__(self,
point_coords: ty.List[ty.Tuple[int, int]],
center_coords: ty.Union[int, ty.List[ty.Tuple[int, int]]],
edges: ty.Optional[ty.List[ty.Tuple[int, int]]] = None):
"""
Parameters
----------
point_coords : list(tuple(int, int))
A list of integer tuples corresponding to the coordinates of
points to be clustered.
center_coords : list(tuple(int, int))
A list of integer tuples corresponding to the coordinates of
cluster-centers.
edges : (Optional) list(tuple(int, int, float))
An optional list of edges connecting points and cluster centers,
given as a list of triples (ID1, ID2, weight). See the note
below for ID-scheme. If None, assume all-to-all connectivity
between points, weighted by their pairwise distances.
Notes
-----
IDs 1 to M correspond to cluster centers and (M+1) to (M+N) correspond
to the points to be clustered.
"""
super().__init__()
self._point_coords = point_coords
self._center_coords = center_coords
self._num_points = len(self._point_coords)
self._num_clusters = len(self._center_coords)
self._cluster_ids = list(np.arange(1, self._num_clusters + 1))
self._point_ids = list(np.arange(
self._num_clusters + 1, self._num_clusters + self._num_points + 1))
self._points = dict(zip(self._point_ids, self._point_coords))
self._cluster_centers = dict(zip(self._cluster_ids,
self._center_coords))
if edges:
self._edges = edges
else:
self._edges = []

self._problem_graph = None

@property
def points(self):
return self._points

@points.setter
def points(self, points: ty.Dict[int, ty.Tuple[int, int]]):
self._points = points

@property
def point_ids(self):
return self._point_ids

@property
def point_coords(self):
return self._point_coords

@property
def num_points(self):
return self._num_points

@property
def edges(self):
return self._edges

@property
def cluster_centers(self):
return self._cluster_centers

@cluster_centers.setter
def cluster_centers(self, cluster_centers: ty.Dict[int, ty.Tuple[int,
int]]):
self._cluster_centers = cluster_centers

@property
def cluster_ids(self):
return self._cluster_ids

@property
def center_coords(self):
return self._center_coords

@property
def num_clusters(self):
return self._num_clusters

@property
def problem_graph(self):
"""NetworkX problem graph is created and returned.
If edges are specified, they are taken into account.
Returns
-------
A graph object corresponding to the problem.
"""
if not self._problem_graph:
self._generate_problem_graph()
return self._problem_graph

def _generate_problem_graph(self):
if len(self.edges) > 0:
gph = ntx.DiGraph()
# Add the nodes to be visited
gph.add_nodes_from(self.point_ids)
# If there are user-provided edges, add them between the nodes
gph.add_edges_from(self.edges)
else:
gph = ntx.complete_graph(self.point_ids, create_using=ntx.DiGraph())

node_type_dict = dict(zip(self.point_ids,
["Point"] * len(self.point_ids)))
# Associate node type as "Node" and node coordinates as attributes
ntx.set_node_attributes(gph, node_type_dict, name="Type")
ntx.set_node_attributes(gph, self.points, name="Coordinates")

# Add vehicles as nodes
gph.add_nodes_from(self.cluster_ids)
# Associate node type as "Vehicle" and vehicle coordinates as attributes
cluster_center_type_dict = dict(zip(self.cluster_ids,
["Cluster Center"] * len(
self.cluster_ids)))
ntx.set_node_attributes(gph, cluster_center_type_dict, name="Type")
ntx.set_node_attributes(gph, self.cluster_centers, name="Coordinates")

# Add edges from initial vehicle positions to all nodes (oneway edges)
for cid in self.cluster_ids:
for pid in self.points:
gph.add_edge(cid, pid)

# Compute Euclidean distance along all edges and assign them as edge
# weights
# ToDo: Replace the loop with independent distance matrix computation
# and then assign the distances as attributes
for edge in gph.edges.keys():
gph.edges[edge]["cost"] = np.linalg.norm(
np.array(gph.nodes[edge[1]]["Coordinates"]) - np.array(
gph.nodes[edge[0]]["Coordinates"]))

self._problem_graph = gph
202 changes: 202 additions & 0 deletions src/lava/lib/optimization/apps/clustering/solver.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
# Copyright (C) 2023 Intel Corporation
# SPDX-License-Identifier: BSD-3-Clause
# See: https://spdx.org/licenses/


import numpy as np
from pprint import pprint
from dataclasses import dataclass

from lava.lib.optimization.problems.problems import QUBO
from lava.lib.optimization.solvers.generic.solver import OptimizationSolver, \
SolverReport
from lava.lib.optimization.apps.clustering.problems import ClusteringProblem
from lava.lib.optimization.apps.clustering.utils.q_matrix_generator import \
QMatrixClust

import typing as ty
import numpy.typing as npty

from lava.magma.core.resources import (
CPU,
Loihi2NeuroCore,
NeuroCore,
)
from lava.lib.optimization.solvers.generic.solver import SolverConfig

BACKENDS = ty.Union[CPU, Loihi2NeuroCore, NeuroCore, str]
CPUS = [CPU, "CPU"]
NEUROCORES = [Loihi2NeuroCore, NeuroCore, "Loihi2"]

BACKEND_MSG = f""" was requested as backend. However,
the solver currently supports only Loihi 2 and CPU backends.
These can be specified by calling solve with any of the following:
backend = "CPU"
backend = "Loihi2"
backend = CPU
backend = Loihi2NeuroCore
backend = NeuroCoreS
The explicit resource classes can be imported from
lava.magma.core.resources"""


@dataclass
class ClusteringConfig(SolverConfig):
"""Solver configuration for VRP solver.
Parameters
----------
core_solver : CoreSolver
Core algorithm that solves a given VRP. Possible values are
CoreSolver.VRPY_CPU or CoreSolver.LAVA_QUBO.
Notes
-----
VRPConfig class inherits from `SolverConfig` class at
`lava.lib.optimization.solvers.generic.solver`. Please refer to the
documentation for `SolverConfig` to know more about other arguments that
can be passed.
"""

do_distance_sparsification: bool = False
sparsification_algo: str = "cutoff"
max_dist_cutoff_fraction: float = 1.0
profile_q_mat_gen: bool = False
only_gen_q_mat: bool = False


@dataclass
class ClusteringSolution:
"""Clustering solution holds two dictionaries:
- `clustering_id_map` holds a map from cluster center ID to a list
of point IDs
- `clustering_coords_map` holds a map from the cluster center
coordinates to the point coordinates
"""
clustering_id_map: dict = None
clustering_coords_map: dict = None


class ClusteringSolver:
"""Solver for clustering problems, given cluster centers.
"""
def __init__(self, clp: ClusteringProblem):
self.problem = clp
self._solver = None
self._profiler = None
self.dist_sparsity = 0.
self.dist_proxy_sparsity = 0.
self.q_gen_time = 0.
self.q_shape = None
self.raw_solution = None
self.solution = ClusteringSolution()

@property
def solver(self):
return self._solver

@property
def profiler(self):
return self._profiler

def solve(self, scfg: ClusteringConfig = ClusteringConfig()):
"""
Solve a clustering problem using a given solver configuration.
Parameters
----------
scfg (ClusteringConfig) : Configuration parameters.
Notes
-----
The solver object also stores profiling data as its attributes.
"""
# 1. Generate Q matrix for clustering
node_list_for_clustering = self.problem.center_coords + \
self.problem.point_coords
# number of binary variables = total_num_nodes * num_clusters
mat_size = len(node_list_for_clustering) * self.problem.num_clusters
q_mat_obj = QMatrixClust(
node_list_for_clustering,
num_clusters=self.problem.num_clusters,
lambda_dist=1,
lambda_points=100,
lambda_centers=100,
fixed_pt=True,
fixed_pt_range=(-128, 127),
clust_dist_sparse_params={
"do_sparse": scfg.do_distance_sparsification,
"algo": scfg.sparsification_algo,
"max_dist_cutoff_fraction": scfg.max_dist_cutoff_fraction},
profile_mat_gen=scfg.profile_q_mat_gen)
q_mat = q_mat_obj.matrix.astype(int)
self.dist_sparsity = q_mat_obj.dist_sparsity
self.dist_proxy_sparsity = q_mat_obj.dist_proxy_sparsity
if scfg.profile_q_mat_gen:
self.q_gen_time = q_mat_obj.time_to_gen_mat
self.q_shape = q_mat.shape
# 2. Call Lava QUBO solvers
if not scfg.only_gen_q_mat:
prob = QUBO(q=q_mat)
self._solver = OptimizationSolver(problem=prob)
hparams = {
'neuron_model': 'nebm-sa-refract',
'refract': 10,
'refract_scaling': 6,
'init_state': np.random.randint(0, 2, size=(mat_size,)),
'min_temperature': 1,
'max_temperature': 5,
'steps_per_temperature': 200
}
if not scfg.hyperparameters:
scfg.hyperparameters.update(hparams)
report: SolverReport = self._solver.solve(config=scfg)
if report.profiler:
self._profiler = report.profiler
pprint(f"Clustering execution"
f" took {np.sum(report.profiler.execution_time)}s")
# 3. Post process the clustering solution
self.raw_solution: npty.NDArray = \
report.best_state.reshape((self.problem.num_clusters,
len(node_list_for_clustering))).T
else:
self.raw_solution = -1 * np.ones((self.problem.num_clusters,
len(node_list_for_clustering))).T

self.post_process_sol()

def post_process_sol(self):
"""
Post-process the clustering solution returned by `solve()`.
The clustering solution returned by the `solve` method is a 2-D
binary numpy array, wherein the columns correspond to clusters and
rows correspond to points or cluster centers. entry (i, j) is 1 if
point/cluster center 'i' belongs to cluster 'j'.
"""

coord_list = (self.problem.center_coords + self.problem.point_coords)
id_map = {}
coord_map = {}
for j, col in enumerate(self.raw_solution.T):
node_idxs = np.nonzero(col)
# ID of "this" cluster is the only nonzero row in this column
# from row 0 to row 'num_clusters' - 1
this_cluster_id = \
(node_idxs[0][node_idxs[0] < self.problem.num_clusters] + 1)
if len(this_cluster_id) != 1:
raise ValueError(f"More than one cluster center found in "
f"{j}th cluster. Clustering might not have "
f"converged to a valid solution.")
node_idxs = node_idxs[0][node_idxs[0] >= self.problem.num_clusters]
id_map.update({this_cluster_id.item(): (node_idxs + 1).tolist()})

this_center_coords = np.array(coord_list)[this_cluster_id - 1, :]
point_coords_this_cluster = np.array(coord_list)[node_idxs, :]
point_coords_this_cluster = \
[tuple(point) for point in point_coords_this_cluster.tolist()]
coord_map.update({
tuple(this_center_coords.flatten()): point_coords_this_cluster})

self.solution.clustering_id_map = id_map
self.solution.clustering_coords_map = coord_map
Loading

0 comments on commit 4ae9122

Please sign in to comment.