Skip to content

pancak3/HMM-Viterbi-CUDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Viterbi_CUDA

We implement the Viterbi Algorithm both in sequential and in parallel. The sequential one is src/viterbi_sequential.cu and the parallel one using CUDA is src/viterbi_cuda.cu.

To learn more details about Viterbi Algorithm, please refer to our report. We will only introduce the dependencies and running environments.
NOTE:
Our algorithms were tested on Ubuntu 18.04 with Nvidia Pascal P100, due to the difference of the architecture of Nvidia products, you may define THREADS_PER_BLOCK at line 5 of src/viterbi_cuda.cu according to your Nvidia card.

Dependency

Environment

  • PATH=$PATH:/usr/local/cuda-10.1/bin
  • LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64

Instructions

To check CUDA environment,

user@machine$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

To compile all the files without debug info (use build_debug.sh for debug info),

user@machine$ ./build.sh 
[*] use nvcc to compile 
	{driver.cu, viterbi_cuda.cu,viterbi_sequential.cu}
	... done.
	--> bin/driver_cuda
[*] use gcc to compile ... done.
	--> bin/generator

To generate problem with 1000 states, 2000 possible observations and 100 actual observations,

user@machine$ ./bin/generator > problem.in
1000 2000 100
user@machine$ 

To compare Viterbi_Sequential and Viterbi_CUDA,

user@machine$ ./bin/driver_cuda < problem.in 
States: 1000, Emissions: 2000, Observation_length: 100
SEQUENTIAL Time: 1081669
CUDA Time: 37556
[ SEQUENTIAL OPTIMAL PATH ]
  0   2   0   1   3   0   2   1   0   4   2   0   0   0   4   3   0   1   3   0   1   3   0   0   1   3   0   1   3   0   1   3   0   1   3   0   1   3   0   4   3   0   0   0   1   3   0   4   3   0   0   1   3   0   0   2   4   2   1   3   0   1   3   0   4   3   0   4   2   0   0   0   0   4   3   0   0   0   0   1   3   0   1   3   0   0   0   1   3   0   1   3   0   9   2   1   3   0   0   1 
[ CUDA OPTIMAL PATH ]
  0   2   0   1   3   0   2   1   0   4   2   0   0   0   4   3   0   1   3   0   1   3   0   0   1   3   0   1   3   0   1   3   0   1   3   0   1   3   0   4   3   0   0   0   1   3   0   4   3   0   0   1   3   0   0   2   4   2   1   3   0   1   3   0   4   3   0   4   2   0   0   0   0   4   3   0   0   0   0   1   3   0   1   3   0   0   0   1   3   0   1   3   0   9   2   1   3   0   0   1
user@machine$ 


NOTE:
There may be unexpected bug with specific input which causing core dump. We did not successfully find the pattern and debug it. It may be solved in next century :).

About

Parallel and Multicore Computing Project 2

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published