
smith-waterman
Align sequences of data, with sequential, vectorized, and parallel implementations
Created Mar 2019
Updated the README to provide more explicit instructions on setting up the X10 environment and running the Smith-Waterman implementation. By clarifying the build process and including specific command examples for various test cases, it's now easier to get the project running. This should reduce friction for anyone setting up the repository for the first time.
This merge introduces high-performance vectorized and striped variations of the Smith-Waterman algorithm for sequence alignment. The update includes significant optimizations, such as replacing cell-based matrix structures with raw memory representations to reduce overhead, alongside improvements to parallel execution patterns. These additions provide researchers with multiple strategies for exploring hardware acceleration and cache-efficient sequence analysis. 
Updated the expected output and timing benchmarks for parvect across test suites including 'gap', 'men', and 'subst'. These changes reflect adjustments in the algorithm's output alignment and performance metrics. These updates ensure the regression test suite remains synchronized with the latest implementation.
This change deletes duplicate or unnecessary files across several results/* benchmark directories, including gap, men, and subst variants in sequential, parallel, and vectorized runs. There’s no code change here, but it does clean up the recorded outputs so future comparisons and analysis are less noisy and less error-prone. The practical effect is a tidier benchmark corpus with fewer confusing extra entries.
This update restores a previous memory optimization that replaces the Cell object matrix with a pure Long matrix for the H-matrix. By avoiding object allocation for each matrix cell, this significantly reduces the overall memory footprint and avoids object overhead. Ultimately, this change improves runtime performance and cache locality for the Smith-Waterman sequence alignment implementation.
When calculating score profiles for the Smith-Waterman sequence alignment, the program previously attempted to look up the alphabet index even for padded illegal characters, potentially leading to incorrect scores or errors. This update introduces a check to ignore ILLEGAL_CHAR entries during generated vector evaluations and removes an unnecessary increment in the vector count initialization. This ensures accurate sequence scoring and avoids out-of-bounds operations on padded trailing fragments.
The sequence padding logic has been updated for when the input sequences don't perfectly align with the vector size. The a-sequence is now properly padded with a designated illegal character, and the substitution matrix S is carefully resized to accommodate this padding. We also ensure that padding characters correctly receive a large negative score so they don't skew or artificially inflate the alignment results.
This commit addresses a bug in the OR vector operation logic where elements could be incorrectly overwritten by zeros during vector shifts. By adding conditional checks before assigning shifted values, it ensures that an element ORed with a zero-element correctly retains its original value. This ensures accurate alignment score and gap penalty calculations in the parallelized algorithm.
This update reverts a previous memory optimization that converted the H-matrix from using Cell objects to primitive Long values. It appears that stripping away the Cell object also removed necessary tracking metadata (like cell coordinates), causing issues in the alignment logic. The matrix is once again leveraging Cells, trading that intended memory efficiency back for correctness.

This commit effectively reapplies a previously backed-out change in the parallel vectorized Smith-Waterman implementation. Most of the diff is comment cleanup, but the meaningful part restores the max-score cell update from zero-based to one-based coordinates when constructing maxH, which affects how the best alignment position is reported. For anyone consuming alignment results from parVect, this should bring the reported H-matrix location back in line with the intended semantics.
A missing shift operation was causing incorrect score calculations during sequence alignment. The logic for calculating the initial F-vector has been corrected by applying the necessary left shift before subtracting the gap penalty. This ensures accurate alignment scoring in the parallel vectorized version of the algorithm.
The H-matrix type in the parallel vectorized Smith-Waterman aligner was updated from an array of Cell objects to an array of raw Long values. By avoiding the instantiation of a wrapper object for every coordinate pair, this significantly cuts down the number of allocations required during execution. Ultimately, this refactor makes memory usage far more efficient when performing sequence alignments on a large scale.
This update resolves an off-by-one indexing error when tracking the maximum score cell in the H-matrix for the parallel vector implementation of Smith-Waterman. It also comprehensively refactors the inline documentation describing vector operations and gap penalty logic during matrix generation. Correcting these coordinates ensures the expected origin point is captured for the alignment's backtrace phase.

This change updates the inline comments in SmithWatermanParVect.x10 to better describe the actual roles of the vector registers and arrays used in the Rognes-style Smith-Waterman inner loop. The code path itself does not change, but the documentation now explains concepts like H/E/F vector updates, gap penalty handling, and how state is carried between iterations, which makes the SIMD-style logic much easier to follow and maintain. Practical effect: no runtime behavior change, but future debugging and algorithm work should be less error-prone.
This change is documentation-focused: it annotates the parallel and vectorized X10 implementations with clearer comments around matrix initialization, anti-diagonal scheduling, vector register flow, gap-penalty handling, and backtracking-related state. The underlying algorithms are largely unchanged, but the code is now much easier to read and maintain, especially in the more complex SIMD-style path where temporary vectors and shifts were previously hard to reason about. Practical effect: future contributors should have a much easier time understanding and modifying the parallel alignment code.
