So, initially when I was using CTW the problem was with VQ, in particular, independence of the quantization from the inference in the Markovian model. I solved that using HMMs so that the quantization is learned simultaneously with the structure.

Can we use left-to-right HMMs (probably with emissions from a mixture of Gaussians), and still use the number of states as a complexity measure? Or perhaps a combination of \(N_{states}\) and \(N_{mixtures}\)? This might function as an indirect way of representing temporal structure.

Original post is here.

It works for the original dataset pretty well. Here's what comes out: