Skip to content

Generalize TransferOperator#434

Open
rusandris wants to merge 21 commits intoJuliaDynamics:mainfrom
rusandris:main
Open

Generalize TransferOperator#434
rusandris wants to merge 21 commits intoJuliaDynamics:mainfrom
rusandris:main

Conversation

@rusandris
Copy link
Copy Markdown
Contributor

@rusandris rusandris commented Jan 20, 2025

First draft PR. I tried to follow the design ideas mentioned here #424 .
TransferOperator becomes a ProbabilitiesEstimator.

TransferOperator <: ProbabilitiesEstimator #true

First results:

x = [1.0,2.0,1.0,2.0,1.0] #simple dummy time series

#1D counting-based outcome spaces
op = OrdinalPatterns{2}()
vb = ValueBinning(RectangularBinning(5))
d = Dispersion(; c = 3, m = 2, τ = 1)
ue = UniqueElements()

#try with different outcome spaces
transferoperator(op,x)
transferoperator(vb,x)
transferoperator(d,x)
transferoperator(ue,x)

which returns a TransferOperatorApproximation <: ProbabilitiesEstimator .

OP example:

codify(op,x)
    4-element Vector{Int64}:
     1
     2
     1
     2

to = transferoperator(op,x)

to.outcome_space #gives back op
    OrdinalPatterns{2}, with 2 fields:
    encoding = OrdinalPatternEncoding(perm = [2, 1], lt = isless_rand)
    τ = 1
to.outcomes #unique outcomes
  2-element Vector{Int64}:
   1
   2

Most important field is the transfermatrix

to.transfermatrix
  2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
       1.0
   1.0    

probabilities function does work formally in that it gives an output:

probabilities(TransferOperator(),op,x)
   Probabilities{Float64,1} over 2 outcomes
   Outcome(1)  0.6056021883538506
   Outcome(2)  0.3943978116461494

but for some reason, for this particular example, the iterative method used in invariant_measure does not converge.

probabilities(TransferOperator(),op,x;N=4000,tolerance=1e-12) #keyword args are propagated to invariant_measure
   Probabilities{Float64,1} over 2 outcomes
   Outcome(1)  0.9508731468636822
   Outcome(2)  0.04912685313631784

the correct probabs of course would be [0.5,0.5] as the eigenvector corresponding to eigenvalue 1 is [1,1]. This might be a separate issue though.
Let me know what you think.

@Datseris
Copy link
Copy Markdown
Member

is there any paper in the literature that has generalized the transfer operator like this? Among other outcome spaces? if not, perhaps you can consider making this a paper.

@kahaaga
Copy link
Copy Markdown
Member

kahaaga commented Jan 21, 2025

is there any paper in the literature that has generalized the transfer operator like this? Among other outcome spaces? if not, perhaps you can consider making this a paper.

At least I am not aware of any such paper. Would be a cool contribution for sure!

@kahaaga
Copy link
Copy Markdown
Member

kahaaga commented Jan 21, 2025

@rusandris Thanks for the PR! I am super busy at the moment, so I'm not able to review this in detail yet. I'll try to get to it within the next couple of weeks.

the correct probabs of course would be [0.5,0.5] as the eigenvector corresponding to eigenvalue 1 is [1,1]. This might be a separate issue though.
Let me know what you think.

I'll have to do a thorough review dig deeper to understand this issue. I'll ping you when I've tested this out a bit, @rusandris.

@Datseris
Copy link
Copy Markdown
Member

is there any paper in the literature that has generalized the transfer operator like this? Among other outcome spaces? if not, perhaps you can consider making this a paper.

At least I am not aware of any such paper. Would be a cool contribution for sure!

Just to be clear: i have no capacity to be involved in such a paper, but it is worth discussing with your supervisor Andras.

@rusandris
Copy link
Copy Markdown
Contributor Author

I'll have to do a thorough review dig deeper to understand this issue. I'll ping you when I've tested this out a bit, @rusandris.

I looked inside invariant_measure to see what happens exactly.
Adding some @show statements

julia> probabilities(TransferOperator(),op,x;N=5,tolerance=1e-12)

  distribution = [0.28892942757860407 0.7110705724213959]
  distribution = [0.7110705724213959 0.28892942757860407]
  distance = 0.7778172853930047
  distribution = [0.28892942757860407 0.7110705724213959]
  distance = 0.7778172853930047
  distribution = [0.7110705724213959 0.28892942757860407]
  distance = 0.7778172853930047
  distribution = [0.28892942757860407 0.7110705724213959]
  distance = 0.7778172853930047
  distribution = [0.7110705724213959 0.28892942757860407]
  distance = 0.7778172853930047

   Probabilities{Float64,1} over 2 outcomes
     Outcome(1)  0.7110705724213959
     Outcome(2)  0.28892942757860407

reveals of course that it works as it should. Since the transition matrix looks like this

to.transfermatrix
  2×2 SparseMatrixCSC{Float64, Int64} with 2 stored entries:
       1.0
   1.0    

the iterative method keeps flipping the initial random values inside the distribution. This wouldn't converge, unless the initial distribution is already given as

probabilities(RelativeAmount(),op,x)
 Probabilities{Float64,1} over 2 outcomes
 Outcome(1)  0.5
 Outcome(2)  0.5

@Datseris
Copy link
Copy Markdown
Member

Datseris commented Jan 9, 2026

I've seeen some new commits here, but it's been more than a year since i've looked at this and have forgotten many things, let me know if there is something in particular you'd like help with, or to discuss, and I'll bring myself back up to speed!

@rusandris
Copy link
Copy Markdown
Contributor Author

rusandris commented Jan 10, 2026

Thanks, great to hear! I'm still working on it, but in short TransferOperator works like this:

logistic_rule(x, p, n) = SVector{1}(p[1] * x[1] * (1.0 - x[1]))
p = [r]
ds = DeterministicIteratedMap(logistic_rule, [0.4], p)
x, t = trajectory(ds, 10^3; Ttr=10^4)
b = ValueBinning(FixedRectangularBinning(range(0,1;length=11)))
probabilities(TransferOperator(),b,x)
 Probabilities{Float64,1} over 10 outcomes
  1  0.21500000000000005
  2  0.08700000000000001
  3  0.06900000000000002
  4  0.06799999999999999
  5  0.06599999999999995
  6  0.06999999999999998
  7  0.053999999999999986
  8  0.081
  9  0.08999999999999998
 10  0.19999999999999993

through invariant_measure. I used binning here but the outcome space can be ordinal patterns etc.
Also, there are two methods of getting an invariant measure \rho using the transfermatrix: iterative (as it was before, this is the default) and the eigenvector (newly added). (Right now the eigen method uses eigsolve from KrylovKit, since it's said to work well for extremal eigenvalues, and we're after the vector corresponding to the largest eigenvalue of the transfermatrix which is 1. But this can change if we want to avoid adding KrylovKit as a dependency).

The eigenvector method can be better for example when the time series is periodic:

 ds = DeterministicIteratedMap(logistic_rule, [0.4], [3.84])
 x, t = trajectory(ds, 10^3; Ttr=10^4)

 probabilities(TransferOperator(),b,x;method=:iterate)
   Probabilities{Float64,1} over 3 outcomes
    2  0.09340928604503983
    5  0.23372777459244914
   10  0.6728629393625111

probabilities(TransferOperator(),b,x;method=:eigen)
   Probabilities{Float64,1} over 3 outcomes
    2  0.33333333333333337
    5  0.3333333333333334
   10  0.33333333333333326

Let me know how this looks for you.

@kahaaga
Copy link
Copy Markdown
Member

kahaaga commented Jan 10, 2026

@rusandris, great stuff! 🤘Super nice that the eigenvector approach works well for periodic time series. This is a major improvement which may have advantages in e.g. reducing false negatives in settings where e.g. extensive pseudoperiodic surrogate testing is done for some statistic. For that reason I vote for keeping the dependency.

One comment though: following the convention from the test of the packaged: any keywords that controls the probabilities estimation should be part of the discretization type ('TransferOperator'), not as a keyword to the 'probabilities' function.

@rusandris
Copy link
Copy Markdown
Contributor Author

One comment though: following the convention from the test of the packaged: any keywords that controls the probabilities estimation should be part of the discretization type ('TransferOperator'), not as a keyword to the 'probabilities' function.

My thinking was the following (and I'm sure there are some workarounds/better ways to do this): the method keyword argument in probabilities is to simplify things on the user side. The value of this keyword argument is then used to decide if TransferOperatorApproximation should be wrapped inside TransferOperatorApproximationIterative or TransferOperatorApproximationEigen in order to dispatch invariantmeasure on one of them eventually.

probabilities(TransferOperatorApproximationIterative(),outcome_space,x)
probabilities(TransferOperatorApproximationEigen(),outcome_space,x)

seemed too verbose and difficult. These types are supposed to be internal and are only used once probabilities is called (which might or might not happen, maybe the user only wanted to extract the transition probabilities from the transfermatrix).
Now, I guess method can be moved inside TO and used to dispatch like

 probabilities(TransferOperator(params...;method=:eigen),b,x)

which is what you mean, right?

@kahaaga
Copy link
Copy Markdown
Member

kahaaga commented Jan 15, 2026

@rusandris Thanks again for your great work! I'll do a thorough review of this in the coming days. In order to not give conflicting feedback here, I want to familiarize myself properly with the changes before I respond in detail. But in short:

Breaking changes

I see that there is a breaking change (converting TransferOperator from an OutcomeSpace to a ProbabilitiesEstimator.
We do not want to introduce any breaking changes in the package for what is happening here: introducing a new ProbabilitiesEstimator. We can easily introduce TransferOperatorApproximation <: ProbabilitiesEstimator, just as long as we don't re-define the original TransferOperator (which then should be deprecated instead).

Internal dispatch

probabilities(TransferOperator(params...;method=:eigen),b,x)
which is what you mean, right?

Something like that! For consistency with the rest of the code base, I think the following would work:

probabilities(probest::TransferOperatorApproximation(::ApproximationMethod), o::OutcomeSpace, x)

where ApproximationMethod is either TransferOperatorApproximationIterative or TransferOperatorApproximationEigen. Internally, depending on the method, this dispatches to

probabilities(::TransferOperatorApproximationIterative,outcome_space,x)
probabilities(::TransferOperatorApproximationEigen,outcome_space,x)

I just need to see how it relates to the manual extraction of the transfer matrix in your suggested code.

Why we don't want extra keywords in the function

We want the base functions like probabilities in the package to be "pure", in the sense that they predictably operate on combinations of arguments of different types, without any configuration at the function level.

If we were to break with this convention for the transfer operator, then there would be hundreds of possible ways to discretize and estimate probabilities from data using one unified syntax, EXCEPT for the transfer operator approximation, which has an extra keyword for probabilities. We want to avoid special cases and keep the unified approach: any configuration arguments always go in the relevant types (e.g. struct TransferOperatorApproximation <: ProbabilitiesEstimator; param1; param2 ... end).

We set reasonable, well-documented default behaviours, but leave the user the option to be as granular as they want.

@rusandris
Copy link
Copy Markdown
Contributor Author

Thanks for the suggestions! Now it's starting to shape up nicely, I would say.
The old TransferOperator<:OutcomeSpace is left alone and could be deprecated to avoid breaking changes.
The new TransferOperatorEstimator <: ProbabilitiesEstimator is an estimator dummy struct. The function transfer_operator creates a TransferOperatorApproximation{OC<:OutcomeSpace,AM<:ApproximationMethod} which is used by probabilities down the line.

x = rand(10000)
op = OrdinalPatterns{3}()
probabilities(TransferOperatorEstimator(),op,x)

where an ApproximationMethod is responsible for the dispatching between the two methods:

probabilities(TransferOperatorEstimator(ApproximationIterative()),op,x)
 Probabilities{Float64,1} over 6 outcomes
 3  0.16324623547971576
 2  0.1591237693781441
 6  0.16384218566116265
 5  0.17197826172205205
 4  0.16785579628202602
 1  0.17395375147689937

probabilities(TransferOperatorEstimator(ApproximationEigen()),op,x)
 Probabilities{Float64,1} over 6 outcomes
 3  0.16324623516644013
 2  0.15912376937533454
 6  0.16384218587939298
 5  0.17197826206173342
 4  0.1678557962706277
 1  0.17395375124647128

Still work in progress though, it needs to be moved to /probabilities_estimators/.

@rusandris
Copy link
Copy Markdown
Contributor Author

Now the tests pass, finally.
Also, I realized the name change from TransferOperator to TransferOperatorEstimator wasn't necesarry, so I changed it back. The previous TransferOperator method using binning was deprecated.

@rusandris
Copy link
Copy Markdown
Contributor Author

rusandris commented Mar 16, 2026

All right so here is a quick summary of the changes:
TransferOperator is a ProbabilitiesEstimator now, instead of an OutcomeSpace.
It can be used with (count-based) outcomes, instead of being tied to binnings (the old main constructor method from time series method signature was transferoperator(pts::AbstractStateSpaceSet{D, T}, binning::Union{FixedRectangularBinning, RectangularBinning}; ).

Basic usage goes like:

x = rand(10000)
op = OrdinalPatterns{3}()
probabilities(TransferOperator(),op,x)

where TransferOperator() creates a

TransferOperator, with 2 fields:
 approximation_method = ApproximationIterative(200, 1.0e-8, 1.0e-8, Random.TaskLocalRNG())
 boundary_condition = none

The approximation method can be of two types (ApproximationIterative being the default one):

probabilities(TransferOperator(ApproximationIterative()),op,x)
 Probabilities{Float64,1} over 6 outcomes
 3  0.16324623547971576
 2  0.1591237693781441
 6  0.16384218566116265
 5  0.17197826172205205
 4  0.16785579628202602
 1  0.17395375147689937

probabilities(TransferOperator(ApproximationEigen()),op,x)
 Probabilities{Float64,1} over 6 outcomes
 3  0.16324623516644013
 2  0.15912376937533454
 6  0.16384218587939298
 5  0.17197826206173342
 4  0.1678557962706277
 1  0.17395375124647128

Under the hood, the method

  transferoperator(o::OutcomeSpace,x; boundary_condition = :none,
  approximation_method=ApproximationIterative())

creates a

TransferOperatorApproximation{OC<:OutcomeSpace,AM<:ApproximationMethod}

from the time series x with all the relevant fields

approximation_method 
outcome_space
outcomes
transfermatrix

This is passed onto invariantmeasure (this takes care of the dispatch between the two approximation methods). The selected invariantmeasure method based on the approximation is called of course by probabilities.

I'm sure I missed some things, but that's the essence of it. A more general TransferOperator estimator tool that accepts many outcome spaces, and can be used to approximate the invariant measure using two different methods with different advantages and drawbacks (see above in other comments).

Transition probabilites can be useful in Markov chain based estimator methods as well (see here for example ).

I probably left some type annotations out from some constructors here and there but those can be corrected later.
Some tests and deprecations were added but I might need to add to them as well.

We also need to check if this breaks something in other packages (I'm thinking of Associations.jl).
Docstrings are still missing, those need to be added as well.

@rusandris rusandris marked this pull request as ready for review March 16, 2026 19:46
@Datseris
Copy link
Copy Markdown
Member

hi @rusandris , this looks fantastic. Thank you for putting in the work to finish this. We are very close to the finish line, but as you pointed out already, documentation is crucial before attempting to do the final review and merge. Please tag me here once you have added documentation (this includes both docstrings and a new usage example in the examples as this is a substantial new feature).

@Datseris
Copy link
Copy Markdown
Member

Oh, also: do we need to do anything for backwards compatibility in this PR? This way we won't have to worry about breaking stuff. You can likely still extend the old methods that only use binning in the deprecations.jl file?

@rusandris
Copy link
Copy Markdown
Contributor Author

Oh, also: do we need to do anything for backwards compatibility in this PR? This way we won't have to worry about breaking stuff. You can likely still extend the old methods that only use binning in the deprecations.jl file?

So far the old methods

x = rand(1000)
b = RectangularBinning(4)
est = TransferOperator(b)
transferoperator(x, b)

are added in deprecations.jl so they still work. But the previous probabilities(x, est) and probabilities_and_outcomes(x, est) do not, because it is changed now to probabilities_and_outcomes(TransferOperator(),o,x) which also requires an outcome space and a different order of arguments. But I can probably add to the methods in deprecations.jl as you said.

@Datseris
Copy link
Copy Markdown
Member

probably worth adding in the deprecations a method that where probabilities_and_outcomes(TransferOperator(),x) makes o = RectangularBinning(10) and comes the proper method for example? I don't rmember what was teh default binning for the oriignal TrOp.

@rusandris
Copy link
Copy Markdown
Contributor Author

@Datseris I have finally managed to put together a working version of the code and the docs. I disabled link checking in the docs for now (some of the references are not found apparently), so it can build.
There are probably still some things to add and to clarify, but at least it's something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants